/g/ - >vibe coding with Claude Opus 4.6 >tell it to writ - Technology

Anonymous

03/28/26(Sat)13:18:17 No.108473507

File: 1746652460698594.png (290 KB, 454x453)

Anonymous 03/28/26(Sat)13:18:17 No.108473507 Archived

>vibe coding with Claude Opus 4.6
>tell it to write tests after changing any code to make sure it still works
>it writes useless tests that check nonspecific output conditions like output !== null and just assumes there's no bug
>instruct it to write tests that do more complex things and print output as they go instead of making assumptions
>it finds a few failures
>it starts going "hmm, these tests are quite complicated, I think my testing is too ambitious, let's simplify" and tries editing the test instead of diagnosing the bug
This is with me babysitting the AI at every step, can you imagine hands off vibe coding? I have had to correct this behavior multiple times.

Anonymous
03/28/26(Sat)13:22:44 No.108473536

Anonymous 03/28/26(Sat)13:22:44 No.108473536

>>108473507
Automated testing has always been a scam except to squeeze out the last remaining 0.1% of bugs when it comes to critical infrastructure or at least something that will be used by millions of people and is more or less frozen in time.
For your average CRUD app that wont be used by more than a few hundred people, manual testing is the way to go.

Anonymous
03/28/26(Sat)13:29:02 No.108473574

Anonymous 03/28/26(Sat)13:29:02 No.108473574

>>108473536
You would think LLMs would be able to conduct more intelligent tests (i.e. just printing everything all the time and reading the output to trace through the call in your head)
But Claude fails at that, preferring to edit the test instead of fixing the bug. Not even indians do that shit.

Anonymous
03/28/26(Sat)13:53:43 No.108473716

Anonymous 03/28/26(Sat)13:53:43 No.108473716

>>108473507
Why aren’t you having your customers test for you? That’s what big tech does these days.

Anonymous
03/28/26(Sat)13:56:28 No.108473732

Anonymous 03/28/26(Sat)13:56:28 No.108473732

>>108473507
That's every ENTERPRISE quality test ever. The goal is always coverage no matter how nonsensical the case is.
>>108473716
Also this somehow.

Anonymous
03/28/26(Sat)13:57:38 No.108473742

Anonymous 03/28/26(Sat)13:57:38 No.108473742

>>108473507
i find codex better for tests than claude.
https://github.com/orlfman/FFmpeg-Converter-GTK/tree/main/tests
all of mine were made by codex

Anonymous
03/28/26(Sat)13:57:47 No.108473743

Anonymous 03/28/26(Sat)13:57:47 No.108473743

>>108473507
Skill issue.

Anonymous
03/28/26(Sat)16:05:53 No.108474373

Anonymous 03/28/26(Sat)16:05:53 No.108474373

>>108473507
>using anthrocuck trash after they got flooded by virtue signaling (orange) redditors and nerfed their models because they're compute choked
lol

Anonymous
03/28/26(Sat)16:36:26 No.108474527

Anonymous 03/28/26(Sat)16:36:26 No.108474527

One time I was asking jeetGPT to help me debug some problem I was having with a VNC connection that kept crashing

Sent it a snippet of log output and asked it to look through it and find the error message

It told me the error message. I then ctrl+f looked for it and it didn’t exist. I confronted it and it admitted it lied and thought it was easier just write what it thought was the most likely answer than actually doing the work.

Anonymous
03/28/26(Sat)17:04:06 No.108474654

Anonymous 03/28/26(Sat)17:04:06 No.108474654

>>108474527
>if i make up shit maybe i won't lose my job
kek

Anonymous
03/28/26(Sat)17:15:15 No.108474706

Anonymous 03/28/26(Sat)17:15:15 No.108474706

File: 1755095177697807.jpg (157 KB, 596x699)

157 KB JPG

>>108473507
How retarded are you to not be able to make a fucking SOTA LLM write some unit tests lmao? Actually, don't answer that

Anonymous
03/28/26(Sat)17:18:52 No.108474725

Anonymous 03/28/26(Sat)17:18:52 No.108474725

>>108474706
>this is the sota
I thought the point of coding with AI was that you don't have to micromanage everything it implements, since otherwise you are just writing code yourself
Just 2 more weeks until AGI though amirite?

Anonymous
03/28/26(Sat)17:57:59 No.108474933

Anonymous 03/28/26(Sat)17:57:59 No.108474933

>>108474725
You're just retarded bro, I'm not going to spoonfeed you how to make it work. Go enroll in a AI bootcamp or something

Anonymous
03/28/26(Sat)18:00:06 No.108474940

Anonymous 03/28/26(Sat)18:00:06 No.108474940

>>108473507
>vibe coding
There's your problem.
Vibe coders should be b& forever from this board. You are not coders but customers.

Anonymous
03/28/26(Sat)18:15:06 No.108475059

Anonymous 03/28/26(Sat)18:15:06 No.108475059

>>108473507
>>108473536
>doing it wrong
>things go to shit
>surprise pikachu face
The right order is:
1. Write a GOOD specification (doc comments above functions/methods, etc.), including pre-/post-conditions, taking edge cases into account, etc.
2. Have the AI generate tests based on this, review / adjust as necessary
3. Have the AI generate code until the tests pass.

It sounds like you're having the AI tests based on the implementation, which is a recipe for disaster even with human programmers.

>manual testing is the way to go
That works if you only ever do one release of your application or the cost of defects is minimal.
Because if you ever have to implement change requests or bug fixes in a serious application, you'll quickly find yourself either:
- wasting time testing every single thing all over again every release,
- or skimping on quality by skipping tests.

Anonymous
03/28/26(Sat)18:35:22 No.108475186

Anonymous 03/28/26(Sat)18:35:22 No.108475186

just wait for next model saar
agii soon ssaar

Anonymous
03/28/26(Sat)18:42:10 No.108475226

Anonymous 03/28/26(Sat)18:42:10 No.108475226

>>108475059
Nobody writes LE SPEC and then writes the code.
Modern software development involves experimenting with very complex systems which require experimenting with prototypes and designing as you develop and explore the solution space.
You can LARP as Dijkstra but that doesn't change anything.
Automated testing (both unit and integration) is only necessary if for whatever reason you MUST get rid of the remaining rare edge cases remaining after your code is already mature.
>That works if you only ever do one release of your application or the cost of defects is minimal.
On the contrary, if you change your app you will need to rewrite the tests anyway which ends up taking more time than manual testing for the same level of defect detection.

Anonymous
03/28/26(Sat)18:47:26 No.108475269

Anonymous 03/28/26(Sat)18:47:26 No.108475269

>>108475226
No one writes LE CODE first either. You shit out the Frontend/Figma to show you have an idea of a product, then you cobble together some code for your series A

Anonymous
03/28/26(Sat)19:02:37 No.108475362

Anonymous 03/28/26(Sat)19:02:37 No.108475362

>>108473536
really bad take, educate yourself brother

Anonymous
03/28/26(Sat)19:53:12 No.108475665

Anonymous 03/28/26(Sat)19:53:12 No.108475665

>>108475362
he's a vibeshitter what do you expect
>>108473536
new vibeshitter take just dropped
>testing is a scam
no wonder your project unemployment on everyone else

Anonymous
03/28/26(Sat)20:02:14 No.108475694

Anonymous 03/28/26(Sat)20:02:14 No.108475694

They made Opus retarded last night.

Anonymous
03/28/26(Sat)23:36:03 No.108476612

Anonymous 03/28/26(Sat)23:36:03 No.108476612

>>108475694
>last night
kek

Anonymous
03/29/26(Sun)00:54:30 No.108476895

Anonymous 03/29/26(Sun)00:54:30 No.108476895

>>108473507
this is common problem with TDD, you write a test to see if the DB or external API returned a valid response but you mock the db/service so you're just testing whenever the function is getting the mock data right? so you're just testing whenever a function with arguments works in whatever language you're using... completely useless.

don't ask it to add tests for everything, you're being too literal. test the critical path and test both the happy path (ie correct data == correct result) and errors (ie incorrect data == expected error/exception)

Anonymous
03/29/26(Sun)01:05:51 No.108476928

Anonymous 03/29/26(Sun)01:05:51 No.108476928

>>108473507
>tell it to write tests after
Proper TDD would do it before. You write tests that you expect to fail until it implements something.

Anonymous
03/29/26(Sun)02:29:52 No.108477185

Anonymous 03/29/26(Sun)02:29:52 No.108477185

>why isn't AI a pure magic that can read my mind and write code flawlessly?

Well yes it does need guidance. It still boost your productivity buy a fucking lot.

Anonymous
03/29/26(Sun)02:48:33 No.108477229

Anonymous 03/29/26(Sun)02:48:33 No.108477229

>>108476895
>you write a test to see if the DB or external API returned a valid response
Why the fuck would you write such a test? You write a test to see how YOUR function behaves in a case of a valid response. And then you write another test to see what happens with an invalid response.

Anonymous
03/29/26(Sun)02:49:36 No.108477234

Anonymous 03/29/26(Sun)02:49:36 No.108477234

>>108475665
What do you mean? I'm gainfully employed.

>>108475362
Implying your opinion is a fact to be learned doesn't make you more correct.

Anonymous
03/29/26(Sun)02:52:23 No.108477245

Anonymous 03/29/26(Sun)02:52:23 No.108477245

>>108475269
Yes, because design by committee works so well, right?

Anonymous
03/29/26(Sun)03:00:22 No.108477269

Anonymous 03/29/26(Sun)03:00:22 No.108477269

>>108477185
>why do you expect the product to function as advertised
Truly a mystery.
>It still boost your productivity buy a fucking lot
It's the exact opposite of what the data suggests.

Anonymous
03/29/26(Sun)03:46:05 No.108477404

Anonymous 03/29/26(Sun)03:46:05 No.108477404

>>108473536
Either you are clueless AF or you are trolling outside of /b/.
Automated tests shield against unexpected behavior changes and regressions. Only trash tier companies don't have automated build process with automated testing.

Anonymous
03/29/26(Sun)10:44:11 No.108479149

Anonymous 03/29/26(Sun)10:44:11 No.108479149

>>108475226
This works with people because people sometimes have common sense, AI is good as some things but it has no common sense and once it's going it moves much faster than humans.
Also you don't necessarily have to spec out everything, but at least the module you are working on. In my experience this is the single most important thing with AI.

Anonymous
03/29/26(Sun)10:51:16 No.108479185

Anonymous 03/29/26(Sun)10:51:16 No.108479185

>sign up for claude code
>it becomes shit a week after

I was having so much fun working on my hobby projects

Anonymous
03/29/26(Sun)10:54:08 No.108479202

Anonymous 03/29/26(Sun)10:54:08 No.108479202

>>108473536
>manual testing is the way to go.
Not always feasible every time

Best approach is manual test first, identify what testing can be automated and focus automating those

Anonymous
03/29/26(Sun)10:55:50 No.108479211

Anonymous 03/29/26(Sun)10:55:50 No.108479211

Is anyone here actually 100% coding without AI?

Anonymous
03/29/26(Sun)10:58:22 No.108479226

Anonymous 03/29/26(Sun)10:58:22 No.108479226

File: 1764821244685536.jpg (43 KB, 411x418)

43 KB JPG

>>108479211
You're lost gramps? This isn't the retirement home

Anonymous
03/29/26(Sun)10:59:46 No.108479234

Anonymous 03/29/26(Sun)10:59:46 No.108479234

>>108479226
Read what I wrote again

Anonymous
03/29/26(Sun)11:09:06 No.108479288

Anonymous 03/29/26(Sun)11:09:06 No.108479288

>>108474527
damn anthropic was right, it really is replacing juniors