[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: 1746652460698594.png (290 KB, 454x453)
290 KB
290 KB PNG
>vibe coding with Claude Opus 4.6
>tell it to write tests after changing any code to make sure it still works
>it writes useless tests that check nonspecific output conditions like output !== null and just assumes there's no bug
>instruct it to write tests that do more complex things and print output as they go instead of making assumptions
>it finds a few failures
>it starts going "hmm, these tests are quite complicated, I think my testing is too ambitious, let's simplify" and tries editing the test instead of diagnosing the bug
This is with me babysitting the AI at every step, can you imagine hands off vibe coding? I have had to correct this behavior multiple times.
>>
>>108473507
Automated testing has always been a scam except to squeeze out the last remaining 0.1% of bugs when it comes to critical infrastructure or at least something that will be used by millions of people and is more or less frozen in time.
For your average CRUD app that wont be used by more than a few hundred people, manual testing is the way to go.
>>
>>108473536
You would think LLMs would be able to conduct more intelligent tests (i.e. just printing everything all the time and reading the output to trace through the call in your head)
But Claude fails at that, preferring to edit the test instead of fixing the bug. Not even indians do that shit.
>>
>>108473507
Why aren’t you having your customers test for you? That’s what big tech does these days.
>>
>>108473507
That's every ENTERPRISE quality test ever. The goal is always coverage no matter how nonsensical the case is.
>>108473716
Also this somehow.
>>
>>108473507
i find codex better for tests than claude.
https://github.com/orlfman/FFmpeg-Converter-GTK/tree/main/tests
all of mine were made by codex
>>
>>108473507
Skill issue.
>>
>>108473507
>using anthrocuck trash after they got flooded by virtue signaling (orange) redditors and nerfed their models because they're compute choked
lol
>>
One time I was asking jeetGPT to help me debug some problem I was having with a VNC connection that kept crashing

Sent it a snippet of log output and asked it to look through it and find the error message

It told me the error message. I then ctrl+f looked for it and it didn’t exist. I confronted it and it admitted it lied and thought it was easier just write what it thought was the most likely answer than actually doing the work.
>>
>>108474527
>if i make up shit maybe i won't lose my job
kek
>>
File: 1755095177697807.jpg (157 KB, 596x699)
157 KB
157 KB JPG
>>108473507
How retarded are you to not be able to make a fucking SOTA LLM write some unit tests lmao? Actually, don't answer that
>>
>>108474706
>this is the sota
I thought the point of coding with AI was that you don't have to micromanage everything it implements, since otherwise you are just writing code yourself
Just 2 more weeks until AGI though amirite?
>>
>>108474725
You're just retarded bro, I'm not going to spoonfeed you how to make it work. Go enroll in a AI bootcamp or something
>>
>>108473507
>vibe coding
There's your problem.
Vibe coders should be b& forever from this board. You are not coders but customers.
>>
>>108473507
>>108473536
>doing it wrong
>things go to shit
>surprise pikachu face
The right order is:
1. Write a GOOD specification (doc comments above functions/methods, etc.), including pre-/post-conditions, taking edge cases into account, etc.
2. Have the AI generate tests based on this, review / adjust as necessary
3. Have the AI generate code until the tests pass.

It sounds like you're having the AI tests based on the implementation, which is a recipe for disaster even with human programmers.

>manual testing is the way to go
That works if you only ever do one release of your application or the cost of defects is minimal.
Because if you ever have to implement change requests or bug fixes in a serious application, you'll quickly find yourself either:
- wasting time testing every single thing all over again every release,
- or skimping on quality by skipping tests.
>>
just wait for next model saar
agii soon ssaar
>>
>>108475059
Nobody writes LE SPEC and then writes the code.
Modern software development involves experimenting with very complex systems which require experimenting with prototypes and designing as you develop and explore the solution space.
You can LARP as Dijkstra but that doesn't change anything.
Automated testing (both unit and integration) is only necessary if for whatever reason you MUST get rid of the remaining rare edge cases remaining after your code is already mature.
>That works if you only ever do one release of your application or the cost of defects is minimal.
On the contrary, if you change your app you will need to rewrite the tests anyway which ends up taking more time than manual testing for the same level of defect detection.
>>
>>108475226
No one writes LE CODE first either. You shit out the Frontend/Figma to show you have an idea of a product, then you cobble together some code for your series A
>>
>>108473536
really bad take, educate yourself brother
>>
>>108475362
he's a vibeshitter what do you expect
>>108473536
new vibeshitter take just dropped
>testing is a scam
no wonder your project unemployment on everyone else
>>
They made Opus retarded last night.
>>
>>108475694
>last night
kek
>>
>>108473507
this is common problem with TDD, you write a test to see if the DB or external API returned a valid response but you mock the db/service so you're just testing whenever the function is getting the mock data right? so you're just testing whenever a function with arguments works in whatever language you're using... completely useless.

don't ask it to add tests for everything, you're being too literal. test the critical path and test both the happy path (ie correct data == correct result) and errors (ie incorrect data == expected error/exception)
>>
>>108473507
>tell it to write tests after
Proper TDD would do it before. You write tests that you expect to fail until it implements something.
>>
>why isn't AI a pure magic that can read my mind and write code flawlessly?

Well yes it does need guidance. It still boost your productivity buy a fucking lot.
>>
>>108476895
>you write a test to see if the DB or external API returned a valid response
Why the fuck would you write such a test? You write a test to see how YOUR function behaves in a case of a valid response. And then you write another test to see what happens with an invalid response.
>>
>>108475665
What do you mean? I'm gainfully employed.

>>108475362
Implying your opinion is a fact to be learned doesn't make you more correct.
>>
>>108475269
Yes, because design by committee works so well, right?
>>
>>108477185
>why do you expect the product to function as advertised
Truly a mystery.
>It still boost your productivity buy a fucking lot
It's the exact opposite of what the data suggests.
>>
>>108473536
Either you are clueless AF or you are trolling outside of /b/.
Automated tests shield against unexpected behavior changes and regressions. Only trash tier companies don't have automated build process with automated testing.
>>
>>108475226
This works with people because people sometimes have common sense, AI is good as some things but it has no common sense and once it's going it moves much faster than humans.
Also you don't necessarily have to spec out everything, but at least the module you are working on. In my experience this is the single most important thing with AI.
>>
>sign up for claude code
>it becomes shit a week after

I was having so much fun working on my hobby projects
>>
>>108473536
>manual testing is the way to go.
Not always feasible every time

Best approach is manual test first, identify what testing can be automated and focus automating those
>>
Is anyone here actually 100% coding without AI?
>>
File: 1764821244685536.jpg (43 KB, 411x418)
43 KB
43 KB JPG
>>108479211
You're lost gramps? This isn't the retirement home
>>
>>108479226
Read what I wrote again
>>
>>108474527
damn anthropic was right, it really is replacing juniors



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.