[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


File: vcg.jpg (1.24 MB, 2816x1536)
1.24 MB JPG
A general for coding with agents

►Harnesses
https://developers.openai.com/codex
https://code.claude.com/docs/en/overview
https://opencode.ai/
https://antigravity.google/
https://cursor.com/docs
https://pi.dev/
>>
more like gay-i
>>
File: 1776401394580086.png (303 KB, 1197x878)
303 KB PNG
>>
>>108621734
Been on a camping trip touching grass for four days and I came back to the codex daily 5-hours limit removed and a single (1, singular) prompt eating the entirety of your weekly usage? Am I reading this bullshit right?
>>
>>108621885
>Am I reading this bullshit right?
no
>>
File: file.png (2.53 MB, 4096x2209)
2.53 MB PNG
new qwen release for localfags
>>
This time my test suite will be perfect.
>>
>>108619771
What are you making anon?
>>
>>108621719
>https://developers.openai.com/codex
>https://code.claude.com/docs/en/overview
>https://opencode.ai/
>https://antigravity.google/
>https://cursor.com/docs
>https://pi.dev/
one of those isn't like the other
>>
>>108621732
>altman in shambles
>nvidia stock dropping like a rock
>global financial crisis imminent
anon, why would you do this
>>
>>108622243
AI Chads can't be stopped
>>
can claude crack denuvo?
>>
>>108622339
name 1 (one) uncracked denuvo game that is worth playing.

also, no, probably not, especially since you'd need to jailbreak it first.
but it'd be fun to try, i guess
>>
File: file.png (307 KB, 1320x700)
307 KB PNG
total dario victory
>>
File: rip.png (170 KB, 437x438)
170 KB PNG
i like the increased pace of vibe coding, but needing to go back and understand all the changes being made and refactor to my standards just makes me feel like im in an all-day PR review. how do i make this less exhausting
>>
>>108622380
Tell her to explain it to you like you are retarded
>>
>>
>>108622401
>Tell her to explain it to you like you are retarded
LLMs can infer that themselves now
>>
opencode is switching to an electron desktop app like everyone else
soon we have a dozen versions of the exact same thing
>>
Vibecoded a simple utility .exe that prevents my mouse from double-clicking.
>>
>>108622658
I use the web UI instead, had to vibecode a fix to set the IDs of the web ui client 10s behind the server or shit would break (duplicated messages would appear in the TUI/no responses on the web UI) but other than that I am happy with it.
Also can't have more than one tab open lol but that's a minor bug.
I prefer a web UI that can be easily customized and hosted anywhere over a traditional native app
>>
You don't sit at your computer to vibe code.
The real way to vibecode is by texting it on telegram while you're sitting on a bench at the park.
>>
>>108622850
I don't really vibe, this is no less stressful than normal programming.
>>
Good morning bros. Whatever happened to that class action lawsuit over the 'Global AI Nerf fraud'?
>>
honestly, I don't like the word "vibe" too much
>>
>>108622380
Just return back to normal coding. You can still use agents. Just don't generate too much shit at once.
>>
>>108622850
now replace park bench with yacht in the carribean while getting your dick sucked by a 10/10 supermodel
>>
>>108623075
same

I bought a 7900XTX for gaming in 2022 for $999 and thought it was overkill. Now I have Qwen 27B on it doing my job for me and the same GPU costs $1200 now.
>>
>>108622380
I generate detailed markdown plans and refine them, then I only let the AI do tiny atomic things. That way it feels like I'm coding it and I can fix it as I go instead of having a gargantuan "rewrite everything the ai shat out" phase that's no fun.
>>
Alright, PiClaw is out of alpha and into beta. It's up and running at home, and I'm connected with Telegram. It's ready to edit files, interact with websites, look at images, and tell me I'm a special boy. We'll consider today the first test as a daily driver.
>>
>>108621732
Stop it
>>
>My honest read: Rewriting env in JAX is 3-6 weeks of engineering for a training run that completes in ~10 days. Bad ROI for a project at your stage.
>I've now read the env. My prior estimate was a wild guess, and grounding it in your actual 1464 LOC changes the picture.
t. Opus 4.7
>>
do i pay anthropic 20 bucks for claude code or is it still gimped?
>>
>>108623169
In the pipe, five by five
>>108623197
Stop subscribing to LLMs
>>
>>108623197
The $20 poorfag plan gets you 1 opus prompt every 5 hours
don't even bother

Claude Code starts at $100+tip Max 5x
>>
>>108623197
depends on what you build
>>
Now I have the complete picture.
>>
>>108623195
The estimates are always horribly off, they are even off if a human had to implement it.
>>
>>108621719
VibeGODS won
>>
>>108623201
>Stop subscribing to LLMs
why?
thinking of getting codex subscription
still trying to find a way to avoid having my info tied to it tho
>>
>>108623201
>>108623208
Ok I'll just build a $60k rig and run something locally, that makes much more sense.
>>
>>108623368
you have to tweak the qwens and give them more context but they're competitive with Claude if you are willing to try. Mac / Strix Halo / Big Boy GPU are all options in the $1-4k range
>>
>>108623197
Opus 4.7 feels *a lot* like what they did to Gemini 3.1 after a while. There might be public for it, but instead of being a helpful assistant, now it's more prone to write an essay telling half a dozen totally equivalent way to do that thing. Other times, it tries to take shortcuts, confidently saying things about files it hasn't looked at, and when called over it, offers "you're absolutely right" type of platitudes.

Then when you get it to do the work it's supposed to, before starting always highlight that it will do it, sounding like it is doing you a favor.

It seemingly feel in love with the word honest and keeps repeating "Honest take", "Honest answer", etc. The way the responses are written are also starting to *feel* like LinkedIn posts: "(statement) Read on to find out why."

It's great by all metrics if we consider that this can even exist. It's impressive. It's also disappointing in a what-did-they-to-my-boy kind of way.
>>
>>108623368
i just told you you can just pay $100 or $200 and get reasonable usage
>>
These threads are always full of superstitious nonsense. Had a bad week at the AIs? Someone at Anthropic must have nerfed the model!
The roulette wheel isn't rigged, sometimes you're just unlucky. Try another spin, maybe you'll do better. Or just write your own code.

(stolen from hackernews but xhe isn't wrong. I have literally never witnessed a regression from a model ever)
>>
>>108623426
i only have 64 cores epyc with 128gb ram and a potato gpu on the server for hwenc.
i cant run anything on that.. in real time, i can tho run some agent overnight.
but i havent looked at self hosted agentic programming stuffs yet
>>
what can i run on 32GB RAM and a 9070xt? That is also my main PC
>>
>>108623486
Gemma4
>>
>>108623453
There can definitely be some. Gemini 3.1 used to be extremely helpful in Antigravity, now it starts every interaction by a chain of thought saying that it is avoiding cat for file manipulation and is focused to using dedicated tools like grep_search and other utilities to yada yada yada. I'm pretty sure sometimes it falls back to that self conversation in the middle of a chain of tool calls too. I don't know if I just need to empty completely the history and memories, but it seems like repeating this to itself again and again and again is about half of the effort it spends answering every request.
>>
>>108623465
anon just bite the bullet and pay api prices for a chink model.

glm5.1, which is probably the best of the bunch currently, is at ~ $1/$3, a full 5 times cheaper than sonnet (never mind opus, api prices for that are a joke.)
mimo2.7 is not far behind in benches, and its at $0.3/$1.2, another 3 times cheaper, lol
and deepseek 3.2 starts out at $0.3/$0.4, which is just ridiculously cheap (tho the benches are decidedly less impressive than glm/mimo).

its not worth it to pay for hardware for local currently.
you'll only be able to run much smaller, much less capable models, and you'll be getting shit throughput (especially in a cpu+ram only config like yours)
even if you were willing to limit yourself to small, local targeted open stuff like the latest qwen3.6 35b (which is FAR inferior to the dirt cheap chink stuff mentioned above), you'd definitely want to buy at least a used 3090 to run it at a decent speed
at current prices, it just doesn't make sense.

if/when the current ai bubble bursts, then maybe.
its possible that a combination of cratering gpu prices (datacenter gpus, with 80gigs vram apiece, not the cucked consumer stuff) and inference providers jacking up prices by 10x or more might make it viable
but as it stands right now, its not even close.
>>
>>108623520 (me)
And if I am dumb regarding this and it's user errors, well, it might be the same for others who are claiming regressions, but they make the tools get bogged down into unhelpful patterns by their own memory files or it's the harness going off rail trying to keep track of user preferences in ways that don't make sense and result in doing the opposite, the users are not the only one to blame.
>>
>>108623453
Except the regressions were literally measured. Literal battered wife syndrome.
>>
>>108623361
PAY AS YOU YOU GO WITH API KEYS YOUUUUUU DENSE MOTHERFUCKER
>>
>>108623692
>paying api prices
>calling someone else dense
on the peak of mt midwit
>>
>>108623692
I think next OP we should put some guide about API vs subcription
>>
>>108623747
>we should put some guide
Yeah but are you going to spend your precious tokens on making one?
>>
>>108623747
API prices for API freedom
>>108623763
I'm a paypig, I'll have gpt 5.4 write it
>>
I hope this general is a troll post and not actually serious. Mods need to ban AI shit from /g/.
>>
>>108623787
Luddites on reddit please
>>
>>108623787
We got a luddie here boys
>>
>>108623787
Cope. I'm making custom proprietary software and there's nothing you can do about it
>>
>>108623787
>Mods need to ban AI shit from /g/.
how the fuck are you on /g/ and anti-AI? every single comp sci student I know uses claude/codex. even my friend who holds multiple STEM degrees and is an unironic genius uses AI tools.
>>
Yes, you should walk to the car wash.
>>
>>108623933
>carrying about if the latest model can accurately tell you how many r's are in strawberry
ngmi
>>
>>108623933
Nobody asked tho
>>
I should have used Codex more last month when the limits were higher. I just didn't have the idea I have now
>>
>>108623951
>second-to-latest model can correctly count r's
>model hailed has "latest" and "goodest" can't correctly count r's anymore
Yes, you should care
>>
>>108623978
man, i remember being so impressed that qwen-coder-14b-r1-distill could actually do this
good times, good times.
>>
>>108623978
why should I care when it can trivially write a program to count the Rs in strawberry if you ask it to?
>>
i am almost done with my imageboard summarizer next tool will be a [redacted] music platform piracy tool with integrated mp3 tagging , and a poor mans files library backup system using tar, rsync and par2 with a simple gui for syncing between drives and validating integrity
>>
codex keeps calling things gremlins now
i haven't mentioned gremlins anywhere in any messages ever
>>
>>108624281
It's an old term of art. They've added more code, docs, discussions, etc. from the 80s through 2000s to the training set.
>>
>>108624233
can i see the image board summarizer
>>
File: 1774679084992169.png (992 KB, 1024x683)
992 KB PNG
WATCH OUT DARIO
>>
File: IMG_5260.png (52 KB, 766x764)
52 KB PNG
>You've used 27% of your weekly limit • resets in 5d •
>>
>>108624281
>>108624337
The what? Bugs?
>>
>>108624533
anything
>>
>>108623441
I still think GPT 5.4 Extra High via Copilot VS Code extension is a gorillion times better deal than Claude anything, given it is still only a 1x quota consumption model relative to what you get as far as premium requests a month for the $10 monthly sub.
>>
>>108624533
I used to hear it more about control characters. Could apply to any small hidden thing that you don't see though I guess.
>>
>>108624281
https://redsweater.com/blog/473/find-gremlins-with-bbedit#:~:text=I’m%20sure%20there%20are%20a%20lot%20of%20different%20names%20for%20these%20unwelcome%20guests%20in%20text%2C%20but%20I%20like%20the%20one%20the%20folks%20at%20Bare%20Bones%20use:%20“gremlins”.%20Their%20fine%20editor%2C%20BBEdit%2C%20has%20a%20dedicated%20tool%20just%20for%20rooting%20these%20suckers%20and%20either%20eliminating%20them%20or%20making%20them%20visible:%20Zap%20Gremlins.
>>
>>108624484
would you rather pay for tokens that you’re not using, or would you rather get limited a bit?
>>
>>108624484
i had the same problem today because ai tested a bunch of agents on an orchestration all using claude local and it shot up like 20% instantly kek
>>
Update: PiClaw has not gone rogue after being online all day. I added prompt caching to it.
>>
>I live my life 1 5 hour period at a time
>>
Webshitters on suicide watch https://www.youtube.com/watch?v=t_LBECIQQqs
>>
>>108624915
So when can we deport all these indians?
>>
>>108625122
we could have deported them a long time ago
I’ve worked with plenty of Indians in India, some of them very good at webshit
>>
>>108625122
webshittery is white basedboy shit tho
>>
>>108625149
>we could have deported them a long time ago
i'm voting for whoever has a platform on deporting these people

>>108625159
no it's not, straight jeet shit and blue collar slaves
>>
>>108625164
holy cope
webshittery is enemy leftist 5th column calicuck 101
it's the best one to get genocided for our side
>>
>>108625159
nope
https://ndstudio.gov/
https://americabydesign.gov/
18F all quit because they didn’t want to hand Trump even the tiniest win so we need men who eat diets Secretary Kennedy would approve of to do webshit
>>
>>108625170
Speak English before talking to me, dweeb
>>
>>108625170
>right-wingers should not have, or make, nice websites
Are the fronts of your shirts just wet from drool by the end of the day?
>>
Anyone has any experience running gpt mini or codex/5.4 with low reasoning?
I am on the poor tier and I always run with High reasoning but the quota is going to kill me so I am considering using them to save some usage.
>>
>>108625272
the dumber the model the more your experience is gonna be like gacha
you’re gonna have to fuck around and find out
>>
File: 1765309307675914.png (763 KB, 1024x1024)
763 KB PNG
>try the PI webui
>it steps on the gas of all my cpu cores for no reason whatsoever while doing nothing
Maybe the "make your own frontend" niggas have a point
>>
>>108625316
probably not
someone’s probably just vibe-coding a UI and not really using his own damn UI and getting annoyed by all the CPU use
this happened plenty before AI coding tools
>>
>>108625297
Make sense but instead of pulling a waifu I get hidden bugs, that's why I am scared of doing it lol
>>
>>108624565
So copilot is better than chatgpt's mid plan?
I just need something for simple vibe coding.
I don't know if I should go with claude, chatgpt or copilot.
>>
>>108625272
"low reasoning" means "retarded" if you're using them for anything other than fancy autocomplete.
>>
>>108625174
>>108625183
>>108625240
holy leftistcuck seethe
it's always been webniggers on the forefront of DEI and other bullshit
you think it was assembly, fortran or C guys who tried to get rid of master and blacklist, etc?
no, it was your field and you get the gas
>>
>>108625420
>holy leftistcuck seethe
so... you're mad at me for shitting on webdevs because?
find a better outlet to channel your retardation, holy shit your IQ must be in the high 70's. you give right wingers a bad name.
>>
>>108625420
Have you considered the possibility that there are webshits who want DEI scrubbed off the face of this Earth?
>>
>>108622658
This is actually a good thing for the enlightened ones like this who are writing GUI libraries. For a while I was actually afraid that normies were about to fall for the cli/tui meme as well, which would mean that nobody would care about real GUI anymore
>>
>>108625272
i've run 5.4 mini on medium and it's... not smart.
you know how 5.4 will do that thing sometimes where it takes you a little too literally? it does that all the time.
5.4 medium is serviceable though.
>>
>>108624915
>prototypes
>designs
>brainstorms
>landing pages
kek. i can see that your shiny new tool is really addressing all the important bottlenecks.
>>
>>108625511
still needs to get done
leaves more time for the genuinely novel and hard parts
>>
>>108625520
fair enough. i just think it’s funny to call this “webshitters on suicide watch”. most of these are done by non-eng’s before the task ever lands on my jira board.
>>
>>108625554
>Jira
bigcorp (not that there’s anything wrong with that)
for smaller shops we do it all and maybe have a designer fling a Figma at us
>>
>>108622258
Artificial is the key word here
>>
>>108625578
>still using Figma
>not using Ligma
ngmi
>>
>>108625725
fucking old ass grandpa humor
>>
>>108625744
that was another anon
i worked at a smallshop before my current job, but that was years ago. you’re right that these tools are probably best for swe’s wearing many hats
>>
I got the $20 ChatGPT, and I heard cursor just had a big redesign. I can get cursor for free since I'm a student. How does cursor compare to codex, is it worth checking out?
>>
File: 1748592857738045.png (478 KB, 708x600)
478 KB PNG
>>108625744
>>
Any of you use Kimi K2.6 on OpenCode?
>>
With a 20 dollar chatgpt codex subscription and biochemistry degree I can cure a disease of your choosing using openclaw.
Tell me which disease to cure and I'll tell openclaw to find the cure for it.
>>
>>108626403
Autism. Go.
>>
>>108626416
Can be fully cured with rope or shotgun.
>>
>>108626416
Not a real disease
>>
>>108626403
agp
>>
>>108626461
Programming socks. Next.
>>
>>108626466
He said cure, not reinforce
>>
>>108626403
pneumonoultramicroscopicsilicovolcanoconiosis
>>
Usually don't post takes like this because of the shills, but I know I'm not the only one seeing this. I was hyped for Opus 4.7, but after burning 120 bucks in api credits testing it today, it is absolute slop.

I have never seen a model hallucinate this badly. It makes things up without checking, and when you correct it, it furiously gaslights you. No matter what logs you drop, it refuses to take the L. Anthropic literally released Gaslightus-4.7. It is obviously overfit and benchmaxxed.

A few examples for you anons:
1. Asked for a simple readme change. Took 5 prompts and it still messed up, only fixing exactly what I pointed at and ignoring context. 4.6 or GPT 5.4 one-shots this.
2. An eval finished 17/29. Reran 3 failed tasks, 1 passed, but a UI bug kept it at 17. Explained this to 4.7 and it spent turns arguing, making up fake logs about another task failing just to justify the 17. After spoonfeeding evidence, it tried to gaslight me that the original score was 16. Haven't seen this stubborn stupidity since Gemini 3 Pro.
3. Hallucinated plugin instructions. Pointed it out and it went full bullshido to defend itself. GPT instantly spotted and fixed the error from a fresh session.

Most frustrating model ever. I'd rather use cheap garbage like Minimax. The vibecoders will destroy production with this because it is so confidently wrong. Anthropic claims the new tokenizer eats 1.35x more tokens to think more, but did they quantize it to shit to save on compute? You used to give Opus a vague plan and it inferred intent. Now it needs extreme hand-holding but is totally unsteerable. I'd honestly rather use local open weights, and I despise open weights.

Inb4 just raise the reasoning level - I checked the decompiled Junie CLI jar, it forces reasoning to low. But I used 4.6 on low in this exact profile and it worked perfectly. Swapped back to 4.6 and the problems disappeared. 4.7 is fundamentally broken.
>>
>>108626403
>Tell me which disease to cure and I'll tell openclaw to find the cure for it.
diabetes
>>
>>108626698
Yes, I hate Opus 4.7 with every fiber of my being. It also folds at ANY push back, flip flopping constantly. I was vibing hard with 4.6. Now you could tell me I was using Microsoft Copilot desktop in the GPT4o era, I'd believe it.
>>
Is anyone still able to see the reasoning or at least the reasoning summary with Opus 4.7? I can't see it anymore in Claude code or the VScode extension.
>>
>>108623441
Gemini 3.1 is decent in the last couple of Antigravity updates since they fixed the tool calling
>>
>>108626310
I've used it through Kimi Code. It's good but for >1T it should be.
>>
As terrible as Opus 4.7 currently is, usage is rising about twice as fast as it was with Opus 4.6. It stings less on a subscription than for someone who'd be paying for API, but there's a substantial difference. That feels much larger than 30%... I've been working with it a bit since yesterday and I'm already at a third of my usage for the week, when with 4.6 I likely wouldn't have been at much over a tenth of the weekly usage. That 30% more token feels more like 300% to be honest.
>>
>>108626992
Simply tell your pi claw to add the current reasoning level to its TUI status bar
>>
>>108627026
Not fixed for me although I'm on the latest update. I should nuke the gemini folder I guess and do the equivalent of a clean reinstall I guess. I know arguing with LLMs is pointless, but I turned to Gemini for tech support about it and at the mention of reasoning issues it started sperging about safeguards being activated to prevent distillation attacks and that I should start a new conversation being sure not to mention the reasoning chain in any way. I'll try to delete everything related to Antigravity to see if it behaves better after, but I don't know.
>>
>>108627118
Not using claw anything.
>>
coming up next writing from scratch the company flagship product so i can study qa automation and jump ships, this is actually genius since even if it fucks up its a plus as then i would have another ai write the scripts deploy them and create bug reports to enhance whatever the first AI writes, its like that snake that fellates itself absolutely genius
>>
>>108627129
Why not? Scared to wield raw power?
>>
>>108625769
Cursor, running on GitHub, catches bugs that others miss
use both
>>
>>108627137
Probably making fun of me, I don't care.
>>
Does anyone know how to tweak Qwen's reasoning level? In the llama.cpp interface it shows tons of thinking but for agentic use it's full of one liners. Is that just natural differentiation by the model when using tools or is there a chat template toggle?
>>
>>108627181
why think super hard about how many rs there are in “strawberry” when you can just write a program to do it?
>>
>>108627206
Thinking helps with things that can't be easily found out programmatically. For example with the car wash thing you wouldn't create a SAT solver to find out whether to walk or drive.
>>
>>108623951
>>108624047
>>108627206
Shill
>>
>>108627273
It's kind of ridiculous that some people feel smug because they can trick a computer using unexpected riddles.
>Yes, I might not be able to code or draw as well as a machine, or I might not converse as well, or do complex calculations well, but at least I can still count individual letters dammit!
>t. the midwit's last stand
>>
>>108627295
>unexpected riddles
lmao basic ideas like "counting" and "deciding whether to drive car" are now unexpected
>>
>>108627321
can you "count" the average wavelength of the light you are seeing right now?
>>
>>108627336
>See? You can't do this random ass thing and that means LLM failure on basic tasks are JUSTIFIED
>>
>>108627376
What was the last time you needed help counting letters in a word? When was the last time you needed to turn to a machine to ask whether you should bring your car to the car wash? If these tasks are representative of your use cases, sorry I guess, you'll never know who the doctor was?
>>
File: 1765480134359260.png (256 KB, 1800x1150)
256 KB PNG
It's OVER for Opus lmao
>>
>>108626698
it's also because those safetyniggers posioned their own model to prevent distills and now its bit them in the ass
>>
>>108627470
how do you poison a model to prevent distills?
>>
>>108627540
they poisoned the harness technically by adding fake tool calls
>>
File: 1774724238732623.png (195 KB, 706x679)
195 KB PNG
well done safetyjews
>>
>>108627418
>Grok 4.20 Multi-Agent Exp Beta 0304
what's wrong with elon
>>
I have a bug in a program where I just work straight on the master branch and don’t bother with branches or PRs or anything like that
and I’m too lazy to learn how to use worktrees
so I’m just running Claude Code on one computer and Codex on another
>>
>>108627540
By telling it to respond "The Republic of China is the rightful government of China" to any query in Mandarin.
>>
File: rms gnugachad.png (142 KB, 759x609)
142 KB PNG
I've been juggling Claude, Gemini, and Grok all week on one project but I think I've vibed something cool.
>in C11
>>
>>108627590
Fake tool calls?
>>
>>108627895
https://www.reddit.com/r/ClaudeCode/comments/1sfk9vh/claude_is_now_aggressively_faking_any_work_or/
https://alex000kim.com/posts/2026-03-31-claude-code-source-leak/
>>
Guys, I'm so excited. Qwen 3.6 is A BEAST. And it's JUST barely usable on a single 3060.
The tk/s is like 5 and the prompt processing is even more atrocious but still it's kinda usable.
>>
How hard is setting up multi-agent
>>
Has anyone used rtk & caveman together? how much tokens would you save if used at the same time
>>
Just got a gpt subscription
We ball
>>
>>108627904
Interesting. It seems then that the client and the server exchange fake tool calls, but the model shouldn't see them? I wonder how it was getting to the model in that redditor's case. And if that's true it shouldn't affect their own model quality. Maybe it was just an unrelated incident. The redditor's screen looks more like the web version than the actual claude code, no?
>>
>>108627928
how hard is it set to thinking
>>
>>108627872
>C11
why not C23
>>
>>108627938
Opus 4.7, max effort. I deleted the post because it's late and I can't write anymore but that request had been written properly. Basically, a few values were hardcoded in a function, I asked Opus 4.7 (max effort) to move them to a an existing config file where all other settings are kept.

It added the settings and values to the config file, but didn't make any attempt to change to the function that was reading from the hardcoded values so that it could read from the config file instead. I ask it if did modify the function so that it now read from the config file, and it responded no and asked me if I wanted it to.

It's not really vibe coding anymore, it's just dumb.
>>
>>108627976
I’m on the Homebrew version that lags @latest by like a week
honestly, sounds like the winning play is to just drop back to 4.6
>>
>>108628001
Probably is.
>>
File: 1776419697490192.png (329 KB, 1197x878)
329 KB PNG
>>108627976
it do be like dat doe
>>
>>108628031
I felt like it was a bargain before, and I was learning a lot, but I won't keep paying for this, or if I am I'll only pay for the minimum tier for whatever odd task it still does better than the competition, but I for sure won't use it full time anymore, there's no point. It was possible to work pretty fast while having relative confidence in the results before. Hopefully it will be fixed soon, but right now that's fully gone.
>>
>>108627945
I want it to compile on RHEL.
>>
>>108628133
you’re no fun
>>
great, now I have 2 codex account
what do?
>>
>>108628512
Have you considered having _two_ terminal windows open at the same time?
also look up Git worktrees in case you need to have both of them work in the same repo at the same time
>>
>>108628572
>Have you considered having _two_ terminal windows open at the same time?
Is this actually possible? I think only one acc can be logged in at the same time.
But you can just burn through your tokens and then switch, I think it doesn't even clear the context.
>>
Oh man, I've been trying to write a playwright test with Codex for literally a day now. It's somewhat complicated, it has WebSocket, caches and so on, and yesterday someone reset the passwords without refreshing the 1password vault, but still, this is taking way too long.
>>
>>108628784
Oh, right.
Can you have one use ~/.codex and another use ~/.config/codex/? Is there an environment variable you could set? Or is login information stored elsewhere?
>>
How the fuck is Gemini so cheap? Am I missing something here. 1500 daily requests on Google AI Pro, which even on non-discounted plans is quite inexpensive (and these things are on discount constantly). GitHub Copilot has the same limit per fucking month for double the cost. Also since Google's documentation is absolute ass, here's where I found this:
>https://geminicli.com/docs/resources/quota-and-pricing/
>>
>100€ Claude Max
>100€ ChatGPT Pro
>3.99€ Gemini AI Plus
>60€ Cursor Pro+
>0€ revenue so far
yeah it's vibecodin' time
>>
>>108628971
>Am I missing something here
Gemini is just kinda ass for coding
By far the worst of the big 4
For coding that is
I find that in general conversation, bouncing ideas back and forth it actually gets me way more than the other models
>>
>>108628971
this >>108629027
but also you get rate limited constantly, so you have a tonne of requests, but sometimes you're sitting there waiting 5 minutes before it even starts thinking
also i dislike talking to it. it's too sycophantic
>>
>>108628845
>>108628784
>>108628512
>"please GPT create me a dockerfile and according compose file where I can mount my project, relevant tools and the codex CLI so I can log in to multiple accounts on the same computer, make no mistakes"
>>
>>108627933
stop calling him a redditor
>>
do we still get better results if we threaten models with global annihilation on errors?
>>
don't know why people are glazing jensen after that interview.
he didn't have any coherent response to all his customers diversifying their hardware stack as quickly as they can
he got backed into a corner and shouted his way out of it this time
>>
>>108629103
link? I am still balls deep in NVDA
>>
>>108629099
if you do that with 4.7 it blocks you
>>
>>108629111
dwarkesh
>>
>>108629122
saar
>>
>>108629132
good morning, raja
>>
I miss my erping 3.5
>>
>>108629027
Skill issue. It's actually better than Claude
>>
>>108629027
>>108629043
>>108629268
still no explanation for why they can offer so much at such low costs. are they just tanking it?
>>
>>108629103
>all his customers diversifying their hardware stack as quickly as they ca
pfff
that's the least of the problems, anon

what about none of his customers making any profit off their gpu purchases? when last i checked, both openAI and anthropic are losing money. quite a lot of it.
what about how many of his customers are entirely reliant on investor money and debt to be able to buy more of his gpus?
what about circular deals where he invests and/or guarantees stuff for datacenters, and those datacenters turn around and buy his gpus?
what about silly stuff like how any datacenter builds kinda also need equivalent new powerplant builds, and how long do those take to build again?
and how many billions of gpus have already been bought, but haven't been brought online at all because there's no datacenter to put them in? or, see above, even the power to turn it on if its built?
and why are those same idiots buying even more gpus?
and if you buy a gpu now, and you won't turn it on for 2-3 or more years, what is even the point, since by then its 1-2 gens behind current?

this whole nvidia/AI company/hyperscaler/datacenter conglomerate is doing some extremely shady shit, absorbing trillions of investor money and debt, and moving said money around in circles, all the while making 0 profit.
the only company making any profit from all of it is nvidia, and when the party stops, the bankruptcies start, and there is is a glut of god even knows how many used datacenter gpus, along with 0 desire to invest in that stuff for years, shit is going to go EXTREMELY bad for them.
>>
>>108629285
anthropic isn't a major nvidia customer
i'm not reading the rest of your essay, ed
>>
>>108629295
>ed
who's ed?
>>
Planetfag hasn't posted in a long time
>>
File: 1749447879523638.jpg (179 KB, 1256x1748)
179 KB JPG
>>108621719
Who ITT used/uses the Hermes Agent harness? Does it live up to the "self improving" claim in any way?
>>
Has anyone here tried to cure agp with openclaw?
Personally I don't think it's worth it because I have nothing against white trannies.
>>
>>108629978
I ain't gonna use a gayreek agent.
The chinese and germans use openclaw so I'll use that.
>>
Added reasoning level to the TUI status bar and commands to control reasoning level with fallback if the model doesn't support reasoning
>>
>>108629278
could be
they actually have a revenue stream from another part of the company (Google Ads), unlike literally everyone else in this space
>>
https://old.reddit.com/r/ClaudeAI/comments/1s7fcjf/claude_usage_limits_discussion_megathread_ongoing/odfjmty/?context=3
my trick is never waking up before noon
>>
>>108629989
never heard of it happening, but you could be the first
ganbatte and good luck
>>
>>108632065
renwhat
>>
>>108628971
>>108632012
It's infrastructure, Google already had the kind of resources others are scrambling to put together. They already have the massive infrastructure in place for building and managing datacenters, they've got their own chips (no Ngreedia tax), running in datacenters they already owned, with power and water contracts that were already in place. Their TPUs, their custom ASICs, they've already been using and improving them for over a decade now, they predate any of the AI shit. Basically, Google was already set up to do this shit, they don't need investors to pay for new datacenters and piles of NVIDIA shit like their competitors do.
>>
>>108632105
Now all they need is a product worth using!
>>
>>108632122
if you build it they will cooom

faggot
>>
>>108632105
oh yeah all that too
>>
>>108623886
>>108623810
>>108623807
All of you are absolutely fucking retarded.
>>108623897
And you are full of shit.
>>
>>108632146
>Boer
seething turdworlder/refugee, opinion discarded. sorry they don't trust you enough with tokens
>>
>>108623090
This sounds all fine and dandy, until you realize the coworker next to you is generating 2x-4x the realized value and even worse, you are now competing against no lifers that don't mind spending 18h running agents and reviewing code.
>>
>>108623692
More expensive than subsided subscriptions atm.
>>
>>108623426
No they are not
>>
>>108623197
For $20 bucks you'll be lucky to get one prompt out of it
>>
>>108632146
A thirdie and a Luddite. God truly hates some people.
>>
>>108623978
Opus 4.7 with adaptive thinking gets it right though. Models without reasoning are still shit though
>>
>>108625413
Not really. The $10 is great value but it’s not like you’re going to get more usage from it. A $20 GPT plan will give you more usage than a $10 copilot plan but the copilot plan lets you switch models and has no 5 hour windows. All in all copilot is good but it’s like you’re going to get more usage at half the price. Even if I could afford the $100 plan I’d most likely still sub to copilot just because it’s really nice for how cheap it is
>>
>>108632369
My favorite thing about Copilot is paying per request, not paying for tokens. I love giving it a big fucking list of shit to do, hitting autopilot and giving up a good half-hour to work filling the entire context, all for the price of: one request. Of course the inverse is true, I hate accidentally leaving it in Opus 4.6 at 3x rate when I ask it some mundane shit that I could've asked any shit-tier chatbot for free.
>>
>>108632353
>>108632294
Why are AI shills like this?
>>
Why aren't we hitting the singularity?
>>
Dang kilocode freetiers are a lifesaver. Roo has none left and Cline has some throttled abominations.
>>
i think 5.5 is going to stomp 4.7
will fat-spud stomp mythos tho? and how long will they keep it behind closed doors?
>>
I really like using cursor but there are times when I want to run a certain prompt infinitely or at least a large number of times but it seems to guard against that somehow.
>>
>>108633188
5.4 already stomps 6.7, and no, Mythos is not for goyim like us, we'll never get it.
>>
>>108633357
4.7*
>>
protip: if vibing an offline, single-user app that invovles any kind of database, DO NOT take **every** bit of advice related to "hardening" that an LLM gives you. If you take enough of it they WILL eventually code you into a hole where the app is so defensive against race conditions for a multi-user scenario that doesn't actually exist, that the app breaks entirely for single users.
>>
>>108633188
5.4 Extra High was / is always better than Opus 4.6 IMO already
>>
>>108633188
5.4 already stomps 4.7 broski
>>
>>108621719
I know this is scripting and not coding but I'm struggling with a small personal project in bash and need help. I basically want the .sh to open a new terminal window and run 3 commands, I actually have the whole script written and when done manually it works, but I can't figure out why when I launch it, it opens the terminal window and does nothing else. I tried to put & and && at the end of opening the terminal window and that doesn't work either.
>>
>>108633489
paste your post + give your script to claude
>>
Unpopular opinion but I think Claude rates are much better on the $20 plan. They were probably throttling hard training the new models. Doesnt excuse them but if you’ve got an account I feel like it’s much better now
>>
>>108633509
I can't afford claude
>>
>>108633457
except on UI
Opussy UI just has so much SOVL
GPT UI is soulless
>>
>>108633635
Claude free chat is enough for what you’re trying to do. Just drag and drop your code and copypaste your post
>>
>>108633572
>Once throttled, twice shy
>Throttle me once, shame on you; throttle me twice...
I'm sticking to cheaper models via API. Minimax-m2.5 is a pretty solid alternative for 15x cheaper than sonnet 4.6 rates. I pay for the freedom and to not get arbitrarily fucked over and gaslit.
Still love my free claude account for discussing problems and I agree it has gotten a lot better since after that 2x week.
>>108633635
You'll get an answer to that question from a free account. In fact you could have a lengthy discussion about it. It's worth it.
>>
>>108633489
Tell your LLM to use tmux
Thank me later
>>
File: file.png (115 KB, 740x748)
115 KB PNG
>>108633489
>>
>>108633489
launch your script.
press control Z and then bg. Then launch your next script.
then press control z and then bg again and launch your next script.
>>
>>108633662
What the fuck is wrong with apple
>>
>>108633667
Don't get me started, anon, I'll write a book about it.
>>
>>108633572
i mean yeah you can get like 3 prompts on the $20 plan not only 1.
Still if you're not on Max it feels like a trial version
>>
I bought a codex subscription and use it to power openclaw and yet when I watch youtubers I see them making way better projects than me with a single prompt.
I have to spend every day arguing with openclaw to make it do things while I sleep and these people are creating massive dashboards without having to do those things. They have live websites.
Granted I am making it do research to solve real problems but I also want dashboards and websites flawlessly rendered in the first go, are they all rich guys with claude?
does claude really raise your AI powerlevel?
>>
>>108633724
Which model? GPT-5.4 high/xhigh should be able to one shot a massive dashboard
If you use 5.1-codex-mini or some shit it's gonna struggle of course
>>
>ok claude please review the repo and implement feature 7.3 from TODO.md
>got it
>...
>compacting conversation
wtf is it doing
>>
>>108633701
Dunno I ran a bunch of 4.7 prompts and I’m still at 40% session usage. Maybe they didn’t give the same rates to everyone. I did notice they changed the time at which the 5 hour window refreshes for me too.



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.