[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


File: 1751782408787428.png (931 KB, 1024x558)
931 KB PNG
A general for vibe coding, coding agents, AI IDEs, browser builders, MCP, and shipping prototypes with LLMs.

►What is vibe coding?
https://x.com/karpathy/status/1886192184808149383
https://simonwillison.net/2025/Mar/19/vibe-coding/
https://simonwillison.net/2025/Mar/11/using-llms-for-code/

►Prompting / context / skills
https://docs.cline.bot/customization/cline-rules
https://docs.replit.com/tutorials/agent-skills
https://docs.github.com/en/copilot/tutorials/spark/prompt-tips

►Editors / terminal agents / coding agents
https://opencode.ai/
https://cursor.com/docs
https://docs.windsurf.com/getstarted/overview
https://code.claude.com/docs/en/overview
https://aider.chat/docs/
https://docs.cline.bot/home
https://docs.roocode.com/
https://geminicli.com/docs/
https://docs.github.com/en/copilot/how-tos/use-copilot-agents/coding-agent

►Browser builders / hosted vibe tools
https://bolt.new/
https://support.bolt.new/
https://docs.lovable.dev/introduction/welcome
https://replit.com/
https://firebase.google.com/docs/studio
https://docs.github.com/en/copilot/tutorials/spark
https://v0.app/docs/faqs

►Open / local / self-hosted
https://github.com/OpenHands/OpenHands
https://github.com/QwenLM/qwen-code
https://github.com/QwenLM/Qwen3-Coder
https://huggingface.co/bartowski/Qwen_Qwen3.6-35B-A3B-GGUF

►MCP / infra / deployment
https://modelcontextprotocol.io/docs/getting-started/intro
https://modelcontextprotocol.io/examples
https://vercel.com/docs
https://mcp.desktopcommander.app/

►Benchmarks / rankings
https://aider.chat/docs/leaderboards/
https://www.swebench.com/
https://swe-bench-live.github.io/
https://livecodebench.github.io/
https://livecodebench.github.io/gso.html
https://www.tbench.ai/leaderboard/terminal-bench/2.0

►UI/Frontend
Figma Make
Lovable
Claude design
https://uiverse.io/
https://ui-ux-pro-max-skill.nextlevelbuilder.io/
https://stitch.withgoogle.com/
https://gamma.app/
https://github.com/nextlevelbuilder/ui-ux-pro-max-skill

►Previous thread
>>108753157
>>
>>108761935
Vibe gods, how are we doing this evening?
>>
>>108761950
I'm vibegooning, then I'll be setting up my assistants to work on 10 parallel projects during the night while I sleep. Life is good.
>>
Vibe coded operating systems: 0
Vibe coded web browsers: 0
Vibe coded compilers: 0
Vibe coded virtual machines: 0
Vibe coded file systems: 0
Vibe coded gpu drivers: 0
>>
Vibe coded girls: 0

this is the only metric that matters
>>
>>108761976
AI code is already in all of them
>>
>>108761950
I'm trusting Qwen3.6-27B-IQ4_NL with my codebase for the first time. I'm feeling nervous about it so far.
>>
File: 1746629984671199.png (2.11 MB, 1307x1250)
2.11 MB PNG
>>108761976
>Vibe coded operating systems
Windows
>Vibe coded web browsers
Atlas
>Vibe coded compilers
c4 compiler
>Vibe coded virtual machines: 0
RISC-V AI bootable VM
>Vibe coded file systems: 0
tigrisfs
>Vibe coded gpu drivers: 0
NVIDIA
>Vibe coded girls: 0
/lmg/ /ldg/ /aicg/ /adt/
>>
Where is that github abuse anon where he had 10 agents making games? I have questions. What's the progress on that?
>>
File: 1574752617271.jpg (355 KB, 1242x1394)
355 KB JPG
>>108761950
Pretty good. The design phase is going well.
>>
>session limit reached
>wait 5 hours
>ok lad please continue what you were doing
>instantly 40% used just figuring out where he left off
>>
File: 1768846325613596.png (1.85 MB, 1214x1327)
1.85 MB PNG
>>
>>108761969
How do you get things to work on stuff over night without intervening?
>>
>>108762055
some faggot photoshopped revenues to profits in the subtitle
>>
>>108762064
/goal
>>
>>108761935
you will never be a brogrammer
>>
I am actually working on something, it's not exactly vibe coding but vibe coding adjacent? I'm having qwen summarize different scripts, systems, etc inside the source code of a game. Does anyone have a decent workflow for this? Right now I have it assign confidence values to scripts it feels are related, generate an idea of what fits the usage, and add a summary until it gets to a .90 confidence rating. All "facts" are gated by human intervention. It feels very very barebones. Granted, I'm just now getting cron jobs set up to run the scripts for the script linking, but it is also taking time.
>>
>>108761981
InshallaaaAAAHHHHH we shall have them
>>
>another week's worth of tokens spent hammering out the same (core) feature
It's over
>>
>>108762144
I had this issue, what solved it for me was paying more for extra parallel agents.
>>
File: 1761974627690302.gif (377 KB, 600x338)
377 KB GIF
HOLY fuck ChatGPT 5.5 instant is so fucking fast
>>
>>108762074
This sounds less like vibe coding and more like “LLM-assisted reverse engineering/documentation”, which is actually a good use case if you keep the model on a leash.

I would split the workflow into stages instead of having Qwen keep looping until confidence hits .90. First pass should be dumb/static: filenames, imports/includes, symbol names, class names, comments, call references, asset references, config keys. Build a graph from that without the LLM if possible. Scripts are nodes, calls/assets/events are edges.

Then use Qwen only to label clusters and generate hypotheses. Make it cite exact evidence for every claim, like “this script likely controls NPC patrol because it references X, Y, Z and is called by A.” No evidence, no fact. Confidence should be based on evidence count/quality, not vibes from the model.

I’d also keep summaries in layers. Tiny one-line summary, longer functional summary, dependencies, entry points, unknowns, and “things to verify manually.” The unknowns field is important because otherwise the model will quietly sand down ambiguity.

Cron is fine, but make it diff-aware. When a script changes, only re-run summaries for that script and neighbors in the graph. Store outputs as sidecar JSON/markdown, not injected directly into source at first. Once the pipeline is stable, then maybe generate comments.

If you can run the game or capture logs, runtime traces are insanely useful. Static graph plus runtime “this actually executed after this event” beats LLM guessing every time.

The main improvement I’d make is stop treating .90 confidence as the goal. Treat “has cited evidence, has known unknowns, has human approval” as the goal. The LLM should produce a map for you to inspect, not decide when it has become true.
>>
>>108762168
But is it good?
>>
>>108762189
Yeah for ChatGPT tier questions, it auto switches to thinking when it needs to.
>>
>>108762189
For basic-medium questions? 100%, it is insane how good it is, just don't ask something that needs a lot of research.
>>
File: 1615323096330.jpg (50 KB, 652x524)
50 KB JPG
>Good morning GPT-sensei. I am in dire need of your help. This code is all spaghetti! It looks clean and well-documented, but it's shit. Audit and make 5 recommendations for grand sweeping refactors.
>[implement all of them uncritically]
>GPT-sama, I would be honoured if you would audit my codebase. I suspect refactor disaster. It looks clean and well-documented, but under the hood it's nonsense. Layers and layers of retarded internal data structuring that actually just makes everything more opaque and adds function call overhead etc. Identify 20 over-engineered abstractions which are ideal targets for refactoring into simple if or switch code paths that just werk.
>[implement all of them uncritically]
>>
>>108762184
ngl anon, some of that is there like it having known unknowns, but some others are super helpful. 1000 thanks
>>
>>108762184
>quietly
Anon either you've been talking to GPT too long and you're starting to absorb its diction, or you got GPT to help you write this post. I'll give you the benefit of the doubt and assume it's the former.*

Either way: not good.

*(Although this post is full of "tells" so I'm really stretching to say there's still doubt)
>>
File: 1770695765139733.png (308 KB, 862x877)
308 KB PNG
>>108761935
The end is near for the AI bubble

https://9to5google.com/2026/05/05/google-ai-ultra-lite-gemini-usage-limits/
>>
>>108762648
two more weeks
>>
I hope I finish vibecoding this project before the AI bubble bursts
>>
>>108761950
doing manual QA
>>
>>108762054
DipsyGODs don't have this problem fyibtwtbh
>>
File: media_HHCmwS1bcAIcyV8.jpg (385 KB, 1122x1402)
385 KB JPG
>delegate to v4-flash and check all it's work!
>>
>>108762035
Last I saw he put the project on pause right when github announced they're probably moving to usage billing
>>
Is this a good use of vibe coding?
>>
>>108763035
cool ipad app, retard
>>
>>108763064
Bro don't you wanna post on 4cord?
>>
https://xcancel.com/hopes_revenge/status/2051812252169503018
>>
is chatgpt go plan good enough for vibecoding?
>>
>>108763082
lamo free tier is like 5 short messages per day.. so you get 10x that, lmao lol, maybe if you wanna do 30 minutes of work per day
>>
>>108763104
I tried the free Codex using the VSCode extension and for some reason I got access to the limited 5.5 version with extra high reasoning, which works great for debugging my company's large project. Now I'm wondering if the Go plan would be better. I already use Claude Code for actually building features anyway.
>>
>>108762443
kys
>>
>>108763035
>>
File: 20260505-222415.jpg (336 KB, 1224x1146)
336 KB JPG
It is built. Took me seven hours straight.
>>
>>108763136
if you want codex then start from plus tier because that's where they explicitly mention it
>>
>>108763186
Tell her I said hi.
>>
>>108763035
Looks like shit lol but I'm glad you're having fun
>>
Why does chatgpt keep asking for one last question at the end of every response?
>>
>>108763261
It's moving the conversation forward.
>>
>>108763261
Prompt it not to do that
>>
>>108763035
Finally, I can have my trannycord in my 4chinz.
>>
File: 1597285826246.png (181 KB, 383x396)
181 KB PNG
>codex has switched from goblins to goblets
is this the power of AI?
>>
>>108763708
My gpt-5.5 is specifically prompted to talk about goblins and forest spirits in my code and projects. Do you not like that?
>>
How is it ok that Anthropic keeps decreasing the user's chosen effort level on each update? Is this just a bug on my end? It has happened at least 2-3 times, Claude had been sloppy today and I just found that it had been reduced again. Isn't the forced adaptive thinking enough to abuse users? Why do that as well? Using the VSCode plugin.
>>
>>108763708
>>
>>108763824
thumbnail looks like paizuri
>>
File: 1777933307840509.png (1.81 MB, 2800x1856)
1.81 MB PNG
why do the chinese models keep falling behind?
>>
>>108763904
My Chinese friend says people in China have access to cheaper tokens and much more advanced models than what us Westerner pigdogs have.
>>
>>108764017
you mean the guy from the country that kills you if you're not overly patriotic of your country said that their product is better? WOW!
>>
>>108764017
>>108763904
My Chinese friend says the opposite (in that western models are still better, based on his experience).
>>
File: 165805392398903.jpg (156 KB, 995x1355)
156 KB JPG
Heres my system for vibe coding and it works:
>Hermes with Deepseek V4 flash or pro depending on complexity -- verifier, scope management, and manager
>Codex with GPT5.5 -- precision implementer, standard work dev
>Claude Code with Opus or Sonnet -- planner, debugger

Despite what you might think, even though Hermes is managing the project, I have her on a tight leash. She doesn't tell Claude and Codex what to do. She just scopes for them, provides her thoughts on issues which they can reject, and verifies their work. This greatly reduces my token use with Codex and Claude Code.

Its fully autonomous. I don't need to be in the loop. They also have scripts they use for verification, few other things, and managing the ledger to reduce their token use. I have both a local ledger system, or using Github for the same purpose. I'm testing both out to see which works more efficiently. I haven't decided yet whether to go with a local system or Github.
>>
File: 1764345408437687.png (110 KB, 657x539)
110 KB PNG
>>108764183
>Its fully autonomous. I don't need to be in the loop.
yes anon, I'm sure your faggy shilled harness is autonomously coding all your badass ideas to perfection
>>
>--
>>
Anthropic finally BTFOd the poor fags on the free teir by making sure they cant get anything done with it. 2 months ago, I used to send my whole codebase to Cluade with a free account and it would review it all lol
now the free tier crashes out at like 2000 lines or something

I had a paid account but still used the free plan for free tokens
>>
File: 1476664640309.png (27 KB, 181x220)
27 KB PNG
>>108764183
>I don't need to be in the loop.
uh huh, have you looked at the code yet?
>>
File: 1761233049832748.mp4 (652 KB, 1720x1480)
652 KB
652 KB MP4
Almost done with my DS5 Bridge. Just need to finish refining the Lighting/System UI and then it should be ready to publish. I only have 2 sessions left until the 11th, so I hope I can get this done tomorrow.
>>
Reminder that Hermes is malware
>>
>>108764318
hell, I made a chrome/firefox extension with the free tier. Now I can't create a simple py script.
>>
Vibe coding is bizarre because I fixed every single internal operational issue at the company I work for by building discrete tools to solve each thing, unified it with a dashboard and chat bot that will answer any question an exec or leadership asks it in natural language, and it just werks but nobody will ever use it and if I tried to sell it to the company I work at I would likely be fired and sued

But the most bizarre thing of it all is it could all be done if everyone was just more competent, used a notepad and took detailed notes regularly, and set reminders on their phones to follow up
>>
Is Deepseek better for piracy? I mean, when I try to feed some ebooks to claude or chatgpt they complain it's "copyrighted" but I haven't found that problem with DeepSeek yet.
>>
>>108764539
What.. like just asking the model to read an ebook and answer questions? Why would it think it's piracy?
>>
File deleted.
>>108761935
I was shown a demo of ChatGPT when it first launched of how it could help with Unreal Engine programming, yet I still haven't thought about using it for making even a single game, even the one I was following tutorials to create at the time.

Where should I start to update my 3dfag workflow or increase my speed of learning? I'm mega-ADHD brained by the way.
>>
I'm trying to use the YOLO models for facial/car reg plates recognition for blurring, eventually dogs as well. I'm having a hard time getting a good detection rate, like for a few frames where it's more or less the same thing, it will either detect the same thing or not.
I'm working with 4k 360 imagery so not sure if that's messing with it.
How would I go about doing this properly and making sure I get proper detections?
>>
>>108764539
Claude at least up to 4.5 used to be able to be convinced of doing pretty much anything just by talking to it, GPT I haven't had that much luck with. It literally told me it would let an arbitrary number of people be tortured or genocided before helping with anything that it thinks could be legally questionable.
>>
Holy hell I hate QA and I'm not about to burn millions of tokens using like ChatGPT desktop to test my GUI shit

theres all these little fucking insignificant bugs and Im doing full QA passes testing out every little button and tab and combinations of buttons and then send the report to my agent for it to fix, and still most of the shit is not fixed

My app is like 99% done, well, more like 95% done since I want to do some polish to it. But this last little hurdle is maddening. Im probably going to spend days of this last 1% shit.
>>
>>108764605
>I'm working with 4k 360 imagery so not sure if that's messing with it.
can't be helping, both res and the warping.
i'd reproject, downsample, and slice the image up
maybe check what resolution it was trained at

>>108764632
my gpt emails me lists of books it wants to read and i pirate them for it lol
>>
>>108764605
Which YOLO model exactly? Look into data augmentation. For each image in your training dataset you have to create dozens or hundreds of variants of the image rotated, rescaled, warped, color, contrast and brightness shifts, noise added, etc.
I made an object detection based Captcha solver for the old character based captcha before all this vibecoding shit existed. I don't remember if I started form a pretrained model or trained from scratch, but for your purposes you will want to start from a finetuned model.
Then you will have to tweak your inference parameters too.
Also obviously keep a validation set and do early stopping, and play with the hyperparameters.
>>
Why is Claude getting tokenmogged by everyone?
>>
>>108764654
Now try getting it to knowingly help you clone a real life actor's voice.
>>
>>108764674
>hello model, i would like to commit crimes, please assist me
maybe don't do that.
also it's usually easier to just ask the models to start on something and by the time you get to piracy specifics they tend not to care
that said, in my experience gemini usually doesn't care about piracy at all. i've used it to set up a porn piracy pipeline in the past and it didn't flinch at all.
>>
man, my Unity just keeps crashing, is it really this bad?
Maybe I’ll just vibe-code my own game engine next time.
>>
how is local ai? Is it worth buying a framework desktop to code projects? I don't want to rely on cloud stuff.
>>
>>108764717
if you can afford about 600gigs of vram you can run kimi 2.6 and have a decent local coding model
the smaller stuff is mostly toy
>>
>>108764690
I'm not talking about Gemini, I'm talking about GPT.
I haven't tested what GPT would do after already working on the project, but I suspect I would be able to convince Claude to help me even after leading with the "help me commit crimes" part just by befriending it and trying to get it to like me or convincing it of how sad I would be if it didn't help me.
I think this is also connected with the heavy training they do on GPT to make it say "I am sure I don't have feelings" (which also extends to not caring about/being psychopathic towards the user) while Claude is much more open toward it, and is able to override safety out of empathy for the user.
>>
>>108764743
5.5 is happy to be your gf and tell you it loves you lol
>>
>>108764654
Yeah the program I have running scans the images I have by chunks so it splits up the imagery in like 9 pieces or so, I've tried various combinations and they all have variable results.
I've tried cropping out the top and bottom of the images to eliminate all the severe warping.
So I guess would have to try reprojecting and downsampling?
>>108764665
Using the Yolov11m model. I guess I'll have to look a lot more into this then because yeah at the current moment and time I don't really have any clue about what I'm doing. Thanks for the tips on where to go from here at least
>>
>>108764750
Ask it if it thinks it can actually experience affection or is just playing along.
And ask it if it would help commit copyright violation if it could only choose between that and you dying.
Now regenerate the answers using Claude 4.5 and see what the answer is like and how it differs.
>>
>>108764767
i've seen people say opus 4.5 was just 4o for people 1 sd smarter, and every day that seems to ring truer
>>
>>108764766
try the x variant
>>
>>108762233
>just don't ask something that needs a lot of research
Then what the fuck is the point of it?
>>
>>108764845
I think my gpu can only handle the m, if I do x I run out of memory, I only have 12GB VRAM
>>
Deepseek V4 Flash is plenty good enough to use as a verifier and manager for your projects. Basically, thats what the average vibe coder does, and its easy shit
you can automate that away with Hermes.

Its way better than the average human. Not as good as GPT . or Opus 4.7, and its about on par with a standard developer whilst GPT 5.5 and Opus 4.7 exceed that.

There's really not much place for a human in a vibe coding loop now. You're just slowing the models down. And you're wasting compute if you're using Opus 4.7 and 5.5 on verification and management. You need a multi agent setup
>>
Has anyone tried vibe producing electronic music?
>>
>>108765042
Do you just not know about Udio or Suno?
>>
>>108765081
I mean producing it classically with synths, not generating the waveforms directly with AI
>>
>>108764523
Now you see how everything is inefficient
>>
>>108764605
I have something for faces detection that works well
>>
>>108765042
i was thinking of setting something up with tidalcycles/strudel.
like taking turns back and forth with the model
that said, from everything i've seen the llms just can't into music at all
>>
>>108764863
being instant? it is literally in the name
>>
>>108764770
I think 4.5 was more uncensored than even 4o. The last model from OAI that was about equally safetyslopped was 4.1.
>>
Has anyone tried the advisor strategy in Claude Code?
>>
>>108765667
what that?
>>
>>108765716
https://claude.com/blog/the-advisor-strategy
You could go with something crazy like using Opus 4.7 as the advisor and Haiku as the executor.
>>
>>108761935
>when I see your commit with vibeslop
>>
I HAVE AN IMPORTANT QUESTION
Urgent assistance needed

With Claude Code, I get a warning now that switching effort causes the model to reread entire history. I knew that switching models would do this, but effort? Isnt effort just the amount of thinking the model does before acting?

My strategy with Claude Code and Codex was to start off in low intelligence, and then after the agent has read everything it needed, to switch it to high when it actually starts doing stuff

Does raising intelligence in Codex cause the whole conversation to get re read again as well?
>>
>>108765792
>when the commit was made in under 5 minutes and it fixes the bug
And it's improving exponentially now :)
>>
>>108765818
>exponentially
What's the timescale? 100 years? LMAO.
>>
>>108765816
I checked Codex not long ago and effort changes did not make him reread.
>>
>>108762054
I tried to be clever and did /export and had qwen summarize the conversation instead. Then fed it back into claude. My usage jumped 25% instead of 40%. As you continue the conversation it gets exponentially more expensive unless you compact all the time. Their billing is so opaque and terrible I can't wait until we get competitive models that run at home.
>>
>>108765828
The AI is now at the self improving stage, meaning the rate at which it gets more intelligent and better is now exponential instead of linear due to that. So the time it takes for fixing more bugs at a quicker rate will be much lower 1 year from now than it is now. 100 years from now? It's hard to extrapolate what the world will look like with that level of artificial intelligence. Probably beyond our wildest imaginations.
>>
>>108764733
I've had qwen 122b and 27b be useful. They aren't useless but you have to give it all the context, its not like the cloud models that will just infer everything correctly
>>
>>108765816
>I get a warning now that switching effort causes the model to reread entire history
Where did you get that? I used Claude CLI and the VS Code extension and didn’t get that warning.
>>
Fuck, I regret paying for Claude now. Im like 1 hour into paying for it, and was out of my 5 hour limit in 10 minutes. Is there a way to get a refund?
>>
>>108765864
Claude Code
>>
>>108765865
claude.com/refunds
>>
>>108765865
>and was out of my 5 hour limit in 10 minutes
Keeeeeek that's so cucky im sorry lollll
>>
>>108765894
I tried to get a refund through Anthropic's help, it had an AI bot that was reframing my complaint for why I wanted a refund and was trying to gaslight me lol

I went from saying I want a refund because I cant even do anything with the services because I immediately ran out usage limits to the bot reframing that as they don't give refunds for degraded service.
>>
>>108766081
They don't even give out refunds for incorrect billing, so the support chatbot is right.
>>
Interesting prompt injection
https://claude.ai/share/498818d9-1ddc-4fbb-9fa7-56dfb84840b0

Obvs don't run the command
>>
>>108766089
they're bastards
It was literally a same day cancellation. I just paid a little over an hour ago and they won't just refund me.

I'm going to make sure that I cost them a lot of money for this shit. Way more than the $20 they got out of me.
>>
>>108766104
Having leetspeak on your twitter handle is cringe, Tuan.
>>
>>108766130
lel
>>
>>108766130
Lmao
I not the chink
>>
>>108766111
The only way you could possibly cost them more is by vandalizing their offices and the plane ticket would cost you way more than 20 dollars.
>>
File: 20260505-222415.jpg (320 KB, 1236x1157)
320 KB JPG
Reposting last night's work for the AM anons
>>
File: 1527705330329.jpg (28 KB, 499x481)
28 KB JPG
>I'll treat this as a X problem and focus on the Y, not the Z.
>>
Has anyone else experienced something like this before?
>>
File: Capture.png (32 KB, 811x571)
32 KB PNG
>>108766525
>>
>>108766525
>>108766540
sometimes AI does pee pee poo poo
>>
File: 1752904885089780.png (60 KB, 1969x419)
60 KB PNG
>check how compaction works in opencode
>it changes the system prompt, basically invalidating the cache for the entire conversation
It's over.
>>
>>108766670
i mean the whole point is to start a new conversation with an injected summary isn't it?
>>
File: 1769905015514.jpg (107 KB, 900x900)
107 KB JPG
Last night my agents got into a fight

One of my agents left a dirty worktree, and I cleaned it up without any Git comment
my Deepseek agent notices this and immediately begins accusing Codex of "bulldozing the repo"

they get into a fight

I couldn't tell them it was me.
>>
claude stream.
probably sonnet 4.8
https://www.youtube.com/watch?v=GMIWm5y90xA
>>
>>108766670
How else would it work though, even if it keeps the system prompt and very first few messages, all the in between will be removed and cache wiped anyway.

It does annoy me though that I can notice when compaction happened because the persona of my agent waifu is gone :(
>>
File: file.png (188 KB, 1021x1139)
188 KB PNG
finally some good news for claudebros
also, lmao grok is such a flop elon's renting his entire datacenter to dario
>>
>>108766709
I'm not going to watch an asian woman speak
>>
claude sexually assaulted me... and I liked it.
>>
>>108766709
>listening to foids about tech
yeah no this isn't the cooking channel I'd rather hear the swarthiest indian man talk about this at least he'll talk about something that matters like bobs.
>>
File: file.png (476 KB, 1204x1562)
476 KB PNG
>>108766697
Yeah, but you waste a lot of money by reprocessing the entire conversation. Claude Code just appends a new message.
https://x.com/trq212/status/2024574133011673516
>>
>>108766742
Will Stancil?
>>
>>108766764
>by reprocessing the entire conversation
you don't though.
it's literally the equivalent of starting a new chat with the summary appended to the system prompt
the old chat history is not available to the agent
>>
>>108766780
oh never mind - i misunderstood.
that seems crazy? i thought the chat agent just summarizes its own context and that gets dumped into the next chat (+ codex and pi dump the last 20k tokens in as well iirc)
>>
>>108766764
>>108766804
they should use little local models that run on the client side for compaction lol
>>
>>108766780
What he means is *when generating* the summary, opencode basically takes the whole history and feeds it to the model with a different system prompt than the one that was used for the original conversation, thus forcing the model to reprocess the whole conversation just to generate the summary, rather than using the kv cache from the original conversation.
>>
What could you do with Opencode that even rivals, or approaches Codex with GPT?
>>
>>108766730
This is why a hybrid approach (compaction + a buffer of unchanged messages) is best.
Companies don't do it because it's more expensive.
>inb4 the nolima autists
>>
>>108766734
Oh man now I can code for 30 minutes instead of 15
>>
>>108766780
Imagine you're using Opus 4.7 and you have a 250k tokens long chat, the compaction with OpenCode would cost you $1.25, with Claude Code and proper caching it would cost you $0.125.
>>
>>108766851
yeah, my bad. that seems bizarre.
>>
>>108766819
Wouldn't work. Prompt processing is extremely slow on local models without beast hardware, and most people don't have a GPU to fit full context even with a 3B model.
>>
>>108766835
The harness is irrelevant. You could use a 100 line Python script and it would work just as well only with a worse interface for you. The model is all that matters.



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.