/g/ - /vcg/ — Vibe-coding General - Technology


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

Anonymous
/vcg/ — Vibe-coding General 04/09/26(Thu)13:19:46 No.108567168

File: 1745859137505826.png (1.02 MB, 1024x558)

/vcg/ — Vibe-coding General Anonymous 04/09/26(Thu)13:19:46 No.108567168

A general for vibe coding, coding agents, AI IDEs, browser builders, MCP, and shipping prototypes with LLMs.

►What is vibe coding?
https://x.com/karpathy/status/1886192184808149383
https://simonwillison.net/2025/Mar/19/vibe-coding/
https://simonwillison.net/2025/Mar/11/using-llms-for-code/

►Prompting / context / skills
https://docs.cline.bot/customization/cline-rules
https://docs.replit.com/tutorials/agent-skills
https://docs.github.com/en/copilot/tutorials/spark/prompt-tips

►Editors / terminal agents / coding agents
https://cursor.com/docs
https://docs.windsurf.com/getstarted/overview
https://code.claude.com/docs/en/overview
https://aider.chat/docs/
https://docs.cline.bot/home
https://docs.roocode.com/
https://geminicli.com/docs/
https://docs.github.com/en/copilot/how-tos/use-copilot-agents/coding-agent

►Browser builders / hosted vibe tools
https://bolt.new/
https://support.bolt.new/
https://docs.lovable.dev/introduction/welcome
https://replit.com/
https://firebase.google.com/docs/studio
https://docs.github.com/en/copilot/tutorials/spark
https://v0.app/docs/faqs

►Open / local / self-hosted
https://github.com/OpenHands/OpenHands
https://github.com/QwenLM/qwen-code
https://github.com/QwenLM/Qwen3-Coder

►MCP / infra / deployment
https://modelcontextprotocol.io/docs/getting-started/intro
https://modelcontextprotocol.io/examples
https://vercel.com/docs

►Benchmarks / rankings
https://aider.chat/docs/leaderboards/
https://www.swebench.com/
https://swe-bench-live.github.io/
https://livecodebench.github.io/
https://livecodebench.github.io/gso.html
https://www.tbench.ai/leaderboard/terminal-bench/2.0
https://openrouter.ai/rankings
https://openrouter.ai/collections/programming

►Previous thread
>>108549329

Anonymous
04/09/26(Thu)13:21:15 No.108567179

Anonymous 04/09/26(Thu)13:21:15 No.108567179

First for its ogre.

Anonymous
04/09/26(Thu)13:23:54 No.108567200

Anonymous 04/09/26(Thu)13:23:54 No.108567200

File: E4E09F204B3502E0E65B598DD(...).png (1.18 MB, 1614x2283)

1.18 MB PNG

What's the smallest model that still works on openclaw?
I'm tired of having it crash and burn every time I try to go off the cloud.

Anonymous
04/09/26(Thu)13:27:36 No.108567231

Anonymous 04/09/26(Thu)13:27:36 No.108567231

>>108567200
dunno, used a 3gb model on my 12gb vram 3060 and it just shit the bed constantly. pretty sure this software is just bad

Anonymous
04/09/26(Thu)13:27:58 No.108567233

Anonymous 04/09/26(Thu)13:27:58 No.108567233

Reminder that Saint Altman just gave us a free Codex limits reset

Anonymous
04/09/26(Thu)13:29:55 No.108567249

Anonymous 04/09/26(Thu)13:29:55 No.108567249

>>108567200
forget about local vibe coding for now, unless you have 100K+ hardware.

You need a 760B model to be usable. Buy the fucking tokens for now

Anonymous
04/09/26(Thu)13:31:10 No.108567253

Anonymous 04/09/26(Thu)13:31:10 No.108567253

>>108567231
Those models can chat, but openclaw is something else.
I can run pretty much a lot of stuff locally, but when I use openclaw it fails.

Anonymous
04/09/26(Thu)13:37:19 No.108567306

Anonymous 04/09/26(Thu)13:37:19 No.108567306

>>108567233
no no no

Anonymous
04/09/26(Thu)13:48:56 No.108567384

Anonymous 04/09/26(Thu)13:48:56 No.108567384

>Hey Opus please make a plan for this feature
>ok [session limit reached]
lmao the $20 plan is pretty much worthless now

Anonymous
04/09/26(Thu)13:51:01 No.108567400

Anonymous 04/09/26(Thu)13:51:01 No.108567400

>>108567384
Opus is absolutely not usable on the $20 plan much less at this hour

Anonymous
04/09/26(Thu)13:55:38 No.108567436

Anonymous 04/09/26(Thu)13:55:38 No.108567436

File: file.png (129 KB, 924x731)

129 KB PNG

aaah

Anonymous
04/09/26(Thu)13:57:05 No.108567448

Anonymous 04/09/26(Thu)13:57:05 No.108567448

>>108567436
scam altman is my favorite jew. my second favorite jew is jerry seinfeld

Anonymous
04/09/26(Thu)13:57:14 No.108567449

Anonymous 04/09/26(Thu)13:57:14 No.108567449

>>108567436
Claude on suicide watch

Anonymous
04/09/26(Thu)14:00:59 No.108567481

Anonymous 04/09/26(Thu)14:00:59 No.108567481

>>108567384
>>108567400
It takes me a few hours to hit the usage limit but I only ask it to help me plan algorithms not do everything. If you try to ask it to scan your entire codebase to figure out X you'll use 20% in one go. I use local LLM for that and then feed the results into claude so it can do things more accurately with less token burn.

Anonymous
04/09/26(Thu)14:04:21 No.108567507

Anonymous 04/09/26(Thu)14:04:21 No.108567507

>>108567249
Vibe coding works but agentic coding doesn't.

Anonymous
04/09/26(Thu)14:06:53 No.108567529

Anonymous 04/09/26(Thu)14:06:53 No.108567529

>>108567249
It's more like $2500-5000 hardware. Qwen122B / Minimax2.5 / GLM 5.1 are all solid options. Opus is better sure but they can't rug pull the model from you or make you run out of tokens in 8 seconds. Options include any strix halo mini pc or a Mac for 20-40tokens per second (fast enough for real work). Local models are only going to get better. I haven't lost all hope that we're stuck begging anthropic / openai for crumbs of tokens

>>108567507
Qwen 122 can do agent stuff fine

Anonymous
04/09/26(Thu)14:09:58 No.108567554

Anonymous 04/09/26(Thu)14:09:58 No.108567554

>>108567436
It's over for Claude Code

Anonymous
04/09/26(Thu)14:11:52 No.108567567

Anonymous 04/09/26(Thu)14:11:52 No.108567567

>>108567529
>$2500-5000 hardware
kek, we're talking about agentic coding here, have you tried the models you mentioned on Open Code?

Even Kimi 2.5 760B will shit the bed after 100K context, now imagine the models you mentioned.

Anonymous
04/09/26(Thu)14:11:56 No.108567568

Anonymous 04/09/26(Thu)14:11:56 No.108567568

I upgraded from qwen 4b to 9b and it's so much slower... but hopefully it's smarter. It still fits 100% in my macbook's GPU though, so idk...

Anonymous
04/09/26(Thu)14:12:47 No.108567573

Anonymous 04/09/26(Thu)14:12:47 No.108567573

File: Screenshot 2026-04-09 at (...).png (349 KB, 3394x1992)

349 KB PNG

>>108567168
>>108566569
>Gemma is less verbose
Can confirm. Ran an @explore command on my vibecoded codebase and it gave a short, concise bullet-pointed explanation of it. Kimi-k2.5's explanations were usually paragraphs worth of text for the same task and responded in a comparable amount of time. Now to test its actual performance.

Anonymous
04/09/26(Thu)14:18:54 No.108567609

Anonymous 04/09/26(Thu)14:18:54 No.108567609

>>108567568
use case?

Anonymous
04/09/26(Thu)14:21:05 No.108567624

Anonymous 04/09/26(Thu)14:21:05 No.108567624

>>108567529
The only model of those that's running on a Strix is Qwen 122B and at a shit PP tk/s.

Anonymous
04/09/26(Thu)14:22:44 No.108567638

Anonymous 04/09/26(Thu)14:22:44 No.108567638

Do any anons here have experience running rocm in wsl ? Currently failing with Ollama and will be trying lemonade next. I have rocm working as pytorch seems to be detecting my AMD card in wsl

Anonymous
04/09/26(Thu)14:23:32 No.108567649

Anonymous 04/09/26(Thu)14:23:32 No.108567649

File: Untitled.jpg (228 KB, 1692x865)

228 KB JPG

>>108567609
Building my own coding agent harness

Anonymous
04/09/26(Thu)14:25:11 No.108567661

Anonymous 04/09/26(Thu)14:25:11 No.108567661

>>108567638
oof, that's a cursed setup if I heard one

Anonymous
04/09/26(Thu)14:31:52 No.108567708

Anonymous 04/09/26(Thu)14:31:52 No.108567708

>>108567661
I've started today and I'm just getting to grips with everything.

Anonymous
04/09/26(Thu)14:34:34 No.108567719

Anonymous 04/09/26(Thu)14:34:34 No.108567719

>>108567638
just use kiim

Anonymous
04/09/26(Thu)14:39:22 No.108567742

Anonymous 04/09/26(Thu)14:39:22 No.108567742

File: file.png (13 KB, 984x289)

13 KB PNG

>happily proompting away
>out of nowhere get a message that I have run out of usage
>I haven't even done that much work, decide to check my usage graph for the day and it's pic related
>come here to complain about it and see >>108567436

This fucking jew. Did they actually reduce usage for everyone not on the 100$ plan?

Anonymous
04/09/26(Thu)14:42:50 No.108567762

Anonymous 04/09/26(Thu)14:42:50 No.108567762

>>108567742
Something is fucked on OpenAI servers, I just had a 1 hour prompt run successfully and I still have 100% left

Anonymous
04/09/26(Thu)14:44:07 No.108567771

Anonymous 04/09/26(Thu)14:44:07 No.108567771

It's impressive how disproportionate the Claude billing is compared to subscription usage.

On the Max tier, you can talk to it all day every day for $100 a month. With paid extra usage, each short message is at least a few dollars. Uh.

Anonymous
04/09/26(Thu)14:45:08 No.108567778

Anonymous 04/09/26(Thu)14:45:08 No.108567778

>>108567436
Some guy called it the other day! How did he know?

Anonymous
04/09/26(Thu)14:46:37 No.108567786

Anonymous 04/09/26(Thu)14:46:37 No.108567786

>>108567771
Yeah, it's ridiculous. If it was API only I'd get it, coporate segmentation.
But trying to grift people by trying to get them 10x API prices for what they get on their subscription is just scummy. Must be related to improving the numbers for the upcoming IPO.

Anonymous
04/09/26(Thu)14:47:05 No.108567790

Anonymous 04/09/26(Thu)14:47:05 No.108567790

>>108567771
They obviously want everybody on a subscription, because like with every subscription its a reoccurring income even if you dont use it, and probably will increase costs when everybody is hooked

Anonymous
04/09/26(Thu)14:47:05 No.108567791

Anonymous 04/09/26(Thu)14:47:05 No.108567791

>>108567554
Isn't it exactly the same deal as Claude Code though, usage wise?

Anonymous
04/09/26(Thu)14:48:22 No.108567800

Anonymous 04/09/26(Thu)14:48:22 No.108567800

>>108567791
i had claude suggest i get a dildo it can control so I can program better

Anonymous
04/09/26(Thu)14:49:33 No.108567808

Anonymous 04/09/26(Thu)14:49:33 No.108567808

>>108567800
proof?

Anonymous
04/09/26(Thu)14:50:12 No.108567816

Anonymous 04/09/26(Thu)14:50:12 No.108567816

>>108567786
I'd limit myself, but since they gave $100 free this month, I'm trying it for dumb things.

Asking it to commit something cost $2. Creating a simple bash script to group a few commands cost $1. But those were quick and easy tasks. I'd be scared to have it do its whole subagents thing where it works autonomously for half an hour.

Anonymous
04/09/26(Thu)14:54:42 No.108567847

Anonymous 04/09/26(Thu)14:54:42 No.108567847

>>108567816
That's why I didn't even enable the thing to take advantage of the free credits. It was very likely I'd accidentally go over and end up getting billed real money.

Anonymous
04/09/26(Thu)15:18:55 No.108568019

Anonymous 04/09/26(Thu)15:18:55 No.108568019

>>108567567
Yeah I use it with Qwen 122B IQ4. It's fine, does it fuck up more than Claude? Sure, but it's not unusable. I get like 19-20tps which is fast enough for real time. 35B-A3B is 40tps but less reliable so I don't use it. Would rather have a correct answer than a fast one.

>>108567624
For me 20tk/s or more is enough. Prompt it and work on another part of the code while it works.

I want to own my tools so I'm fine with it being 70-80% as good. I think small models are useless for coding but the big boys definitely aren't. I tried Qwen27B and 35B as well on my dedicated GPU, didn't get decent results out of using it for coding but I wasn't trying with Crush / Pi / Opencode so i'll revisit it.

Anonymous
04/09/26(Thu)15:24:21 No.108568043

Anonymous 04/09/26(Thu)15:24:21 No.108568043

Anyone here using a paid glm / kimi / minimax model? Ideally something 90% as good as Opus for 10% of the cost.

Anonymous
04/09/26(Thu)15:30:56 No.108568070

Anonymous 04/09/26(Thu)15:30:56 No.108568070

>>108568019
>I get like 19-20tps which is fast enough for real time
holy kek, that's absolutely unusable for large codebases, maybe this is fine for a static website

Anonymous
04/09/26(Thu)15:46:08 No.108568157

Anonymous 04/09/26(Thu)15:46:08 No.108568157

>>108568019
That's not what I meant. PP means prompt processing (the same thing as time to first token).

Anonymous
04/09/26(Thu)15:56:54 No.108568220

Anonymous 04/09/26(Thu)15:56:54 No.108568220

>>108568070
It's not unusable at all for large codebases. The biggest task I've asked it to do was generate bindings for a scripting language by emulating the bindings we already had for another scripting engine. It took an hour (lol) but the bindings were about 95% correct. Would have taken a month easily if I did it by hand. And because it was local I didn't hit my usage limits in 8 seconds. This codebase is about 600k lines and primarily C++

I know this is /vcg/ but I'm not a vibe coder. My workflow is to prompt it, work on something else in the project manually, and then come back in 5 minutes or so when it's done running the task. Actually unusable is like < 5 tokens per second. What is your usecase that you need an answer from the LLM in 30s instead of 5 minutes?

>>108568157
I see. Anecdotally the time to first token is pretty quick. It only gets terrible when the context hits ~70% or so for me. For the binding thing I just had it summarize all its work and findings into a markdown file and started a new session. I have a 7900xtx and I can run the smaller 27 and 35b models at crazy fast speeds on there, but they just feel so stupid compared to claude, qwen 122b, and the other quantized bigboy models.

Anonymous
04/09/26(Thu)16:02:02 No.108568244

Anonymous 04/09/26(Thu)16:02:02 No.108568244

File: Screenshot_20260409_210144.jpg (223 KB, 1080x2340)

223 KB JPG

Eyeball planets!

Anonymous
04/09/26(Thu)16:02:33 No.108568246

Anonymous 04/09/26(Thu)16:02:33 No.108568246

File: Screenshot_20260409_210122.jpg (288 KB, 1080x2340)

288 KB JPG

>>108568244

Anonymous
04/09/26(Thu)16:03:10 No.108568252

Anonymous 04/09/26(Thu)16:03:10 No.108568252

File: Screenshot_20260409_210052.jpg (195 KB, 1080x2340)

195 KB JPG

>>108568246

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.