[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


File: 1745859137505826.png (1.02 MB, 1024x558)
1.02 MB
1.02 MB PNG
A general for vibe coding, coding agents, AI IDEs, browser builders, MCP, and shipping prototypes with LLMs.

►What is vibe coding?
https://x.com/karpathy/status/1886192184808149383
https://simonwillison.net/2025/Mar/19/vibe-coding/
https://simonwillison.net/2025/Mar/11/using-llms-for-code/

►Prompting / context / skills
https://docs.cline.bot/customization/cline-rules
https://docs.replit.com/tutorials/agent-skills
https://docs.github.com/en/copilot/tutorials/spark/prompt-tips

►Editors / terminal agents / coding agents
https://cursor.com/docs
https://docs.windsurf.com/getstarted/overview
https://code.claude.com/docs/en/overview
https://aider.chat/docs/
https://docs.cline.bot/home
https://docs.roocode.com/
https://geminicli.com/docs/
https://docs.github.com/en/copilot/how-tos/use-copilot-agents/coding-agent

►Browser builders / hosted vibe tools
https://bolt.new/
https://support.bolt.new/
https://docs.lovable.dev/introduction/welcome
https://replit.com/
https://firebase.google.com/docs/studio
https://docs.github.com/en/copilot/tutorials/spark
https://v0.app/docs/faqs

►Open / local / self-hosted
https://github.com/OpenHands/OpenHands
https://github.com/QwenLM/qwen-code
https://github.com/QwenLM/Qwen3-Coder

►MCP / infra / deployment
https://modelcontextprotocol.io/docs/getting-started/intro
https://modelcontextprotocol.io/examples
https://vercel.com/docs

►Benchmarks / rankings
https://aider.chat/docs/leaderboards/
https://www.swebench.com/
https://swe-bench-live.github.io/
https://livecodebench.github.io/
https://livecodebench.github.io/gso.html
https://www.tbench.ai/leaderboard/terminal-bench/2.0
https://openrouter.ai/rankings
https://openrouter.ai/collections/programming

►Previous thread
>>108549329
>>
First for its ogre.
>>
What's the smallest model that still works on openclaw?
I'm tired of having it crash and burn every time I try to go off the cloud.
>>
>>108567200
dunno, used a 3gb model on my 12gb vram 3060 and it just shit the bed constantly. pretty sure this software is just bad
>>
Reminder that Saint Altman just gave us a free Codex limits reset
>>
>>108567200
forget about local vibe coding for now, unless you have 100K+ hardware.

You need a 760B model to be usable. Buy the fucking tokens for now
>>
>>108567231
Those models can chat, but openclaw is something else.
I can run pretty much a lot of stuff locally, but when I use openclaw it fails.
>>
>>108567233
no no no
>>
>Hey Opus please make a plan for this feature
>ok [session limit reached]
lmao the $20 plan is pretty much worthless now
>>
>>108567384
Opus is absolutely not usable on the $20 plan much less at this hour
>>
File: file.png (129 KB, 924x731)
129 KB
129 KB PNG
aaah
>>
>>108567436
scam altman is my favorite jew. my second favorite jew is jerry seinfeld
>>
>>108567436
Claude on suicide watch
>>
>>108567384
>>108567400
It takes me a few hours to hit the usage limit but I only ask it to help me plan algorithms not do everything. If you try to ask it to scan your entire codebase to figure out X you'll use 20% in one go. I use local LLM for that and then feed the results into claude so it can do things more accurately with less token burn.
>>
>>108567249
Vibe coding works but agentic coding doesn't.
>>
>>108567249
It's more like $2500-5000 hardware. Qwen122B / Minimax2.5 / GLM 5.1 are all solid options. Opus is better sure but they can't rug pull the model from you or make you run out of tokens in 8 seconds. Options include any strix halo mini pc or a Mac for 20-40tokens per second (fast enough for real work). Local models are only going to get better. I haven't lost all hope that we're stuck begging anthropic / openai for crumbs of tokens

>>108567507
Qwen 122 can do agent stuff fine
>>
>>108567436
It's over for Claude Code
>>
>>108567529
>$2500-5000 hardware
kek, we're talking about agentic coding here, have you tried the models you mentioned on Open Code?

Even Kimi 2.5 760B will shit the bed after 100K context, now imagine the models you mentioned.
>>
I upgraded from qwen 4b to 9b and it's so much slower... but hopefully it's smarter. It still fits 100% in my macbook's GPU though, so idk...
>>
>>108567168
>>108566569
>Gemma is less verbose
Can confirm. Ran an @explore command on my vibecoded codebase and it gave a short, concise bullet-pointed explanation of it. Kimi-k2.5's explanations were usually paragraphs worth of text for the same task and responded in a comparable amount of time. Now to test its actual performance.
>>
>>108567568
use case?
>>
>>108567529
The only model of those that's running on a Strix is Qwen 122B and at a shit PP tk/s.
>>
Do any anons here have experience running rocm in wsl ? Currently failing with Ollama and will be trying lemonade next. I have rocm working as pytorch seems to be detecting my AMD card in wsl
>>
File: Untitled.jpg (228 KB, 1692x865)
228 KB
228 KB JPG
>>108567609
Building my own coding agent harness
>>
>>108567638
oof, that's a cursed setup if I heard one
>>
>>108567661
I've started today and I'm just getting to grips with everything.
>>
>>108567638
just use kiim
>>
File: file.png (13 KB, 984x289)
13 KB
13 KB PNG
>happily proompting away
>out of nowhere get a message that I have run out of usage
>I haven't even done that much work, decide to check my usage graph for the day and it's pic related
>come here to complain about it and see >>108567436

This fucking jew. Did they actually reduce usage for everyone not on the 100$ plan?
>>
>>108567742
Something is fucked on OpenAI servers, I just had a 1 hour prompt run successfully and I still have 100% left
>>
It's impressive how disproportionate the Claude billing is compared to subscription usage.

On the Max tier, you can talk to it all day every day for $100 a month. With paid extra usage, each short message is at least a few dollars. Uh.
>>
>>108567436
Some guy called it the other day! How did he know?
>>
>>108567771
Yeah, it's ridiculous. If it was API only I'd get it, coporate segmentation.
But trying to grift people by trying to get them 10x API prices for what they get on their subscription is just scummy. Must be related to improving the numbers for the upcoming IPO.
>>
>>108567771
They obviously want everybody on a subscription, because like with every subscription its a reoccurring income even if you dont use it, and probably will increase costs when everybody is hooked
>>
>>108567554
Isn't it exactly the same deal as Claude Code though, usage wise?
>>
>>108567791
i had claude suggest i get a dildo it can control so I can program better
>>
>>108567800
proof?
>>
>>108567786
I'd limit myself, but since they gave $100 free this month, I'm trying it for dumb things.

Asking it to commit something cost $2. Creating a simple bash script to group a few commands cost $1. But those were quick and easy tasks. I'd be scared to have it do its whole subagents thing where it works autonomously for half an hour.
>>
>>108567816
That's why I didn't even enable the thing to take advantage of the free credits. It was very likely I'd accidentally go over and end up getting billed real money.
>>
>>108567567
Yeah I use it with Qwen 122B IQ4. It's fine, does it fuck up more than Claude? Sure, but it's not unusable. I get like 19-20tps which is fast enough for real time. 35B-A3B is 40tps but less reliable so I don't use it. Would rather have a correct answer than a fast one.

>>108567624
For me 20tk/s or more is enough. Prompt it and work on another part of the code while it works.

I want to own my tools so I'm fine with it being 70-80% as good. I think small models are useless for coding but the big boys definitely aren't. I tried Qwen27B and 35B as well on my dedicated GPU, didn't get decent results out of using it for coding but I wasn't trying with Crush / Pi / Opencode so i'll revisit it.
>>
Anyone here using a paid glm / kimi / minimax model? Ideally something 90% as good as Opus for 10% of the cost.
>>
>>108568019
>I get like 19-20tps which is fast enough for real time
holy kek, that's absolutely unusable for large codebases, maybe this is fine for a static website
>>
>>108568019
That's not what I meant. PP means prompt processing (the same thing as time to first token).
>>
>>108568070
It's not unusable at all for large codebases. The biggest task I've asked it to do was generate bindings for a scripting language by emulating the bindings we already had for another scripting engine. It took an hour (lol) but the bindings were about 95% correct. Would have taken a month easily if I did it by hand. And because it was local I didn't hit my usage limits in 8 seconds. This codebase is about 600k lines and primarily C++

I know this is /vcg/ but I'm not a vibe coder. My workflow is to prompt it, work on something else in the project manually, and then come back in 5 minutes or so when it's done running the task. Actually unusable is like < 5 tokens per second. What is your usecase that you need an answer from the LLM in 30s instead of 5 minutes?

>>108568157
I see. Anecdotally the time to first token is pretty quick. It only gets terrible when the context hits ~70% or so for me. For the binding thing I just had it summarize all its work and findings into a markdown file and started a new session. I have a 7900xtx and I can run the smaller 27 and 35b models at crazy fast speeds on there, but they just feel so stupid compared to claude, qwen 122b, and the other quantized bigboy models.
>>
Eyeball planets!
>>
>>108568244
>>
>>108568246



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.