[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


File: 1769393355149528.png (1.06 MB, 1024x558)
1.06 MB
1.06 MB PNG
A general for vibe coding, coding agents, AI IDEs, browser builders, MCP, and shipping prototypes with LLMs.

►What is vibe coding?
https://x.com/karpathy/status/1886192184808149383
https://simonwillison.net/2025/Mar/19/vibe-coding/
https://simonwillison.net/2025/Mar/11/using-llms-for-code/

►Prompting / context / skills
https://docs.cline.bot/customization/cline-rules
https://docs.replit.com/tutorials/agent-skills
https://docs.github.com/en/copilot/tutorials/spark/prompt-tips

►Editors / terminal agents / coding agents
https://cursor.com/docs
https://docs.windsurf.com/getstarted/overview
https://code.claude.com/docs/en/overview
https://aider.chat/docs/
https://docs.cline.bot/home
https://docs.roocode.com/
https://geminicli.com/docs/
https://docs.github.com/en/copilot/how-tos/use-copilot-agents/coding-agent

►Browser builders / hosted vibe tools
https://bolt.new/
https://support.bolt.new/
https://docs.lovable.dev/introduction/welcome
https://replit.com/
https://firebase.google.com/docs/studio
https://docs.github.com/en/copilot/tutorials/spark
https://v0.app/docs/faqs

►Open / local / self-hosted
https://github.com/OpenHands/OpenHands
https://github.com/QwenLM/qwen-code
https://github.com/QwenLM/Qwen3-Coder

►MCP / infra / deployment
https://modelcontextprotocol.io/docs/getting-started/intro
https://modelcontextprotocol.io/examples
https://vercel.com/docs

►Benchmarks / rankings
https://aider.chat/docs/leaderboards/
https://www.swebench.com/
https://swe-bench-live.github.io/
https://livecodebench.github.io/
https://livecodebench.github.io/gso.html
https://www.tbench.ai/leaderboard/terminal-bench/2.0
https://openrouter.ai/rankings
https://openrouter.ai/collections/programming

►Previous thread
>>108504430
>>
so is gemma 4 better than gpt 5.3?
>>
I use AI and I still look like the guy on the left.
>>
>>108526147
proof?
>>
>>108525828
You probably can if you're smart enough, but making a virus that kills humans extremely efficiently is probably orders of magnitude easier than making a virus that makes CPUs.
>>
>>108526069
ya, if not then skill issue
>>
>>108526043
I sell blood to hospital from time to time. Also grandparents gift me money on birthdays.
>>
i wonder how the models got rlvr'ed into writing monolithic files
>>
Is there any good reason to use claude code over claud desktop with mcp servers?
>>
>>108526499
good question

>>108526532
dunno never used claude desktop but probably not much of a difference
>>
File: file.png (128 KB, 1280x720)
128 KB
128 KB PNG
>>
https://github.com/anomalyco/opencode/

Out of curiosity--and perhaps ignorance--why don't people use OpenCode? I've been using it with omnicoder:9b for planning and Qwen3 Coder A3B for coding. I've tried some others like aider, cursor, codex, and claude code and I've felt I've gotten the best experience from OpenCode--well minus their absolute shit need to manually configure models.
>>
>>108526968
i was going to say plenty of mentions of opencode in threads + it's in the op, but for some reason it isn't in the op. lol codex isn't in the op either and it's the most popular thing in the threads.
>>
They're not quite Blender-level planet surfaces but it generates near instantly so I'm happy with it for now. Gonna use these visual profiles to generate simple 2D icons to replace those shitty placeholder spheres I've had since the beginning. And find a way to render the spheres without turning my phone into a nuclear reactor.
>>
>>108526968
I'm using it. The harness seems really good but the desktop version is still in beta and it's annoying to do some things that aren't fully implemented or documented yet, and it seems really dumb that you can't edit files in the viewer if you want, maybe that's planned.
>>
They cut off claude for openclaw.
You can't use openclaw with it anymore what do we do bros?
>>
it's very tempting to just spam "what are the next steps" and "continue on next steps" without looking at what its actually doing at this point

they love to add lines of code, i think i'll need to refactor soon and strip out the parts that don't add much value
>>
>>108527052
>>108527371

That's awesome to hear! What are y'all using for backends? I'm still on Ollama but I want to give vLLM a try for the Ray support--wanna give a shot at splitting the kv cache onto a different machine's GPU.
>>
>>108527502
They're mostly doing this so people move to Cowork right?
>>
>>108527502
Pay what you owe or be left behind.
>>
>>108527547
i'm an api fag, paying the altman toll
>>
>>108527550
Should I move on to cowork because it's better than open source stuff?
>>
>>108527547
Not much ollama I can use with a 1050 ti
>>
>>108527547
glm-5 on nvidia nim, slow but surprisingly capable
it has some kind of whitespace compression issue that makes python unworkable but it's fine in languages where it only needs to get the brackets in the right place and doesn't have to fight with indentation levels
>>
>>108527550
They want vertical integration. They want whatever moat they can get from people being locked in to their proprietary client side software.
The reason of "well umm they put too much load on our servers" might be a distant secondary reason but the primary reason is 100% that they want lock-in. They banned opencode as well and there is no way the tiny % of users using opencode are putting that much load on their servers from slightly worse caching or using opus instead of sonnet for agents.
>>
>>108527563
>>108527583
Fair enough, we use what's best. However I will ask why not try to save ~$200 and get an RTX 3060 12GB? It is/was a budget sweet spot for me personally. Threw them in an old HP Z620--def not the fastest CPU again--budget.

>>108527598
Any advantages of nim? I don't know much about it, it gave me a more "professional" vibe than I was looking for when I started with Ollama. I suspect it's capable of some pretty nifty stuff with it being from nvidia 'nd all. GLM-5 is still a bit much for my setup, can you "lazy prompt" it: "the app needs to be more efficient, make a plan", "execute your plan and suggest next steps"?
>>
>>108527700
tb h i do want to run local eventually, but i'd rather wait a year or two before building a machine for it.
hopefully memory crunch won't be so fucked then and the local models will offer better performance than today's frontier ones.
plus i'm only doing like $20 a month rn (might go up to 40) so it's not crazy.
>>
>>108527700
real glm-5 is way too big to run locally, like multiple H-100s tier and the local viable quants are pretty limited and much worse at large context stuff. It's an order of magnitude more competent than anything I could run locally with 16gb vram.
I'm just fucking around so I don't have any privacy concerns about using a remote model, don't care much about speed and usage limits because I have no deadlines(which I've only hit once anyway) and am too cheap to pay for one of the subscription services even if claude or the new gemini might be a little smarter, so this is perfect for my uses
while I typed this it was in the background working on a bug we'd been going back and forth about, this is what I typed last
>Still no footprints, try making some unit tests
and then it worked on its own for 28 minutes, 13k tokens of just thinking, make a dozen+ edits to files, implemented a bunch of tests it could run on its own with shell commands, did the tests, and found and fixed 95% of the bug.
>>
>>108527700
I was trying to use qwen3.5 2b but it barely did any work for me
>>
unironically best general on this god forsaken board but i fear that will not last
>>
>>108527953
it still requires some level of technical knowledge to engage with
>>
>>108527895
>I'm just fucking around so I don't have any privacy concerns about using a remote model
Tried cerebras? They promise 1000T/s and like 5 times cheaper output than claude.
>>
>>108527568
Short term, if you need to do something, maybe? Long term, if things get more closed, everyone (except a few) lose.
>>
File: clip0002.webm (801 KB, 1280x720)
801 KB
801 KB WEBM
it has achieved what can technically be considered gameplay
>>
>>108528017
yooo
>>
>>108527958
it seems egos have yet to be at play
>>
https://limewire.com/d/bZYeo#D4ZdJZY2Zw
Nothing to see here, totes not a script to restore Opus access on LMArena despite them removing the model.
>>
>>108527766
That makes sense, Qwen3.5 blew me away when I first used it. I found it much more capable than I expected. The 2b-4b I could used for conversations in OpenWebUI--although continuity suffers over long conversations.

>>108527895
That's pretty cool, and makes total sense. Fine-tuning, prompting, adding loras, it'll turn into a real mess fast when you try doing to same things with smaller models.

>>108527911
What were you trying to get it to do? I found that it's ability to use tools at 2b wasn't great, but for conversational "driving buddy" it did pretty well--a little dull without decent personality prompting.

>>108527978
Woah, just googled this and I didn't expect to see a free tier like that. I might give this one a shot for some planning/architecture/debug work.

>>108528017
This is really cool looking, but what am I looking at. Is this a game engine or like a render library?
>>
just started using opus and sonnet. company gives me basically unlimited access for now. In rust lang having problems producing actual high quality code. it keeps defaulting to really lazy cloning and other heap allocations when I can write much cleaner zero copy implementations for the same functions. opus gave me a filter that was ~100ops/sec due to multiple allocations and a channel. my own implementation was ~22,000op/sec via write interception on a recursive callback. so my question is how do I guide it to stop this? telling it to reduce allocations and cloning doesn't really work.
>>
>>108528160
You don't. Stop looking at the code and focus on behavior.
>>
>>108528155
>Is this a game engine or like a render library?
asked codex to make a game engine from scratch and i've just been letting it continue for now over a week
>>
>>108528160
make no mistakes, engage ultrathi...
but seriously, if you want it to write peformant code you'll probably have to throw it into an autoresearch type loop - initially you'll almost always get slop because training (so far) rewards it for producing functioning code, not optimised code
>>
>>108528182
so what? set it up to benchmark until it can beat what we currently have?
>>
>>108528190
Wowee! That's really cool! I remember designing a game engine back in high school. How far are you going to take it? Do you plan on sharing the source?
>>
>>108528196
try giving it your implementation as a reference
>>
>>108528196
https://github.com/karpathy/autoresearch
>>
>>108528160
set hard requirements in the rules file
>>
>>108528219
posted a copy of the source yesterday in the last thread, not sure how far i'll end up taking it
>>
>>108528160
you have to make it clear to the agent you want it to work for maximum performance. and always have it check its work after its done. it also helps to have another, completely different agent, like another AI provider, to review the work the other agent did, and have both keep checking each others work / review until nothing is found. its why i use both claude opus 4.6 high and codex gpt 5.4 high together.
no ai is at a state where you can just sit there, tell it vague, and have it produce top results from the start. treat ai like you would treat a human. expect work from ai that you would from most humans. don't treat ai as if it was the top 1% of humanity because we don't even treat ourselves from that. we typically look at ourselves going into most people are retarded.
>>
>>108528245
and i find going into it by first asking it to review then have another agent review that review, until both agree on a single review, then move on to creating a battle plan to put the review into practice. having both review the battle plan until they both agree.
doing that i find i get an extremely high success rate of getting the work done, on the first try and be the best implementation to do it.
similar to having people work a plan before executing it.
>>
>>108528245
>>108528256
i in essence, treat myself as a manager, managing a group of people to produce a project. treat yourself as a manager, and treat the agent(s) as the workers.
>>
>>108528191
>>108528196
Autonomous loops don't work for performance.

>>108528160
You will have to babysit it.
Make it write the code giving it a file with guidelines, ask it to review for performance (point out issues you already know are important, maybe create a kind of "checklist" like file with usual issues for your domain), then benchmark or read the code yourself. Ask it to fix the issues, repeat.
For complex tasks AI isn't very good yet.
>>
>>108528311
>Autonomous loops don't work for performance.
they literally do >>108528223
>>
>>108528311
>>108528318
I think it's meant more literally, like you can't just tell it "Do this until it's done".

>>108528223
Is a lot more than just single-prompt agent looping.
>>
Anthropic and Orange Reddit defending them pissed me off with the anti-claw thing (I don't use it but still), now I switched to Kimi Code until I get annoyed enough and go back to Claude.
But honestly even Opus sucks for this kind of work anyway so we'll see how long this phase it lasts. Kimi 2.5 feels like the first open source model that actually kind of competes with Claude and Gemini.
I'm having it improve SASS documentation for the Nvidia GPU emulator I'm making to be able to debug my hand edited assembly code.
It feels like an extreme degree of yak shaving but also like it's the only way I can get where I want to be.
>>
>>108528318
I haven't tried autoresearch specifically, but when I put an agent to work on optimizing something hard, it tries something naive, it doesn't work, revert, and repeat the same loop a couple times until running out of context.
If you do some Pareto frontier kinda stuff what would probably happen is the frontier advances a bit, then stops advancing and now you're just waiting forever and wasting tokens.
LLMs get tunnel vision, get-there-itis and don't think outside the box or long term. It makes sense when you consider they were trained to solve one shot small problems because that's what's easy to test them for with RLVR.
At some point to advance the Pareto frontier during an optimization process you have to take a completely different approach that doesn't lead to immediate rewards and only yields rewards in the long term.
You can probably brute force it with a shit ton of tokens (like OpenAI got good scores on Arc AGI 2 by spending 100k dollars in compute per task, after all humans are a product of random evolution), but right now for hard problems it's cheaper to use your own or other people's minds rather than brute force it with AI. There are very strong diminishing returns for tokens spent.
>>
>>108528381
Why aren't you using codex?
>>
Like for example I'm spending an afternoon improving documentation so I can make a GPU emulator so I can optimize kernels so I can increase the performance of my actual project.
No LLM would ever decide to go on a month long tangent to achieve a certain overall long term goal. They always opt for the quick workaround and the hack rather than a principled solution.
>>
>>108528381
Have you tried glm 5, I've been using kimi for a while but I get occasional tool usage errors with opencode where that rarely happens with codex, opencode too.
GLM is supposed to be smarter no?
>>
>>108528437
I don't like the extremely robotic tone codex/GPT has after GPT 5, and it works bad with my custom assistant harness. Nowadays GPT is not even doing good on benchmarks.
Right now I pay Anthropic $200, even despite those disadvantages I might split it 100/100 if OpenAI had a $100 plan with full context, but it doesn't. I pay for the $20 plan but it doesn't do particularly better than Claude.
My dream long term is to do everything local so if I'm not gonna use Claude then I might as well try the top open weights model through API to see if it would be useful to me even if the hardware became affordable one day.
>>
>>108528480
I tried using GLM 4.7 and 5.0 and they go ultra schizo around the 100k token mark, outputting broken sentences and nonsensical code, becoming completely unusable. GLM 4.6 began to lose quality at long contexts but didn't go completely off the rails like the newer models.
>>
>>108528220
when I did it just said it was good a copied it. there wasn't really any improvement beside auto commenting which is nice since I am usually a little lean on verbose comments.
>>108528228
kindof what I think I'm looking for. how would I define a hard requirement.
>>108528311
kindof what I was hoping not to hear. if it can't crank out 11/10 code it's not really worth our time. I need it to cut server usage which essentially means beat our current implementation.
>>
```"""
Universal codebase mapper.
Generates MAP.md — a complete, accurate, AI-optimized snapshot of any project.

Usage:
python generate_map.py
python generate_map.py --ignore logs uploads tmp
python generate_map.py --no-default-ignore --ignore .venv __pycache__
python generate_map.py --output MYMAP.md
python generate_map.py --root /path/to/project
"""```

Just had claude make this, does anyone have/do something similar to prevent having to load every file and hold it in context?
>>
>>108528512
Crap, there goes my plan B, I'll keep using kimi then
>>
I think I've hit a wall with Claude Code in terms of complexity I can milk out of it from vanilla settings. I haven't had the time to read through skills, subagents, prompting, honestly because I don't know where to start, it's so overwhelming.
>>
>>108528591
What is this? To generate directory tree of all project files? Your MAP.md will constantly be out of date and (assuming your project isn't just 5 files) full of irrelevant files that claude won't need to know about for every single task. eza --tree and rg is all it needs to navigate and find which specific files it needs to load
>>
>>108528641
start with skills.
they're very simple. usually just a markdown file, sometimes thte skill folder will have some bundled assets like scripts or more docs.
but they're very effective if well written.
it's mostly just fancy reusable, prompt-writing
>>
>>108528691
Honestly, great response! My intended use/case:

I have omnicoder:9b (Q8) with a 128k context, it's large but the size of the model doesn't seem to handle loading "a lot" of files at once. I use Qwen3 30b for "coding", but I find that when it has to do a change that affects a larger codebase it starts making mistakes like double adding a `]` to a file.

My idea to solve this was to write this script and add it to my AGENTS.md and tell it to run it after making large changes to the codebase.

You're right though, having a tree within the file was definitely going to cause more headache than it solved. That said, now it's a "complete, accurate, AI-optimized snapshot of any project." for the purpose of giving a kind of foresight to my weaker models.

As you can probably tell, I suck with words so to quote claude, "A drop-in Python script that generates MAP.md — a semantic index of any codebase. It auto-detects your stack (Flask, FastAPI, Django, Express, React, etc.), extracts every class, function, method signature, route, import, module-level variable, and call graph from your Python and JS files, and sniffs your config from whatever format you use. Designed to sit in an LLM's context as a lightweight substitute for loading individual source files."
>>
>>108528787
>start with skills
are they generic, something that applies for all projects that I tackle? Or should I segment skills per project based on how relevant they are to it? I'm worried that if I go to deep with it they might end up being a net negative down the line
>>
>>108511745
can this anon from last thread publish his acpx wrapper as a pi extension, i couldnt find anyone else using acpx on https://pi.dev/packages , and im too retarded right now to figure out how to make a pi extension. unless you realllly dont want to publish the code then i guess ill figure it out. thx
>>
>>108528875
>It auto-detects your stack (Flask, FastAPI, Django, Express, React, etc.), extracts every class, function, method signature, route, import, module-level variable, and call graph from your Python and JS files
I see what you mean now. I personally use AGENTS.md and memory banks to keep a summary of the stack and Serena for symbolic search and lookup capability. It's not a bad idea though. Definitely simple and lightweight, if nothing else.
>>
>>108528641
What were you trying to do?
>>
>>108529080
I've got functional code, but I'm trying to optimize it, so it either gets the job done faster, or uses less resources. Sometimes, when starting a project from scratch, I feel like it doesn't plan for all the edge cases, full functionality, or rushes with the implementation. Other times, it will change things without permission, or just hallucinate when I give too few instructions. Sometimes I ask it to ask me questions, but I feel like he could have interrogated me more before continuing with the plan. It depends on the project I guess.
>>
ITT
Lazy zoomers who think they know how to code
>>
>>108529152
I see, interesting.
Could you give me a small sample of the type of projects you were trying to run it?
Maybe a simple game engine? A SaaS program or something a bit more complex? And what types of projects do you think Claude handles the task better?
>>
>>108529160
>think they know how to code
nah unc i know i dont thats why i ask computer to do it fer me no cap
>>
>>108529160
>luddie unc who doesn't want to try out the new tech
yawn
>>
>>108529160
you lost
>>
What about agentic AI on mobile?
>>
Should I quit the software industry?
Feels like there's no hope / future in this industry anymore.
>>
>>108529369
did you ask claude?



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.