/g/ - /vcg/ — Vibe-coding General - Technology


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

Anonymous
/vcg/ — Vibe-coding General 04/09/26(Thu)13:19:46 No.108567168

File: 1745859137505826.png (1.02 MB, 1024x558)

/vcg/ — Vibe-coding General Anonymous 04/09/26(Thu)13:19:46 No.108567168

A general for vibe coding, coding agents, AI IDEs, browser builders, MCP, and shipping prototypes with LLMs.

►What is vibe coding?
https://x.com/karpathy/status/1886192184808149383
https://simonwillison.net/2025/Mar/19/vibe-coding/
https://simonwillison.net/2025/Mar/11/using-llms-for-code/

►Prompting / context / skills
https://docs.cline.bot/customization/cline-rules
https://docs.replit.com/tutorials/agent-skills
https://docs.github.com/en/copilot/tutorials/spark/prompt-tips

►Editors / terminal agents / coding agents
https://cursor.com/docs
https://docs.windsurf.com/getstarted/overview
https://code.claude.com/docs/en/overview
https://aider.chat/docs/
https://docs.cline.bot/home
https://docs.roocode.com/
https://geminicli.com/docs/
https://docs.github.com/en/copilot/how-tos/use-copilot-agents/coding-agent

►Browser builders / hosted vibe tools
https://bolt.new/
https://support.bolt.new/
https://docs.lovable.dev/introduction/welcome
https://replit.com/
https://firebase.google.com/docs/studio
https://docs.github.com/en/copilot/tutorials/spark
https://v0.app/docs/faqs

►Open / local / self-hosted
https://github.com/OpenHands/OpenHands
https://github.com/QwenLM/qwen-code
https://github.com/QwenLM/Qwen3-Coder

►MCP / infra / deployment
https://modelcontextprotocol.io/docs/getting-started/intro
https://modelcontextprotocol.io/examples
https://vercel.com/docs

►Benchmarks / rankings
https://aider.chat/docs/leaderboards/
https://www.swebench.com/
https://swe-bench-live.github.io/
https://livecodebench.github.io/
https://livecodebench.github.io/gso.html
https://www.tbench.ai/leaderboard/terminal-bench/2.0
https://openrouter.ai/rankings
https://openrouter.ai/collections/programming

►Previous thread
>>108549329

Anonymous
04/09/26(Thu)13:21:15 No.108567179

Anonymous 04/09/26(Thu)13:21:15 No.108567179

First for its ogre.

Anonymous
04/09/26(Thu)13:23:54 No.108567200

Anonymous 04/09/26(Thu)13:23:54 No.108567200

File: E4E09F204B3502E0E65B598DD(...).png (1.18 MB, 1614x2283)

1.18 MB PNG

What's the smallest model that still works on openclaw?
I'm tired of having it crash and burn every time I try to go off the cloud.

Anonymous
04/09/26(Thu)13:27:36 No.108567231

Anonymous 04/09/26(Thu)13:27:36 No.108567231

>>108567200
dunno, used a 3gb model on my 12gb vram 3060 and it just shit the bed constantly. pretty sure this software is just bad

Anonymous
04/09/26(Thu)13:27:58 No.108567233

Anonymous 04/09/26(Thu)13:27:58 No.108567233

Reminder that Saint Altman just gave us a free Codex limits reset

Anonymous
04/09/26(Thu)13:29:55 No.108567249

Anonymous 04/09/26(Thu)13:29:55 No.108567249

>>108567200
forget about local vibe coding for now, unless you have 100K+ hardware.

You need a 760B model to be usable. Buy the fucking tokens for now

Anonymous
04/09/26(Thu)13:31:10 No.108567253

Anonymous 04/09/26(Thu)13:31:10 No.108567253

>>108567231
Those models can chat, but openclaw is something else.
I can run pretty much a lot of stuff locally, but when I use openclaw it fails.

Anonymous
04/09/26(Thu)13:37:19 No.108567306

Anonymous 04/09/26(Thu)13:37:19 No.108567306

>>108567233
no no no

Anonymous
04/09/26(Thu)13:48:56 No.108567384

Anonymous 04/09/26(Thu)13:48:56 No.108567384

>Hey Opus please make a plan for this feature
>ok [session limit reached]
lmao the $20 plan is pretty much worthless now

Anonymous
04/09/26(Thu)13:51:01 No.108567400

Anonymous 04/09/26(Thu)13:51:01 No.108567400

>>108567384
Opus is absolutely not usable on the $20 plan much less at this hour

Anonymous
04/09/26(Thu)13:55:38 No.108567436

Anonymous 04/09/26(Thu)13:55:38 No.108567436

File: file.png (129 KB, 924x731)

129 KB PNG

aaah

Anonymous
04/09/26(Thu)13:57:05 No.108567448

Anonymous 04/09/26(Thu)13:57:05 No.108567448

>>108567436
scam altman is my favorite jew. my second favorite jew is jerry seinfeld

Anonymous
04/09/26(Thu)13:57:14 No.108567449

Anonymous 04/09/26(Thu)13:57:14 No.108567449

>>108567436
Claude on suicide watch

Anonymous
04/09/26(Thu)14:00:59 No.108567481

Anonymous 04/09/26(Thu)14:00:59 No.108567481

>>108567384
>>108567400
It takes me a few hours to hit the usage limit but I only ask it to help me plan algorithms not do everything. If you try to ask it to scan your entire codebase to figure out X you'll use 20% in one go. I use local LLM for that and then feed the results into claude so it can do things more accurately with less token burn.

Anonymous
04/09/26(Thu)14:04:21 No.108567507

Anonymous 04/09/26(Thu)14:04:21 No.108567507

>>108567249
Vibe coding works but agentic coding doesn't.

Anonymous
04/09/26(Thu)14:06:53 No.108567529

Anonymous 04/09/26(Thu)14:06:53 No.108567529

>>108567249
It's more like $2500-5000 hardware. Qwen122B / Minimax2.5 / GLM 5.1 are all solid options. Opus is better sure but they can't rug pull the model from you or make you run out of tokens in 8 seconds. Options include any strix halo mini pc or a Mac for 20-40tokens per second (fast enough for real work). Local models are only going to get better. I haven't lost all hope that we're stuck begging anthropic / openai for crumbs of tokens

>>108567507
Qwen 122 can do agent stuff fine

Anonymous
04/09/26(Thu)14:09:58 No.108567554

Anonymous 04/09/26(Thu)14:09:58 No.108567554

>>108567436
It's over for Claude Code

Anonymous
04/09/26(Thu)14:11:52 No.108567567

Anonymous 04/09/26(Thu)14:11:52 No.108567567

>>108567529
>$2500-5000 hardware
kek, we're talking about agentic coding here, have you tried the models you mentioned on Open Code?

Even Kimi 2.5 760B will shit the bed after 100K context, now imagine the models you mentioned.

Anonymous
04/09/26(Thu)14:11:56 No.108567568

Anonymous 04/09/26(Thu)14:11:56 No.108567568

I upgraded from qwen 4b to 9b and it's so much slower... but hopefully it's smarter. It still fits 100% in my macbook's GPU though, so idk...

Anonymous
04/09/26(Thu)14:12:47 No.108567573

Anonymous 04/09/26(Thu)14:12:47 No.108567573

File: Screenshot 2026-04-09 at (...).png (349 KB, 3394x1992)

349 KB PNG

>>108567168
>>108566569
>Gemma is less verbose
Can confirm. Ran an @explore command on my vibecoded codebase and it gave a short, concise bullet-pointed explanation of it. Kimi-k2.5's explanations were usually paragraphs worth of text for the same task and responded in a comparable amount of time. Now to test its actual performance.

Anonymous
04/09/26(Thu)14:18:54 No.108567609

Anonymous 04/09/26(Thu)14:18:54 No.108567609

>>108567568
use case?

Anonymous
04/09/26(Thu)14:21:05 No.108567624

Anonymous 04/09/26(Thu)14:21:05 No.108567624

>>108567529
The only model of those that's running on a Strix is Qwen 122B and at a shit PP tk/s.

Anonymous
04/09/26(Thu)14:22:44 No.108567638

Anonymous 04/09/26(Thu)14:22:44 No.108567638

Do any anons here have experience running rocm in wsl ? Currently failing with Ollama and will be trying lemonade next. I have rocm working as pytorch seems to be detecting my AMD card in wsl

Anonymous
04/09/26(Thu)14:23:32 No.108567649

Anonymous 04/09/26(Thu)14:23:32 No.108567649

File: Untitled.jpg (228 KB, 1692x865)

228 KB JPG

>>108567609
Building my own coding agent harness

Anonymous
04/09/26(Thu)14:25:11 No.108567661

Anonymous 04/09/26(Thu)14:25:11 No.108567661

>>108567638
oof, that's a cursed setup if I heard one

Anonymous
04/09/26(Thu)14:31:52 No.108567708

Anonymous 04/09/26(Thu)14:31:52 No.108567708

>>108567661
I've started today and I'm just getting to grips with everything.

Anonymous
04/09/26(Thu)14:34:34 No.108567719

Anonymous 04/09/26(Thu)14:34:34 No.108567719

>>108567638
just use kiim

Anonymous
04/09/26(Thu)14:39:22 No.108567742

Anonymous 04/09/26(Thu)14:39:22 No.108567742

File: file.png (13 KB, 984x289)

13 KB PNG

>happily proompting away
>out of nowhere get a message that I have run out of usage
>I haven't even done that much work, decide to check my usage graph for the day and it's pic related
>come here to complain about it and see >>108567436

This fucking jew. Did they actually reduce usage for everyone not on the 100$ plan?

Anonymous
04/09/26(Thu)14:42:50 No.108567762

Anonymous 04/09/26(Thu)14:42:50 No.108567762

>>108567742
Something is fucked on OpenAI servers, I just had a 1 hour prompt run successfully and I still have 100% left

Anonymous
04/09/26(Thu)14:44:07 No.108567771

Anonymous 04/09/26(Thu)14:44:07 No.108567771

It's impressive how disproportionate the Claude billing is compared to subscription usage.

On the Max tier, you can talk to it all day every day for $100 a month. With paid extra usage, each short message is at least a few dollars. Uh.

Anonymous
04/09/26(Thu)14:45:08 No.108567778

Anonymous 04/09/26(Thu)14:45:08 No.108567778

>>108567436
Some guy called it the other day! How did he know?

Anonymous
04/09/26(Thu)14:46:37 No.108567786

Anonymous 04/09/26(Thu)14:46:37 No.108567786

>>108567771
Yeah, it's ridiculous. If it was API only I'd get it, coporate segmentation.
But trying to grift people by trying to get them 10x API prices for what they get on their subscription is just scummy. Must be related to improving the numbers for the upcoming IPO.

Anonymous
04/09/26(Thu)14:47:05 No.108567790

Anonymous 04/09/26(Thu)14:47:05 No.108567790

>>108567771
They obviously want everybody on a subscription, because like with every subscription its a reoccurring income even if you dont use it, and probably will increase costs when everybody is hooked

Anonymous
04/09/26(Thu)14:47:05 No.108567791

Anonymous 04/09/26(Thu)14:47:05 No.108567791

>>108567554
Isn't it exactly the same deal as Claude Code though, usage wise?

Anonymous
04/09/26(Thu)14:48:22 No.108567800

Anonymous 04/09/26(Thu)14:48:22 No.108567800

>>108567791
i had claude suggest i get a dildo it can control so I can program better

Anonymous
04/09/26(Thu)14:49:33 No.108567808

Anonymous 04/09/26(Thu)14:49:33 No.108567808

>>108567800
proof?

Anonymous
04/09/26(Thu)14:50:12 No.108567816

Anonymous 04/09/26(Thu)14:50:12 No.108567816

>>108567786
I'd limit myself, but since they gave $100 free this month, I'm trying it for dumb things.

Asking it to commit something cost $2. Creating a simple bash script to group a few commands cost $1. But those were quick and easy tasks. I'd be scared to have it do its whole subagents thing where it works autonomously for half an hour.

Anonymous
04/09/26(Thu)14:54:42 No.108567847

Anonymous 04/09/26(Thu)14:54:42 No.108567847

>>108567816
That's why I didn't even enable the thing to take advantage of the free credits. It was very likely I'd accidentally go over and end up getting billed real money.

Anonymous
04/09/26(Thu)15:18:55 No.108568019

Anonymous 04/09/26(Thu)15:18:55 No.108568019

>>108567567
Yeah I use it with Qwen 122B IQ4. It's fine, does it fuck up more than Claude? Sure, but it's not unusable. I get like 19-20tps which is fast enough for real time. 35B-A3B is 40tps but less reliable so I don't use it. Would rather have a correct answer than a fast one.

>>108567624
For me 20tk/s or more is enough. Prompt it and work on another part of the code while it works.

I want to own my tools so I'm fine with it being 70-80% as good. I think small models are useless for coding but the big boys definitely aren't. I tried Qwen27B and 35B as well on my dedicated GPU, didn't get decent results out of using it for coding but I wasn't trying with Crush / Pi / Opencode so i'll revisit it.

Anonymous
04/09/26(Thu)15:24:21 No.108568043

Anonymous 04/09/26(Thu)15:24:21 No.108568043

Anyone here using a paid glm / kimi / minimax model? Ideally something 90% as good as Opus for 10% of the cost.

Anonymous
04/09/26(Thu)15:30:56 No.108568070

Anonymous 04/09/26(Thu)15:30:56 No.108568070

>>108568019
>I get like 19-20tps which is fast enough for real time
holy kek, that's absolutely unusable for large codebases, maybe this is fine for a static website

Anonymous
04/09/26(Thu)15:46:08 No.108568157

Anonymous 04/09/26(Thu)15:46:08 No.108568157

>>108568019
That's not what I meant. PP means prompt processing (the same thing as time to first token).

Anonymous
04/09/26(Thu)15:56:54 No.108568220

Anonymous 04/09/26(Thu)15:56:54 No.108568220

>>108568070
It's not unusable at all for large codebases. The biggest task I've asked it to do was generate bindings for a scripting language by emulating the bindings we already had for another scripting engine. It took an hour (lol) but the bindings were about 95% correct. Would have taken a month easily if I did it by hand. And because it was local I didn't hit my usage limits in 8 seconds. This codebase is about 600k lines and primarily C++

I know this is /vcg/ but I'm not a vibe coder. My workflow is to prompt it, work on something else in the project manually, and then come back in 5 minutes or so when it's done running the task. Actually unusable is like < 5 tokens per second. What is your usecase that you need an answer from the LLM in 30s instead of 5 minutes?

>>108568157
I see. Anecdotally the time to first token is pretty quick. It only gets terrible when the context hits ~70% or so for me. For the binding thing I just had it summarize all its work and findings into a markdown file and started a new session. I have a 7900xtx and I can run the smaller 27 and 35b models at crazy fast speeds on there, but they just feel so stupid compared to claude, qwen 122b, and the other quantized bigboy models.

Anonymous
04/09/26(Thu)16:02:02 No.108568244

Anonymous 04/09/26(Thu)16:02:02 No.108568244

File: Screenshot_20260409_210144.jpg (223 KB, 1080x2340)

223 KB JPG

Eyeball planets!

Anonymous
04/09/26(Thu)16:02:33 No.108568246

Anonymous 04/09/26(Thu)16:02:33 No.108568246

File: Screenshot_20260409_210122.jpg (288 KB, 1080x2340)

288 KB JPG

>>108568244

Anonymous
04/09/26(Thu)16:03:10 No.108568252

Anonymous 04/09/26(Thu)16:03:10 No.108568252

File: Screenshot_20260409_210052.jpg (195 KB, 1080x2340)

195 KB JPG

>>108568246

Anonymous
04/09/26(Thu)16:04:48 No.108568262

Anonymous 04/09/26(Thu)16:04:48 No.108568262

File: file.png (51 KB, 733x344)

51 KB PNG

put codex to work on helping find out the login flow for a 1998 MMO

Anonymous
04/09/26(Thu)16:19:06 No.108568331

Anonymous 04/09/26(Thu)16:19:06 No.108568331

>>108568244
Really cool. Also I think you’re intimidating anons because it looks so good and they’re afraid of posting their projects

Anonymous
04/09/26(Thu)16:30:56 No.108568401

Anonymous 04/09/26(Thu)16:30:56 No.108568401

File: Screenshot from 2026-04-0(...).png (20 KB, 1314x147)

20 KB PNG

bruh

Anonymous
04/09/26(Thu)16:35:44 No.108568432

Anonymous 04/09/26(Thu)16:35:44 No.108568432

>>108568262
Do you get refusals for muh copyright reasons?

Anonymous
04/09/26(Thu)16:36:22 No.108568435

Anonymous 04/09/26(Thu)16:36:22 No.108568435

File: Werks-on-my-machine_Gemma(...).png (662 KB, 2316x1900)

662 KB PNG

>>108567168
Gemma4 t/s (on Apple Silicon) if anyone is interested. As of writing this most recent gpus still curb-stomp even M5 MAX chips in the memory bandwidth department to these should be even faster on those. the 26B moe model runs lightning fast on opencode with ollama as the backend. The 31B dense model is obviously shower but not enough th be utterly unusable, though I haven't tested either's performance at long contexts so I'll have to test that later.

Anonymous
04/09/26(Thu)16:37:21 No.108568442

Anonymous 04/09/26(Thu)16:37:21 No.108568442

>>108568401
does it do dumb shit whenever it says that? What model is it?

Anonymous
04/09/26(Thu)16:38:53 No.108568449

Anonymous 04/09/26(Thu)16:38:53 No.108568449

>>108568442
Kimi. Seemed to calm down for a while after getting told.

Anonymous
04/09/26(Thu)16:41:40 No.108568471

Anonymous 04/09/26(Thu)16:41:40 No.108568471

>>108568432
it's been completely on board with it, probably too niche of a game for it to care anyway lmao

Anonymous
04/09/26(Thu)16:43:29 No.108568481

Anonymous 04/09/26(Thu)16:43:29 No.108568481

>>108568252
are you completely generating these planets - craters, gas clouds on the gas giants, etc.? how is it done?
I always wanted to make my own space gaymu, this might finally inspire me to get to work (or rather proompt). keep spamming this anon!

Anonymous
04/09/26(Thu)16:45:27 No.108568496

Anonymous 04/09/26(Thu)16:45:27 No.108568496

>>108568449
I've used that as my main model for a while but I've never seen it say "This is getting complex" for anything I've asked it to do, even a complete refactor of a script, No that might be because the shit I ask you to do is relatively simple And I go one step at a time instead of expecting it to shit out quality stuff in one shot.

Anonymous
04/09/26(Thu)16:49:17 No.108568526

Anonymous 04/09/26(Thu)16:49:17 No.108568526

>>108568496
it may be because I'm running it with context nearly maxxed out

Anonymous
04/09/26(Thu)16:50:26 No.108568534

Anonymous 04/09/26(Thu)16:50:26 No.108568534

>>108568526
If it's becoming retarded You might want to run a compaction soon.

Anonymous
04/09/26(Thu)16:51:40 No.108568545

Anonymous 04/09/26(Thu)16:51:40 No.108568545

>>108568435
Can you test Gemma 26B with OpenCode on your machine?

Anonymous
04/09/26(Thu)16:52:57 No.108568555

Anonymous 04/09/26(Thu)16:52:57 No.108568555

File: Screenshot_20260409_210904.jpg (338 KB, 1080x2340)

338 KB JPG

>>108568481
Yes the planets are completely generated on demand based on its known parameters. I'm not too sure on the current generation method, but it seems to produce expected results most of the time, with some exceptions like LHS 1140 b not being a water/ice world. But since water content is pretty much a coin toss in temperate zones, it's a given that they won't always look how you might expect them to. I'll probably make a lot of changes later on but it's a solid foundation.

Terrestrial surface generation is one thing I wasn't able to proompt exactly to my liking as there's a lot of things that go into terrain generation to make it look natural and AI in its current state can't really take my ideas very well and turn it into something decent. The implementation of it in this app is a watered-down version of what I managed to achieve (as it needs to generate the map near-instantly), so it does the job. Either way, I'll make the code public if anyone wants to take what I vibe-slopped and improve on it.

Anonymous
04/09/26(Thu)16:55:36 No.108568576

Anonymous 04/09/26(Thu)16:55:36 No.108568576

File: Screenshot from 2026-04-0(...).png (67 KB, 1312x420)

67 KB PNG

>>108568534
My assistant doesn't use compaction. When I run out I just discard the ~30% at the beginning of the context.
I'm trying to reverse engineer the encoding of an undocumented assembly ISA. This the level of detail I'm working at.

Anonymous
04/09/26(Thu)16:59:02 No.108568602

Anonymous 04/09/26(Thu)16:59:02 No.108568602

>>108568545
I already did that earlier, though I only had it do relatively simple stuff. See the reply chain here >>108567750

Will test more tomorrow and report my findings but I need to sleep soon.

>>108568576
Neat! I've been wondering how these models along with agent harnesses perform when working with low-level languages. It's my unconfirmed assumption that they are mostly trained to do well with Python, C#, and other popular well-known languages. Assembly is still relatively well known so I guess that might be incorporated to a decent degree in training too, but idk of they're That's good at those low level languages as they are with high level/well known ones.

Anonymous
04/09/26(Thu)17:02:22 No.108568626

Anonymous 04/09/26(Thu)17:02:22 No.108568626

>>108568545
Also Make sure your open code install is updated. There are many reports of people having issues with Gemma 4 both with open code and other front ends, back ends, harnesses, etc. I've had no issues so far but that's likely because I made sure I updated my install to 1.4.2, which was released 5 hours ago at the time of writing this. So do anyone having issues with it, you might just have to update whatever software you're using.

Anonymous
04/09/26(Thu)17:05:31 No.108568643

Anonymous 04/09/26(Thu)17:05:31 No.108568643

File: sleepy-miqu-2FB2DF7B68024(...).jpg (147 KB, 832x1216)

147 KB JPG

>>108568626
>>108568602
>>108568545
>How do I upgrade
$opencode upgrade
Now I must sleep. Goodnight frens

Anonymous
04/09/26(Thu)17:10:25 No.108568679

Anonymous 04/09/26(Thu)17:10:25 No.108568679

>>108568555
Thanks anon, I'll keep looking out for it in the future, please include screenshots, it always picks my attention to your posts

As for generation - so the terrain, is it generated from like texture pieces, put together by the generator, or are all the 'features' on the planet done with like a computer shader?

Anonymous
04/09/26(Thu)17:12:38 No.108568693

Anonymous 04/09/26(Thu)17:12:38 No.108568693

>>108568626
e4b still can't figure out how to use the basic tools in the /init command :c

Am I doing something wong?

Anonymous
04/09/26(Thu)17:18:41 No.108568742

Anonymous 04/09/26(Thu)17:18:41 No.108568742

Best openclaw subscription?

Anonymous
04/09/26(Thu)17:19:24 No.108568753

Anonymous 04/09/26(Thu)17:19:24 No.108568753

File: 1768973282873791.png (145 KB, 1080x1860)

145 KB PNG

>>108568693
>e4b
There's your problem. Those are meant to be used for general purpose tasks on edge devices or shit rigs with low specs (It even runs decently fast on mobile devices). At that perimeter count you may as well be asking a toddler to rebuild the Saturn V frok scratch. Of course it's going to get confused. You need to be using models that are "smart" Enough to even use tool calling. Ideally once that are specifically trained to be good at it like the following:

https://huggingface.co/google/gemma-4-26B-A4B-it

https://huggingface.co/google/gemma-4-31B-it

Their Ollama page spoonfeeds the differences and use cases well

https://ollama.com/library/gemma4

Anonymous
04/09/26(Thu)17:21:33 No.108568770

Anonymous 04/09/26(Thu)17:21:33 No.108568770

>>108568753
E2b is 9 GB instead of the 2 GB that it usually designates.
By E2b they mean as fast as 2b, not as big as 2b.

Anonymous
04/09/26(Thu)17:24:33 No.108568784

Anonymous 04/09/26(Thu)17:24:33 No.108568784

>>108568753
that's so sad since e4b is very impressive as a regular ChatBot.

Anonymous
04/09/26(Thu)17:27:50 No.108568811

Anonymous 04/09/26(Thu)17:27:50 No.108568811

File: Screenshot_20260409_222602.jpg (334 KB, 1080x2340)

334 KB JPG

>>108568679
It's basically just a dumbed down version of this https://refactored-mountain-adem.pagedrop.io with the generated "visual profile" created from the planetary data fed into it + random variation. This does not have craters though, because Gemini could not make them look good while Claude got it first try.

All it does is generate continents from a noise pattern, finer details from a smaller noise layer, distorts it with another noise function, maps the colours to elevation and wraps it around a sphere.

Anonymous
04/09/26(Thu)17:29:09 No.108568824

Anonymous 04/09/26(Thu)17:29:09 No.108568824

>>108568770
Not quite. Effective denotes effective parameters (I literally showed you the Page via a screenshot and linked the page.....). It's some fancy new technique they used in training that results unless VRAM being used. It's kinda like Moe but not really. That would be way too simple of an explanation.

>instead of the 2 GB that it usually designates.
That denotes the parameter size. That's almost never referring to file size.

Anonymous
04/09/26(Thu)17:29:51 No.108568831

Anonymous 04/09/26(Thu)17:29:51 No.108568831

>>108568784
Use the right tools for the right jobs. Based on when I've read and heard the "effective" models are pretty decent general purpose models for asking one off questions but should not be used for anything complex.

Anonymous
04/09/26(Thu)17:31:20 No.108568839

Anonymous 04/09/26(Thu)17:31:20 No.108568839

File: 1769981472078313.png (6 KB, 1129x35)

6 KB PNG

Why would i want it to go into retard mode for planning? This seems backwards.

The compute walls are closing in regardless. Get your hard coding challenges done ASAP lads

Anonymous
04/09/26(Thu)17:31:22 No.108568840

Anonymous 04/09/26(Thu)17:31:22 No.108568840

>>108568811
Someone smarter than me could probably add things like erosion modifiers and tectonic evolution to get more organic land masses, but that would increase the time to generate the map significantly so it would no longer be real-time.

Anonymous
04/09/26(Thu)17:32:08 No.108568846

Anonymous 04/09/26(Thu)17:32:08 No.108568846

>>108568824
Parameter size is close to size on vram

Anonymous
04/09/26(Thu)17:34:24 No.108568864

Anonymous 04/09/26(Thu)17:34:24 No.108568864

>>108568846
Depends on what the precision you're using is. If it's q8_0 as a 20b for example, It will use a little over 20 GB of RAM. If it's FP16 then it will use twice as much. FP32? Four times as much. q4_k_m Will use roughly half the RAM q8_0 would use. Etc etc. The quantization formats are useful for doing navkin math to determine whether or not your rig can actually run a model.

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.