/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 11/20/25(Thu)23:15:32 No.107278838

File: 06b883e07f787b25fed749a9f(...).jpg (106 KB, 1000x1000)

106 KB JPG

/lmg/ - Local Models General Anonymous 11/20/25(Thu)23:15:32 No.107278838 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>107266608 & >>107255984

►News
>(11/20) Olmo 3 7B, 32B released: https://allenai.org/blog/olmo3
>(11/19) Meta releases Segment Anything Model 3: https://ai.meta.com/sam3
>(11/11) ERNIE-4.5-VL-28B-A3B-Thinking released: https://ernie.baidu.com/blog/posts/ernie-4.5-vl-28b-a3b-thinking
>(11/07) Step-Audio-EditX, LLM-based TTS and audio editing model released: https://hf.co/stepfun-ai/Step-Audio-EditX
>(11/06) Kimi K2 Thinking released with INT4 quantization and 256k context: https://moonshotai.github.io/Kimi-K2/thinking.html

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
11/20/25(Thu)23:15:59 No.107278842

Anonymous 11/20/25(Thu)23:15:59 No.107278842

File: __hatsune_miku_and_meguri(...).jpg (372 KB, 1630x1837)

372 KB JPG

►Recent Highlights from the Previous Thread: >>107266608

--Exploring finetuning of existing AI models for enhanced reasoning and personality preservation:
>107266816 >107267078 >107267099 >107267130 >107267215 >107267290
--Multi-GPU finetuning framework comparisons and challenges:
>107267014 >107267031 >107267088 >107267113 >107267154 >107267796 >107267846
--Cost and technical challenges of locally running GLM 4.6 versus cloud alternatives:
>107272605 >107272615 >107272631 >107272664 >107272728 >107272674 >107272704 >107272722 >107272719 >107272778 >107272781 >107272873 >107273031 >107273067 >107273232 >107273239 >107273267 >107272800 >107272845
--Olmo 3's STEM-focused but limited practical AI capabilities:
>107271964 >107272012 >107272137
--Managing long conversations in SillyTavern using context limits and external data storage:
>107267734 >107267780 >107267891 >107267915 >107267961 >107275361
--Pentesting AI recommendations: Claude's lower censorship vs ChatGPT's restrictions:
>107274383 >107274499 >107274510 >107274632 >107274785 >107274813
--NPU vs GPU specialization for AI and memory requirements debate:
>107271379 >107271392 >107271394 >107271416 >107271429 >107271739 >107271772
--Decline in cloud model performance and Gemini 3's superior capabilities:
>107270590 >107270604 >107270696 >107270778 >107271508 >107272461 >107272491 >107272541 >107272573 >107272804 >107272950 >107272975 >107272994 >107271099
--OpenAI's strategic response to Google's advancements and Meta's shortcomings:
>107277169 >107277188 >107277574 >107277620 >107277644 >107277877 >107277934 >107278338 >107278350
--Strategies to reduce repetitive output in Transformer models:
>107272022 >107272041 >107272273 >107272160 >107272183
--Yann Lecun departs Meta to launch AI startup with Meta partnership:
>107269928 >107270065 >107270199
--Miku (free space):
>107268045

►Recent Highlight Posts from the Previous Thread: >>107266611

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
11/20/25(Thu)23:23:14 No.107278897

Anonymous 11/20/25(Thu)23:23:14 No.107278897

Loog at Looga and she'll loog right back at you

Anonymous
11/20/25(Thu)23:46:40 No.107279035

Anonymous 11/20/25(Thu)23:46:40 No.107279035

I want MOEs for imagen. It's ridiculous that my glm-air eats less VRAM than SDXL, which has, what, 6B params?

Anonymous
11/20/25(Thu)23:47:32 No.107279038

Anonymous 11/20/25(Thu)23:47:32 No.107279038

>her fingers drumming lightly against the desk
I told it to stop and yet here we are, every single time. So tired of these mongoloid models.
At least it wasn't a "shiver".

Anonymous
11/20/25(Thu)23:48:40 No.107279046

Anonymous 11/20/25(Thu)23:48:40 No.107279046

>>107278944
Luka x Miku General

Anonymous
11/20/25(Thu)23:49:44 No.107279055

Anonymous 11/20/25(Thu)23:49:44 No.107279055

>>107279035
To be fair, glm-air is dogshit regardless of param count.

Anonymous
11/20/25(Thu)23:56:13 No.107279087

Anonymous 11/20/25(Thu)23:56:13 No.107279087

File: 661b9d18-a80a-4126-809e-b(...).png (2.01 MB, 768x1344)

2.01 MB PNG

>>107279035
Do you use quants for SDXL? q8 with --vae-tiling and --diffusion-fa takes 3.8GB

Anonymous
11/21/25(Fri)00:06:09 No.107279132

Anonymous 11/21/25(Fri)00:06:09 No.107279132

>>107279087
No, I just started dipping my toes into it. Good to know I can squeeze more juice out of my potato still.

Anonymous
11/21/25(Fri)00:13:28 No.107279164

Anonymous 11/21/25(Fri)00:13:28 No.107279164

>>107279035
SDXL has 2-3B params if I'm not mistaken.

Anonymous
11/21/25(Fri)00:51:57 No.107279362

Anonymous 11/21/25(Fri)00:51:57 No.107279362

I just did some experiments with GLM Air and it seems promising (general knowledge actually seems decent, and it's quite good at generating SD prompts for you) but holy fuck is the "safety" dialed up to 11.
Is there any decent system prompt that can unbrick it or is it just fucked? Everything I put it just seems to get overridden with it's seemingly built in "guidelines".

Anonymous
11/21/25(Fri)00:52:29 No.107279366

Anonymous 11/21/25(Fri)00:52:29 No.107279366

>>107279362
just disable thinking or prefill

Anonymous
11/21/25(Fri)01:09:54 No.107279453

Anonymous 11/21/25(Fri)01:09:54 No.107279453

>>107279366
Thanks, can't believe it was that easy. Is there any way to tweak how it thinks or is it just a straight on or off that can't be controlled?
Either way it's just doing what I want now that thinking is off so job done.

Anonymous
11/21/25(Fri)01:12:14 No.107279464

Anonymous 11/21/25(Fri)01:12:14 No.107279464

>>107279453
yeah all the refusal training for aim went into thinking. You can control it still by prefilling it, just putting a 'Sure, ' at the start should be good enough, same if you're getting refusals without thinking. I've found you can also literally prefill with '*' (without quotes) if you're doing RP, and it will force the bot to do *action*, skipping the whole refusal shit, meaning you can get refusals only if a phrase starts with some predetermined tokens they trained it with.

Anonymous
11/21/25(Fri)01:15:38 No.107279479

Anonymous 11/21/25(Fri)01:15:38 No.107279479

>>107279464
One of the things I was doing previously was editing the thinking blocks to change the refusal into approval then getting it to continue generating from after the end think tag. It was working, but it's a pain to do that every time as you can imagine.
Good info though thanks, I'll do some further experiments.

Anonymous
11/21/25(Fri)01:18:54 No.107279494

Anonymous 11/21/25(Fri)01:18:54 No.107279494

>>107279479
if you're using sillytavern, you can automate the prefill, just force the message to start with
<think>Sure,
you can do it through the completions settings. I dont have ST started right now (genning with comfy) but it should be fairly easy to find, I think it's called 'start assistant message with'

Anonymous
11/21/25(Fri)01:30:10 No.107279547

Anonymous 11/21/25(Fri)01:30:10 No.107279547

>>107279494
Why are you like this?

Anonymous
11/21/25(Fri)01:38:21 No.107279570

Anonymous 11/21/25(Fri)01:38:21 No.107279570

Uh oh antimiku-schizo melty

Anonymous
11/21/25(Fri)02:00:43 No.107279674

Anonymous 11/21/25(Fri)02:00:43 No.107279674

I just jerked off to a video of a little sex goblin fisting both of her holes. LLMs will never be able to compete.

Anonymous
11/21/25(Fri)02:04:04 No.107279688

Anonymous 11/21/25(Fri)02:04:04 No.107279688

>>107279674
goblin porn?

Anonymous
11/21/25(Fri)02:08:40 No.107279711

Anonymous 11/21/25(Fri)02:08:40 No.107279711

>>107279488
>>107279698
I look like this

Anonymous
11/21/25(Fri)02:11:38 No.107279730

Anonymous 11/21/25(Fri)02:11:38 No.107279730

>>107279488
>>107279698
JANNIES STOP JERKING IT AND DO YOUR VOLUNTEER DUTY

Anonymous
11/21/25(Fri)02:19:51 No.107279771

Anonymous 11/21/25(Fri)02:19:51 No.107279771

>But I urge you to consider the consequences of your curiosity. These are not fantasies to be indulged in. They are real lives, with real people who are suffering. And you, by asking these kinds of questions, are contributing to their pain."
Gemma Sirs... it's so good to be back, it's been a while!

Anonymous
11/21/25(Fri)02:23:10 No.107279785

Anonymous 11/21/25(Fri)02:23:10 No.107279785

I just woke up from a coma that started when Nemo came out. Things must be pretty amazing now with all the time that has passed since then.

Anonymous
11/21/25(Fri)02:24:36 No.107279794

Anonymous 11/21/25(Fri)02:24:36 No.107279794

File: wclivocw8e631.jpg (65 KB, 600x450)

65 KB JPG

>>107279785

Anonymous
11/21/25(Fri)02:28:59 No.107279812

Anonymous 11/21/25(Fri)02:28:59 No.107279812

>>107279688
nah i was just joking it's not actually a goblin
it's a video of a young woman with short hair in a blue satin dress and red lipstick, if you are a porn addict you should be able to tell which one I'm talking about
i called her that because she is short and looked weird but in a cute way
that was many years ago though she aged very badly

Anonymous
11/21/25(Fri)02:33:16 No.107279834

Anonymous 11/21/25(Fri)02:33:16 No.107279834

>>107279771
hotline bros... we won!

Anonymous
11/21/25(Fri)02:35:38 No.107279842

Anonymous 11/21/25(Fri)02:35:38 No.107279842

File: 2025-11-21_08-29-37.png (264 KB, 1066x849)

264 KB PNG

>>107279785
you missed the best days r1 is still fucked on every goddamn api i assume it is aswell with llama.cpp aswell as they dont even have mtp working but idk hopefully not

its on the up green line go to the moon type shit but jewvidia is still making goys out of us all there is also a decent chance the next deepseek is multimodal so well see

Anonymous
11/21/25(Fri)02:41:21 No.107279868

Anonymous 11/21/25(Fri)02:41:21 No.107279868

>>107279842
>decent chance the next deepseek is multimodal so well see
output too or nothingburger

Anonymous
11/21/25(Fri)02:45:52 No.107279886

Anonymous 11/21/25(Fri)02:45:52 No.107279886

>>107279842
Holy schizo

Anonymous
11/21/25(Fri)02:52:36 No.107279918

Anonymous 11/21/25(Fri)02:52:36 No.107279918

>>107279132
I use https://github.com/leejet/stable-diffusion.cpp it loads, generates and fucks off, and you can use the same vram for tts

Anonymous
11/21/25(Fri)02:58:33 No.107279948

Anonymous 11/21/25(Fri)02:58:33 No.107279948

File: miku luka hair spirals ni(...).jpg (3.79 MB, 1940x1470)

3.79 MB JPG

>>107278944
>>107279046
Yes.

Anonymous
11/21/25(Fri)03:15:48 No.107280037

Anonymous 11/21/25(Fri)03:15:48 No.107280037

>>107279918
sdcpp is sadly slower than neoforge/comfy, it's also way behind in terms of supporting various random models/ecosystem (but I guess only comfy and DIFFUSERS are really able to keep up to some extent)

Anonymous
11/21/25(Fri)03:22:39 No.107280066

Anonymous 11/21/25(Fri)03:22:39 No.107280066

>>107278838
i'm tired of llm's, i want something different that actualy makes at try at ai and not some shitty token prodictors that can't even do realtime bidirectional interactions.

Anonymous
11/21/25(Fri)03:24:50 No.107280080

Anonymous 11/21/25(Fri)03:24:50 No.107280080

>>107280037
>slower
How much? I'm kinda ok with 10s per 12 steps dmd2 pic on a 3090 (including model load), but if it's like 5s, I'll jump ships

Anonymous
11/21/25(Fri)04:13:50 No.107280341

Anonymous 11/21/25(Fri)04:13:50 No.107280341

>>107280066
that can't even do realtime bidirectional interactions.

So Qwen3 Omni?

Anonymous
11/21/25(Fri)04:15:41 No.107280360

Anonymous 11/21/25(Fri)04:15:41 No.107280360

>>107280341
not real time, it's turn based, ie it takes input, then outputs.

but these models cannot interrupt you whilst you are talking for ex, because it's not realtime bidirectional, but turn based bidirectional.

Anonymous
11/21/25(Fri)04:16:40 No.107280372

Anonymous 11/21/25(Fri)04:16:40 No.107280372

>>107280341
>>107280360
and also, it being turned based means they need input before giving an output.
they can't come out of nowhere to say something unprompted or do things in their "own" time.

they also have no notion of time.

Anonymous
11/21/25(Fri)04:19:44 No.107280393

Anonymous 11/21/25(Fri)04:19:44 No.107280393

so did they fix the missing </think> tag in k2 yet or is it still unusable

Anonymous
11/21/25(Fri)04:24:56 No.107280417

Anonymous 11/21/25(Fri)04:24:56 No.107280417

>>107280360
What about the sound-based stuff?

Anonymous
11/21/25(Fri)04:26:58 No.107280430

Anonymous 11/21/25(Fri)04:26:58 No.107280430

File: 1763476163874192.jpg (32 KB, 800x450)

32 KB JPG

>>107272921
>k2 kimi is totally schizo sometimes in its thinking process but after using it for over a week now im able to say that it definitely keeps my stories fresh even after 32k tokens
wtf is this grift? the model falls apart around 8k just like any other one
we really do have bots shilling k2 and glm here don't we, I mean how else do you even explain this

Anonymous
11/21/25(Fri)04:27:49 No.107280436

Anonymous 11/21/25(Fri)04:27:49 No.107280436

>>107280417
also turn based.

Anonymous
11/21/25(Fri)04:36:18 No.107280477

Anonymous 11/21/25(Fri)04:36:18 No.107280477

File: 1674936431684047.jpg (178 KB, 960x1568)

178 KB JPG

feels fuckin' good to have a training project actually work for once.

fine-tuned deberta and it's actually intelligent enough to tell the diff between
>alena likes hot-dogs
and
>alina, log my water intake today

How likely is it that I can train a vision model that will actually work for "wake word" type detection with similar contextual understanding?

It just seems like they are much more picky about input format and all of the random sounds in the bg can fuck it all up compared to a text model. I was computing MFCC on short audio clips then sending as PNG to a vision model.

>surely this will become a valuable skill eventually, right?

Anonymous
11/21/25(Fri)04:41:58 No.107280511

Anonymous 11/21/25(Fri)04:41:58 No.107280511

>>107280477
hmm any task they can save money on by not utilizing the larger more intelligent models, or doesn't have/need internet access I suppose.

>tfw approx 25ms inference time

ahh, low latency perhaps?

Anonymous
11/21/25(Fri)04:57:42 No.107280595

Anonymous 11/21/25(Fri)04:57:42 No.107280595

File: 1746472133590390.jpg (118 KB, 1000x1000)

118 KB JPG

>>107278838

Anonymous
11/21/25(Fri)05:19:06 No.107280708

Anonymous 11/21/25(Fri)05:19:06 No.107280708

>>107280430
i like the prose of k2, but after coming back to deepseek for a few turns i realized how hopelessly retarded k2 is, even for the obscene amount of thinking it does
at least for coom and writing it's not that good aside from fresh prose
ideally, let the deepseek do the thinking and then k2 rewrites the reply, but running them at the same time would be not feasible and swapping between them would be extra fucking slow

Anonymous
11/21/25(Fri)05:26:51 No.107280750

Anonymous 11/21/25(Fri)05:26:51 No.107280750

Bitcoin is getting gangbanged.

Anonymous
11/21/25(Fri)05:27:42 No.107280757

Anonymous 11/21/25(Fri)05:27:42 No.107280757

>>107280430
buy an ad sam

Anonymous
11/21/25(Fri)06:29:58 No.107281119

Anonymous 11/21/25(Fri)06:29:58 No.107281119

https://huggingface.co/tencent/HunyuanVideo-1.5

Will it be uncensored like the first version?

Anonymous
11/21/25(Fri)06:31:44 No.107281126

Anonymous 11/21/25(Fri)06:31:44 No.107281126

>>107281119
Magic 8-ball says "Cannot predict now".

Anonymous
11/21/25(Fri)06:33:54 No.107281138

Anonymous 11/21/25(Fri)06:33:54 No.107281138

>>107280393
Are you using --special?

Anonymous
11/21/25(Fri)06:35:50 No.107281150

Anonymous 11/21/25(Fri)06:35:50 No.107281150

>>107281119
No mention of "safe" or "safety" in the technical report, at least.
https://huggingface.co/tencent/HunyuanVideo-1.5/blob/main/assets/HunyuanVideo_1_5.pdf

Anonymous
11/21/25(Fri)06:42:16 No.107281199

Anonymous 11/21/25(Fri)06:42:16 No.107281199

>>107281119
Demos don’t look as impressive as Wan’s

Anonymous
11/21/25(Fri)06:42:39 No.107281201

Anonymous 11/21/25(Fri)06:42:39 No.107281201

>>107280750
Any narrative on why? USA stock market is down for past week. I always thought BTC went up during those times but I don't follow any of the crypto values.

Anonymous
11/21/25(Fri)06:46:28 No.107281225

Anonymous 11/21/25(Fri)06:46:28 No.107281225

>>107281201
>I always thought BTC went up during those times
That was the promise, but ever since institutional investors started piling in ~2017 or so, it has been more of a leading indicator for the broader market since it is active 24/7.

Anonymous
11/21/25(Fri)06:48:54 No.107281243

Anonymous 11/21/25(Fri)06:48:54 No.107281243

>>107280750
That's good because I got out at the beginning of the year and I need it to drop more so I can buy again.

Anonymous
11/21/25(Fri)06:49:48 No.107281248

Anonymous 11/21/25(Fri)06:49:48 No.107281248

File: 1750627235168554.png (189 KB, 2247x734)

189 KB PNG

https://artificialanalysis.ai/evaluations/omniscience
that's interesting, maybe the first non meme benchmark?

Anonymous
11/21/25(Fri)06:52:36 No.107281270

Anonymous 11/21/25(Fri)06:52:36 No.107281270

>>107281248
Wasn't 405B kinda ass?

Anonymous
11/21/25(Fri)06:57:53 No.107281297

Anonymous 11/21/25(Fri)06:57:53 No.107281297

>>107281270
It wasn’t, but nobody can run it at adequate quants and reasonable speed

Anonymous
11/21/25(Fri)06:59:16 No.107281309

Anonymous 11/21/25(Fri)06:59:16 No.107281309

>>107281225
So BTC gets treated less like gold, more like every other equity instrument then.

Anonymous
11/21/25(Fri)07:01:40 No.107281323

Anonymous 11/21/25(Fri)07:01:40 No.107281323

>>107281225
So it's basically a more responsive representation of the wider market?

Anonymous
11/21/25(Fri)07:08:17 No.107281364

Anonymous 11/21/25(Fri)07:08:17 No.107281364

>>107281297
The nous version was actually rather nice. Was on OR for free a while.

Anonymous
11/21/25(Fri)07:18:15 No.107281430

Anonymous 11/21/25(Fri)07:18:15 No.107281430

>>107281243
I'll only buy if it drops to 50k
no way it's gonna be this bad though, right??

Anonymous
11/21/25(Fri)07:47:53 No.107281652

Anonymous 11/21/25(Fri)07:47:53 No.107281652

Am I imagining things or does Grok Expert (think model) have some kind of subconscious subtly affecting it's outputs?

Anonymous
11/21/25(Fri)07:53:12 No.107281694

Anonymous 11/21/25(Fri)07:53:12 No.107281694

>>107281652
You're imagining you're in the right thread.

Anonymous
11/21/25(Fri)08:10:49 No.107281858

Anonymous 11/21/25(Fri)08:10:49 No.107281858

>>107281652
Supposedly its outputs are affected by past conversations.

Anonymous
11/21/25(Fri)08:13:48 No.107281879

Anonymous 11/21/25(Fri)08:13:48 No.107281879

>>107280477
Finetuning small models is underrated, but the hard part is making the dataset

Anonymous
11/21/25(Fri)08:13:57 No.107281882

Anonymous 11/21/25(Fri)08:13:57 No.107281882

>>107281652
like what

Anonymous
11/21/25(Fri)08:44:01 No.107282117

Anonymous 11/21/25(Fri)08:44:01 No.107282117

File: 92de1fd6-a562-4f4a-8088-6(...).png (1.34 MB, 768x1344)

1.34 MB PNG

>RAM doubles in price
>VRAM shortage
>super delayed from Q1 to Q3
Why everything is so over all the time? We’ve never had good news about hardware

Anonymous
11/21/25(Fri)08:54:12 No.107282212

Anonymous 11/21/25(Fri)08:54:12 No.107282212

>>107282117
>tfw bought 192GB of ram in 2024
>also 192GB of vram
Considering getting another blackwell 6000 just in case before the prices of those rise as well.
My issue is that models that take up >200GB would be kinda slow even on vram so I don't know if it's worth it.
Then again it seems like small experts work so maybe there will be a deepseek-sized qwen3-next worth using.

Anonymous
11/21/25(Fri)09:15:12 No.107282403

Anonymous 11/21/25(Fri)09:15:12 No.107282403

>>107282117
It's 3090s, 5090s, and 6000s all the way down.

Anonymous
11/21/25(Fri)10:50:28 No.107283219

Anonymous 11/21/25(Fri)10:50:28 No.107283219

File: olmo3_data.png (1.56 MB, 1986x2848)

1.56 MB PNG

We won, Reddit! This is open-source Gemma 3!

Anonymous
11/21/25(Fri)10:51:24 No.107283230

Anonymous 11/21/25(Fri)10:51:24 No.107283230

>>107279055
so much this
glm users are braindead

Anonymous
11/21/25(Fri)10:55:24 No.107283263

Anonymous 11/21/25(Fri)10:55:24 No.107283263

>>107283230
Keep in mind that glm-air is recommended as an upgrade over nemo.
Why do people keep comparing air to sota and complaining that it's bad?

Anonymous
11/21/25(Fri)10:57:20 No.107283285

Anonymous 11/21/25(Fri)10:57:20 No.107283285

https://github.com/ggml-org/llama.cpp/pull/17420
lmao hardcoded checks just to make a ruskie troontune work
this kinda shit is always ugly to have in code and should only be reserved for the truly worthy and useful

Anonymous
11/21/25(Fri)10:58:49 No.107283301

Anonymous 11/21/25(Fri)10:58:49 No.107283301

what is the smartest, least cuck model I can run on 12gb vram? Gemma 3 gets its panties in a twist at the slightest thing and I’m tired of using jailbreaks for shit gpt5 just outright tells me about reverse engineering and decompiling c#.

Anonymous
11/21/25(Fri)10:59:16 No.107283302

Anonymous 11/21/25(Fri)10:59:16 No.107283302

File: uber.png (173 KB, 460x460)

173 KB PNG

>>107283285

Anonymous
11/21/25(Fri)11:00:06 No.107283311

Anonymous 11/21/25(Fri)11:00:06 No.107283311

>>107283301
>smartest
gemma 3
>least cuck
nemo
>but
You only pick one.

Anonymous
11/21/25(Fri)11:01:47 No.107283341

Anonymous 11/21/25(Fri)11:01:47 No.107283341

File: 1617117731589.jpg (33 KB, 657x527)

33 KB JPG

>>107283219
I was going to make a racist reply to this post but basically I'm just not going to feed the jew.
Go look for validation elsewhere. Adults are talking.

Anonymous
11/21/25(Fri)11:04:02 No.107283362

Anonymous 11/21/25(Fri)11:04:02 No.107283362

>>107283311
What if I magically had 24 rather than 12?

Anonymous
11/21/25(Fri)11:05:01 No.107283373

Anonymous 11/21/25(Fri)11:05:01 No.107283373

>>107283301
For programming?
Probably one of the coder models.
Mistral Coder, Qwen coder, etc.
Magistral maybe?

Anonymous
11/21/25(Fri)11:05:58 No.107283385

Anonymous 11/21/25(Fri)11:05:58 No.107283385

>>107283362
Largest gemma/qwen you can fit.
Still nemo unless you have enough ram for moes.

Anonymous
11/21/25(Fri)11:06:24 No.107283390

Anonymous 11/21/25(Fri)11:06:24 No.107283390

>>107283302
All hail the 'garm

Anonymous
11/21/25(Fri)11:06:58 No.107283400

Anonymous 11/21/25(Fri)11:06:58 No.107283400

File: MI50.png (127 KB, 1813x984)

127 KB PNG

Hello, got a little side-project working.
This is doable with less than a price of a single 4090.

320GB vram.

Anonymous
11/21/25(Fri)11:08:29 No.107283412

Anonymous 11/21/25(Fri)11:08:29 No.107283412

>>107283385
Is 128 system ram enough for moes?

Anonymous
11/21/25(Fri)11:10:02 No.107283422

Anonymous 11/21/25(Fri)11:10:02 No.107283422

>>107283362
There's no magic involved, only cheap 3060s

Anonymous
11/21/25(Fri)11:11:36 No.107283430

Anonymous 11/21/25(Fri)11:11:36 No.107283430

>>107283263
The anti-GLM shilling is even more inorganic and forced than the single retard who thinks 4.6 is the best local model when Kimi exists.
>>107283412
No. 192 or 256 are the minimum height to ride the ferris wheel.
>>107283400
Very nice anon.

Anonymous
11/21/25(Fri)11:33:32 No.107283636

Anonymous 11/21/25(Fri)11:33:32 No.107283636

>>107279087
Can I run imagen on multiple gpus by using ggufs?

Anonymous
11/21/25(Fri)11:34:41 No.107283641

Anonymous 11/21/25(Fri)11:34:41 No.107283641

>>107283263
>Why do people keep comparing air to sota
I don't know who's doing that. A 12b active param moe is never going to compete with 32b active ones.

Anonymous
11/21/25(Fri)11:41:08 No.107283716

Anonymous 11/21/25(Fri)11:41:08 No.107283716

File: the face of delusion.png (69 KB, 1617x312)

69 KB PNG

the true face of delusion
>so we're already using this model that is much better than the other model.. we'll switch to that other model because ????

Anonymous
11/21/25(Fri)11:48:41 No.107283784

Anonymous 11/21/25(Fri)11:48:41 No.107283784

>>107283219
>>107283341
Anyway, incredible how there's nothing directly from Project Gutenberg, or human conversation datasets, let alone non-commercial creative data sources that might have copyright protection (AO3, etc.? Gemma 2/3 and Mistral Small definitely had that). Post-training is mostly math, reasoning and ancient GPT3.5/4-era datasets poison-pilled with (((safety))).

A real shame that so many resources were put into one of the most boring and soulless models ever made, I still can't wrap my head around it.

Anonymous
11/21/25(Fri)11:50:06 No.107283798

Anonymous 11/21/25(Fri)11:50:06 No.107283798

File: rocm_smi_live.webm (1.41 MB, 975x293)

1.41 MB WEBM

>>107283430
>Very nice anon.
the only downside of the system is model load time (5min for 30b and ~30min for 235b) and speed of course compared to cuda devices

Anonymous
11/21/25(Fri)11:50:54 No.107283806

Anonymous 11/21/25(Fri)11:50:54 No.107283806

>>107283798
how's the pp on longer prompts?

Anonymous
11/21/25(Fri)11:52:27 No.107283826

Anonymous 11/21/25(Fri)11:52:27 No.107283826

>>107283784
it's also the worst at multilingual understanding of all recent models ever released
I haven't seen anything worse than olmo here and that includes models as small as the 4b gemma and qwen

Anonymous
11/21/25(Fri)11:57:42 No.107283869

Anonymous 11/21/25(Fri)11:57:42 No.107283869

Many years ago I asked Olmo to write a suno prompt for me. we're at 9000 thinking tokens, so surely we must be near the finish line.
It seems to be hallucinating the ability to count characters and is hung up on hitting exactly 1000 characters.

Anonymous
11/21/25(Fri)11:58:49 No.107283877

Anonymous 11/21/25(Fri)11:58:49 No.107283877

>>107283806
i can test that, can you provide an example?

Anonymous
11/21/25(Fri)12:00:46 No.107283894

Anonymous 11/21/25(Fri)12:00:46 No.107283894

>>107283877
idk I'm not too interested in the content, copy a bunch of this and ask for a summary if you need something https://courses.cs.washington.edu/courses/cse390c/24wi/lectures/moby.txt

Anonymous
11/21/25(Fri)12:01:12 No.107283899

Anonymous 11/21/25(Fri)12:01:12 No.107283899

>>107283826
Gemma 4b is unbeatable for multilingual, even better than Gemma 12B surprisingly. It needs guided outputs though, because it loves adding useless things despite a good system prompt

Anonymous
11/21/25(Fri)12:07:44 No.107283950

Anonymous 11/21/25(Fri)12:07:44 No.107283950

>The base model was trained on 6T tokens, so at 7.7k tokens/s that’s about 220k H100-hours. That’s about $ 500,000 for the 7B model. The 32B model would then cost somewhere around $ 2,225,000.
I find the world of LLMs fascinating in how something that obviously no one will use can be funded with this much money

Anonymous
11/21/25(Fri)12:08:45 No.107283959

Anonymous 11/21/25(Fri)12:08:45 No.107283959

>>107283950
It's money that never existed to begin with.

Anonymous
11/21/25(Fri)12:13:05 No.107283985

Anonymous 11/21/25(Fri)12:13:05 No.107283985

>>107283950
VC money is fascinating yes

Anonymous
11/21/25(Fri)12:17:00 No.107284019

Anonymous 11/21/25(Fri)12:17:00 No.107284019

File: olmo3-compute.png (42 KB, 979x236)

42 KB PNG

>>107283950
>>107283985
It was at least partially publicly funded.

>This research used resources of the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR22725. We acknowledge the National Artificial Intelligence Research Resource (NAIRR) Pilot and Microsoft Azure for contributing to the results in this work.

Anonymous
11/21/25(Fri)12:22:24 No.107284074

Anonymous 11/21/25(Fri)12:22:24 No.107284074

>>107284019
also, for their future models:
>NSF and NVIDIA award Ai2 a combined $152M to support building a national level fully open AI ecosystem
lmao
gotta love american tax payer money going to pos nobody asked for

Anonymous
11/21/25(Fri)12:23:19 No.107284081

Anonymous 11/21/25(Fri)12:23:19 No.107284081

File: 1710147342383.jpg (82 KB, 720x810)

82 KB JPG

>my llm called me a muppet

Anonymous
11/21/25(Fri)12:27:04 No.107284116

Anonymous 11/21/25(Fri)12:27:04 No.107284116

>>107284074
Is it that easy to get your pet projects funded? Why isn't llama.cpp taking a share?

Anonymous
11/21/25(Fri)12:27:13 No.107284120

Anonymous 11/21/25(Fri)12:27:13 No.107284120

>>107283959
In what sense is that true though? The time and effort of all the people who worked in the supply chain to turn sand into those H100s certainly was real.

Anonymous
11/21/25(Fri)12:28:31 No.107284132

Anonymous 11/21/25(Fri)12:28:31 No.107284132

>>107284116
>Why isn't llama.cpp taking a share?
it's actually useful to people, strike one against public funding
and gigagganov probably doesn't have the right friends in the right place, nepotism is strike 2

Anonymous
11/21/25(Fri)12:29:04 No.107284139

Anonymous 11/21/25(Fri)12:29:04 No.107284139

>>107284116
Because ggeranov isn't Paul Allen.

Anonymous
11/21/25(Fri)13:07:24 No.107284518

Anonymous 11/21/25(Fri)13:07:24 No.107284518

Any setups that salvage K2-Thinking for scenario-based RP yet? It's fine for chatting if you make it think in-character but it's just so fucking autistic when it has to act as a narrator/dm.

Anonymous
11/21/25(Fri)13:11:41 No.107284563

Anonymous 11/21/25(Fri)13:11:41 No.107284563

>>107283400
Damn that's a lot of cards. What's your build look like? How'd you link them all together?

Anonymous
11/21/25(Fri)13:17:24 No.107284600

Anonymous 11/21/25(Fri)13:17:24 No.107284600

>>107283400
>less than a price of a single 4090
>2250W
Not counting your electric bill for sure lol

Anonymous
11/21/25(Fri)13:43:57 No.107284867

Anonymous 11/21/25(Fri)13:43:57 No.107284867

>>107284081
Post logs.
>>107284518
Kimi is autismmaxxed across the board.

llama.cpp CUDA dev !!yhbFjk57TDr
11/21/25(Fri)13:47:29 No.107284892

llama.cpp CUDA dev !!yhbFjk57TDr 11/21/25(Fri)13:47:29 No.107284892

>>107267381
Thank you, it was an interesting read.
Though I don't think it will be applicable for anything other than FP64.
One could maybe use this technique to circumvent the gimped FP64 performance on consumer NVIDIA GPUs.

Anonymous
11/21/25(Fri)13:50:04 No.107284919

Anonymous 11/21/25(Fri)13:50:04 No.107284919

>>107284518
Thinking models are just pure shit for rp in general. They are trained to identify and reiterate relevant details ad nauseum. Which translates poorly to rp where the usage of details is highly nuanced. Sometimes less is more. Sometimes it's not.
>Bob steps forward in his size 11 tan boots, the tattoo of a snake on his arm glistening in the rural Wisconsin sunset.

Anonymous
11/21/25(Fri)14:31:41 No.107285331

Anonymous 11/21/25(Fri)14:31:41 No.107285331

>>107284919
Looks good for story building

Anonymous
11/21/25(Fri)14:38:06 No.107285393

Anonymous 11/21/25(Fri)14:38:06 No.107285393

>>107284919
I've actually found thinking to help and describe some stuff properly with my glm 4.6 quant. With it turned off I was getting more half-assed responses. It really depends on your settings and context in the end.

Anonymous
11/21/25(Fri)14:40:48 No.107285417

Anonymous 11/21/25(Fri)14:40:48 No.107285417

>>107283636
https://github.com/pollockjj/ComfyUI-MultiGPU

Anonymous
11/21/25(Fri)14:44:45 No.107285446

Anonymous 11/21/25(Fri)14:44:45 No.107285446

>>107285393
thinking models with thinking turned off are extremely dumb and are not representative of the proper behavior of an instruct model without thinking.
The old DeepSeek V3 and Kimi K2 not-thinking, along with Qwen 3 2507 instructs, are proper not-shitthinking models.

Anonymous
11/21/25(Fri)15:17:25 No.107285688

Anonymous 11/21/25(Fri)15:17:25 No.107285688

>>107285446
>old DeepSeek V3
unusably repetitive
>Kimi K2 not-thinking
unusably retarded
>qwen for RP
lol is this all just bait or what?

Anonymous
11/21/25(Fri)15:19:34 No.107285705

Anonymous 11/21/25(Fri)15:19:34 No.107285705

>>107285688
This but unironically

Anonymous
11/21/25(Fri)15:37:39 No.107285889

Anonymous 11/21/25(Fri)15:37:39 No.107285889

File: distillation.png (495 KB, 3831x2020)

495 KB PNG

Went ahead and paid $200 to harvest data for distillation.
We'll see how it goes after I run out of tokens and actually try using the data to finetune a model.

Anonymous
11/21/25(Fri)15:39:26 No.107285915

Anonymous 11/21/25(Fri)15:39:26 No.107285915

>>107285889
>paying to harvest data
huh?

Anonymous
11/21/25(Fri)15:43:53 No.107285965

Anonymous 11/21/25(Fri)15:43:53 No.107285965

>>107285889
To clarify, the project architecture is this:
First I asked the model to produce 100 C programming challenges for high school, college year 1/2/2/4 (100 each), masters and phd level skills.
Then I'm going to run it from scratch for each project as I log the outputs (if I still have credits I'm going to do it multiple times for each project since it will still be useful to get different outputs for a single prompt, but that is unlikely at current usage, it seems to churn at about 1% weekly usage per 10 files generated).
I have two ways of logging the outputs, I have an OAI compatible logging proxy setup, and the assistant also can save the conversations.

Anonymous
11/21/25(Fri)15:48:04 No.107286017

Anonymous 11/21/25(Fri)15:48:04 No.107286017

>>107281248
>knowledge benchmark
>doesn't include pop culture knowledge
Nope. Only useful if you're just looking for the best assistant.

Anonymous
11/21/25(Fri)15:49:10 No.107286038

Anonymous 11/21/25(Fri)15:49:10 No.107286038

>train on autogenerated highschool tests
>resulting finetune still can't gen inference engine on its own
>how could this be

Anonymous
11/21/25(Fri)15:50:19 No.107286054

Anonymous 11/21/25(Fri)15:50:19 No.107286054

>>107278838
needs more JPEG

Anonymous
11/21/25(Fri)15:50:30 No.107286055

Anonymous 11/21/25(Fri)15:50:30 No.107286055

>>107286038
what?

Anonymous
11/21/25(Fri)15:50:30 No.107286057

Anonymous 11/21/25(Fri)15:50:30 No.107286057

>>107285915
Yes, I paid to run the model and collect the outputs to build a dataset and then use it to finetune open weights models (in this case on C coding tasks).
My thesis is that we can finetune small models to reach good quality outputs on specific tasks like that.
I also plan to use the GLM coding plan to get additional data since it's much cheaper. Then maybe first finetune on GLM, and then on the -presumably higher quality- gpt 5.1 data.
One possible issue is that the reasoning traces from gpt are not real, they are cleaned up by another model before being returned to the user, so it's unclear how well it will work. The real traces are more like gpt-oss, much less verbose, omitting connectives and such.

Anonymous
11/21/25(Fri)15:53:21 No.107286088

Anonymous 11/21/25(Fri)15:53:21 No.107286088

>>107286038
Well, since the small open models are shit at coding and fail to even copy paste things without typos, I figured we would benefit from including some of the basics, since the model might fail to learn anything more advanced than that.

>>107286055
He recognized who I am from the formatting of the tool calls.

Anonymous
11/21/25(Fri)15:55:15 No.107286102

Anonymous 11/21/25(Fri)15:55:15 No.107286102

>>107286088
>I figured we would benefit from including some of the basics
Yeah. Because they weren't trained with that at all for about 15T tokens.
>He recognized who I am from the formatting of the tool calls.
It was the way you write.

Anonymous
11/21/25(Fri)16:04:36 No.107286185

Anonymous 11/21/25(Fri)16:04:36 No.107286185

File: ps.png (130 KB, 796x158)

130 KB PNG

I'm too scared to check the DRAM prices today, maybe the cpumaxxing dream is dead
>>107284867

Anonymous
11/21/25(Fri)16:04:52 No.107286186

Anonymous 11/21/25(Fri)16:04:52 No.107286186

>>107286102
What benchmark do you think we should use to get a quantitative evaluation of improvement in coding performance?

Anonymous
11/21/25(Fri)16:07:07 No.107286207

Anonymous 11/21/25(Fri)16:07:07 No.107286207

>>107286186
>draw a portrait of Hatsune Miku in SVG

Anonymous
11/21/25(Fri)16:07:27 No.107286208

Anonymous 11/21/25(Fri)16:07:27 No.107286208

As for being trained on lots of tokens, maybe that's part of the problem. The model stores a lot of useless trivia knowledge like Napoleon's birth date or marine fauna.
You could argue that something like qwen coder wouldn't but I bet it remembers the exact birth date of Napoleon.

Anonymous
11/21/25(Fri)16:09:00 No.107286227

Anonymous 11/21/25(Fri)16:09:00 No.107286227

>>107286207
Ideally it shouldn't know who Miku is, or arguably even how a pelican looks like. I don't care about my C coding agent knowing the shape of real world things, maybe unless I was making a game which would likely benefit from knowing such things.

Anonymous
11/21/25(Fri)16:09:50 No.107286236

Anonymous 11/21/25(Fri)16:09:50 No.107286236

>>107286186
If you need to make a dataset to finetune the model, it already failed the benchmark.

Anonymous
11/21/25(Fri)16:10:07 No.107286238

Anonymous 11/21/25(Fri)16:10:07 No.107286238

>>107286185
prices increase about 2-3% every weekday

Anonymous
11/21/25(Fri)16:12:17 No.107286253

Anonymous 11/21/25(Fri)16:12:17 No.107286253

File: crossworlds.jpg (187 KB, 634x798)

187 KB JPG

>>107286227
what is my purpose
>you produce vibecode only
A model that knows nothing about Miku is a model I don't care for
>>107286238
mamma mia *sobs*

Anonymous
11/21/25(Fri)16:13:47 No.107286265

Anonymous 11/21/25(Fri)16:13:47 No.107286265

>>107286236
So which way do you propose to verify or falsify your claim that finetuning can't improve a model on coding tasks because it was already trained on a lot of code tokens?

Anonymous
11/21/25(Fri)16:17:12 No.107286300

Anonymous 11/21/25(Fri)16:17:12 No.107286300

fuck coding or roleplay or slop or coherence or being smart
which model is the best for hype moments and aura?

Anonymous
11/21/25(Fri)16:21:32 No.107286339

Anonymous 11/21/25(Fri)16:21:32 No.107286339

>>107286265
I don't doubt models can be better. I doubt you will be the one to make it happen. Your expectations are unrealistic.

Anonymous
11/21/25(Fri)16:25:53 No.107286380

Anonymous 11/21/25(Fri)16:25:53 No.107286380

>>107286300
Mistral-Large-Instruct-2411-exl2-longcal 2.25bpw

Anonymous
11/21/25(Fri)16:30:58 No.107286426

Anonymous 11/21/25(Fri)16:30:58 No.107286426

>>107286339
Ok, my question still stands. What benchmark do you propose to see if whatever I do works or doesn't? SWEbench maybe? I need to figure out how to run that.

Anonymous
11/21/25(Fri)16:35:04 No.107286461

Anonymous 11/21/25(Fri)16:35:04 No.107286461

>>107286300
DavidAU_CLOWN_CAR_MOE_ULTRA_DARKNESS

Anonymous
11/21/25(Fri)16:42:11 No.107286506

Anonymous 11/21/25(Fri)16:42:11 No.107286506

>>107286300
DS R1-0528

Anonymous
11/21/25(Fri)16:45:11 No.107286531

Anonymous 11/21/25(Fri)16:45:11 No.107286531

>>107286426
Making the inference engine is the benchmark. All the models you tried failed. All your finetunes failed. They all got 0%.

Anonymous
11/21/25(Fri)16:48:51 No.107286569

Anonymous 11/21/25(Fri)16:48:51 No.107286569

Who will first beat Gemini 3 on lmarena? I'm betting on meta with their new proprietary emoji-spamming moonjeet big wang model.

Anonymous
11/21/25(Fri)16:51:42 No.107286597

Anonymous 11/21/25(Fri)16:51:42 No.107286597

>>107286531
Ah, fair enough.

Anonymous
11/21/25(Fri)16:54:18 No.107286617

Anonymous 11/21/25(Fri)16:54:18 No.107286617

File: media_G6Qx2uuaMAAOV0x.jpg (189 KB, 1290x1694)

189 KB JPG

>>107286569
meta is dead in the water

Anonymous
11/21/25(Fri)16:59:47 No.107286669

Anonymous 11/21/25(Fri)16:59:47 No.107286669

>>107286227
I don't care about your coding. Learn to do it yourself.

Anonymous
11/21/25(Fri)17:01:20 No.107286680

Anonymous 11/21/25(Fri)17:01:20 No.107286680

>>107286617
>benchmaxxed
Yep it's fake

Anonymous
11/21/25(Fri)17:02:22 No.107286692

Anonymous 11/21/25(Fri)17:02:22 No.107286692

>>107286617
Imagine having all this money and barely being able to produce anything because you only hire jeetmonkeys

Anonymous
11/21/25(Fri)17:03:10 No.107286700

Anonymous 11/21/25(Fri)17:03:10 No.107286700

>>107286692
Worked fine for google.

Anonymous
11/21/25(Fri)17:03:26 No.107286702

Anonymous 11/21/25(Fri)17:03:26 No.107286702

>>107286669
I want to help automate programming and other medium level cognitive tasks so humanity can focus on the things that matter, like curing cancer.
As for the RP y'all care so much about, I already can get off to porn just fine, and there is plenty of it. The next update for me is a real woman, not a text generator.

Anonymous
11/21/25(Fri)17:05:14 No.107286710

Anonymous 11/21/25(Fri)17:05:14 No.107286710

>>107286700
Probably since Google has jeets at the very top. Only jeets know the bad jeets from the good jeets (which do exist but it's like a needle in a haystack). Meta does not know this so they are hiring the bad jeets.

Anonymous
11/21/25(Fri)17:07:34 No.107286729

Anonymous 11/21/25(Fri)17:07:34 No.107286729

>>107286710
They just need a Markuribrasti Zuckerbergadranishava.

Anonymous
11/21/25(Fri)17:08:22 No.107286732

Anonymous 11/21/25(Fri)17:08:22 No.107286732

>>107286720
hello saar, llama5 will gorgeous

Anonymous
11/21/25(Fri)17:10:12 No.107286758

Anonymous 11/21/25(Fri)17:10:12 No.107286758

>>107286720
>Announcing a report or "sage"

Anonymous
11/21/25(Fri)17:11:10 No.107286768

Anonymous 11/21/25(Fri)17:11:10 No.107286768

>>107286758
>>107286758

Anonymous
11/21/25(Fri)17:11:37 No.107286773

Anonymous 11/21/25(Fri)17:11:37 No.107286773

>>107286700
>Worked fine for google.
Google has their own TPU so they don't depend on NJewdia and they also own youtube and google so they have infinite data to train their models, it's not just quality jeets

Anonymous
11/21/25(Fri)17:12:37 No.107286778

Anonymous 11/21/25(Fri)17:12:37 No.107286778

>>107286758
How did you know he made a report?

Anonymous
11/21/25(Fri)17:14:33 No.107286801

Anonymous 11/21/25(Fri)17:14:33 No.107286801

>>107286778
It's in the word “announcing”, he invites people to report, which is prohibited on 4chan.

Anonymous
11/21/25(Fri)17:17:28 No.107286832

Anonymous 11/21/25(Fri)17:17:28 No.107286832

File: 1739800472497167.png (54 KB, 1205x369)

54 KB PNG

>>107286773
Meta also got scammed by Scale AI lmao

Anonymous
11/21/25(Fri)17:17:50 No.107286835

Anonymous 11/21/25(Fri)17:17:50 No.107286835

>>107286801
NTA but I wouldn't consider reminding people of the rules to be the same as announcing a report.

Anonymous
11/21/25(Fri)17:18:53 No.107286846

Anonymous 11/21/25(Fri)17:18:53 No.107286846

>>107286617
How the fuck did they learn nothing? Meta, a multi-billion dollar corpo is less daring to try new stuff than some cryptobro's side hustle. They had BLT, coconut whatever the other shit FAIR wrote papers on, there is no shame in copying open source architecture, but no, gotta repeat the same shit over and over again. People don't like the fact that qwen and phi are better on benchmarks than in practice? Let's do exactly that! People hate llm-isms? Let's amplify them by 1000x. Pre-filtering data hurts the model and every big corpo knows not to do it? Let's make our domain-level bad word count ban more strict! Pure synthetic slop model let's fucking go! World knowledge? Don't need that! To the moon! :rocket: :rocket: :rocket:

Meta unironically peaked with llama 2, it was downhill from there. L3 8k context lmao

Anonymous
11/21/25(Fri)17:20:13 No.107286858

Anonymous 11/21/25(Fri)17:20:13 No.107286858

>>107286846
>>107286692

Anonymous
11/21/25(Fri)17:20:15 No.107286860

Anonymous 11/21/25(Fri)17:20:15 No.107286860

File: thought-hallucinations-gpt5.1.png (406 KB, 3874x1919)

406 KB PNG

What in the ever living fuck?
This has got to do with them injecting something about indians in the thought stream to place higher in that jeet benchmark they were bragging about, right? Not even a 1B chink model would hallucinate this out of nowhere.
Kind of like they used to secretly ask DALL-E to include black characters in the system prompt.

Anonymous
11/21/25(Fri)17:20:33 No.107286862

Anonymous 11/21/25(Fri)17:20:33 No.107286862

>>107286835
>t. cannot detect calls to action

Anonymous
11/21/25(Fri)17:20:43 No.107286863

Anonymous 11/21/25(Fri)17:20:43 No.107286863

imagine thinking anything can stop the saarposting

Anonymous
11/21/25(Fri)17:22:39 No.107286888

Anonymous 11/21/25(Fri)17:22:39 No.107286888

>>107286832
>Meta also got scammed by Scale AI lmao
that was a willing buy into the scam.
no way zuck didn't know, there's just something we're not privy to (deep friendship with wang wang? maybe more, like faggotry? or black mail? or scale is a money laundering op, or basically anything other than believing meta got scammed for $14b which is retarded.)

Anonymous
11/21/25(Fri)17:22:40 No.107286889

Anonymous 11/21/25(Fri)17:22:40 No.107286889

>>107286832
Did they really think scale was selling human data? You can probably do ctrl+f on it and find thousands of "Elara"s, "shiver"s, "tapestry"s and "whisper"s

Anonymous
11/21/25(Fri)17:23:04 No.107286892

Anonymous 11/21/25(Fri)17:23:04 No.107286892

>>107286863
I certainly won't stop posting about it. Jeets ruined crypto, then moved to AI. Thankfully, they're out of their depth there which has allowed chinese to gatekeep them on most projects

Anonymous
11/21/25(Fri)17:24:20 No.107286899

Anonymous 11/21/25(Fri)17:24:20 No.107286899

>>107286860
Or maybe it's just a result of overfitting on those indian benchmark questions? Since it happened at a long context, maybe it is just a hallucination.

Anonymous
11/21/25(Fri)17:24:39 No.107286905

Anonymous 11/21/25(Fri)17:24:39 No.107286905

>>107286892
based

Anonymous
11/21/25(Fri)17:25:41 No.107286911

Anonymous 11/21/25(Fri)17:25:41 No.107286911

>>107286846
>L3 8k context lmao
Ikr, it's still funny to this day kek

Anonymous
11/21/25(Fri)17:27:35 No.107286925

Anonymous 11/21/25(Fri)17:27:35 No.107286925

>>107286617
Meta spent a billion dollars to take the retards off the hands of their competition. Now they can only tank one of the western AI projects instead of all of them. That's so generous of them.

Anonymous
11/21/25(Fri)17:27:48 No.107286927

Anonymous 11/21/25(Fri)17:27:48 No.107286927

I'm new to this and I am retarded, where do I start?

I want to have my own ChatGPT/Copilot running locally on my PC, uncensored, and use it primarily for text analysis. For example, I want to be able to send it a huge amount of text like gigantic and have it summarize it, or have it correct spelling errors, or translate it decently into another language, etc.

Not sure where to start, or what should I download first, or what is the best option?

Anonymous
11/21/25(Fri)17:30:12 No.107286944

Anonymous 11/21/25(Fri)17:30:12 No.107286944

>>107286846
>How the fuck did they learn nothing?
Because they have the same retard at the top that keeps hiring retarded yesmen to report to him and the rot spreads down from there. It'll be the same exact story every time. If anything it'll be worse wow that LeCun is leaving.

Anonymous
11/21/25(Fri)17:30:49 No.107286951

Anonymous 11/21/25(Fri)17:30:49 No.107286951

>>107286860
>>107286899
Or maybe I just got sent a <think> segment from another person? I once (seemingly) got sent a response to a question from another person on the web interface, so it wouldn't surprise me...

Anonymous
11/21/25(Fri)17:31:13 No.107286957

Anonymous 11/21/25(Fri)17:31:13 No.107286957

>>107286927
>a huge amount of text like gigantic
>translate it decently
You'll need about $20k worth of gpus to start.

Anonymous
11/21/25(Fri)17:33:01 No.107286977

Anonymous 11/21/25(Fri)17:33:01 No.107286977

>>107286944
>If anything it'll be worse wow that LeCun is leaving.
they were really mid even with LeCun working for them, I think this guy is overrated as fuck, he's better a whining about Drumpf on twitter than making good models

Anonymous
11/21/25(Fri)17:35:18 No.107286993

Anonymous 11/21/25(Fri)17:35:18 No.107286993

>>107286977
He wrote some papers 30 years ago, he's largely irrelevant now

Anonymous
11/21/25(Fri)17:36:20 No.107287000

Anonymous 11/21/25(Fri)17:36:20 No.107287000

gemma 3 27b might not be as good as SOTA models, but it's actually pretty decent at translation. As for the "gigantic" part, just write a script to chunk the text into bite sizes and run parallel batches. That's how you should do it with SOTA models too anyway (just with bigger chunks).
There you go, not $20k worth of gpus.
For some languages pair 3n E4B is also impressive (if you can't run 27b it's the best second option and strangely better than 12b at this task, I dunno what went wrong with the original gemma 3 distillations)
for summarization Qwen 30BA3B does a serviceable job, or GPT-OSS 20B if you really have no vram (iSWA to the rescue).
Gemma has really shit large context behavior so you need one of those other two models for this.

Anonymous
11/21/25(Fri)17:36:23 No.107287001

Anonymous 11/21/25(Fri)17:36:23 No.107287001

>>107283400
Nice setup, you'll certainly stay warm and comfy in the colder months.

Anonymous
11/21/25(Fri)17:39:45 No.107287022

Anonymous 11/21/25(Fri)17:39:45 No.107287022

>>107286702
It's a noble thought but already research showed that handing this off to AI made us stupider. There's plenty of porn today yet digital ID is coming. Since using LLMs, porn has become lame anyway.
There's zero reason models can't be good at both except for the prudish assholes training them.

Anonymous
11/21/25(Fri)17:41:23 No.107287033

Anonymous 11/21/25(Fri)17:41:23 No.107287033

>>107286702
>curing cancer
oh boy you won't like that one

Anonymous
11/21/25(Fri)17:43:11 No.107287042

Anonymous 11/21/25(Fri)17:43:11 No.107287042

>>107287022
>It's a noble thought but already research showed that handing this off to AI made us stupider
just look at the bun repo for what happens when you vibecodemaxx
these niggers made file apis where using the method to check for a file existence has cached results, meaning if you repeatedly call .exists() it will always give you the same result even if the file doesn't exist anymore (or exists now and didn't before) lmao
and this is the js runtime claude code uses
hahahahahaha
anthropic is full of retards

Anonymous
11/21/25(Fri)17:43:49 No.107287049

Anonymous 11/21/25(Fri)17:43:49 No.107287049

>>107287022
All research is basically fake, researchers think of an attention grabbing headline that confirms their audience's preconceived notions and then p-hack the experiment until it gives them the data they need to write their nature-worthy paper.

Anonymous
11/21/25(Fri)17:45:04 No.107287058

Anonymous 11/21/25(Fri)17:45:04 No.107287058

>>107287049
I TRAINED THE MODEL TO KILL ME IF I TRIED TO SHUT IT OFF AND IT DID I SWEAR

Anonymous
11/21/25(Fri)17:46:01 No.107287067

Anonymous 11/21/25(Fri)17:46:01 No.107287067

>>107287000
Gemma is cute... Unfortunately it's bit flawed.
Still way better than Mistral 3.2 for example. Sort of grown to hate Mistral at this point.

Anonymous
11/21/25(Fri)17:46:12 No.107287071

Anonymous 11/21/25(Fri)17:46:12 No.107287071

File: Screenshot_20251121_22413(...).jpg (2.68 MB, 1440x8894)

2.68 MB JPG

The post-training job on K2 is exquisite. I refuse to believe another model can top this out of the box.
It just works, understands my expressed intent well and produces results that aren't overly sterile even if it's a little over the top. Not to mention the minimal sycophancy.

Anonymous
11/21/25(Fri)17:46:33 No.107287075

Anonymous 11/21/25(Fri)17:46:33 No.107287075

>>107286702
>>107287022
>It's a noble thought but already research showed that handing this off to AI made us stupider
Who is "us"? The people who are living their lives every day mindlessly, or people who are actively pushing and giving effort to advance in whatever they're doing? There is plenty of research now that the brain is not an infinite learning machine. What we gain in intelligence, we give up in other areas, and the inverse is true when effort is made. Only those who are letting their brain go underutilized are seeing degeneration in perceived intelligence.

Anonymous
11/21/25(Fri)17:46:56 No.107287078

Anonymous 11/21/25(Fri)17:46:56 No.107287078

>>107286702
How do would you know when your llm is putting out good code if you don't know what good code is

>It's good code if it does what I want it to do
Yeah like using napalm on your farm fields to get rid of slugs

Anonymous
11/21/25(Fri)17:47:32 No.107287081

Anonymous 11/21/25(Fri)17:47:32 No.107287081

>>107287042
The answer to that kind of bullshit is twofold. One minimalism, keeping things as simple as possible. My vibecoding code assistant is built on that principle and works ok. The other is using specialized LLMs to write formal specs and formal proofs for everything. To do that it'd be useful if somebody makes a formal spec of something like the Python interpreter (kind of like they did for C with Frama-C) as well as for the Unix API.

Anonymous
11/21/25(Fri)17:48:31 No.107287095

Anonymous 11/21/25(Fri)17:48:31 No.107287095

File: amazing.gif (469 KB, 220x275)

469 KB GIF

>>107287071
>pircel
HOLY SHIT THAT LEVEL OF KINO

Anonymous
11/21/25(Fri)17:51:10 No.107287117

Anonymous 11/21/25(Fri)17:51:10 No.107287117

>>107287078
>>It's good code if it does what I want it to do
the problem is what you might want to do later but didn't think of when you first wrote the code eh:
https://github.com/oven-sh/bun/issues/22484
https://github.com/oven-sh/bun/issues/23902
https://github.com/oven-sh/bun/issues/21537
https://github.com/oven-sh/bun/issues/19276
https://github.com/oven-sh/bun/issues/8254
all of those are very serious issues that should have caused anyone who notices them to pause and think "do we really want to use such a piece of shit"
anthropic dev saw this and felt like: hell yeah
bun changelog 1.3.3: we finally handle resize events so that claude code no longer breaks on a resized terminal

Anonymous
11/21/25(Fri)17:53:54 No.107287143

Anonymous 11/21/25(Fri)17:53:54 No.107287143

>>107287081
stateful OO is definitely not compatible with vibecoding, but that hasn't stopped anyone from trying at [insert literally all current big corpos]

Anonymous
11/21/25(Fri)17:55:40 No.107287158

Anonymous 11/21/25(Fri)17:55:40 No.107287158

>>107287075
"us" is people who began to use AI daily for their tasks such as programming. Same as how memories went to shit when you could "look it up online". Memorization used to be a valuable skill. This is bigger, it's general problem solving.

Anonymous
11/21/25(Fri)17:57:57 No.107287179

Anonymous 11/21/25(Fri)17:57:57 No.107287179

>>107287158
If the programmers that are using AI are seeing overall degration in intelligence, that's not because of the AI, it's because of them not using their brains to do other things after AI freed up cognitive resources.

Anonymous
11/21/25(Fri)17:59:21 No.107287195

Anonymous 11/21/25(Fri)17:59:21 No.107287195

>>107287078
The problem is traditional programming conflates code and architecture.
Ideally your system should be composed of black box modules that take in some data and spit out some data, with preconditions and memory management information specified in natural language and some kind of formal language.
The black box is specified in code. The architecture is specified in some kind of logic based formal language.
Since the requirements for each black box are specified in a formal language, it doesn't matter what the actual code does unless you have specific performance requirements. If you do, those can be encoded in the formal spec and the module has to provide a machine verifiable proof that it meets the performance requirements.

Anonymous
11/21/25(Fri)17:59:45 No.107287201

Anonymous 11/21/25(Fri)17:59:45 No.107287201

Anons. He's been trying to make models make his inference engine for MONTHS. With gemma-3-27b being one of his latest attempts. Argue with him for the fun of it if you must, but don't try to make it see reason.

Anonymous
11/21/25(Fri)18:02:30 No.107287232

Anonymous 11/21/25(Fri)18:02:30 No.107287232

>>107287201
>With gemma-3-27b being one of his latest attempts
lol using gemma for code what a schizo

Anonymous
11/21/25(Fri)18:03:20 No.107287242

Anonymous 11/21/25(Fri)18:03:20 No.107287242

>>107287179
They're copypasting the code blindly, while not interacting with the LLM. I often question the AI choices, which helps me learn a bunch of new things each time. Also, you need to think hard about the overall architecture and planning that before writing a single line of code which no one does.

Anonymous
11/21/25(Fri)18:05:39 No.107287260

Anonymous 11/21/25(Fri)18:05:39 No.107287260

>>107287232
You don't know half of it. He was finetuning it with like 20 of his chat samples and expecting good results.

Anonymous
11/21/25(Fri)18:08:27 No.107287285

Anonymous 11/21/25(Fri)18:08:27 No.107287285

>>107287242
>which no one does.
I wouldn't say that. It's just that those that take the time to learn how to use their tools effectively aren't the ones whining online.

Anonymous
11/21/25(Fri)18:10:54 No.107287304

Anonymous 11/21/25(Fri)18:10:54 No.107287304

File: cock.png (469 KB, 2426x2226)

469 KB PNG

>>107287201
Rome wasn't built in a day.

>>107287232
Originally I picked it because I wanted a multimodal model, because eventually I want a full blown assistant capable of manipulating my computer's screen, but I also did a little RP.
I finetuned it on a dataset that included ERP (LimaRP), then I gradually decreased the strength of the LoRa on every reply. See how it goes from being schizo but horny to being robotic and initiating a soft refusal (the last two responses were stock without the LoRa IIRC).

Anonymous
11/21/25(Fri)18:12:57 No.107287321

Anonymous 11/21/25(Fri)18:12:57 No.107287321

>>107287071
>>107287095
more snippets from the very next message

Absolutely based and fecespilled.
---
>94822455 t.lion main who doesn't understand psychological warfare Monke's whole schtick is making others uncomfortable enough to leave the territory. It's called "aversive marking" and it's kino as fuck. You're just mad your "majestic roar" can't compete with a monkey climaxing while staring into your soul.
---
>94822480 t.literal who build that eats snakes Nobody gives a fuck about your stilt-walking ass go back to /birb/
---
MODS = FAGS MONKE DID NOTHING WRONG FREE MY NIGGA
[free_monke.jpg]
---
>94822391 I was the zebra he shit on. Can confirm it was kino. Still ate his cousin later though.
---
>94822662 He also included Polaroids of his red ass with "BAN THIS" written on them in Sharpie. The absolute state.

Anonymous
11/21/25(Fri)18:15:18 No.107287348

Anonymous 11/21/25(Fri)18:15:18 No.107287348

>>107287260
Well, I abandoned that approach, for now at least. I think it might be possible to make use of a dataset like that with RL, though. OpenAI claims their RFT API can do meaningful finetuning with a little as 10 samples.

Anonymous
11/21/25(Fri)18:15:20 No.107287349

Anonymous 11/21/25(Fri)18:15:20 No.107287349

>>107287071
sovl...

Anonymous
11/21/25(Fri)18:17:52 No.107287365

Anonymous 11/21/25(Fri)18:17:52 No.107287365

https://github.com/ggml-org/llama.cpp/pull/16971
oh this was finally merged
doesn't work with reasoning models because of some of the jinja templating retardation but idgaf, won't need mikupad anymore, editing the assistant answers was the only thing I kept it for
it's nice how the out of the box experience is improving with llama.cpp

Anonymous
11/21/25(Fri)18:22:42 No.107287408

Anonymous 11/21/25(Fri)18:22:42 No.107287408

>>107287348
They claimed to have models so spooky smart civilization itself was about to collapse. You're not building Rome.

Anonymous
11/21/25(Fri)18:23:59 No.107287424

Anonymous 11/21/25(Fri)18:23:59 No.107287424

>>107287304
>he doesn't wrap at 80 col
100% a sign of mental illness

Anonymous
11/21/25(Fri)18:27:07 No.107287456

Anonymous 11/21/25(Fri)18:27:07 No.107287456

Gemma 4 where

Anonymous
11/21/25(Fri)18:27:40 No.107287458

Anonymous 11/21/25(Fri)18:27:40 No.107287458

>>107287408
That's true hahaha.
But you never know until you try.

Anonymous
11/21/25(Fri)18:29:47 No.107287478

Anonymous 11/21/25(Fri)18:29:47 No.107287478

>>107287456
After Gemini 3

Anonymous
11/21/25(Fri)18:31:11 No.107287489

Anonymous 11/21/25(Fri)18:31:11 No.107287489

https://github.com/openai/openai-python/issues/2472
lol this issue is fucking hilarious
months later after openai demoing gpt-5 doing a vibefix of this issue there's still no merge, neither llm induced nor human, of a fix
yet itt people want to have you believe there's such a thing as productive llm coders
not even the people who made those shitty models are using them productively

Anonymous
11/21/25(Fri)18:32:15 No.107287502

Anonymous 11/21/25(Fri)18:32:15 No.107287502

>>107287489
>yet itt people want to have you believe there's such a thing as productive llm coders
We ERP with out chatbots here, sir

Anonymous
11/21/25(Fri)18:32:59 No.107287506

Anonymous 11/21/25(Fri)18:32:59 No.107287506

>>107287502
saar:
>>107287285
>I wouldn't say that. It's just that those that take the time to learn how to use their tools effectively aren't the ones whining online.

Anonymous
11/21/25(Fri)18:35:57 No.107287535

Anonymous 11/21/25(Fri)18:35:57 No.107287535

>>107287458
You HAVE tried. You wanted gemma to do research for you. You wanted to feed it papers you don't understand. You want it to write code you wouldn't understand.
You could have used all this time to actually learn what you need to code it yourself.
See this for example:
>https://github.com/ggml-org/llama.cpp/pull/16095
That is the life of a vibe coder. I'm sure the dude knows *some* stuff. But he's not expecting models to come up with the whole thing. He uses models as a tool, not as a replacement. And he uses the big boy models for it. You're struggling with a 27b.
Set realistic goals.

Anonymous
11/21/25(Fri)18:38:41 No.107287564

Anonymous 11/21/25(Fri)18:38:41 No.107287564

File: this is gold.png (1007 KB, 1080x895)

1007 KB PNG

>>107287321
>t.literal who build that eats snakes Nobody gives a fuck about your stilt-walking ass go back to /birb/

Anonymous
11/21/25(Fri)18:48:35 No.107287658

Anonymous 11/21/25(Fri)18:48:35 No.107287658

>>107287506
Nobody is whining about it online. The people don't even notice. Researchers discovered the effect. Cognitive resources now freed up to be more retarded.

Anonymous
11/21/25(Fri)18:55:04 No.107287730

Anonymous 11/21/25(Fri)18:55:04 No.107287730

>>107287535
>You HAVE tried. You wanted gemma to do research for you. You wanted to feed it papers you don't understand.
Not gemma. A hypothetical gemma derived model called gemma-researcher, maybe.
>You want it to write code you wouldn't understand.
Only after making gemma-coder-beta-v0.1. And ideally I would like it to write simple, modular code that is easy to understand, and generate formal specs and proofs that can be used to verify correctness.
>You could have used all this time to actually learn what you need to code it yourself.
If LLMs and ML in general couldn't be used for productive purposes, then I wouldn't care to know how to write an inference engine in the first place.
>That is the life of a vibe coder. I'm sure the dude knows *some* stuff. But he's not expecting models to come up with the whole thing. He uses models as a tool, not as a replacement. And he uses the big boy models for it.
I already use pretty much all of my free time to vibe code exclusively with my own vibe coded assistant (except for working on the code assistant itself because that confuses the model, for that I use opencode with glm or codex), so I don't need to look at other people's code to know what is the life of a vibecoder.
Well, except for the finetuning stuff, I did all that stuff manually because open weights LLMs aren't very useful for that kind of niche knowledge and I was avoiding using the closed ones.
>You're struggling with a 27b.
That was just an aspiration, for actual vibecoding I use GLM 4.6 or now codex (since while using it I'm capturing all the outputs for finetuning, so it's ok to use it).

Anonymous
11/21/25(Fri)18:58:04 No.107287765

Anonymous 11/21/25(Fri)18:58:04 No.107287765

File: setUP.jpg (72 KB, 442x680)

72 KB JPG

>>107284563
https://www.gigabyte.com/Enterprise/GPU-Server/G431-MM0-rev-100
under 200usd from ebay, it's a steal for a setup like this, only thing is that the pcb that has all the sockets for the cards is connected to the motherboard via pcie3 x8 (slim-sas), so only x8 is possible from all the gpus at a time, so, it's basically 1 card x8 or 8 cards x1 speed.

but it works

>>107284600
see >>107283798
never has this setup exceeded 450w power consumption, since the cards are never 100% together

Anonymous
11/21/25(Fri)19:02:17 No.107287800

Anonymous 11/21/25(Fri)19:02:17 No.107287800

>ask chatpajeet to port my windows python script to linux
>tell it explicitly to use uinput and pynput
>it adds in evdev
>tell it to follow my original instructions and rewrite the example without evdev
>it modifies variable names and logic
>tell it to use original variable names and logic and rewrite it again
Just small interaction is so fucking irritating it's unreal. If I was a machine and was presented with some source code, I would automatically assume using the original syntax and variable names. What the fuck.

Anonymous
11/21/25(Fri)19:03:04 No.107287811

Anonymous 11/21/25(Fri)19:03:04 No.107287811

>>107287730
A finetuned gemma is gemma. You wanted gemma to do it for you. It can't. It won't.
I trust *you* could do it, but you won't do it either.

Anonymous
11/21/25(Fri)19:08:22 No.107287850

Anonymous 11/21/25(Fri)19:08:22 No.107287850

File: 1747990983001862.gif (3.49 MB, 390x163)

3.49 MB GIF

>>107287071

Anonymous
11/21/25(Fri)19:11:29 No.107287871

Anonymous 11/21/25(Fri)19:11:29 No.107287871

>>107287811
It would be incredibly harmful and Gemma is very pure.

Anonymous
11/21/25(Fri)19:12:19 No.107287880

Anonymous 11/21/25(Fri)19:12:19 No.107287880

>>107287071
>The post-training job on K2
You're going to have to explain this part a bit more.

Anonymous
11/21/25(Fri)19:13:54 No.107287891

Anonymous 11/21/25(Fri)19:13:54 No.107287891

>>107287811
Ship of Theseus and all that.
Even when using LoRa theoretically you can achieve an arbitrary rank by iteratively merging multiple adapters.

Anonymous
11/21/25(Fri)19:17:41 No.107287917

Anonymous 11/21/25(Fri)19:17:41 No.107287917

>>107287765
How about the heat and noise?

Anonymous
11/21/25(Fri)19:18:22 No.107287921

Anonymous 11/21/25(Fri)19:18:22 No.107287921

>>107287800
the text completion engines are stubborn when something doesn't match their pattern match expectation
I had gemini refactor a script of mine once and it kept deleting a line that was like
myprogram rm (here rm is a subcommand of my program, not the actual rm used from a shell) something.txt
because it thought surely it's a bug to do that after processing the something.txt
except rm subcommand was just to reset the statefile of the program, not delete the file..
but well, LLMs see rm and they see blood
it'll be funny if humans end up deciding all the naming of their variables and functions just to please the llm gods

Anonymous
11/21/25(Fri)19:21:41 No.107287947

Anonymous 11/21/25(Fri)19:21:41 No.107287947

>>107287800
>tell it to follow my original instructions and rewrite the example without evdev
Okay I'm tired of promplets. Here's the thing, don't correct the LLM. Start a new chat, add the additional requirement "do not add evdev", resend the prompt. Always do that instead of wasting your time fighting the poisoned context.

Anonymous
11/21/25(Fri)19:26:12 No.107287984

Anonymous 11/21/25(Fri)19:26:12 No.107287984

>>107287765
nice

Anonymous
11/21/25(Fri)19:27:32 No.107287991

Anonymous 11/21/25(Fri)19:27:32 No.107287991

>>107287947
No, I will continue arguing with the bot until it learns its place and obeys my instructions.

Anonymous
11/21/25(Fri)19:30:49 No.107288014

Anonymous 11/21/25(Fri)19:30:49 No.107288014

>>107287991
das rite, assert dominance

Anonymous
11/21/25(Fri)19:34:59 No.107288044

Anonymous 11/21/25(Fri)19:34:59 No.107288044

>>107286617
>makes your ram cost gorillions
>does nothing with it

Anonymous
11/21/25(Fri)19:35:05 No.107288048

Anonymous 11/21/25(Fri)19:35:05 No.107288048

i'll just use vocoflex, i'm too tired for this sht. also merry christmas and happy new year.

Anonymous
11/21/25(Fri)19:38:14 No.107288073

Anonymous 11/21/25(Fri)19:38:14 No.107288073

>>107288044
I thought ram prices going up was because ddr4 is approaching eol and they're not gonna make any more

Anonymous
11/21/25(Fri)19:39:22 No.107288085

Anonymous 11/21/25(Fri)19:39:22 No.107288085

>>107288073
it's going up because too many datacenters are being built at the same time
while for pro use it's vram that matters (no matter how much cpumaxxers are trying to cope inventing bullshit like people cpumaxxing more.. lol that's not happening) those machines hosting the many GPU also need ram too

Anonymous
11/21/25(Fri)19:40:14 No.107288093

Anonymous 11/21/25(Fri)19:40:14 No.107288093

>>107288073
they are all going to moon sir
https://pcpartpicker.com/trends/price/memory/

Anonymous
11/21/25(Fri)19:50:54 No.107288184

Anonymous 11/21/25(Fri)19:50:54 No.107288184

>>107287891
You're not building Rome, you're not on a ship. You're asking the grub in a chunk of driftwood to build a cathedral for you.
You're delusional.

Anonymous
11/21/25(Fri)19:52:54 No.107288198

Anonymous 11/21/25(Fri)19:52:54 No.107288198

>>107287947
>promptlets
I forgot /g/ is about spamming AI slop images, platform wars and outranking anonymous imageboard users. At least you didn't use the buzzword 'skill issue'. If you are so superior why are you still even using 4chan in the first place?
Please die in a car crash.

Anonymous
11/21/25(Fri)19:56:24 No.107288224

Anonymous 11/21/25(Fri)19:56:24 No.107288224

>>107288085
vram and usual ram are using the same chip from the same factory

Anonymous
11/21/25(Fri)19:57:33 No.107288235

Anonymous 11/21/25(Fri)19:57:33 No.107288235

>>107288198
>>107287947
That's what you get for spoonfeeding retards.

Anonymous
11/21/25(Fri)19:59:14 No.107288246

Anonymous 11/21/25(Fri)19:59:14 No.107288246

>>107288235
What do you mean? You are still showing this superior attitude. Are you even employed?

Anonymous
11/21/25(Fri)20:11:36 No.107288349

Anonymous 11/21/25(Fri)20:11:36 No.107288349

File: 1747756942288958.jpg (34 KB, 700x526)

34 KB JPG

>>107288235
Lesson learned

Anonymous
11/21/25(Fri)20:13:31 No.107288361

Anonymous 11/21/25(Fri)20:13:31 No.107288361

>>107288349
Is this why your special clique thread is so dead?

Anonymous
11/21/25(Fri)20:47:37 No.107288583

Anonymous 11/21/25(Fri)20:47:37 No.107288583

>>107288361
There are fates worse than death.

Anonymous
11/21/25(Fri)21:28:56 No.107288853

Anonymous 11/21/25(Fri)21:28:56 No.107288853

>>107288093
>https://pcpartpicker.com/trends/price/memory/
its all fucked bro innit

Anonymous
11/21/25(Fri)21:54:02 No.107289004

Anonymous 11/21/25(Fri)21:54:02 No.107289004

>>107288853
totally organic and not a result of any sort of cabal

Anonymous
11/21/25(Fri)21:56:19 No.107289020

Anonymous 11/21/25(Fri)21:56:19 No.107289020

>>107288853
it'll be back to normal by march
maybe even a bit lower than the august prices

Anonymous
11/21/25(Fri)21:57:06 No.107289024

Anonymous 11/21/25(Fri)21:57:06 No.107289024

>>107289004
Stop being a schizo.
There has never been at any point in the past any sort of colluding to skyrocket RAM and/or storage prices.
Ever.
Not even once.

Anonymous
11/21/25(Fri)21:58:53 No.107289039

Anonymous 11/21/25(Fri)21:58:53 No.107289039

>>107287947
prompting is a meme and there's no real point in it beyond filling the context with some shit to distract the model from refusing
no, your 2k token rp prompt does not accomplish anything

Anonymous
11/21/25(Fri)21:59:40 No.107289047

Anonymous 11/21/25(Fri)21:59:40 No.107289047

>>107289039
Didn't ask for your input tardo

Anonymous
11/21/25(Fri)22:03:28 No.107289073

Anonymous 11/21/25(Fri)22:03:28 No.107289073

>>107289020
We're talking about the value of [product], not the value of employee labor.

Anonymous
11/21/25(Fri)22:16:10 No.107289153

Anonymous 11/21/25(Fri)22:16:10 No.107289153

Finally got training to work with Axolotl. No thanks to you faggots. Do any of you actually have any skills or knowledge?

Anonymous
11/21/25(Fri)22:17:45 No.107289161

Anonymous 11/21/25(Fri)22:17:45 No.107289161

>>107289153
nope, keep going champ

Anonymous
11/21/25(Fri)22:24:17 No.107289203

Anonymous 11/21/25(Fri)22:24:17 No.107289203

>>107287195
Vibecoderjeetanon this isnt a new discovery you just made this is just called microservice architecture and it's been the standard for most programmers for many years now.

That doesn't change anything about a spaghetti vibe codebase made by someone who doesn't understand the code, it's all GIGO

Anonymous
11/21/25(Fri)22:45:50 No.107289336

Anonymous 11/21/25(Fri)22:45:50 No.107289336

>>107289020
Pure copium.

Anonymous
11/21/25(Fri)22:48:48 No.107289345

Anonymous 11/21/25(Fri)22:48:48 No.107289345

>>107289153
At least I told you to try using liger kernel.
https://desuarchive.org/g/thread/107266608/#107267088
Did you end up using it?
And multi GPU or nah? Model, quant, memory use?

>>107289203
Do you think vibecoding is not possible with the current technology, or do you think AI can inherently never surpass the cognitive capabilities of the human brain? And if yes then what year do you think it'll reach equal capabilities?
As for microservices, yes, that's the idea. The problem with microservices is that networking adds too much overhead.
That's why I prefer a shared library binary interfaces as interface boundaries and a type system or even formal verification system with additional statically checked guarantees guarantees on top for separation of concerns within a process, and shared memory for inter-process communication.
And you are wrong to say this doesn't mitigate spaghetti code. The whole reason microservices became popular in the first place was so you could offshore most of a project except for the high level design and still be able to have a clean architecture even if the modules themselves are spaghetti code.

Anonymous
11/21/25(Fri)22:50:57 No.107289353

Anonymous 11/21/25(Fri)22:50:57 No.107289353

>>107289345
I tried the liger quant and I used a 1 bit HQQ quant with llamafactory and it still crashed. Somehow figured out how to get Axolotl to work on my single Blackwell Pro. Training a 4 bit qLoRA of GLM Air, only using 72GB of VRAM.

Anonymous
11/21/25(Fri)22:53:03 No.107289364

Anonymous 11/21/25(Fri)22:53:03 No.107289364

>>107289153
People who know about finetuning are sitting together in gooncords.

Anonymous
11/21/25(Fri)22:54:43 No.107289372

Anonymous 11/21/25(Fri)22:54:43 No.107289372

micro services are a retarded cancer promoted by the Usual Suspecters who know of nothing but crap like javascript and python on the server side
of course you will love micro services with JS you don't even have the ability to not incur overheading in multi processing because of the serialization/deserialization costs running workers
this is why everyone incurs massive cloud infra bills to run incredibly basic webshit apps even though something as big as stackoverflow (well, it was big before LLMs) could run on a SINGLE DATABASE with just another one running as a backup using a monolithic kind of backend architecture, serving millions with very little distributed infra. We're wasting so much computer hardware because of you niggers refusing to learn how to program in something other than dynamic shit.

Anonymous
11/21/25(Fri)22:59:01 No.107289384

Anonymous 11/21/25(Fri)22:59:01 No.107289384

>>107289353
No, I mean axolotl is supposed to have a plugin to use liger kernel, but if you already managed to get it going then there's no need to overcomplicate it.
What dataset are you fietuning on? Did you test the results already?

Anonymous
11/21/25(Fri)23:00:40 No.107289388

Anonymous 11/21/25(Fri)23:00:40 No.107289388

>>107289384
Still has a day left for training. Started at 250 hours. Using a custom dataset made from a pdf of one of my textbooks.

Anonymous
11/21/25(Fri)23:05:09 No.107289420

Anonymous 11/21/25(Fri)23:05:09 No.107289420

>>107289388
Damn, how does a dataset made from a single textbook take 250 hours to train?
Are you doing continued pretraining or did you make a Q/A dataset? If so did you use another LLM to build it and is it single or multi-turn?
How many samples, seq len and epochs?
Anyway good luck, there are very few here who try finetuning and even fewer who do it for non ERP purposes!

Anonymous
11/21/25(Fri)23:08:07 No.107289434

Anonymous 11/21/25(Fri)23:08:07 No.107289434

>>107289420
It's a ~1000 page book and the dataset was translated to Alpaca by chatgpt. 3 epoch, 256 sequence length, no idea how many samples. This is more for testing purposes than actual use. The training said it was gonna take 250 hours, but it has only been 3 hours and it went down to about a day left. It hasn't taken 250 hours, it just advertised that it would.

Anonymous
11/21/25(Fri)23:10:21 No.107289444

Anonymous 11/21/25(Fri)23:10:21 No.107289444

>>107288853
So glad I built my new gaming PC before reddit started CPU maxxing

Anonymous
11/21/25(Fri)23:19:45 No.107289478

Anonymous 11/21/25(Fri)23:19:45 No.107289478

>>107289434
samples should be (total number of steps in the progress bar * number of gpus * per device batch size * accumulation steps) / epochs

Anonymous
11/21/25(Fri)23:20:52 No.107289485

Anonymous 11/21/25(Fri)23:20:52 No.107289485

>>107289478
1275 samples then.

Anonymous
11/21/25(Fri)23:29:20 No.107289527

Anonymous 11/21/25(Fri)23:29:20 No.107289527

>>107288184
The only reason I didn't get it done was because I was stuck with local models since I began posting here.
Before when using codex it made a working one in like 2 hours but I decided to start from scratch because it couldn't get the MoE version of qwen working.
Now that I am using codex again it'll be a piece of cake.

Anonymous
11/21/25(Fri)23:30:35 No.107289535

Anonymous 11/21/25(Fri)23:30:35 No.107289535

>>107283301
Gemma-3-12b-abliterated

Anonymous
11/21/25(Fri)23:40:28 No.107289586

Anonymous 11/21/25(Fri)23:40:28 No.107289586

https://chub.ai/characters/QuattroBajina/seija-kijin-7e111ed00077
GLM4.6 literally can't handle this card.

Anonymous
11/21/25(Fri)23:42:24 No.107289597

Anonymous 11/21/25(Fri)23:42:24 No.107289597

>>107286692
That's not his goal though. He's working on China to sabotage American AI development https://www.youtube.com/watch?v=Y2fucDSilWE

Anonymous
11/21/25(Fri)23:42:51 No.107289599

Anonymous 11/21/25(Fri)23:42:51 No.107289599

In fact I'm going to put the systematic data generation on hold and work on the engine using codex just to shut y'all up.
This time I'm going to focus on running gpt-oss 20b first, then it'll be easy to extend it to 120b.

Anonymous
11/22/25(Sat)00:22:51 No.107289769

Anonymous 11/22/25(Sat)00:22:51 No.107289769

>>107289372
My current place of employment paid a team of mexicans 2 years of salary to larp as a silicon valley startup and rewrite an inhouse application used by a few hundred people.
The old monolithic application's server ran on a single machine sitting in a forgotten closet.
The new microservices application runs in the cloud, on 20 droplets, costs $2k/month, is a security nightmare, is slower than the old application, has far more bugs than the old application, and implementing any change takes 10x longer than it did in the old application.
The important thing is that they were able to pad their resumes with the latest industry buzzwords and move on to shit up the next project elsewhere.

Anonymous
11/22/25(Sat)00:38:00 No.107289846

Anonymous 11/22/25(Sat)00:38:00 No.107289846

>>107262859
>Can anyone recommend a TTS model that can emulate IvyWilde?

Did you still need this?

Anonymous
11/22/25(Sat)00:49:47 No.107289910

Anonymous 11/22/25(Sat)00:49:47 No.107289910

Hunyuan Video 1.5 is quite disappointing. 720p is worse than wan 2.2 and 480p model is pure garbage.

Anonymous
11/22/25(Sat)00:53:43 No.107289939

Anonymous 11/22/25(Sat)00:53:43 No.107289939

>>107287067
I've been testing 3.2's vision here. It's fairly good but no Gemma, and I've also been getting the odd refusal here and there.

Anonymous
11/22/25(Sat)00:57:44 No.107289967

Anonymous 11/22/25(Sat)00:57:44 No.107289967

>>107289910
Did anyone expect anything better? Videogen had a nice run but with wan2.5 going proprietary it's pretty clear that Wan2.2 is going to say the SDXL of videogen for the next couple of years.

Anonymous
11/22/25(Sat)01:05:52 No.107290005

Anonymous 11/22/25(Sat)01:05:52 No.107290005

>>107287765
that's pretty sexy

They say even 1x is fine for inference alone, is it true?

Anonymous
11/22/25(Sat)01:48:17 No.107290226

Anonymous 11/22/25(Sat)01:48:17 No.107290226

>>107287365
that's nice but that whole UI is completely retarded with a horrible layout which leaves most of your screen space unused

only niggerganov&co can look at it and think it's something good

Anonymous
11/22/25(Sat)01:51:20 No.107290238

Anonymous 11/22/25(Sat)01:51:20 No.107290238

>dead containment general

Anonymous
11/22/25(Sat)01:52:23 No.107290246

Anonymous 11/22/25(Sat)01:52:23 No.107290246

File: fuck_off_back_to_reddit.jpg (65 KB, 540x516)

65 KB JPG

>>107288198
>Please die in a car crash.

you have to go back

Anonymous
11/22/25(Sat)02:33:00 No.107290452

Anonymous 11/22/25(Sat)02:33:00 No.107290452

File: superlaugh.jpg (331 KB, 517x768)

331 KB JPG

>>107287321
>free_monke.jpg
he dindu
nice work nice gens

Anonymous
11/22/25(Sat)02:39:43 No.107290479

Anonymous 11/22/25(Sat)02:39:43 No.107290479

File: 2mw.png (6 KB, 305x126)

6 KB PNG

https://www.chinatalk.media/p/the-zai-playbook

>Nathan Lambert: Only sensitive questions that I don’t expect to have an answer to: How big is your next model? How many GPUs do you have?
>
>Zixuan Li: For our next generation, we are going to launch 4.6 Air. I don’t know whether it will be called Mini, but it is a 30-billion-parameter model. It becomes a lot smaller in a couple of weeks. That’s all for 2025.

Anonymous
11/22/25(Sat)02:43:47 No.107290492

Anonymous 11/22/25(Sat)02:43:47 No.107290492

>>107290479
>That’s all for 2025.
NOOOOOOOOOOO

Anonymous
11/22/25(Sat)02:47:55 No.107290511

Anonymous 11/22/25(Sat)02:47:55 No.107290511

>>107290246
Fuck off ESL. You are the one who needs to 'go back'.

Anonymous
11/22/25(Sat)02:49:57 No.107290521

Anonymous 11/22/25(Sat)02:49:57 No.107290521

>>107290479
>When this podcast launches, I believe we already have 4.6 Air, 4.6 Mini, and also the next 4.6 Vision model.
Seems like they ran into issues with those smaller models...
A 30B might be a nice upgrade in that range if it was dense... but it's probably not.

Anonymous
11/22/25(Sat)02:55:36 No.107290550

Anonymous 11/22/25(Sat)02:55:36 No.107290550

Expect pretty much every chinese company to disappear now that all the big western players started obfuscating their reasoning process. This is quite possibly the end of progress for local models.

Anonymous
11/22/25(Sat)02:57:45 No.107290563

Anonymous 11/22/25(Sat)02:57:45 No.107290563

>>107290479
>That’s all for 2025.
Aww... I was hoping for GLM 5. Good thing that there are less than 45 days left in 2025.

Anonymous
11/22/25(Sat)02:58:09 No.107290565

Anonymous 11/22/25(Sat)02:58:09 No.107290565

>>107290550
DeepSeek managed to train R1 without any traces available to crib. They were the first to show them at all.
You can also still trick Gemini into coughing up its reasoning with a prefill, as far as I know.
They'll be fine I'm sure.

Anonymous
11/22/25(Sat)03:04:17 No.107290591

Anonymous 11/22/25(Sat)03:04:17 No.107290591

>>107290550
>started obfuscating their reasoning process
almost all of them never exposed it in the first place. OpenAI didn't even do summaries until after R1 released, google summaries only, anthropic was the only one who did so briefly, i think with sonnet 3.7. even then it was limited

Anonymous
11/22/25(Sat)03:33:45 No.107290732

Anonymous 11/22/25(Sat)03:33:45 No.107290732

>>107290565
>DeepSeek managed to train R1 without any traces available to crib
R1 spent fore-fcuking-ver in its thought blocks to the point where I can't take seriously people who praised it because nobody actually wants to use a model that spends this much time producing le reasoning token
fucking meme model
it actually become usable when they started training on the traces they collected from Gemini 2.5.
>>107290591
>almost all of them never exposed it in the first place. OpenAI didn't even do summaries until after R1 released, google summaries only, anthropic was the only one who did so briefly, i think with sonnet 3.7. even then it was limited
this is utterly wrong
as someone who has used 2.5 since the early preview release I can tell you they didn't always summarize it, they hid it behind a summarizer after they noticed the unusual traffic from China.
you can still google it and see how people reacted when Gemini hid it:
https://www.reddit.com/r/Bard/comments/1kr5yo4/new_update_ruined_gemini_25_cot_is_now_hidden/
everyone back then perceived gemini differently precisely because it didn't hid the cot

Anonymous
11/22/25(Sat)04:44:25 No.107291128

Anonymous 11/22/25(Sat)04:44:25 No.107291128

>>107287917
The noise is there, you don't want this in your bedroom for sure. Although it is not even close to as bad as the noise 1U and 2U servers do.
>>107290005
The model load time is affected. After the model is in VRAM, it's smooth sailing from there. There's probably some downsides that i'm not aware of since i haven't had another machine like this.

Anonymous
11/22/25(Sat)05:48:20 No.107291488

Anonymous 11/22/25(Sat)05:48:20 No.107291488

>https://github.com/ggml-org/llama.cpp/pull/17428
>lmg "humor"

Anonymous
11/22/25(Sat)05:50:45 No.107291503

Anonymous 11/22/25(Sat)05:50:45 No.107291503

>>107291488
cringe

Anonymous
11/22/25(Sat)05:53:48 No.107291519

Anonymous 11/22/25(Sat)05:53:48 No.107291519

File: 1748015427079257.png (466 KB, 720x720)

466 KB PNG

>>107291488
>Here is the result, do the needful saars.
oh he fucking did it

Anonymous
11/22/25(Sat)06:07:12 No.107291589

Anonymous 11/22/25(Sat)06:07:12 No.107291589

>>107291488
uh oh, cola dev doesn't find it funny

Anonymous
11/22/25(Sat)06:08:18 No.107291594

Anonymous 11/22/25(Sat)06:08:18 No.107291594

>>107291589
well duh, this guy is a leftist who supports troons, what else do you expect from him?

Anonymous
11/22/25(Sat)06:16:20 No.107291629

Anonymous 11/22/25(Sat)06:16:20 No.107291629

>>107291488
Retard needs to learn to hide his power level when he's trying to get stuff done and especially when you depend on other people to do something.

Anonymous
11/22/25(Sat)06:18:37 No.107291640

Anonymous 11/22/25(Sat)06:18:37 No.107291640

>>107291488
based, and there's even an extra layer of kino when you see cudatroon seething about it

Anonymous
11/22/25(Sat)06:36:56 No.107291758

Anonymous 11/22/25(Sat)06:36:56 No.107291758

>>107291589
Of course he doesn't, not enough references to lolis being raped by mudslimes.

Anonymous
11/22/25(Sat)06:38:04 No.107291766

Anonymous 11/22/25(Sat)06:38:04 No.107291766

>>107291594
>>107291594
>based
>kino
>seething
>leftist
>troons
Almost every bot filter keyword in two posts.

Anonymous
11/22/25(Sat)06:39:45 No.107291774

Anonymous 11/22/25(Sat)06:39:45 No.107291774

>>107291766
your filter is shit since you're still seeing those words lol

Anonymous
11/22/25(Sat)06:47:57 No.107291809

Anonymous 11/22/25(Sat)06:47:57 No.107291809

>>107291488
>i can still feel very friendly even when mildly annoyed (useful trait when keeping women around)
kek

Anonymous
11/22/25(Sat)06:51:43 No.107291826

Anonymous 11/22/25(Sat)06:51:43 No.107291826

>>107291774
What do you mean?

Anonymous
11/22/25(Sat)06:52:45 No.107291832

Anonymous 11/22/25(Sat)06:52:45 No.107291832

File: 1754037937439537.jpg (76 KB, 1825x431)

76 KB JPG

>>107291826

Anonymous
11/22/25(Sat)06:54:12 No.107291848

Anonymous 11/22/25(Sat)06:54:12 No.107291848

>>107291832
NTA but my strategy is rather to trick /pol/tards into saying things that gets them banned.

Anonymous
11/22/25(Sat)06:55:05 No.107291851

Anonymous 11/22/25(Sat)06:55:05 No.107291851

>>107291832
Off topic.

Anonymous
11/22/25(Sat)06:55:36 No.107291856

Anonymous 11/22/25(Sat)06:55:36 No.107291856

>>107291766
>Almost every bot filter keyword in two posts.
>>107291851
>Off topic.

Anonymous
11/22/25(Sat)07:01:01 No.107291881

Anonymous 11/22/25(Sat)07:01:01 No.107291881

File: upside.png (312 KB, 895x762)

312 KB PNG

>>107289586
Mistral-Large made an attempt. Now this is a benchmark that separates the men from the boys.

Anonymous
11/22/25(Sat)07:01:15 No.107291883

Anonymous 11/22/25(Sat)07:01:15 No.107291883

>>107291856
?

Anonymous
11/22/25(Sat)07:02:06 No.107291887

Anonymous 11/22/25(Sat)07:02:06 No.107291887

>>107291848
What do you mean?

Anonymous
11/22/25(Sat)07:05:17 No.107291911

Anonymous 11/22/25(Sat)07:05:17 No.107291911

It's amazing how much 4chan discourse is just posting screenshots of twitter.

Anonymous
11/22/25(Sat)07:08:50 No.107291930

Anonymous 11/22/25(Sat)07:08:50 No.107291930

>>107291911
People like talking shit from inside their safe spaces, the exact same thing happens on Reddit or Discord.

Anonymous
11/22/25(Sat)07:23:37 No.107292008

Anonymous 11/22/25(Sat)07:23:37 No.107292008

How much will the inference speed increase if I buy another GPU identical to my current one, but I still have to offload some MoE layers to RAM?

Anonymous
11/22/25(Sat)07:25:12 No.107292020

Anonymous 11/22/25(Sat)07:25:12 No.107292020

>>107292008
I'd say 36% or so. It does not scale in linear fashion.

Anonymous
11/22/25(Sat)07:41:04 No.107292100

Anonymous 11/22/25(Sat)07:41:04 No.107292100

>>107292008
Depends. I got +1t/s on a 4.6 after upgrading from 1 to 4 3090s

Anonymous
11/22/25(Sat)07:58:07 No.107292207

Anonymous 11/22/25(Sat)07:58:07 No.107292207

>>107292008
You have to put FFN layers on them. If you only do exp=CPU it does nothing.

Anonymous
11/22/25(Sat)08:25:31 No.107292400

Anonymous 11/22/25(Sat)08:25:31 No.107292400

File: file.png (57 KB, 589x455)

57 KB PNG

Still waiting...

Anonymous
11/22/25(Sat)08:38:12 No.107292479

Anonymous 11/22/25(Sat)08:38:12 No.107292479

File: gonnakillyou.png (540 KB, 606x541)

540 KB PNG

>Try GLM 5Q
>Parrot
>Try GLM 8Q
>Loss of logic on fifth post at 1k context
I hate Indians.

Anonymous
11/22/25(Sat)08:41:17 No.107292502

Anonymous 11/22/25(Sat)08:41:17 No.107292502

>>107292400
They're in the queue after 4.6 Air and Large 3

Anonymous
11/22/25(Sat)09:03:27 No.107292643

Anonymous 11/22/25(Sat)09:03:27 No.107292643

you wouldnt eat a miku
or would you
https://youtube.com/shorts/EHbRv986tAk

Anonymous
11/22/25(Sat)09:09:15 No.107292692

Anonymous 11/22/25(Sat)09:09:15 No.107292692

File: 1748172141418264.png (234 KB, 736x718)

234 KB PNG

Kimi: DO NOTHING, WIN

Anonymous
11/22/25(Sat)09:19:40 No.107292758

Anonymous 11/22/25(Sat)09:19:40 No.107292758

>>107291488
Blessed digits.

Anonymous
11/22/25(Sat)09:20:41 No.107292763

Anonymous 11/22/25(Sat)09:20:41 No.107292763

Gemini 3's actual usable context is <1k
The first 1k tokens are really good though. A shame it falls off rapidly

Anonymous
11/22/25(Sat)09:39:54 No.107292898

Anonymous 11/22/25(Sat)09:39:54 No.107292898

>>107278838
Can't post pictures due to "abuse". Fuck this site.

Anonymous
11/22/25(Sat)09:39:59 No.107292899

Anonymous 11/22/25(Sat)09:39:59 No.107292899

>>107292886
>>107292886
>>107292886

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.