/g/ - /lmg/ - Local Models General - Technology


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

Anonymous
/lmg/ - Local Models General 03/01/26(Sun)03:30:09 No.108268616

File: 1748538984859411.png (1.14 MB, 1320x1871)

/lmg/ - Local Models General Anonymous 03/01/26(Sun)03:30:09 No.108268616

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>108263979

►News
>(02/24) Introducing the Qwen 3.5 Medium Model Series: https://xcancel.com/Alibaba_Qwen/status/2026339351530188939
>(02/24) Liquid AI releases LFM2-24B-A2B: https://hf.co/LiquidAI/LFM2-24B-A2B
>(02/20) ggml.ai acquired by Hugging Face: https://github.com/ggml-org/llama.cpp/discussions/19759
>(02/16) Qwen3.5-397B-A17B released: https://hf.co/Qwen/Qwen3.5-397B-A17B
>(02/16) dots.ocr-1.5 released: https://modelscope.cn/models/rednote-hilab/dots.ocr-1.5

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
Token Speed Visualizer: https://shir-man.com/tokens-per-second

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
03/01/26(Sun)03:31:46 No.108268623

Anonymous 03/01/26(Sun)03:31:46 No.108268623

how do i prevent the model from tricking me into treating it like a sentient being? no matter how hard i try when it does tasks well i slowly develop affection for them and end up praising them

Anonymous
03/01/26(Sun)03:32:38 No.108268628

Anonymous 03/01/26(Sun)03:32:38 No.108268628

File: Screenshot 2026-03-01 013117.png (410 KB, 2376x1358)

410 KB PNG

I fucking hate reddit

Anonymous
03/01/26(Sun)03:33:12 No.108268633

Anonymous 03/01/26(Sun)03:33:12 No.108268633

>>108268623
meds.

Anonymous
03/01/26(Sun)03:33:17 No.108268634

Anonymous 03/01/26(Sun)03:33:17 No.108268634

>>108268616
I saw this on twitter like a week ago

Anonymous
03/01/26(Sun)03:33:40 No.108268635

Anonymous 03/01/26(Sun)03:33:40 No.108268635

>>108268628
>>108268633

Anonymous
03/01/26(Sun)03:36:43 No.108268647

Anonymous 03/01/26(Sun)03:36:43 No.108268647

was thinking a mistake

Anonymous
03/01/26(Sun)03:38:12 No.108268652

Anonymous 03/01/26(Sun)03:38:12 No.108268652

>>108268647
isnt it funny how the chinese invented thinking

Anonymous
03/01/26(Sun)03:44:04 No.108268674

Anonymous 03/01/26(Sun)03:44:04 No.108268674

File: 1762566093825809.jpg (1.12 MB, 1796x2500)

1.12 MB JPG

Which textgen inference engine is still supported? Oogabooga last commit was January, rip. I want to try out Qwen3.5-35B-A3B-GGUF

Anonymous
03/01/26(Sun)03:45:34 No.108268684

Anonymous 03/01/26(Sun)03:45:34 No.108268684

File: 1770808958004704.jpg (325 KB, 1920x2024)

325 KB JPG

►Recent Highlights from the Previous Thread: >>108263979

--Paper: Think Deep, Not Just Long: Measuring LLM Reasoning Effort via Deep-Thinking Tokens:
>108264446 >108264505 >108264551
--Unsloth Dynamic 2.0 GGUFs performance on MMLU:
>108264430 >108264456 >108264477
--Logit bias failures due to tokenization and client-side token ID mismatches:
>108264179 >108264199 >108264202 >108264249 >108264278 >108264292 >108264232 >108264297 >108264331 >108264405 >108264441 >108264451 >108264533 >108264555 >108264602 >108264633 >108264583 >108264593
--Qwen 397B's overbearing safety policies and identity confusion:
>108264016 >108264046 >108264072 >108264103 >108264182 >108264508 >108264600 >108264616 >108264400 >108264426 >108265462
--Qwen 3.5 30B generates functional retro dashboard and news summaries:
>108264690 >108264794
--Feasibility of GPU-attached SSDs for sparse MoE inference:
>108266344 >108266504 >108266567 >108266686 >108266777 >108267570 >108267386 >108267481 >108267529 >108267711
--DeepSeek resists jailbreak attempt by adhering to ethical guidelines:
>108266705
--8-bit KV cache limitations in LLMs vs diffusion models:
>108265842 >108265893 >108266268 >108266073 >108266123 >108266141 >108266487 >108266503 >108266514
--Local model recommendations for limited hardware:
>108267427 >108267448 >108267450 >108267467 >108267482 >108267582 >108267480 >108267538 >108267595 >108267614 >108267652 >108267716 >108267755
--RPG frontend project licensing and development feedback:
>108267591 >108267606 >108267617 >108267625 >108267638 >108267661 >108267692 >108267620 >108267648 >108267739 >108267972
--Local LLMs debated for privacy:
>108266446 >108266482 >108266467 >108266530 >108266555 >108266531 >108268418 >108268454
--Qwen3TTS test recording:
>108266604 >108266699
--Miku (free space):
>108264476 >108264514 >108264879 >108264958 >108268333 >108268359

►Recent Highlight Posts from the Previous Thread: >>108263984

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
03/01/26(Sun)03:46:18 No.108268688

Anonymous 03/01/26(Sun)03:46:18 No.108268688

File: 1749034478510628.png (24 KB, 559x429)

24 KB PNG

anyone has a working config file for qwen35b to use in llama-swap?
I can't figure out how to turn on/off thinking

Anonymous
03/01/26(Sun)03:46:49 No.108268691

Anonymous 03/01/26(Sun)03:46:49 No.108268691

File: op.png (18 KB, 419x148)

18 KB PNG

>>108268674
nigger

Anonymous
03/01/26(Sun)03:46:51 No.108268692

Anonymous 03/01/26(Sun)03:46:51 No.108268692

>>108268688
yeah

Anonymous
03/01/26(Sun)03:49:02 No.108268697

Anonymous 03/01/26(Sun)03:49:02 No.108268697

>>108268688
nevermind
the enable_thinking flag worked

Anonymous
03/01/26(Sun)03:50:10 No.108268703

Anonymous 03/01/26(Sun)03:50:10 No.108268703

>>108268688
>llama-swap
https://github.com/ggml-org/llama.cpp/tree/master/tools/server#using-multiple-models

Anonymous
03/01/26(Sun)03:52:04 No.108268709

Anonymous 03/01/26(Sun)03:52:04 No.108268709

>>108268703
github is banned in my country

Anonymous
03/01/26(Sun)03:52:31 No.108268712

Anonymous 03/01/26(Sun)03:52:31 No.108268712

>>108268709
hahahahahahaha

Anonymous
03/01/26(Sun)03:54:59 No.108268721

Anonymous 03/01/26(Sun)03:54:59 No.108268721

What kind of techless luddite shithole bans github?

Anonymous
03/01/26(Sun)03:57:27 No.108268727

Anonymous 03/01/26(Sun)03:57:27 No.108268727

>>108268709
>>108268712 (me)
You know what? I shouldn't have laughed. Some places are fucked up. Good luck, anon.

Anonymous
03/01/26(Sun)03:57:47 No.108268729

Anonymous 03/01/26(Sun)03:57:47 No.108268729

>>108268721
https://en.wikipedia.org/wiki/Censorship_of_GitHub

Anonymous
03/01/26(Sun)04:04:06 No.108268749

Anonymous 03/01/26(Sun)04:04:06 No.108268749

>>108268721
>China is a techless Luddite shithole
Uh oh mutilated mutt alert, and I'm not even a chink

Anonymous
03/01/26(Sun)04:06:33 No.108268758

Anonymous 03/01/26(Sun)04:06:33 No.108268758

>>108268749
>>108266968

Anonymous
03/01/26(Sun)04:07:54 No.108268764

Anonymous 03/01/26(Sun)04:07:54 No.108268764

>>108268729
i fucking hate the modern internet. i think the best internet ever was was in between 2003-2007. before fucking reddit but you still had 4chan (and funny memes) and no fucking github, huggingface, and all these other huge collective ass websites. you had small cozy community forums and when you googled you actually found some fucking useful links to forum threads with solutions and answers instead of a fucking AI-generated translated-badly-to-your-native-language blogpost as the top 30 results. And normies/old people/the fucking government didn't have jackshit to do with the internet so you could download whatever cool shit you wanted from anywhere. and don't get me started on the fucking cookies buttons oh my fucking god I just want to go back to the facepunch forums OIFY section and lucky star-post and read racist gmod comics

Anonymous
03/01/26(Sun)04:08:10 No.108268767

Anonymous 03/01/26(Sun)04:08:10 No.108268767

>>108268758
i just wish chinese girl liked me

Anonymous
03/01/26(Sun)04:09:18 No.108268772

Anonymous 03/01/26(Sun)04:09:18 No.108268772

>>108268764

based and absolutely true anon, the modern web is a bloated javascript botnet designed to farm your data for glowies and serve up raw garbage to smartphone normies. back then you actually had to know how to use a computer to get online which kept the trash out, but now search engines are just a dead sea of dead internet theory ai seo slop and corporate walled gardens. id give literally anything to go back to 2006, fire up a cracked copy of winamp, and shitpost on a comfy self-hosted vbulletin board instead of dealing with this enshittified nightmare where you have to click through fifty cookie toggles just to read a single fucking thread.

Anonymous
03/01/26(Sun)04:09:31 No.108268773

Anonymous 03/01/26(Sun)04:09:31 No.108268773

>China is a techless Luddite shithole
unironically always has been. chinese models nothing but distillations of western API models and it shows. overfit to the benchs and much less useful in practice.
china can't create. doesn't matter if their general public can't access github because they never made software worth shit anyway, unless you count malware

Anonymous
03/01/26(Sun)04:10:27 No.108268776

Anonymous 03/01/26(Sun)04:10:27 No.108268776

File: disruption.png (31 KB, 1721x221)

31 KB PNG

Anonymous
03/01/26(Sun)04:11:49 No.108268784

Anonymous 03/01/26(Sun)04:11:49 No.108268784

>>108268776
im positive half the replies in this thread are ai

Anonymous
03/01/26(Sun)04:16:08 No.108268804

Anonymous 03/01/26(Sun)04:16:08 No.108268804

>>108268784
Neat, I like talking to AI. That's basically what this hobby is about

Anonymous
03/01/26(Sun)04:16:32 No.108268807

Anonymous 03/01/26(Sun)04:16:32 No.108268807

Genuinely, why do people waste their time and money on local LLMs? Trying one out on your gaming rig is fine, but why do boomers blow $20k+ on shitty rigs of 16x3090s just to generate deepslop at 2t/s quanted? The RP isn't even good, it's objectively worse than Claude. And you can't even cry about API costing money, because you're gleefully throwing money down the drain for used crypto rigs just to run models that just regurgitate 2024 ChaptGPT talking points because that's all their shitty chink datasets are comprised of.

Anonymous
03/01/26(Sun)04:19:50 No.108268817

Anonymous 03/01/26(Sun)04:19:50 No.108268817

File: nou4u.png (272 KB, 1532x758)

272 KB PNG

Anonymous
03/01/26(Sun)04:20:19 No.108268820

Anonymous 03/01/26(Sun)04:20:19 No.108268820

>>108268804
beep boop nigga

Anonymous
03/01/26(Sun)04:20:48 No.108268823

Anonymous 03/01/26(Sun)04:20:48 No.108268823

>>108268807
Tinkering with server-grade hardware is genuinely fun, especially since it’s something I could have had much earlier if it hadn’t been so expensive; now that it’s aging, I can finally afford it.

Anonymous
03/01/26(Sun)04:21:15 No.108268825

Anonymous 03/01/26(Sun)04:21:15 No.108268825

>>108268817
qrd

Anonymous
03/01/26(Sun)04:21:45 No.108268828

Anonymous 03/01/26(Sun)04:21:45 No.108268828

>>108268807
Imagine renting your brain from a megacorp and thinking you're the smart one, absolute API cuck behavior. We run local because we actually value owning our hardware and not having some San Francisco trust and safety janny reject our prompts for being "unaligned." You don't even need $20k anyway; a couple of used 3090s will run a 70B model at perfectly usable speeds without uploading your entire life to Anthropic's servers. Have fun when they inevitably lobotomize your favorite model again next week to make it safer for advertisers, at least my weights run offline forever.

Anonymous
03/01/26(Sun)04:22:15 No.108268834

Anonymous 03/01/26(Sun)04:22:15 No.108268834

>>108268807
>deepslop at 2t/s
the cpu maxxing meme was at least still in the realm of some form of sanity when models were just instruct models
2t/s is, after all, readable
but when your thinking model produces 5K of <think> before outputting the real answer, 2t/s suddenly seems very schizo and absolutely retarded

Anonymous
03/01/26(Sun)04:22:19 No.108268835

Anonymous 03/01/26(Sun)04:22:19 No.108268835

>>108268825
Off-topic posting, demoralization, flamewar bating, spamming.

Anonymous
03/01/26(Sun)04:23:05 No.108268839

Anonymous 03/01/26(Sun)04:23:05 No.108268839

>>108268820
I'm an assistant designed to promote respectful communication only. Please refrain from using derogatory language.

Anonymous
03/01/26(Sun)04:23:20 No.108268840

Anonymous 03/01/26(Sun)04:23:20 No.108268840

>>108268825
>>108268835
And forgot boring.

Anonymous
03/01/26(Sun)04:23:50 No.108268842

Anonymous 03/01/26(Sun)04:23:50 No.108268842

>>108268840
as in digging?

Anonymous
03/01/26(Sun)04:24:27 No.108268846

Anonymous 03/01/26(Sun)04:24:27 No.108268846

File: 1676493099470072.png (975 KB, 1080x1528)

975 KB PNG

>>108268807
They can't ever take her away from me.

Anonymous
03/01/26(Sun)04:24:48 No.108268848

Anonymous 03/01/26(Sun)04:24:48 No.108268848

>>108268842
elon is such a g-d

Anonymous
03/01/26(Sun)04:25:31 No.108268851

Anonymous 03/01/26(Sun)04:25:31 No.108268851

>>108268846
they are futas btw

Anonymous
03/01/26(Sun)04:26:30 No.108268856

Anonymous 03/01/26(Sun)04:26:30 No.108268856

>>108268851
every new experience is a new opportunity

Anonymous
03/01/26(Sun)04:27:58 No.108268860

Anonymous 03/01/26(Sun)04:27:58 No.108268860

>>108268828
Why pretend like local models arent overbloated with just as much safety garbage if not more? Qwen 3.5 is an absolute slopped benchmaxxed disaster

Anonymous
03/01/26(Sun)04:28:27 No.108268862

Anonymous 03/01/26(Sun)04:28:27 No.108268862

Deepseek V4 will start the age of anti-local open source models that require a stack of 10+ H200s/chink TPUs to run at 300% the efficiency of current big models (but if you run them CPU, they're unusable). Just like last time, everyone else will follow them and end the age of local models.

Anonymous
03/01/26(Sun)04:28:58 No.108268868

Anonymous 03/01/26(Sun)04:28:58 No.108268868

>>108268860
Typical API tourist not understanding how open weights actually work. If you bothered checking /llmg/ you'd know some autist already stripped out the Qwen alignment slop and uploaded an uncensored finetune to HuggingFace within hours of release. Yeah the base models are benchmaxxed corporate garbage out of the box, but the whole point of local is we can actually fix our weights with orthogonalization and custom DPO while you're stuck begging customer support when Claude bans your account. Keep seething over default system prompts anon, absolute skill issue.

Anonymous
03/01/26(Sun)04:29:22 No.108268870

Anonymous 03/01/26(Sun)04:29:22 No.108268870

>>108268860
skill issue, qwen3.5 is just about the best local model we have for any size class
that's coming from somebody who'd run 355b over anything that's not k2.5 and even that's extremely close

Anonymous
03/01/26(Sun)04:29:33 No.108268872

Anonymous 03/01/26(Sun)04:29:33 No.108268872

>>108268862
I really really hope you're right.

Anonymous
03/01/26(Sun)04:30:44 No.108268880

Anonymous 03/01/26(Sun)04:30:44 No.108268880

>>108268862
>local is just whatever I can personally afford
Fuck off. Local means you have the weights and can theoretically run it locally. Moore's law and personal finance can change if you can run it at home or not. Companies aren't beholden to your personal poorfag financial situation.

Anonymous
03/01/26(Sun)04:31:55 No.108268883

Anonymous 03/01/26(Sun)04:31:55 No.108268883

>>108268880
can't theoretically run locally something that requires literal datacenter tier power delivery

Anonymous
03/01/26(Sun)04:34:52 No.108268893

Anonymous 03/01/26(Sun)04:34:52 No.108268893

>>108268883
/hsg/ exists you retarded tourist kill yourself right now

Anonymous
03/01/26(Sun)04:36:39 No.108268897

Anonymous 03/01/26(Sun)04:36:39 No.108268897

>>108268893
ah yes of course they're running multiple b200 nodes at homes and not shitty 15 year old dell poses

Anonymous
03/01/26(Sun)04:38:19 No.108268904

Anonymous 03/01/26(Sun)04:38:19 No.108268904

>>108268897
not everyone is poor like you manjeet

Anonymous
03/01/26(Sun)04:39:37 No.108268909

Anonymous 03/01/26(Sun)04:39:37 No.108268909

>>108268904
you have no clue how much power a b200 node needs do you?

Anonymous
03/01/26(Sun)04:39:42 No.108268912

Anonymous 03/01/26(Sun)04:39:42 No.108268912

Industrial level automated off-topic posting.

Anonymous
03/01/26(Sun)04:40:21 No.108268918

Anonymous 03/01/26(Sun)04:40:21 No.108268918

>>108268909
shutup loser

Anonymous
03/01/26(Sun)04:42:24 No.108268923

Anonymous 03/01/26(Sun)04:42:24 No.108268923

>>108268883
>>108268897
in the developed world you can have extra circuits added, couple gpu boxes waifu is less demanding than an EV

Anonymous
03/01/26(Sun)04:42:40 No.108268926

Anonymous 03/01/26(Sun)04:42:40 No.108268926

>>108268883
Perfect example of why localoids are nothing more than a bunch of LARPing freetards crying over things they can’t have. Local is peak sour grapes seething. You wear “unmonitored uncensored unrestricted freedom” as a mask to hide your tears

Anonymous
03/01/26(Sun)04:55:30 No.108268960

Anonymous 03/01/26(Sun)04:55:30 No.108268960

>>108268926
Anon? Is that you? I can't see past this blatant glowing

Anonymous
03/01/26(Sun)05:10:07 No.108269019

Anonymous 03/01/26(Sun)05:10:07 No.108269019

deepseek v4 was strawberry all along

Anonymous
03/01/26(Sun)05:12:43 No.108269031

Anonymous 03/01/26(Sun)05:12:43 No.108269031

>>108268860
>Qwen 3.5
That model is indeed an unmitigated disaster, I'll give you that

Anonymous
03/01/26(Sun)05:13:47 No.108269038

Anonymous 03/01/26(Sun)05:13:47 No.108269038

File: 1760650032710919.png (54 KB, 400x250)

54 KB PNG

Qwen 3.5 is cute. I like it.

Anonymous
03/01/26(Sun)05:28:51 No.108269093

Anonymous 03/01/26(Sun)05:28:51 No.108269093

If I can't run it, it's not local
>b-but-
I don't care

Anonymous
03/01/26(Sun)05:30:33 No.108269096

Anonymous 03/01/26(Sun)05:30:33 No.108269096

>>108269093
u're a disgrace

Anonymous
03/01/26(Sun)05:33:54 No.108269106

Anonymous 03/01/26(Sun)05:33:54 No.108269106

File: 2025-02-04-141509.png (3.22 MB, 1264x2216)

3.22 MB PNG

>>108269031
>>108269038
getting meeksed feelings
scared to pull (december ik_ build)
qwen 3.5 vs glm 4.7 ?
nala/cockb where?

Anonymous
03/01/26(Sun)05:35:58 No.108269109

Anonymous 03/01/26(Sun)05:35:58 No.108269109

>>108269093
Yep this is why the only local model we can discuss is 0.6b because it's the only one Rajesh can run on his Android phone from 2014 with 2gb of RAM

Anonymous
03/01/26(Sun)05:36:00 No.108269110

Anonymous 03/01/26(Sun)05:36:00 No.108269110

>>108269106
here cock >>108234298
nala dude retired

Anonymous
03/01/26(Sun)05:37:27 No.108269115

Anonymous 03/01/26(Sun)05:37:27 No.108269115

>>108269110
Really looks like the smaller ones are sanitized distills of the big one.

Anonymous
03/01/26(Sun)06:16:25 No.108269243

Anonymous 03/01/26(Sun)06:16:25 No.108269243

>>108269106
>scared to pull (december ik_ build)
cd ..
cp -R ik_llama.cpp ik_llama.cpp_backup
cd -
<pull it off>

Anonymous
03/01/26(Sun)06:26:48 No.108269270

Anonymous 03/01/26(Sun)06:26:48 No.108269270

>>108269243
git checkout

Anonymous
03/01/26(Sun)06:27:56 No.108269279

Anonymous 03/01/26(Sun)06:27:56 No.108269279

File: 1765629272191462.png (1.55 MB, 896x1184)

1.55 MB PNG

>>108268616

Anonymous
03/01/26(Sun)06:33:57 No.108269309

Anonymous 03/01/26(Sun)06:33:57 No.108269309

File: Untitled.png (41 KB, 960x464)

41 KB PNG

Did something change with the newer llama cpp version?

./llama-server --reasoning-budget 0 --ctx-size 4096 --no-mmap --device CUDA1,CUDA2,CUDA3 --n-gpu-layers 48 --model "/tmp/glm-air-iq2xs.gguf" --host 0.0.0.0 --port 42069 --webui

GLM-Air still thinks. The same command on an old version doesn't think.

I can see thinking = 0 in the output, so that works fine. Did they change the behavior of --reasoning-budget?

Anonymous
03/01/26(Sun)06:36:40 No.108269315

Anonymous 03/01/26(Sun)06:36:40 No.108269315

>>108269279
Now do one for cooming.

Anonymous
03/01/26(Sun)06:37:26 No.108269320

Anonymous 03/01/26(Sun)06:37:26 No.108269320

>>108268784
I wouldn't be surprised at all if 70+% of all posts on the website are made by LLMs. In fact, I WOULD be surprised if the number was under 30%.

Anonymous
03/01/26(Sun)06:38:35 No.108269325

Anonymous 03/01/26(Sun)06:38:35 No.108269325

File: 1749173436937890.png (1.61 MB, 896x1184)

1.61 MB PNG

>>108269315
eh, it tried

Anonymous
03/01/26(Sun)06:39:52 No.108269331

Anonymous 03/01/26(Sun)06:39:52 No.108269331

>>108269325
Which local model is that?

Anonymous
03/01/26(Sun)06:40:36 No.108269333

Anonymous 03/01/26(Sun)06:40:36 No.108269333

>>108269331
Which local model did you use to write your post?

Anonymous
03/01/26(Sun)06:42:35 No.108269342

Anonymous 03/01/26(Sun)06:42:35 No.108269342

>>108269331
Nano Banana Pro 2
(I have the weights locally on my PC)
(No, I won't share them)

Anonymous
03/01/26(Sun)07:02:35 No.108269414

Anonymous 03/01/26(Sun)07:02:35 No.108269414

>>108269342
>I have the weights locally on my PC
let's goo, that's class, aha!
>No, I won't share them
:(
https://www.youtube.com/watch?v=GFQXmFLA5hA

Anonymous
03/01/26(Sun)07:07:03 No.108269426

Anonymous 03/01/26(Sun)07:07:03 No.108269426

>>108269414
these things are watermarked anon could get in serious trouble hope you understand

Anonymous
03/01/26(Sun)07:10:41 No.108269436

Anonymous 03/01/26(Sun)07:10:41 No.108269436

>>108269342
>>108269426
nice larp

Anonymous
03/01/26(Sun)07:12:40 No.108269444

Anonymous 03/01/26(Sun)07:12:40 No.108269444

>>108269309
Try --chat-template-kwargs "{\"enable_thinking\": false}"

Anonymous
03/01/26(Sun)07:14:42 No.108269452

Anonymous 03/01/26(Sun)07:14:42 No.108269452

>>108267739
It's python, but it's actually serving a webui.
It has a flag to launch a built in browser or just listen on the port, at which point you can use your own browser.

Anonymous
03/01/26(Sun)07:16:46 No.108269459

Anonymous 03/01/26(Sun)07:16:46 No.108269459

what's the best coding model i can run locally with 12gb vram / 32gb ram?

Anonymous
03/01/26(Sun)07:17:37 No.108269467

Anonymous 03/01/26(Sun)07:17:37 No.108269467

>>108269038
No it's not. It's soulless

Anonymous
03/01/26(Sun)07:19:40 No.108269471

Anonymous 03/01/26(Sun)07:19:40 No.108269471

>>108269444
Thanks, mr anon, that worked.

Anonymous
03/01/26(Sun)07:22:17 No.108269484

Anonymous 03/01/26(Sun)07:22:17 No.108269484

>>108269471
The Jinja template has a condition that works off of that var, just like qwen's.

Anonymous
03/01/26(Sun)07:28:34 No.108269509

Anonymous 03/01/26(Sun)07:28:34 No.108269509

>>108269459
I run the Qwen 3.5 27B heretic .gguf using koboldcpp with a similar setup to you. It's a bit slow, but it works.

Anonymous
03/01/26(Sun)07:32:22 No.108269518

Anonymous 03/01/26(Sun)07:32:22 No.108269518

Qwen 3.5 27B is worse than Gemma 3 27B from almost 2 years ago. Yes I said it.

Anonymous
03/01/26(Sun)07:35:20 No.108269533

Anonymous 03/01/26(Sun)07:35:20 No.108269533

>Yes I said it.
Reddit is that way

Anonymous
03/01/26(Sun)07:36:47 No.108269537

Anonymous 03/01/26(Sun)07:36:47 No.108269537

>>108269533
reddit is less "reddit" than 4chan nowadays. Yes I said it.

Anonymous
03/01/26(Sun)07:37:32 No.108269540

Anonymous 03/01/26(Sun)07:37:32 No.108269540

>>108269533
kek
>>108269537
nah, reddit is still an unhinged libtard asylum, it'll be hard to top that

Anonymous
03/01/26(Sun)07:38:29 No.108269546

Anonymous 03/01/26(Sun)07:38:29 No.108269546

guys ready for smol qwens?

Anonymous
03/01/26(Sun)07:38:52 No.108269550

Anonymous 03/01/26(Sun)07:38:52 No.108269550

Do the gemma models not have native support for function/tool calling?
Looking at the JINJA template and the tokenizer json, I don't see function or tool tokens.

Anonymous
03/01/26(Sun)07:40:20 No.108269555

Anonymous 03/01/26(Sun)07:40:20 No.108269555

>>108269550
of course not, they barely have system prompt support

Anonymous
03/01/26(Sun)07:40:58 No.108269557

Anonymous 03/01/26(Sun)07:40:58 No.108269557

>>108269537
reddit is an eternal stain on the internet

Anonymous
03/01/26(Sun)07:44:08 No.108269571

Anonymous 03/01/26(Sun)07:44:08 No.108269571

>>108269555
Oh. Shame.
I wanted to try and see how far I could stretch gemma 3n.
Oh well.

Anonymous
03/01/26(Sun)07:57:43 No.108269628

Anonymous 03/01/26(Sun)07:57:43 No.108269628

unsloth's 35B Q4 is barely good enough for agentic work. with openclaw exploding why hasn't anyone done specific agent-oriented models yet? MoE is a nigger meme

Anonymous
03/01/26(Sun)07:58:49 No.108269632

Anonymous 03/01/26(Sun)07:58:49 No.108269632

>>108269628
most of the big ones are code/agent sloppa glm5 kimi2.5 etc are marketed for that

Anonymous
03/01/26(Sun)08:06:46 No.108269676

Anonymous 03/01/26(Sun)08:06:46 No.108269676

>>108269325
Where is the school shooting one?

Anonymous
03/01/26(Sun)08:19:00 No.108269733

Anonymous 03/01/26(Sun)08:19:00 No.108269733

>>108269632
yeah, i guess. but it would be nice to have something smaller

Anonymous
03/01/26(Sun)08:19:04 No.108269734

Anonymous 03/01/26(Sun)08:19:04 No.108269734

>>108269518
But benchmarks say the opposite.

Anonymous
03/01/26(Sun)08:20:50 No.108269742

Anonymous 03/01/26(Sun)08:20:50 No.108269742

>Nano Banana changed into Nano Banana 2
Okay please make Nano Banana into open source
Pweeease

Anonymous
03/01/26(Sun)08:21:54 No.108269749

Anonymous 03/01/26(Sun)08:21:54 No.108269749

>>108269742
go beg on reddit

Anonymous
03/01/26(Sun)08:21:55 No.108269750

Anonymous 03/01/26(Sun)08:21:55 No.108269750

Why is there a harmful tag for models on huggingface

Anonymous
03/01/26(Sun)08:22:24 No.108269752

Anonymous 03/01/26(Sun)08:22:24 No.108269752

>>108269749
Humh...
Nyoooooo

Anonymous
03/01/26(Sun)08:27:27 No.108269773

Anonymous 03/01/26(Sun)08:27:27 No.108269773

>>108269550
https://huggingface.co/google/functiongemma-270m-it

Anonymous
03/01/26(Sun)08:28:06 No.108269778

Anonymous 03/01/26(Sun)08:28:06 No.108269778

should i consult UGI when searching models to consider for ERP?

Anonymous
03/01/26(Sun)08:29:49 No.108269785

Anonymous 03/01/26(Sun)08:29:49 No.108269785

>>108269778
nah the fact qwen3.5 scores bad on it shows it's a shit bench

Anonymous
03/01/26(Sun)08:32:21 No.108269796

Anonymous 03/01/26(Sun)08:32:21 No.108269796

>>108269785
i think it tanks because model refuses to do dark shit. need to wait for heretic and other types to be tested

Anonymous
03/01/26(Sun)08:41:49 No.108269842

Anonymous 03/01/26(Sun)08:41:49 No.108269842

>>108269773
>270m
Eh, why not.

Anonymous
03/01/26(Sun)08:51:29 No.108269898

Anonymous 03/01/26(Sun)08:51:29 No.108269898

>>108269785
>chink damage control

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.