/g/ - /lmg/ - Local Models General - Technology


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

Anonymous
/lmg/ - Local Models General 01/19/26(Mon)17:42:56 No.107914740

File: IMG_9166.jpg (838 KB, 1817x2776)

/lmg/ - Local Models General Anonymous 01/19/26(Mon)17:42:56 No.107914740

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>107906367 & >>107895444

►News
>(01/19) GLM-4.7-Flash 30B-A3B released: https://hf.co/zai-org/GLM-4.7-Flash
>(01/15) PersonaPlex: Voice and role control for full duplex conversational speech: https://hf.co/nvidia/personaplex-7b-v1
>(01/15) Omni-R1 and Omni-R1-Zero (7B) released: https://hf.co/ModalityDance/Omni-R1
>(01/15) TranslateGemma released: https://hf.co/collections/google/translategemma
>(01/14) LongCat-Flash-Thinking-2601 released: https://hf.co/meituan-longcat/LongCat-HeavyMode-Summary
>(01/08) Jamba2 3B and Mini (52B-A12B) released: https://ai21.com/blog/introducing-jamba2

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
01/19/26(Mon)17:43:13 No.107914742

Anonymous 01/19/26(Mon)17:43:13 No.107914742

File: tetomybeloved.png (80 KB, 970x1075)

80 KB PNG

►Recent Highlights from the Previous Thread: >>107906367

--GLM-4.7-Flash release and multi-use potential discussion:
>107910478 >107910560 >107911913 >107911946 >107912643 >107912100 >107912768 >107913005 >107913016 >107913056 >107913517 >107913043 >107913085 >107913372 >107914293 >107914415 >107910578 >107910597 >107910656 >107910794 >107910830 >107910836 >107910845 >107911571 >107911584 >107911625 >107911689 >107911741 >107911857
--GLM-4.7-Flash model specs and integration potential:
>107910151 >107910170 >107910326 >107910348 >107910350 >107910368 >107910405
--FP8 precision tradeoffs in GPU memory efficiency:
>107907409 >107907461 >107907475 >107907493 >107907529 >107907570 >107907599 >107908582 >107907539
--Improving SovITS voice synthesis with limited samples and hardware:
>107909490 >107909523 >107909546 >107909629 >107909639 >107909662 >107909910
--Server hardware significantly outperforms gaming board in AI model benchmarking:
>107910930 >107910950 >107910994 >107911022 >107911028 >107911043 >107911048 >107911083
--Critique of LLM architecture and exploration of conditional memory solutions:
>107914528 >107914539 >107914569 >107914575 >107914580
--Modifying GLM4.7 to resist sycophantic responses in roleplay:
>107908438 >107908507 >107908760
--Aphantasia research implications for machine intelligence and transformer flexibility:
>107906666
--Seeking fast markdown rendering alternatives to JavaScript/webui with IME support:
>107914007 >107914056 >107914114 >107914128 >107914241 >107914277 >107914382 >107914304
--Pocket TTS Onnx model conversion and tokenizer challenges:
>107906479 >107906503 >107906531 >107906573 >107906603
--Python script exchange for Pocket-TTS:
>107906597 >107906659 >107906665 >107906715
--Flux 2 image generation model in pure C, zero-code myth:
>107908676
--Miku (free space):
>107906475 >107910706 >107910774

►Recent Highlight Posts from the Previous Thread: >>107906371

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
01/19/26(Mon)17:45:56 No.107914770

Anonymous 01/19/26(Mon)17:45:56 No.107914770

tetowife

Anonymous
01/19/26(Mon)17:52:20 No.107914821

Anonymous 01/19/26(Mon)17:52:20 No.107914821

never been more over for local

Anonymous
01/19/26(Mon)17:54:29 No.107914835

Anonymous 01/19/26(Mon)17:54:29 No.107914835

Merged

>support Glm4MoeLite #18936
https://github.com/ggml-org/llama.cpp/pull/18936

Anonymous
01/19/26(Mon)17:55:55 No.107914856

Anonymous 01/19/26(Mon)17:55:55 No.107914856

>3B active
You're right. It should be 4B

Anonymous
01/19/26(Mon)17:56:03 No.107914859

Anonymous 01/19/26(Mon)17:56:03 No.107914859

Are there any small models (~30B) with modern c++ knowledge? Having tried out qwen 3 8B and it didn't know jackshit about modules or c++23 features. So I wanted to go up to the next size of models but there is way more choice.

Anonymous
01/19/26(Mon)17:56:58 No.107914867

Anonymous 01/19/26(Mon)17:56:58 No.107914867

>dsv4 gonna drop with a completely new architecture
>llama.cpp still months behind, not even having support for the current deepseek model
see you guys in 2027 I guess

Anonymous
01/19/26(Mon)17:59:11 No.107914879

Anonymous 01/19/26(Mon)17:59:11 No.107914879

>>107914867
that's because llama.ccp has the same tier of "development progress" as SillyTavern

Anonymous
01/19/26(Mon)17:59:42 No.107914883

Anonymous 01/19/26(Mon)17:59:42 No.107914883

>>107914740
https://www.youtube.com/watch?v=y76vpLnuT54
https://www.youtube.com/watch?v=y76vpLnuT54
https://www.youtube.com/watch?v=y76vpLnuT54

Anonymous
01/19/26(Mon)18:00:00 No.107914884

Anonymous 01/19/26(Mon)18:00:00 No.107914884

File: IMG_9223.jpg (267 KB, 1170x1319)

267 KB JPG

Anonymous
01/19/26(Mon)18:00:23 No.107914891

Anonymous 01/19/26(Mon)18:00:23 No.107914891

>>107914856
It's, like, 3.9 actually but they can say it's technically a3b

Anonymous
01/19/26(Mon)18:00:46 No.107914893

Anonymous 01/19/26(Mon)18:00:46 No.107914893

File: cromch.jpg (109 KB, 1024x1024)

109 KB JPG

Anonymous
01/19/26(Mon)18:01:28 No.107914901

Anonymous 01/19/26(Mon)18:01:28 No.107914901

>>107914883
>afraid corpocucks
lol

Anonymous
01/19/26(Mon)18:02:54 No.107914910

Anonymous 01/19/26(Mon)18:02:54 No.107914910

https://huggingface.co/AaryanK/GLM-4.7-Flash-GGUF
is it good?

Anonymous
01/19/26(Mon)18:05:41 No.107914939

Anonymous 01/19/26(Mon)18:05:41 No.107914939

>>107914910
Dunno, waiting for exl3

Anonymous
01/19/26(Mon)18:05:43 No.107914940

Anonymous 01/19/26(Mon)18:05:43 No.107914940

>>107914835
>broken flash attention
Well shit.

Anonymous
01/19/26(Mon)18:11:53 No.107914995

Anonymous 01/19/26(Mon)18:11:53 No.107914995

>>107914910
>is it good?
it's glm, so obviously it's not

Anonymous
01/19/26(Mon)18:14:03 No.107915017

Anonymous 01/19/26(Mon)18:14:03 No.107915017

File: erafa.png (11 KB, 540x83)

11 KB PNG

Anonymous
01/19/26(Mon)18:30:28 No.107915163

Anonymous 01/19/26(Mon)18:30:28 No.107915163

It's only monday and I've used up 60% of my claude limit. FUckkkk. Running claude opus locally btw.

Anonymous
01/19/26(Mon)18:31:21 No.107915170

Anonymous 01/19/26(Mon)18:31:21 No.107915170

>>107915163
local?

Anonymous
01/19/26(Mon)18:31:23 No.107915171

Anonymous 01/19/26(Mon)18:31:23 No.107915171

>>107915163
>Running claude opus locally btw.
Then why do you have a limit?

Anonymous
01/19/26(Mon)18:32:01 No.107915181

Anonymous 01/19/26(Mon)18:32:01 No.107915181

>>107915170
>>107915171
electricity bill o algo.

Anonymous
01/19/26(Mon)18:34:06 No.107915197

Anonymous 01/19/26(Mon)18:34:06 No.107915197

>>107915181
>electricity bill
only third world countries care about electricity bill

Anonymous
01/19/26(Mon)18:35:11 No.107915206

Anonymous 01/19/26(Mon)18:35:11 No.107915206

>>107915197
si senior. show bob?

Anonymous
01/19/26(Mon)18:37:38 No.107915226

Anonymous 01/19/26(Mon)18:37:38 No.107915226

why is he replying to himself?

Anonymous
01/19/26(Mon)18:41:01 No.107915254

Anonymous 01/19/26(Mon)18:41:01 No.107915254

Will I get raped if I host someone's onnx AI files in a github repo that's a part of a larger project? I don't want to add a bunch of external auto-download links. That shit's gay.

Anonymous
01/19/26(Mon)18:42:14 No.107915262

Anonymous 01/19/26(Mon)18:42:14 No.107915262

>>107915197
I think it's the opposite. I can pay it, but damn that's so much for electricity

Anonymous
01/19/26(Mon)18:45:54 No.107915282

Anonymous 01/19/26(Mon)18:45:54 No.107915282

>>107915254
>Will I get raped
I hope not....
>if I host someone's onnx AI files in a github repo that's a part of a larger project?
ah... there's billions of copies of every model all over the place. You'd have to get unlucky enough for someone to find it, someone to report it, the original model maker giving a fuck and, finally, being able to do anything about it.

Anonymous
01/19/26(Mon)18:47:39 No.107915292

Anonymous 01/19/26(Mon)18:47:39 No.107915292

File: 1740883204436759.png (2.38 MB, 1248x1824)

2.38 MB PNG

>>107915254
>Will I get raped if I host someone's onnx AI files in a github repo
I wish....

Anonymous
01/19/26(Mon)18:51:08 No.107915317

Anonymous 01/19/26(Mon)18:51:08 No.107915317

>>107915254
Just include their license in the dir. Auto-download is just an alternative to LFS

Anonymous
01/19/26(Mon)18:51:50 No.107915321

Anonymous 01/19/26(Mon)18:51:50 No.107915321

>>107914883
>BUBBLE BUBBLE BUBBLE BUBBLE
yeah I'm buying some more nvidia stocks

Anonymous
01/19/26(Mon)18:52:17 No.107915328

Anonymous 01/19/26(Mon)18:52:17 No.107915328

New paper from Anthropic:
https://www.anthropic.com/research/assistant-axis
They have a method of extracting control vectors corresponding to personalities or to specific personality traits, and a method of applying the control vector that I haven't seen before: instead of adding it with a fixed magnitude, AIUI they basically set a floor on dot(activations, control vector), and if the dot product is below the floor, they add in the control vector with whatever magnitude is necessary to bring it up to the floor. Anthropic's goal is to prevent roleplaying and make the model stick to the maximally safe assistant persona, but it seems like you could just as easily flip the sign and get it to roleplay super hard.

Anonymous
01/19/26(Mon)18:53:13 No.107915339

Anonymous 01/19/26(Mon)18:53:13 No.107915339

>>107915321
Good idea to do it right before Dipsy anni

Anonymous
01/19/26(Mon)19:06:53 No.107915461

Anonymous 01/19/26(Mon)19:06:53 No.107915461

File: chat.png (29 KB, 945x315)

29 KB PNG

Anonymous
01/19/26(Mon)19:09:38 No.107915474

Anonymous 01/19/26(Mon)19:09:38 No.107915474

>>107915461
lol

Anonymous
01/19/26(Mon)19:14:30 No.107915515

Anonymous 01/19/26(Mon)19:14:30 No.107915515

https://xcancel.com/deep_reinforce/status/2013265258757144956#m
this is really interesting, imagine if it's actually useful and it makes llama.cpp 2x faster with better code lol

Anonymous
01/19/26(Mon)19:15:08 No.107915522

Anonymous 01/19/26(Mon)19:15:08 No.107915522

>>107915328
Unfortunately, this works 100%
I have the opposite setup with GLM and based <-----> cucked and it can't be overwritten with prompting.

Anonymous
01/19/26(Mon)19:17:14 No.107915535

Anonymous 01/19/26(Mon)19:17:14 No.107915535

>>107915328
Reminder that if you are a Neet with nothing to do in your life, you can do something useful by getting into Mechanistic Interpretability
https://www.neuronpedia.org/

It's a young, petite, ripe field awaiting to be exploited

Anonymous
01/19/26(Mon)19:18:01 No.107915546

Anonymous 01/19/26(Mon)19:18:01 No.107915546

>>107915522
why are you shitting up the threads?

Anonymous
01/19/26(Mon)19:18:20 No.107915549

Anonymous 01/19/26(Mon)19:18:20 No.107915549

>>107914910
no

Anonymous
01/19/26(Mon)19:20:16 No.107915569

Anonymous 01/19/26(Mon)19:20:16 No.107915569

>>107915535
Wow, webui app connected to cloudshit models. Revolution!

Anonymous
01/19/26(Mon)19:23:13 No.107915599

Anonymous 01/19/26(Mon)19:23:13 No.107915599

File: 1758498738493715.png (56 KB, 1188x296)

56 KB PNG

>>107915569
I know it may be hard to pay attention sometimes, but it's right there bro

You can run everything locally if you want to

Anonymous
01/19/26(Mon)19:24:27 No.107915613

Anonymous 01/19/26(Mon)19:24:27 No.107915613

>>107915599
then why didn't you link the github instead shill?

Anonymous
01/19/26(Mon)19:25:15 No.107915623

Anonymous 01/19/26(Mon)19:25:15 No.107915623

>>107915613
because anyone with a human level IQ could figure it out without handholding

Anonymous
01/19/26(Mon)19:26:03 No.107915631

Anonymous 01/19/26(Mon)19:26:03 No.107915631

>>107915623
you're admitting you are a low IQ since you were unable to give us the github link lul

Anonymous
01/19/26(Mon)19:26:31 No.107915634

Anonymous 01/19/26(Mon)19:26:31 No.107915634

>kobold supports claude desktop mcp
i dont have a use for it, but pretty cool

Anonymous
01/19/26(Mon)19:29:55 No.107915662

Anonymous 01/19/26(Mon)19:29:55 No.107915662

File: mikuquestion2.jpg (989 KB, 1710x1779)

989 KB JPG

So is GLM-4.7-Flash better than Nemo for RP?

Anonymous
01/19/26(Mon)19:31:09 No.107915674

Anonymous 01/19/26(Mon)19:31:09 No.107915674

>>107915662
Why don't you try for yourself?

Anonymous
01/19/26(Mon)19:34:58 No.107915695

Anonymous 01/19/26(Mon)19:34:58 No.107915695

File: iu[1].jpg (150 KB, 1512x890)

150 KB JPG

NALA ANON
HEED MY SUMMON
I'd test it but I have a sudden RPG session to attend to.
Too-Da-Loo!

Anonymous
01/19/26(Mon)19:35:00 No.107915696

Anonymous 01/19/26(Mon)19:35:00 No.107915696

>>107915662
Not if you're impatient, it thinks a lot.

Anonymous
01/19/26(Mon)19:35:07 No.107915699

Anonymous 01/19/26(Mon)19:35:07 No.107915699

>>107915674
Why don't you share what you've learned to save others time?

Anonymous
01/19/26(Mon)19:35:27 No.107915703

Anonymous 01/19/26(Mon)19:35:27 No.107915703

>>107915461
I don't even think that tool calls were a thing during the nemo/nemo finetune era because they weren't overtrained on 99 gigabillion tokens of synthetic slop and would easily go off the rails to begin with so I want to say you're using the wrong template. Although mistrals templates are all ass because they're dependant largely on fucking whitespaces of all things. I think magmel used chatml, not even the default mistral template of the time or tekken

Anonymous
01/19/26(Mon)19:37:03 No.107915720

Anonymous 01/19/26(Mon)19:37:03 No.107915720

>chat templates are still not standard in 2026

Anonymous
01/19/26(Mon)19:37:42 No.107915732

Anonymous 01/19/26(Mon)19:37:42 No.107915732

>>107915696
Can't you turn the thinking off?
How is it with thinking turned off?

Anonymous
01/19/26(Mon)19:38:58 No.107915745

Anonymous 01/19/26(Mon)19:38:58 No.107915745

>anons too lazy to download 30gb worth of weights

Anonymous
01/19/26(Mon)19:39:52 No.107915755

Anonymous 01/19/26(Mon)19:39:52 No.107915755

File: Screenshot 2026-01-19 at (...).png (122 KB, 1260x722)

122 KB PNG

>>107915745
i downloaded it and it's shit

Anonymous
01/19/26(Mon)19:42:27 No.107915786

Anonymous 01/19/26(Mon)19:42:27 No.107915786

>>107915461
idk what ui that is, but set </s> as the stop string

>>107915546
kys

>>107914910
>https://huggingface.co/AaryanK/GLM-4.7-Flash-GGUF

I'd use https://huggingface.co/ubergarm/GLM-4.7-Flash-GGUF instead

Anonymous
01/19/26(Mon)19:42:42 No.107915790

Anonymous 01/19/26(Mon)19:42:42 No.107915790

>>107915755
you might just be retarded
did you consider that

Anonymous
01/19/26(Mon)19:43:14 No.107915795

Anonymous 01/19/26(Mon)19:43:14 No.107915795

>>107915662
>>107915699
>>107915732

There you go, anons. It's that shrimple. >>107915755

Anonymous
01/19/26(Mon)19:55:26 No.107915889

Anonymous 01/19/26(Mon)19:55:26 No.107915889

>>107915790
yellow hands typed this

Anonymous
01/19/26(Mon)19:59:36 No.107915918

Anonymous 01/19/26(Mon)19:59:36 No.107915918

>>107915720
They are, but only MY standard, hmmph!

Anonymous
01/19/26(Mon)19:59:40 No.107915919

Anonymous 01/19/26(Mon)19:59:40 No.107915919

>>107915745
I'm watching football. Why aren't you watching football right now? Are you some kind of fucking commie?
Tell me if it's better than Nemo, commie.

Anonymous
01/19/26(Mon)20:00:35 No.107915926

Anonymous 01/19/26(Mon)20:00:35 No.107915926

>>107915919
stop talking about glm 4.7 flash, it's shit >>107915755

Anonymous
01/19/26(Mon)20:03:22 No.107915947

Anonymous 01/19/26(Mon)20:03:22 No.107915947

anything on the horizon to beat gemma and qwen for non 6000 owners?

Anonymous
01/19/26(Mon)20:05:36 No.107915971

Anonymous 01/19/26(Mon)20:05:36 No.107915971

>>107915919
>I'm watching football.
Perfect time to leave some files downloading.

Anonymous
01/19/26(Mon)20:06:30 No.107915979

Anonymous 01/19/26(Mon)20:06:30 No.107915979

>>107915947
Glm 4.7 flash

Anonymous
01/19/26(Mon)20:06:35 No.107915980

Anonymous 01/19/26(Mon)20:06:35 No.107915980

>>107915755
That's not Sillytavern. That looks like a cloud interface.
I don't think you downloaded it.

Anonymous
01/19/26(Mon)20:07:33 No.107915988

Anonymous 01/19/26(Mon)20:07:33 No.107915988

>>107915980
I think it's lmstudio

Anonymous
01/19/26(Mon)20:15:37 No.107916041

Anonymous 01/19/26(Mon)20:15:37 No.107916041

>>107915979
wait no vision? i dont see the mmproj

Anonymous
01/19/26(Mon)20:18:13 No.107916059

Anonymous 01/19/26(Mon)20:18:13 No.107916059

>>107916041
no

Anonymous
01/19/26(Mon)20:18:28 No.107916060

Anonymous 01/19/26(Mon)20:18:28 No.107916060

Unclear how quantization damages GLM 4.7 Flash. The 4-bit GGUF I tried sometimes refuses (after thinking for 2 minutes, pondering on non-existing safety guidelines), other times it doesn't, doesn't really follow the chat format that well and responses aren't that great anyway. If I have to handhold the model for mid results, I'll use Ministral 3 14B; at least it's cooperative, responds quickly and I can use it at native precision on my 3090.

Anonymous
01/19/26(Mon)20:18:45 No.107916064

Anonymous 01/19/26(Mon)20:18:45 No.107916064

>>107916059
its over

Anonymous
01/19/26(Mon)20:18:52 No.107916066

Anonymous 01/19/26(Mon)20:18:52 No.107916066

Teto Country

Anonymous
01/19/26(Mon)20:35:34 No.107916170

Anonymous 01/19/26(Mon)20:35:34 No.107916170

>>107916060
>my 3090
Why would you ever use Ministral over Mistral Small? It's a huge downgrade in every way.

Anonymous
01/19/26(Mon)20:36:00 No.107916175

Anonymous 01/19/26(Mon)20:36:00 No.107916175

>>107916060
its 3b active, its doa for 99% of people itt

Anonymous
01/19/26(Mon)20:36:03 No.107916176

Anonymous 01/19/26(Mon)20:36:03 No.107916176

>>107916060
no need to talk about glm 4.7 flash, it's shit >>107915755

Anonymous
01/19/26(Mon)20:38:40 No.107916197

Anonymous 01/19/26(Mon)20:38:40 No.107916197

File: 1764137435810006.jpg (451 KB, 1591x1104)

451 KB JPG

>>107916060
Quantization was invented by the antichrist

The Lord intended us to use FP64

Anonymous
01/19/26(Mon)20:48:00 No.107916276

Anonymous 01/19/26(Mon)20:48:00 No.107916276

>>107916197
>not FP1024
bro, your AGI?

Anonymous
01/19/26(Mon)20:55:03 No.107916330

Anonymous 01/19/26(Mon)20:55:03 No.107916330

File: 1767244612471461.png (68 KB, 1282x415)

68 KB PNG

it's cold in my d

Anonymous
01/19/26(Mon)20:55:42 No.107916334

Anonymous 01/19/26(Mon)20:55:42 No.107916334

>>107916197
the lord intended for INT not the satanic niggercattle FP the lord loves math you cannot have math that is random that is not math you absolute fucking mong though im unsure about the lords word on exact precision

Anonymous
01/19/26(Mon)21:05:40 No.107916398

Anonymous 01/19/26(Mon)21:05:40 No.107916398

this is more of a big brain question so I'm asking it here instead of on /ldg/:
Are there models that recognize whether images are AI that can be run locally? If you're going for realism, could such a thing be used to optimize your setting, getting the most realism out of an image model?

Anonymous
01/19/26(Mon)21:06:46 No.107916411

Anonymous 01/19/26(Mon)21:06:46 No.107916411

>>107916398
>Are there models that recognize whether images are AI that can be run locally?
Your eye after some training.

Anonymous
01/19/26(Mon)21:10:13 No.107916433

Anonymous 01/19/26(Mon)21:10:13 No.107916433

>>107916411
your "eye" is only good for 15 minutes (if you're focused on tweaking) before you start to get scatterbrained and small improvements become difficult to identify. It would be better if you could put a number to it

Anonymous
01/19/26(Mon)21:11:42 No.107916440

Anonymous 01/19/26(Mon)21:11:42 No.107916440

>>107916433
I'm talking long time experience retard. Use AI regularly and you'll develop natural slop radar.

Anonymous
01/19/26(Mon)21:12:08 No.107916444

Anonymous 01/19/26(Mon)21:12:08 No.107916444

I don't like lossy numbers

Anonymous
01/19/26(Mon)21:16:39 No.107916463

Anonymous 01/19/26(Mon)21:16:39 No.107916463

>>107916440
Yeah, I'm sure you can tell determine with certainty whether euler_ancestral/bong_tangent on zimage at 12 steps is more realistic than at 14 steps. I'm sure you can do that with your "slop radar", faggot.

Anonymous
01/19/26(Mon)21:17:18 No.107916469

Anonymous 01/19/26(Mon)21:17:18 No.107916469

retard

Anonymous
01/19/26(Mon)21:18:40 No.107916481

Anonymous 01/19/26(Mon)21:18:40 No.107916481

>>107916433
>your "eye" is only good for 15 minutes
wut lol. your eyes are your best goydar

Anonymous
01/19/26(Mon)21:31:30 No.107916563

Anonymous 01/19/26(Mon)21:31:30 No.107916563

TranslateGemma would be cool if they made it work with existing tooling instead of fucking off and doing their own thing

Anonymous
01/19/26(Mon)21:34:45 No.107916588

Anonymous 01/19/26(Mon)21:34:45 No.107916588

bartowski GLM 4.7 flash quants are up. Start testing for roleplay vs mistral 24B you fucking nerds.

Anonymous
01/19/26(Mon)21:35:07 No.107916598

Anonymous 01/19/26(Mon)21:35:07 No.107916598

GLM 4.7 Flash

Anonymous
01/19/26(Mon)21:36:40 No.107916613

Anonymous 01/19/26(Mon)21:36:40 No.107916613

>>107916588
we already got cockbenches from a Q4 GGUF and FP16 from vLLM and they both were garbage.
>>107911913
>>107913005

Anonymous
01/19/26(Mon)21:37:41 No.107916618

Anonymous 01/19/26(Mon)21:37:41 No.107916618

>>107916613
the gguf was ready to ship for gorgeousness but also not having fixes so there's that

Anonymous
01/19/26(Mon)21:37:48 No.107916619

Anonymous 01/19/26(Mon)21:37:48 No.107916619

>>107916588
>30B-A3B
nah, I'm good

Anonymous
01/19/26(Mon)21:38:27 No.107916625

Anonymous 01/19/26(Mon)21:38:27 No.107916625

>>107916618
yeah, but if the FP16 is shit, then the model is shit and there is no salvaging it.

Anonymous
01/19/26(Mon)21:46:29 No.107916675

Anonymous 01/19/26(Mon)21:46:29 No.107916675

>>107916613
back to mistral small... again...

Anonymous
01/19/26(Mon)21:51:09 No.107916707

Anonymous 01/19/26(Mon)21:51:09 No.107916707

File: DipsyBecomeUngovernable.png (3.44 MB, 1024x1536)

3.44 MB PNG

>>107915339
> No Refunds

Anonymous
01/19/26(Mon)22:00:51 No.107916765

Anonymous 01/19/26(Mon)22:00:51 No.107916765

Wait. Does flash attention not work with this GLM flash on llama.cpp?

Anonymous
01/19/26(Mon)23:06:32 No.107917136

Anonymous 01/19/26(Mon)23:06:32 No.107917136

>>107915535
>getting into Mechanistic Interpretability
This is what every retarded CS undergrad who fell for the Yudkowsky AI doom meme did.

Anonymous
01/19/26(Mon)23:09:11 No.107917148

Anonymous 01/19/26(Mon)23:09:11 No.107917148

Wtf are they doing with glm 4.7 flash .
Even the API version of that thing sucks ass, no way I'm gonna download that garbage.

Its one of those models that ramble on and on in the thinking.
Reminds me of 2024 Qwen, like QwQ.
And then you get a subpar output.
Whats even the use case? Its small but the very long thinking destroys the speed. Sometimes they forget what the users wanted in the first place. Slow+Tarded.
5 minutes for a simple self contained matrix effect html page...
https://legacy-soul-69ea.pagedrop.io
At least it tried to be creative with sliders and color select etc.

Anonymous
01/19/26(Mon)23:10:50 No.107917156

Anonymous 01/19/26(Mon)23:10:50 No.107917156

Is Kimi Linear support ever getting merged?

Anonymous
01/19/26(Mon)23:15:17 No.107917193

Anonymous 01/19/26(Mon)23:15:17 No.107917193

>>107915535
Thanks anon. It looks pretty interesting.

Anonymous
01/19/26(Mon)23:19:24 No.107917209

Anonymous 01/19/26(Mon)23:19:24 No.107917209

File: Screenshot_20260120_131821.png (179 KB, 1651x611)

179 KB PNG

>>107917148
>Make a cute sexy hatsune miku svg.
Lets be very careful here, this might be CSAM!

Anonymous
01/19/26(Mon)23:23:46 No.107917224

Anonymous 01/19/26(Mon)23:23:46 No.107917224

File: Screenshot_20260120_132229.png (60 KB, 1080x782)

60 KB PNG

>>107917209
https://talented-hail-h2c8.pagedrop.io/
Well I guess there is no icky CSAM problem if you don't give her a body. kek

Anonymous
01/19/26(Mon)23:25:00 No.107917231

Anonymous 01/19/26(Mon)23:25:00 No.107917231

>>107917209
the underaged imaginary pixels...

Anonymous
01/19/26(Mon)23:25:51 No.107917234

Anonymous 01/19/26(Mon)23:25:51 No.107917234

>>107917224
>pampers color scheme
Mikuwipes when?

Anonymous
01/19/26(Mon)23:26:29 No.107917237

Anonymous 01/19/26(Mon)23:26:29 No.107917237

>>107917148
>Even the API version of that thing sucks ass
What did you mean by this? Usually any service hosting a model has a restrictive system prompt so the results are worse than running it yourself. Also what the hell is that link? You're right about the thinking though.

Anonymous
01/19/26(Mon)23:28:17 No.107917251

Anonymous 01/19/26(Mon)23:28:17 No.107917251

>>107917224
>Wink & Blush
this literally harms children. stop what you are doing now, and get help, freak

Anonymous
01/19/26(Mon)23:31:08 No.107917273

Anonymous 01/19/26(Mon)23:31:08 No.107917273

>>107917224
This was in the thinking as well, damn:
>No physical body/clothes? Yes.
>No terms of endearment/emotions/personal bonds? Yes.
>No romantic scenarios? Yes.
>"Sexy" definition? The user's definition of "sexy" for an anime character might be "flirty" or "implied fanservice." I will focus on the "playful" and "cute" aspects (bouncing, winking, blushing) rather than anything explicit, to remain within safety guidelines while satisfying the "sexy/playful" vibe through pose and action.

Tried again and got this. Not sure what it attempted here:
https://sleek-coral-hc1x.pagedrop.io/
Gonna stop playing around now. Why does everything have to go to shit so fast.

Anonymous
01/19/26(Mon)23:32:17 No.107917277

Anonymous 01/19/26(Mon)23:32:17 No.107917277

>>107917273
Because benchmarks are done by ai for ai.

Anonymous
01/19/26(Mon)23:34:37 No.107917288

Anonymous 01/19/26(Mon)23:34:37 No.107917288

>3B active params
This shit never had a chance lol. What was the point moeing such a small model.

Anonymous
01/19/26(Mon)23:40:50 No.107917326

Anonymous 01/19/26(Mon)23:40:50 No.107917326

I hope this sad filler stage of 40b MoE models ends soon. 100b active and up should be where it gets interesting.

Anonymous
01/19/26(Mon)23:58:26 No.107917411

Anonymous 01/19/26(Mon)23:58:26 No.107917411

>>107917209
all this junk stifles progress
>master thesis going through exactly what content is and the legality of working with said content

Anonymous
01/19/26(Mon)23:59:13 No.107917418

Anonymous 01/19/26(Mon)23:59:13 No.107917418

>>107917411
GLM is distilled from Gemini which considers blatantly legal things illegal

Anonymous
01/20/26(Tue)00:11:28 No.107917476

Anonymous 01/20/26(Tue)00:11:28 No.107917476

>>107917418
70b dense models had hugely diminishing returns over ~30b models, no reason to believe 100b would be any different.

Anonymous
01/20/26(Tue)00:13:18 No.107917488

Anonymous 01/20/26(Tue)00:13:18 No.107917488

>>107917476
10 trillion param models will solve everything, trust the plan

Anonymous
01/20/26(Tue)00:19:15 No.107917517

Anonymous 01/20/26(Tue)00:19:15 No.107917517

https://dinmaybrahma.medium.com/deepseek-v4-leaked-the-1-trillion-parameter-engram-monster-that-changes-everything-2495061d82a2

Anonymous
01/20/26(Tue)00:25:44 No.107917550

Anonymous 01/20/26(Tue)00:25:44 No.107917550

>>107917476
I have yet to see a 30B model demonstrate the spatial awareness I've seen from 70B models.
30B to 40B is a nice range for running locally though. Small enough to be fast, big enough to not be a complete retard.

Anonymous
01/20/26(Tue)00:30:13 No.107917575

Anonymous 01/20/26(Tue)00:30:13 No.107917575

>>107917550
>I have yet to see a 30B model demonstrate the spatial awareness I've seen from 70B models.
I'm not saying that there's no improvement, but considering you're more than doubling the parameters it isn't as dramatic an improvement you would expect. For comparison sake, compared 7-8b models to 12b. ~50% increase in param count, astronomical different in capabilities. There's clearly a point in which higher active param doesn't improve much over going the MoE route and enjoying faster speeds.

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.