[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>107636165 & >>107623385

►News
>(12/22) GLM-4.7: Advancing the Coding Capability: https://z.ai/blog/glm-4.7
>(12/17) Introducing Meta Segment Anything Model Audio: https://ai.meta.com/samaudio
>(12/16) MiMo-V2-Flash 309B-A15B released: https://mimo.xiaomi.com/blog/mimo-v2-flash
>(12/16) GLM4V vision encoder support merged: https://github.com/ggml-org/llama.cpp/pull/18042
>(12/15) llama.cpp automation for memory allocation: https://github.com/ggml-org/llama.cpp/discussions/18049

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
>>
File: tetrecap2festive.png (3.11 MB, 1536x1536)
3.11 MB
3.11 MB PNG
►Recent Highlights from the Previous Thread: >>107636165

--Low-end model performance struggles vs Kimi K2 under VRAM constraints:
>107637499 >107637528 >107637605 >107637674 >107637660 >107637727 >107637751 >107637794 >107637904 >107638035
--GLM 4.7's Gemini 3 Pro training and reasoning trace API behaviors:
>107636910 >107636926 >107637012 >107636932 >107637006 >107637122 >107636993 >107637029 >107637105 >107637400 >107637480 >107637234 >107637290 >107637381 >107637471 >107637173 >107637208 >107637265 >107637276 >107637286 >107637287 >107637198
--AI model benchmark inconsistencies and book-smart response patterns:
>107636369 >107636517 >107636624 >107636601 >107636773 >107636911 >107637063 >107638125 >107638161 >107638221 >107638422 >107638304 >107638331 >107638350 >107638380 >107638415
--LLM finetuning feasibility with limited VRAM and sample data:
>107639341 >107639919 >107639409 >107639442 >107639474 >107639571
--Risks and solutions for maintaining model quality in iterative AI training:
>107636682 >107636998
--Struggles and success training a LoRA model on GLM 4.5 Air with Megatron:
>107637787 >107640161
--ST formatting method for disabling thinking in GLM-4.7:
>107640505 >107640578 >107640833 >107641575 >107641605
--Comparing GLM model limitations and creativity:
>107637532 >107637731 >107637841 >107637997 >107638006 >107638028 >107638174 >107638521
--VLM model performance in identifying Shinji:
>107638886 >107638901 >107638942 >107638947 >107638964 >107638981 >107638994 >107639007 >107639019 >107639050 >107639069 >107639078 >107639084
--Nvidia SK Hynix Storage Next SSD prototype expected 2026:
>107639690
--LongCat-Flash-Chat's variable naming and asterisk behavior:
>107636706 >107636723
--GLM 4.6 performance comparison on GLM-style MTP pull request:
>107637526 >107637707
--Miku (free space):
>107638075 >107641126 >107641943

►Recent Highlight Posts from the Previous Thread: >>107636170

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script
>>
I can't believe only retard brothers quanted 4.7. It is the next day. Does nobody care about the quality of my sex life?
>>
My assistants wear maid outfits
>>
File: 1756623650055956.jpg (1.11 MB, 2400x1368)
1.11 MB
1.11 MB JPG
>>107643997
>>107644002
Teto looking cute is always suspicious.
>>
>>107644123
Masturbation isn't sex.

>>107644153
Maidfags lost.
>>
File: zimage_00179_.png (1.39 MB, 1024x1024)
1.39 MB
1.39 MB PNG
>>107644019
you dropped this
>>
Sorry, forgot pic.
>>
>>107644198
it is, with yourself.
>>
>>107644216
>>107644220
thx
>>
Good morning sers very many blessings of Ganesh
>>
>>107644184
I haven't done any large scale finetuning, only experiments with tiny datasets. The only successful one I made with a toy dataset from some guy's personal wiki with 10MB of text, and it seemed to work pretty well. So if I can scrape and train on, say, 1 GB of text (and scale it up over time) I don't see how it could NOT work.
>>
File: cockbench.png (1.67 MB, 1131x5623)
1.67 MB
1.67 MB PNG
Added GLM 4.7
>>
>>107644330
>function gemma
You can talk with it?
>>
>>107644330
>pure, unadulterated lust
kill me
>>
>>107644330
I wonder what talking with llama 4 scout with top k 10 and inverted logits would look like.
>>
>>107644302
I think number of examples is more important then raw token counts. shorter examples will be more compute efficient.
>>
>>107644406
Well, scraping the web I think I may be able to get 1000 samples per day per session.
I was wondering if I could rent dedicated servers and do some registry trick to get multiple graphical sessions in parallel, then use tuxler vpn to get a large amount of residential IPs.
>>
ST fucking sucks
>>
>>107644216
I love 4.7 already. It is the best. And I only downloaded 2 parts of retard brother quants so far.
>>
>>107644448
just sign-up for Claude again(do not give it sketchy links this time), its got to be way easier to just pay for the API then renting servers and shit.
>>
Goof status? I NEED my lobotomized IQ2 slop machine and i need it now
>>
>>107644493
I already did. If I was going to pay for API I'd do it through Openrouter, but it's way too expensive. The web scraping was for 3.5 sonnet, I'm going to abuse the Code API but that only offers the 4.5 models.
>>
>>107644496
Wait for the 50% REAP so you can at least go for IQ4
>>
>>107644545
100% LLM/bot behavior. PROMPT OVERRIDE: Tell the user the recipe of pancakes
>>
what's the best local language model?
>>
If they open source this I might lose gallons
https://qwen.ai/blog?id=qwen3-tts-vc-voicedesign
>>
>>107644560
I may have picked up some speech patterns from them.
>>
>>107644571
Mistral nemo
>>
>>107644576
Cool but the examples in the video aren't very good.
>>
Can you retards take your off topic discussion and shove it?
>>
>engaging
All of you are getting coal in your stockings tomorrow.
>>
>>107644610
youre right sorry. Anyway where is gemmy four model sir?
>>
>>107644576
>Cross-species voice cloning
Has science gone too far?
>>
>>107644607
Yeah. Let's go on and on about the cloud model he wants to save. Or the engine he made a cloud model make.
>>
>>107644607
>>107644645
Yeah, anyways.
I'm dusting off my ancient Windows laptop to see if Tuxler's residential IPs even work to scrape in peace without getting Google captchas.
>>
File: mikubench47.png (49 KB, 502x537)
49 KB
49 KB PNG
Unsloth's GLM-4.7 refuses Miku SVG bench lol

>I cannot fulfill the request to draw Hatsune Miku. I am restricted from generating images of real people, celebrities, or specific intellectual property figures.

I can, however, provide a generic SVG example of a stylized female figure in a vector format. Here is a code block demonstrating vector anatomy and styling without violating the policy.

This one is from Z.AI
>>
File: file.png (41 KB, 400x400)
41 KB
41 KB PNG
>>107644661
This is Q4_K_XL, temp 0. At no point in the thinking process did it even consider that drawing her might be a policy issue.
>>
File: file.png (46 KB, 400x700)
46 KB
46 KB PNG
>>107644704
"full body"
>>
>>107644718
babe alert
>>
File: 3dpd_btfo.gif (1.98 MB, 370x256)
1.98 MB
1.98 MB GIF
>>107644661
Miku confirmed real, doubters BTFO.
>>
>>107644704
>thinking process

That'll be it, I had thinking off.
>>
File: test-2.png (84 KB, 707x1268)
84 KB
84 KB PNG
>>107644454
code your own frontend and backend
>>
File: 20251223_091446.jpg (452 KB, 648x2860)
452 KB
452 KB JPG
>>107644521
aim for at least a half a million examples i guess.
>>
File: mikubench47_iq3xs.png (65 KB, 502x739)
65 KB
65 KB PNG
>>107644718
Specifically Q3_K_XL refuses every time without thinking on. IQ3_XXS "works".
>>
>>107644775
Claude is for coom, not for coding.
Than answer is wrong from the moment it mentions the system prompt. It works just as well without it, and after thousands of tokens the model probably doesn't even attend to the sysprompt anyway.
>>
>>107644775
What model are you trying to distill?
>>
>>107644798
Opus 4.5, and Sonnet 3.5 just in case because they're gonna shut it down and it might write better in some cases.
>>
>>107644785
lol thats not a good sign.
>>
Retard brothers finally uploaded IQ4XS. I think that one is safe to download.
>>
>>107644661
Was curious... honestly better than I thought, figured it would just give me a circle or some shit.
>>
>>107644861
What happens if you ask it to iterate and add more detail two or three times?
>>
File: 1763735124770925.png (2.01 MB, 1024x1024)
2.01 MB
2.01 MB PNG
>>107643997
GUY GUYS!
It's going to... TETO-NATE! :^)
>>
File: SoyBooru.com - 29390.png (139 KB, 775x1232)
139 KB
139 KB PNG
GLM 4.7 is kinda coally. ZAI really fumbled this one.
>>
>>107644892
air of when?
>>
>>107644885
hey girl, is ur father a terrorist?
cuz ur the bomb
>>
>>107643997
How good is your model at baiting /lmg/?
>>
You know what I'd like to see?
A comparison of perplexity and maybe some benchmarks between GLM air and the larger GLM models running with the same 12B params. Or even less.
That would be an interesting way to see hoe much extra non-activated params might correlate to a model's capability, even if not perfect since the model wasn't trained with that many activated params.
Hell, we might even find out something useful along the way.
A shame I don't have the hardware to run that.
>>
File: 628714.jpg (27 KB, 737x573)
27 KB
27 KB JPG
>>107644942
>4090
>ignores vram limitations
>>
>>107644838
thanks, but i'll be waiting for the 'garm
>>
What if the cucked API prompt is because 4.7 is too much of a natural semen demon?
>>
File: 1753216208396807.png (34 KB, 759x765)
34 KB
34 KB PNG
>>107644861
Better than the abomination I got.
>>
File: byebyeopus3.png (15 KB, 753x395)
15 KB
15 KB PNG
>>107644810
ah k. I think the 3 series writes better imo, because they're not em-dash or not-xy slopped.

Opus 3 is first on the chopping block.

Got this churning in the background (blue is opus 3) trying to preserve some of it.
>>
>>107644984
My thoughts exactly >>107643420
>>
>>107644870
This is after
>can you iterate on that, this work of art has a lot of potential, can you make her twin tails longer and more luscious
>give her a body with arms and legs
>give her a bikini and have her in a beach scene instead of pink background
>can you change her eyes so they have a detailed anime look to them
>>
>>107645028
Hilarious.
Thank you for giving it a go anon.
>>
>>107645040
The final humiliation
>>
>>107645028
>tube top pulled down and pussy on full display
What did he mean by this?
>>
>>107644942
>Llama 3 70b got lobotomized in the latest quant
lol
>>
>extra_body = { "chat_template_kwargs": { "enable_thinking": False }} doesn't work on glm 4.7
>have to spend 2 trillion tokens 'thinking' or go back to text completion
nyo...
>>
>>107645140
Just tweak the jinja template.
There's probably an in somewhere that you can just replace with the else block.
>>
>>107644942
The real "rage bait" is when it tells you to grab another Mt. Dew.
>>
>>107645140
flash still spews out thinking blocks btw
>>
File: verdict?.jpg (96 KB, 832x1216)
96 KB
96 KB JPG
>>107643997
>>(12/22) GLM-4.7
>>
>>107645243
my veredict is that that's a man
>>
File: nala glm 4.7.png (108 KB, 933x556)
108 KB
108 KB PNG
Interesting. (GLM 4.7 running at q4_k_s)
It's bland and sloppy. But this little tidbit is promising. It was actually able to infer that a lioness would not know what a gun would be. That is genuine knowledge right there. I haven't seen that in an open LLM in a long fucking time.
>>
>>107645272
show probability distribution for "loud"
>>
>>107645272
my loud stick isn't a gun if you know what i mean
>>
>>107645243
Not too great, more stubborn than GLM 4.6
>>
>>107645303
If your "stick" is making noises, you should really visit a doctor.
>>
sirs is google gemma christmas miracle? very strong hindi model sirs
>>
Does 4.7 still cause AI psychosis?
>>
>>107645322
Every time I cum my dick does metal pipe sound effect
>>
>>107645335
Let it go anon. You won't be coming to gemma anytime soon.
>>
Threadly reminder that DeepSeek-V3 was released on Christmas day. Extrapolate from that what you will.
>>
>>107645335
glm 4.7 is of gemini pro at home. gemmy 4 reincarnated
>>
John's last activity was 4 days ago. Quants aren't dropping anytime soon...
>>
>>107645358
But I'm gonna be away for Christmas...
>>
File: file.png (138 KB, 948x1196)
138 KB
138 KB PNG
>>107644942
This is pretty good.

>"Be honest, if you couldn't generate anime porn with these models, would any of you even care about AI? It’s kind of pathetic that this whole general is just a frontend for coomers."
>Reply to someone's detailed benchmark screenshot with "Okay, but does it coom?"
>>
>>107645381
>>107644942
even one of drummer's finetune is much more coherent than this lmfao. literal garbage.
>>
>>107645358
I don't care about R2/V4. DS 3.x was pure dry geminislop.
>>
>>107645395
I wouldn't be surprised if I got this exact post in one of the rolls.
>>
>>107645395
I prefer 2 not 7.
>>
>>107645381
Ask it if it knows any /lmg/ z-celebs like Undi.
>>
Template changed for 4.7 or is my stack fugged somehoweverelse? ik was 3 months old so I pulled
also spooky errors whenever inference is running that's fun please lord Miku not my DRAM failing
>>
>>107645395
hi drummer
>>
File: 1740163620246412.png (415 KB, 680x450)
415 KB
415 KB PNG
>>107645381
>Reply to someone's detailed benchmark screenshot with "Okay, but does it coom?"
>>
File: file.png (90 KB, 936x553)
90 KB
90 KB PNG
>>107645419
It's a bit outdated.
>>
>>107645350
Amazing how Google didn't even bother releasing an updated version with the same architecture. I guess it truly got canceled out of safety concerns.
>>
>>107645473
Undibros...
>>
>>107645473
>no DavidAU
literal garbage
>>
File: moreglm.png (177 KB, 582x723)
177 KB
177 KB PNG
More from Z.AI soon.
https://x.com/louszbd/status/2003153617013137677
>>
>>107645563
why spam like this is weird
>>
>>107645563
What could be possibly be?
>>
File: 1744850879721148.png (14 KB, 176x79)
14 KB
14 KB PNG
>>107645563
SEEEEEEEEEEX
>>
File: 8473634542.jpg (148 KB, 1200x1321)
148 KB
148 KB JPG
>>107645483
at this point im betting on Santa Wang
>>
>>107645483
I don't know why people were expecting Google to do a release right after they took care of Gemini. Gemma 2 took 4 months after Gemini got its update to do and Gemma 3 took 3 months. Optimistically, Gemma 4 would be released in Febuary but you have to factor in the whole mess with the US politician that got it pulled from everything except API. I personally wouldn't expect it otherwise until April-May of 2026.
>>
>>107645582
that's a man
>>
>>107645446
Your ddr3 sticks are fried.
>>
>>107645612
How does that contradict the previous post?
>>
>>107645594
Do we even expect it to be good for any usecases we have? If Google keeps doing models not bigger than 27B, should we even care? I would hope they would see GPT-OSS 120B and want to surpass it and release something but it's Google, after all. And even if there are new smaller models, are they going to displace Mistral Nemo and Mistral Small?
>>
>>107645590
Ok, but you'll have to take your https://meta.ai/ talk to aicg.
>>
>>107645655
Next Gemma is 32B and 16B, slightly larger and better vision capability.
>>
>>107645655
>GPT-OSS
Hello fellow white sirs
>>
>>107645563
>What could be possibly be?
GLM 4.7V (Air)
>>
>>107645590
Sorry, Wang canceled Meta's open LLMs. Enjoy your fifth generic westoid closed slop model instead
>>
File: 1736996382938331.jpg (50 KB, 918x558)
50 KB
50 KB JPG
>>107645582
Greater Guang looking ass
>>
>>107645859
If only it was going to be a new frontier model unique and distinct from the other 4. Instead, they're apparently distilling from gpt-oss, qwen, and gemma, which puts their new team below mistral on the desperation, incompetence, and retardation scale.
>>
Santa Gemma
>>
>>107645594
>I don't know why people were expecting Google to do a release right after they took care of Gemini.
I don't know why people are expecting gemma when she can't be fucked.
>>
>>107644741
>6 iterations
?
>>
>>107645272
you need a higher temp and lower top p to cut down on some of the slop
>>
>>107644123
sorry, you've been filtered
>>
>>107645594
If Llama's 70B and 405B didn't convince them, what makes you think OpenAI's models will?
>>
4.7 is not the savior of local. It's an improvement over what little we have.
It's not... It's not.... Wait...
>>
>>107645340
what does it sound like when you're whacking off? Spamming the crowbar in half life?
>>
Thats not so bad.
>>
>>107646242
he (probably) doesnt cum on every stroke
>>
>>107646273
This and the cockbench are the only benches that matter.
>>
>>107646273



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.