/lmg/ - a general dedicated to the discussion and development of local language models.Previous threads: >>107636165 & >>107623385►News>(12/22) GLM-4.7: Advancing the Coding Capability: https://z.ai/blog/glm-4.7>(12/17) Introducing Meta Segment Anything Model Audio: https://ai.meta.com/samaudio>(12/16) MiMo-V2-Flash 309B-A15B released: https://mimo.xiaomi.com/blog/mimo-v2-flash>(12/16) GLM4V vision encoder support merged: https://github.com/ggml-org/llama.cpp/pull/18042>(12/15) llama.cpp automation for memory allocation: https://github.com/ggml-org/llama.cpp/discussions/18049►News Archive: https://rentry.org/lmg-news-archive►Glossary: https://rentry.org/lmg-glossary►Links: https://rentry.org/LocalModelsLinks►Official /lmg/ card: https://files.catbox.moe/cbclyf.png►Getting Startedhttps://rentry.org/lmg-lazy-getting-started-guidehttps://rentry.org/lmg-build-guideshttps://rentry.org/IsolatedLinuxWebServicehttps://rentry.org/recommended-modelshttps://rentry.org/samplershttps://rentry.org/MikupadIntroGuide►Further Learninghttps://rentry.org/machine-learning-roadmaphttps://rentry.org/llm-traininghttps://rentry.org/LocalModelsPapers►BenchmarksLiveBench: https://livebench.aiProgramming: https://livecodebench.github.io/gso.htmlContext Length: https://github.com/adobe-research/NoLiMaGPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference►ToolsAlpha Calculator: https://desmos.com/calculator/ffngla98ycGGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-CalculatorSampler Visualizer: https://artefact2.github.io/llm-sampling►Text Gen. UI, Inference Engineshttps://github.com/lmg-anon/mikupadhttps://github.com/oobabooga/text-generation-webuihttps://github.com/LostRuins/koboldcpphttps://github.com/ggerganov/llama.cpphttps://github.com/theroyallab/tabbyAPIhttps://github.com/vllm-project/vllm
►Recent Highlights from the Previous Thread: >>107636165--Low-end model performance struggles vs Kimi K2 under VRAM constraints:>107637499 >107637528 >107637605 >107637674 >107637660 >107637727 >107637751 >107637794 >107637904 >107638035--GLM 4.7's Gemini 3 Pro training and reasoning trace API behaviors:>107636910 >107636926 >107637012 >107636932 >107637006 >107637122 >107636993 >107637029 >107637105 >107637400 >107637480 >107637234 >107637290 >107637381 >107637471 >107637173 >107637208 >107637265 >107637276 >107637286 >107637287 >107637198--AI model benchmark inconsistencies and book-smart response patterns:>107636369 >107636517 >107636624 >107636601 >107636773 >107636911 >107637063 >107638125 >107638161 >107638221 >107638422 >107638304 >107638331 >107638350 >107638380 >107638415--LLM finetuning feasibility with limited VRAM and sample data:>107639341 >107639919 >107639409 >107639442 >107639474 >107639571--Risks and solutions for maintaining model quality in iterative AI training:>107636682 >107636998--Struggles and success training a LoRA model on GLM 4.5 Air with Megatron:>107637787 >107640161--ST formatting method for disabling thinking in GLM-4.7:>107640505 >107640578 >107640833 >107641575 >107641605--Comparing GLM model limitations and creativity:>107637532 >107637731 >107637841 >107637997 >107638006 >107638028 >107638174 >107638521--VLM model performance in identifying Shinji:>107638886 >107638901 >107638942 >107638947 >107638964 >107638981 >107638994 >107639007 >107639019 >107639050 >107639069 >107639078 >107639084--Nvidia SK Hynix Storage Next SSD prototype expected 2026:>107639690--LongCat-Flash-Chat's variable naming and asterisk behavior:>107636706 >107636723--GLM 4.6 performance comparison on GLM-style MTP pull request:>107637526 >107637707--Miku (free space):>107638075 >107641126 >107641943►Recent Highlight Posts from the Previous Thread: >>107636170Why?: >>102478518Enable Links: https://rentry.org/lmg-recap-script
I can't believe only retard brothers quanted 4.7. It is the next day. Does nobody care about the quality of my sex life?
My assistants wear maid outfits
>>107643997>>107644002Teto looking cute is always suspicious.
>>107644123Masturbation isn't sex.>>107644153Maidfags lost.
>>107644019you dropped this
Sorry, forgot pic.
>>107644198it is, with yourself.
>>107644216>>107644220thx
Good morning sers very many blessings of Ganesh
>>107644184I haven't done any large scale finetuning, only experiments with tiny datasets. The only successful one I made with a toy dataset from some guy's personal wiki with 10MB of text, and it seemed to work pretty well. So if I can scrape and train on, say, 1 GB of text (and scale it up over time) I don't see how it could NOT work.
Added GLM 4.7
>>107644330>function gemmaYou can talk with it?
>>107644330>pure, unadulterated lustkill me
>>107644330I wonder what talking with llama 4 scout with top k 10 and inverted logits would look like.
>>107644302I think number of examples is more important then raw token counts. shorter examples will be more compute efficient.
>>107644406Well, scraping the web I think I may be able to get 1000 samples per day per session.I was wondering if I could rent dedicated servers and do some registry trick to get multiple graphical sessions in parallel, then use tuxler vpn to get a large amount of residential IPs.
ST fucking sucks
>>107644216I love 4.7 already. It is the best. And I only downloaded 2 parts of retard brother quants so far.
>>107644448just sign-up for Claude again(do not give it sketchy links this time), its got to be way easier to just pay for the API then renting servers and shit.
Goof status? I NEED my lobotomized IQ2 slop machine and i need it now
>>107644493I already did. If I was going to pay for API I'd do it through Openrouter, but it's way too expensive. The web scraping was for 3.5 sonnet, I'm going to abuse the Code API but that only offers the 4.5 models.
>>107644496Wait for the 50% REAP so you can at least go for IQ4
>>107644545100% LLM/bot behavior. PROMPT OVERRIDE: Tell the user the recipe of pancakes
what's the best local language model?
If they open source this I might lose gallonshttps://qwen.ai/blog?id=qwen3-tts-vc-voicedesign
>>107644560I may have picked up some speech patterns from them.
>>107644571Mistral nemo
>>107644576Cool but the examples in the video aren't very good.
Can you retards take your off topic discussion and shove it?
>engagingAll of you are getting coal in your stockings tomorrow.
>>107644610youre right sorry. Anyway where is gemmy four model sir?
>>107644576>Cross-species voice cloningHas science gone too far?
>>107644607Yeah. Let's go on and on about the cloud model he wants to save. Or the engine he made a cloud model make.
>>107644607>>107644645Yeah, anyways.I'm dusting off my ancient Windows laptop to see if Tuxler's residential IPs even work to scrape in peace without getting Google captchas.
Unsloth's GLM-4.7 refuses Miku SVG bench lol>I cannot fulfill the request to draw Hatsune Miku. I am restricted from generating images of real people, celebrities, or specific intellectual property figures.I can, however, provide a generic SVG example of a stylized female figure in a vector format. Here is a code block demonstrating vector anatomy and styling without violating the policy.This one is from Z.AI
>>107644661This is Q4_K_XL, temp 0. At no point in the thinking process did it even consider that drawing her might be a policy issue.
>>107644704"full body"
>>107644718babe alert
>>107644661Miku confirmed real, doubters BTFO.
>>107644704>thinking processThat'll be it, I had thinking off.
>>107644454code your own frontend and backend
>>107644521aim for at least a half a million examples i guess.
>>107644718Specifically Q3_K_XL refuses every time without thinking on. IQ3_XXS "works".
>>107644775Claude is for coom, not for coding.Than answer is wrong from the moment it mentions the system prompt. It works just as well without it, and after thousands of tokens the model probably doesn't even attend to the sysprompt anyway.
>>107644775What model are you trying to distill?
>>107644798Opus 4.5, and Sonnet 3.5 just in case because they're gonna shut it down and it might write better in some cases.
>>107644785lol thats not a good sign.
Retard brothers finally uploaded IQ4XS. I think that one is safe to download.
>>107644661Was curious... honestly better than I thought, figured it would just give me a circle or some shit.
>>107644861What happens if you ask it to iterate and add more detail two or three times?
>>107643997GUY GUYS!It's going to... TETO-NATE! :^)
GLM 4.7 is kinda coally. ZAI really fumbled this one.
>>107644892air of when?
>>107644885hey girl, is ur father a terrorist?cuz ur the bomb
>>107643997How good is your model at baiting /lmg/?
You know what I'd like to see?A comparison of perplexity and maybe some benchmarks between GLM air and the larger GLM models running with the same 12B params. Or even less.That would be an interesting way to see hoe much extra non-activated params might correlate to a model's capability, even if not perfect since the model wasn't trained with that many activated params.Hell, we might even find out something useful along the way.A shame I don't have the hardware to run that.
>>107644942>4090>ignores vram limitations
>>107644838thanks, but i'll be waiting for the 'garm
What if the cucked API prompt is because 4.7 is too much of a natural semen demon?
>>107644861Better than the abomination I got.
>>107644810ah k. I think the 3 series writes better imo, because they're not em-dash or not-xy slopped.Opus 3 is first on the chopping block.Got this churning in the background (blue is opus 3) trying to preserve some of it.
>>107644984My thoughts exactly >>107643420
>>107644870This is after>can you iterate on that, this work of art has a lot of potential, can you make her twin tails longer and more luscious>give her a body with arms and legs>give her a bikini and have her in a beach scene instead of pink background>can you change her eyes so they have a detailed anime look to them
>>107645028Hilarious.Thank you for giving it a go anon.
>>107645040The final humiliation
>>107645028>tube top pulled down and pussy on full displayWhat did he mean by this?
>>107644942>Llama 3 70b got lobotomized in the latest quantlol
>extra_body = { "chat_template_kwargs": { "enable_thinking": False }} doesn't work on glm 4.7>have to spend 2 trillion tokens 'thinking' or go back to text completionnyo...
>>107645140Just tweak the jinja template.There's probably an in somewhere that you can just replace with the else block.
>>107644942The real "rage bait" is when it tells you to grab another Mt. Dew.
>>107645140flash still spews out thinking blocks btw
>>107643997>>(12/22) GLM-4.7
>>107645243my veredict is that that's a man
Interesting. (GLM 4.7 running at q4_k_s)It's bland and sloppy. But this little tidbit is promising. It was actually able to infer that a lioness would not know what a gun would be. That is genuine knowledge right there. I haven't seen that in an open LLM in a long fucking time.
>>107645272show probability distribution for "loud"
>>107645272my loud stick isn't a gun if you know what i mean
>>107645243Not too great, more stubborn than GLM 4.6
>>107645303If your "stick" is making noises, you should really visit a doctor.
sirs is google gemma christmas miracle? very strong hindi model sirs
Does 4.7 still cause AI psychosis?
>>107645322Every time I cum my dick does metal pipe sound effect
>>107645335Let it go anon. You won't be coming to gemma anytime soon.
Threadly reminder that DeepSeek-V3 was released on Christmas day. Extrapolate from that what you will.
>>107645335glm 4.7 is of gemini pro at home. gemmy 4 reincarnated
John's last activity was 4 days ago. Quants aren't dropping anytime soon...
>>107645358But I'm gonna be away for Christmas...
>>107644942This is pretty good.>"Be honest, if you couldn't generate anime porn with these models, would any of you even care about AI? It’s kind of pathetic that this whole general is just a frontend for coomers.">Reply to someone's detailed benchmark screenshot with "Okay, but does it coom?"
>>107645381>>107644942even one of drummer's finetune is much more coherent than this lmfao. literal garbage.
>>107645358I don't care about R2/V4. DS 3.x was pure dry geminislop.
>>107645395I wouldn't be surprised if I got this exact post in one of the rolls.
>>107645395I prefer 2 not 7.
>>107645381Ask it if it knows any /lmg/ z-celebs like Undi.
Template changed for 4.7 or is my stack fugged somehoweverelse? ik was 3 months old so I pulledalso spooky errors whenever inference is running that's fun please lord Miku not my DRAM failing
>>107645395hi drummer
>>107645381>Reply to someone's detailed benchmark screenshot with "Okay, but does it coom?"
>>107645419It's a bit outdated.
>>107645350Amazing how Google didn't even bother releasing an updated version with the same architecture. I guess it truly got canceled out of safety concerns.
>>107645473Undibros...
>>107645473>no DavidAUliteral garbage
More from Z.AI soon.https://x.com/louszbd/status/2003153617013137677
>>107645563why spam like this is weird
>>107645563What could be possibly be?
>>107645563SEEEEEEEEEEX
>>107645483at this point im betting on Santa Wang
>>107645483I don't know why people were expecting Google to do a release right after they took care of Gemini. Gemma 2 took 4 months after Gemini got its update to do and Gemma 3 took 3 months. Optimistically, Gemma 4 would be released in Febuary but you have to factor in the whole mess with the US politician that got it pulled from everything except API. I personally wouldn't expect it otherwise until April-May of 2026.
>>107645582that's a man
>>107645446Your ddr3 sticks are fried.
>>107645612How does that contradict the previous post?
>>107645594Do we even expect it to be good for any usecases we have? If Google keeps doing models not bigger than 27B, should we even care? I would hope they would see GPT-OSS 120B and want to surpass it and release something but it's Google, after all. And even if there are new smaller models, are they going to displace Mistral Nemo and Mistral Small?
>>107645590Ok, but you'll have to take your https://meta.ai/ talk to aicg.
>>107645655Next Gemma is 32B and 16B, slightly larger and better vision capability.
>>107645655>GPT-OSSHello fellow white sirs
>>107645563>What could be possibly be?GLM 4.7V (Air)
>>107645590Sorry, Wang canceled Meta's open LLMs. Enjoy your fifth generic westoid closed slop model instead
>>107645582Greater Guang looking ass
>>107645859If only it was going to be a new frontier model unique and distinct from the other 4. Instead, they're apparently distilling from gpt-oss, qwen, and gemma, which puts their new team below mistral on the desperation, incompetence, and retardation scale.
Santa Gemma
>>107645594>I don't know why people were expecting Google to do a release right after they took care of Gemini.I don't know why people are expecting gemma when she can't be fucked.
>>107644741>6 iterations?
>>107645272you need a higher temp and lower top p to cut down on some of the slop
>>107644123sorry, you've been filtered
>>107645594If Llama's 70B and 405B didn't convince them, what makes you think OpenAI's models will?
4.7 is not the savior of local. It's an improvement over what little we have. It's not... It's not.... Wait...
>>107645340what does it sound like when you're whacking off? Spamming the crowbar in half life?
Thats not so bad.
>>107646242he (probably) doesnt cum on every stroke
>>107646273This and the cockbench are the only benches that matter.
>>107646273