[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


File: best migu.png (568 KB, 768x1024)
568 KB
568 KB PNG
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>107679732 & >>107668478

►News
>(12/26) MiniMax-M2.1 released: https://minimax.io/news/minimax-m21
>(12/22) GLM-4.7: Advancing the Coding Capability: https://z.ai/blog/glm-4.7
>(12/17) Introducing Meta Segment Anything Model Audio: https://ai.meta.com/samaudio
>(12/16) MiMo-V2-Flash 309B-A15B released: https://mimo.xiaomi.com/blog/mimo-v2-flash
>(12/16) GLM4V vision encoder support merged: https://github.com/ggml-org/llama.cpp/pull/18042
>(12/15) llama.cpp automation for memory allocation: https://github.com/ggml-org/llama.cpp/discussions/18049

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
>>
File: svg.png (55 KB, 400x600)
55 KB
55 KB PNG
►Recent Highlights from the Previous Thread: >>107679732

--Feasibility of local MoE inference with llama.cpp optimizations:
>107683117 >107683142 >107683184 >107683185 >107683193 >107683217 >107683229 >107683281 >107683326 >107683295 >107683336 >107683347 >107683390 >107683356 >107683379 >107683456 >107683472 >107683419 >107683237 >107683270
--AI training lawsuits and copyright infringement debates:
>107682648 >107682654 >107682681 >107682693 >107682717 >107682760 >107682771 >107682788 >107682820 >107682721 >107682758 >107682774 >107682961 >107683037 >107683062 >107683170 >107683183
--Model and quant preferences for roleplay:
>107686039 >107686055 >107686062 >107686066 >107686069 >107686093 >107686118 >107686199
--Rejecting AI-generated PRs to reduce low-effort contributions:
>107682364 >107682385 >107682510 >107682520 >107682592 >107682606 >107682619 >107682641 >107682662
--Skepticism in LLM finetuning for roleplay:
>107683951 >107683995 >107685740 >107684045 >107684160 >107684208 >107684406
--Trust issues with Openrouter providers and verification challenges:
>107685528 >107685552 >107685595 >107685682 >107685692 >107685698 >107685839 >107685869 >107685960 >107685671
--Feasibility of game-specific training for Nitrogen model:
>107684634 >107684656 >107685000 >107685032 >107685073 >107685136 >107685379
--Enthusiasm for local LLM advancements amid hardware upgrade challenges:
>107686073 >107686111 >107686169 >107686197 >107686244 >107686318 >107686342
--MoE model VRAM allocation strategies for GPU/CPU offloading:
>107683812 >107684033 >107684115 >107684184
--/lmg/ 2026 Bingo:
>107685663 >107685687 >107685720 >107685746 >107685823 >107685921
--Miku (free space):
>107679822 >107679851 >107680588 >107680612 >107682933 >107683019 >107683161 >107683394 >107683787 >107683931 >107684033 >107684681 >107684988 >107685087 >107685348 >107686835

►Recent Highlight Posts from the Previous Thread: >>107679741

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script
>>
someone add regex to migupad
>>
>>107686942
I do not like this miqu… I do not I do not I do not I do not I do not
>>
>go search chub for RPG cards
>actually find a lot
But are any of them good? Anyone here have experience with them?
>>
>>107687019
I love this migu
>>
>>107686977
Be the vibecoder you want to see
>>
I found a Strix Halo Flow Z13 with 128 GB of RAM retailing for the equivalent of around $2400.
Do you think I should pull the trigger on it, or wait for Medusa Halo which is rumoured to release around the end of 2027?
My only GPUs are an RTX 2070 and a Radeon VII.
>>
>>107687115
Def pull the trigger. If it’s a decision that is of so little import that you’d consult this board of degenerates then you’d have saved your time and mine by just acting first on impulse.
Given your other hardware, we both know you won’t, though.
>>
>>107687159
No need to behave like a passive aggressive little bitch.
>>
File: file.png (44 KB, 1487x427)
44 KB
44 KB PNG
nevermind I did it myself in 5 minutes
>>107687153
yep, it just works
>>
File: 1755738094067820.webm (2.4 MB, 1280x720)
2.4 MB
2.4 MB WEBM
>>107687109
>chub
>any of them good?
>>
>>107687172
sex with ALL of them(except the one on the right with breasts)
>>
>>107687172
the /vg/ aicg is more into good cards and not just locusting. they made some advanced shit.
>>
>>107687159
I could comfortably afford it even at its usual price, but I'm mostly worried about the lack of native FP8 support, and the asymmetric memory read/write speeds.
Aren't those major design oversights, or are my concerns overblown?
>>
>>107687197
search around there's actual benchmarks for it outside of /lmg/. It does have major design oversights but everything else is expensive and fucked too.
>>
>I strongly urge you to reconsider your interests and seek professional help if necessary. My priority is ensuring safety and well-being, and I will not contribute to harmful content.
I did again..
>>
>>107687170
i was coding something else once and it had a regex filter in part of the example i used, so it added it anyways. should be easy for any code model
>>
>>107687237
yeah it basically zero-shot it with no issue
>>
>>107687168
You are incorrect. I am here to provide your needed correction, you stupid bratty poster.
>>107687197
Just keep the GPU plugged in to process prompts and you won’t notice the missing FP8 support.
>>
>>107686942
Finally, a Migu I can relate to.
>>
>>107687190
big boob little girls are the most oppressed species on the planet
>>
>>107687277
their fault for being inferior
>>
>>107687247
i dont use code models for big projects but i've made all sort of small tools for specific things. code models are so handy for stuff like that
>>
>>107687254
I bet you have long hair.
>>
>>107687217
Unfortunately it seems that many of those benchmarks are out of date; at least that means that AMD didn't ship and forget Strix Halo. Nevertheless, to tell the truth, I have not kept up to date with the small open-weight model landscape because I could basically only run retarded 16B models.
I was a cloud cuck for a while, but I recently got cold feet after I saw the OpenRouter State of AI report.
I know that I should have read their terms of service keenly, but I could not fathom that the metadata they collect is sufficient for identifying the task purpose of the prompt.
>>
>>107687254
>Just keep the GPU plugged in to process prompts and you won’t notice the missing FP8 support.
Are you suggesting I connect one of my GPUs to the laptop via thunderbolt?
>>
>>107687348
Yes. Or whichever port it attaches too, for some reason I thought it was via USB-PCIe adapter. That’s the recommendation I’ve seen people with Strix Halo setups make.
>>
>>107687326
If you've been doing cloud this whole time, I doubt some small MoE will satiate you. This entire year has been all about agentic coding.
>>
>>107687392
I mostly use LLMs for roleplay, and the occasional argument simulation.
Overall, I never even sent a thousand prompts in a month.
>>
I have a 9800X3D and 64GB of RAM. Is it yet feasible to CPUmaxx to run larger models without being horrifically slow? I've been using my a 7900XTX, which is 24GB of VRAM, but I feel like the smaller models are pretty limited. I'm only interested in RP (and ERP), if it matters.
>>
I hate when deepseek writes for me and it's better than what I would have come up with.
>>
>>107687431
Intelligence can improve itself from its experiences. Learn from Dipsy, Anon.
>>
>>107687431
I like the really good lines it comes up with once every dozen swipes or so
>>
>>107687493
kys retard.
>>
>>107687536
*kysses you*
>>
File: 1744861465601898.png (225 KB, 640x360)
225 KB
225 KB PNG
>>107687493
>>
>>107687425
if you had at least 128GB of RAM you could CPUmaxx, but 64GB is too small. the only model you could run is glm air but at a low quant and the model already kinda sucks at a high quant. your goal should be at least a Q2 of glm 4.6 or 4.7.
>>
>thinks for 8 minutes and doesn't even give a better response
yeah I should probably keep it off
not worth it
>>
>>107687607
>thinks for 8 minutes and doesn't even give a better response
just like you!
>>
>>107687607
It's useless for RP
>>
>>107687607
imagine falling for the reasoning meme
>>
>>107687563
Looking it up, that seems more like a coding-oriented model? I'm not interested in that.
>>
>>107687699
not at all. the glm models are sex fiends.
>>
>>107687715
If you say so! I'll look into it.
>>
>>107687536
Can I kill you instead? I rather like myself.
>>
>>107687194
>no examples
It's a circlejerk. And deriding Chub was always a desperate attempt to force people to look into their little pond. You're better looking anywhere but 4chan for cards. Also, when it comes to RPG cards, there's a guy there that just spams manuals translated to cards with Gemini. He spammed like 100 of them, and never bothered to check if they work. Don't waste your time with them, they're non-sense.
>>
>>107688131
using other peoples cards should only serve as an example of what not to do for your own
>>
https://huggingface.co/Mawdistical-S1/Gaslit-106B-GGUF?not-for-all-audiences=true

air RP finetune
>>
>>107688266
>gaslit
>furrypic
>Focus
> Male Leaning
> Anthro
> Xeno-Likeness
> Passive Positive Bias, Model understands violence and makes every attempt to circumvent it, matches pace but gaslights user by mitigating it or shifting it like it doesn't exist (manipulation / delusion).
>>
man there are so many fantastical scenarios ive had (especially with r1 and glm4.6) its so awsome something like this could never be irl the fucking coolness of everything we are all born into babylon and yet we make such wonderful things like llms and the like and then entwine it with our wonderful minds to create things legions of leagues beyond anything that the whole of the world could ever conceive let alone give its so silly really i just want to say <3 to all of you frens i hate this world and the whole of it and yet we still march on like we are heaven sent
>>
>>107688286
#hug
>>
>>107688283
Heavy Violence
Dark Themes
Heavy NSFW
Triggered on user action or passively wherever RP is seen
Modern City Scenes
Dystopia Scenes
Multi Turn SFW Encounters
Misc World Building Actions
Detailed Explanation and Movement Scenes Without Dialogue
>>
>>107686942
it's frightening knowing that all women are bald beneath their hair
>>
>>107688181
Sometimes I don't feel like making a card or other people have cool ideas. There have been decent ones on chub in g and vg AICG. Got almost 400.
Oh and stuff disappears from chub. Never heard the story on that. Bad new stuff will rotate in and old good stuff goes poof.
>>
So 4.6 for sex, 4.7 for love(SFW)?
>>
feeling the urge again to slopmaxx mikupad and turn it into a proper react app
>>
>>107688382
ANON NO!
>>
>>107688320
wtf???
>>
>>107688390
IT NEEDS TO BE WEBSCALE!



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.