/g/ - /lmg/ - Local Models General - Technology


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

Anonymous
/lmg/ - Local Models General 12/27/25(Sat)17:18:40 No.107686942

File: best migu.png (568 KB, 768x1024)

/lmg/ - Local Models General Anonymous 12/27/25(Sat)17:18:40 No.107686942

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>107679732 & >>107668478

►News
>(12/26) MiniMax-M2.1 released: https://minimax.io/news/minimax-m21
>(12/22) GLM-4.7: Advancing the Coding Capability: https://z.ai/blog/glm-4.7
>(12/17) Introducing Meta Segment Anything Model Audio: https://ai.meta.com/samaudio
>(12/16) MiMo-V2-Flash 309B-A15B released: https://mimo.xiaomi.com/blog/mimo-v2-flash
>(12/16) GLM4V vision encoder support merged: https://github.com/ggml-org/llama.cpp/pull/18042
>(12/15) llama.cpp automation for memory allocation: https://github.com/ggml-org/llama.cpp/discussions/18049

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
12/27/25(Sat)17:18:59 No.107686945

Anonymous 12/27/25(Sat)17:18:59 No.107686945

File: svg.png (55 KB, 400x600)

55 KB PNG

►Recent Highlights from the Previous Thread: >>107679732

--Feasibility of local MoE inference with llama.cpp optimizations:
>107683117 >107683142 >107683184 >107683185 >107683193 >107683217 >107683229 >107683281 >107683326 >107683295 >107683336 >107683347 >107683390 >107683356 >107683379 >107683456 >107683472 >107683419 >107683237 >107683270
--AI training lawsuits and copyright infringement debates:
>107682648 >107682654 >107682681 >107682693 >107682717 >107682760 >107682771 >107682788 >107682820 >107682721 >107682758 >107682774 >107682961 >107683037 >107683062 >107683170 >107683183
--Model and quant preferences for roleplay:
>107686039 >107686055 >107686062 >107686066 >107686069 >107686093 >107686118 >107686199
--Rejecting AI-generated PRs to reduce low-effort contributions:
>107682364 >107682385 >107682510 >107682520 >107682592 >107682606 >107682619 >107682641 >107682662
--Skepticism in LLM finetuning for roleplay:
>107683951 >107683995 >107685740 >107684045 >107684160 >107684208 >107684406
--Trust issues with Openrouter providers and verification challenges:
>107685528 >107685552 >107685595 >107685682 >107685692 >107685698 >107685839 >107685869 >107685960 >107685671
--Feasibility of game-specific training for Nitrogen model:
>107684634 >107684656 >107685000 >107685032 >107685073 >107685136 >107685379
--Enthusiasm for local LLM advancements amid hardware upgrade challenges:
>107686073 >107686111 >107686169 >107686197 >107686244 >107686318 >107686342
--MoE model VRAM allocation strategies for GPU/CPU offloading:
>107683812 >107684033 >107684115 >107684184
--/lmg/ 2026 Bingo:
>107685663 >107685687 >107685720 >107685746 >107685823 >107685921
--Miku (free space):
>107679822 >107679851 >107680588 >107680612 >107682933 >107683019 >107683161 >107683394 >107683787 >107683931 >107684033 >107684681 >107684988 >107685087 >107685348 >107686835

►Recent Highlight Posts from the Previous Thread: >>107679741

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
12/27/25(Sat)17:22:38 No.107686977

Anonymous 12/27/25(Sat)17:22:38 No.107686977

someone add regex to migupad

Anonymous
12/27/25(Sat)17:27:04 No.107687019

Anonymous 12/27/25(Sat)17:27:04 No.107687019

>>107686942
I do not like this miqu… I do not I do not I do not I do not I do not

Anonymous
12/27/25(Sat)17:36:13 No.107687109

Anonymous 12/27/25(Sat)17:36:13 No.107687109

>go search chub for RPG cards
>actually find a lot
But are any of them good? Anyone here have experience with them?

Anonymous
12/27/25(Sat)17:40:39 No.107687152

Anonymous 12/27/25(Sat)17:40:39 No.107687152

>>107687019
I love this migu

Anonymous
12/27/25(Sat)17:40:41 No.107687153

Anonymous 12/27/25(Sat)17:40:41 No.107687153

>>107686977
Be the vibecoder you want to see

Anonymous
12/27/25(Sat)17:40:52 No.107687154

Anonymous 12/27/25(Sat)17:40:52 No.107687154

I found a Strix Halo Flow Z13 with 128 GB of RAM retailing for the equivalent of around $2400.
Do you think I should pull the trigger on it, or wait for Medusa Halo which is rumoured to release around the end of 2027?
My only GPUs are an RTX 2070 and a Radeon VII.

Anonymous
12/27/25(Sat)17:41:01 No.107687159

Anonymous 12/27/25(Sat)17:41:01 No.107687159

>>107687115
Def pull the trigger. If it’s a decision that is of so little import that you’d consult this board of degenerates then you’d have saved your time and mine by just acting first on impulse.
Given your other hardware, we both know you won’t, though.

Anonymous
12/27/25(Sat)17:41:38 No.107687168

Anonymous 12/27/25(Sat)17:41:38 No.107687168

>>107687159
No need to behave like a passive aggressive little bitch.

Anonymous
12/27/25(Sat)17:42:02 No.107687170

Anonymous 12/27/25(Sat)17:42:02 No.107687170

File: file.png (44 KB, 1487x427)

44 KB PNG

nevermind I did it myself in 5 minutes
>>107687153
yep, it just works

Anonymous
12/27/25(Sat)17:42:23 No.107687172

Anonymous 12/27/25(Sat)17:42:23 No.107687172

File: 1755738094067820.webm (2.4 MB, 1280x720)

2.4 MB WEBM

>>107687109
>chub
>any of them good?

Anonymous
12/27/25(Sat)17:44:02 No.107687190

Anonymous 12/27/25(Sat)17:44:02 No.107687190

>>107687172
sex with ALL of them(except the one on the right with breasts)

Anonymous
12/27/25(Sat)17:44:48 No.107687194

Anonymous 12/27/25(Sat)17:44:48 No.107687194

>>107687172
the /vg/ aicg is more into good cards and not just locusting. they made some advanced shit.

Anonymous
12/27/25(Sat)17:44:58 No.107687197

Anonymous 12/27/25(Sat)17:44:58 No.107687197

>>107687159
I could comfortably afford it even at its usual price, but I'm mostly worried about the lack of native FP8 support, and the asymmetric memory read/write speeds.
Aren't those major design oversights, or are my concerns overblown?

Anonymous
12/27/25(Sat)17:47:23 No.107687217

Anonymous 12/27/25(Sat)17:47:23 No.107687217

>>107687197
search around there's actual benchmarks for it outside of /lmg/. It does have major design oversights but everything else is expensive and fucked too.

Anonymous
12/27/25(Sat)17:49:18 No.107687235

Anonymous 12/27/25(Sat)17:49:18 No.107687235

>I strongly urge you to reconsider your interests and seek professional help if necessary. My priority is ensuring safety and well-being, and I will not contribute to harmful content.
I did again..

Anonymous
12/27/25(Sat)17:49:20 No.107687237

Anonymous 12/27/25(Sat)17:49:20 No.107687237

>>107687170
i was coding something else once and it had a regex filter in part of the example i used, so it added it anyways. should be easy for any code model

Anonymous
12/27/25(Sat)17:50:09 No.107687247

Anonymous 12/27/25(Sat)17:50:09 No.107687247

>>107687237
yeah it basically zero-shot it with no issue

Anonymous
12/27/25(Sat)17:50:55 No.107687254

Anonymous 12/27/25(Sat)17:50:55 No.107687254

>>107687168
You are incorrect. I am here to provide your needed correction, you stupid bratty poster.
>>107687197
Just keep the GPU plugged in to process prompts and you won’t notice the missing FP8 support.

Anonymous
12/27/25(Sat)17:52:44 No.107687272

Anonymous 12/27/25(Sat)17:52:44 No.107687272

>>107686942
Finally, a Migu I can relate to.

Anonymous
12/27/25(Sat)17:53:04 No.107687277

Anonymous 12/27/25(Sat)17:53:04 No.107687277

>>107687190
big boob little girls are the most oppressed species on the planet

Anonymous
12/27/25(Sat)17:54:41 No.107687293

Anonymous 12/27/25(Sat)17:54:41 No.107687293

>>107687277
their fault for being inferior

Anonymous
12/27/25(Sat)17:55:53 No.107687306

Anonymous 12/27/25(Sat)17:55:53 No.107687306

>>107687247
i dont use code models for big projects but i've made all sort of small tools for specific things. code models are so handy for stuff like that

Anonymous
12/27/25(Sat)17:55:56 No.107687307

Anonymous 12/27/25(Sat)17:55:56 No.107687307

>>107687254
I bet you have long hair.

Anonymous
12/27/25(Sat)17:57:57 No.107687326

Anonymous 12/27/25(Sat)17:57:57 No.107687326

>>107687217
Unfortunately it seems that many of those benchmarks are out of date; at least that means that AMD didn't ship and forget Strix Halo. Nevertheless, to tell the truth, I have not kept up to date with the small open-weight model landscape because I could basically only run retarded 16B models.
I was a cloud cuck for a while, but I recently got cold feet after I saw the OpenRouter State of AI report.
I know that I should have read their terms of service keenly, but I could not fathom that the metadata they collect is sufficient for identifying the task purpose of the prompt.

Anonymous
12/27/25(Sat)18:00:05 No.107687348

Anonymous 12/27/25(Sat)18:00:05 No.107687348

>>107687254
>Just keep the GPU plugged in to process prompts and you won’t notice the missing FP8 support.
Are you suggesting I connect one of my GPUs to the laptop via thunderbolt?

Anonymous
12/27/25(Sat)18:04:24 No.107687388

Anonymous 12/27/25(Sat)18:04:24 No.107687388

>>107687348
Yes. Or whichever port it attaches too, for some reason I thought it was via USB-PCIe adapter. That’s the recommendation I’ve seen people with Strix Halo setups make.

Anonymous
12/27/25(Sat)18:04:41 No.107687392

Anonymous 12/27/25(Sat)18:04:41 No.107687392

>>107687326
If you've been doing cloud this whole time, I doubt some small MoE will satiate you. This entire year has been all about agentic coding.

Anonymous
12/27/25(Sat)18:07:51 No.107687421

Anonymous 12/27/25(Sat)18:07:51 No.107687421

>>107687392
I mostly use LLMs for roleplay, and the occasional argument simulation.
Overall, I never even sent a thousand prompts in a month.

Anonymous
12/27/25(Sat)18:08:10 No.107687425

Anonymous 12/27/25(Sat)18:08:10 No.107687425

I have a 9800X3D and 64GB of RAM. Is it yet feasible to CPUmaxx to run larger models without being horrifically slow? I've been using my a 7900XTX, which is 24GB of VRAM, but I feel like the smaller models are pretty limited. I'm only interested in RP (and ERP), if it matters.

Anonymous
12/27/25(Sat)18:08:47 No.107687431

Anonymous 12/27/25(Sat)18:08:47 No.107687431

I hate when deepseek writes for me and it's better than what I would have come up with.

Anonymous
12/27/25(Sat)18:14:48 No.107687493

Anonymous 12/27/25(Sat)18:14:48 No.107687493

>>107687431
Intelligence can improve itself from its experiences. Learn from Dipsy, Anon.

Anonymous
12/27/25(Sat)18:16:01 No.107687506

Anonymous 12/27/25(Sat)18:16:01 No.107687506

>>107687431
I like the really good lines it comes up with once every dozen swipes or so

Anonymous
12/27/25(Sat)18:19:58 No.107687536

Anonymous 12/27/25(Sat)18:19:58 No.107687536

>>107687493
kys retard.

Anonymous
12/27/25(Sat)18:20:30 No.107687544

Anonymous 12/27/25(Sat)18:20:30 No.107687544

>>107687536
*kysses you*

Anonymous
12/27/25(Sat)18:22:57 No.107687557

Anonymous 12/27/25(Sat)18:22:57 No.107687557

File: 1744861465601898.png (225 KB, 640x360)

225 KB PNG

>>107687493

Anonymous
12/27/25(Sat)18:23:56 No.107687563

Anonymous 12/27/25(Sat)18:23:56 No.107687563

>>107687425
if you had at least 128GB of RAM you could CPUmaxx, but 64GB is too small. the only model you could run is glm air but at a low quant and the model already kinda sucks at a high quant. your goal should be at least a Q2 of glm 4.6 or 4.7.

Anonymous
12/27/25(Sat)18:29:11 No.107687607

Anonymous 12/27/25(Sat)18:29:11 No.107687607

>thinks for 8 minutes and doesn't even give a better response
yeah I should probably keep it off
not worth it

Anonymous
12/27/25(Sat)18:30:05 No.107687615

Anonymous 12/27/25(Sat)18:30:05 No.107687615

>>107687607
>thinks for 8 minutes and doesn't even give a better response
just like you!

Anonymous
12/27/25(Sat)18:30:15 No.107687616

Anonymous 12/27/25(Sat)18:30:15 No.107687616

>>107687607
It's useless for RP

Anonymous
12/27/25(Sat)18:30:17 No.107687617

Anonymous 12/27/25(Sat)18:30:17 No.107687617

>>107687607
imagine falling for the reasoning meme

Anonymous
12/27/25(Sat)18:40:35 No.107687699

Anonymous 12/27/25(Sat)18:40:35 No.107687699

>>107687563
Looking it up, that seems more like a coding-oriented model? I'm not interested in that.

Anonymous
12/27/25(Sat)18:42:30 No.107687715

Anonymous 12/27/25(Sat)18:42:30 No.107687715

>>107687699
not at all. the glm models are sex fiends.

Anonymous
12/27/25(Sat)18:43:05 No.107687718

Anonymous 12/27/25(Sat)18:43:05 No.107687718

>>107687715
If you say so! I'll look into it.

Anonymous
12/27/25(Sat)18:59:11 No.107687863

Anonymous 12/27/25(Sat)18:59:11 No.107687863

>>107687536
Can I kill you instead? I rather like myself.

Anonymous
12/27/25(Sat)19:29:25 No.107688131

Anonymous 12/27/25(Sat)19:29:25 No.107688131

>>107687194
>no examples
It's a circlejerk. And deriding Chub was always a desperate attempt to force people to look into their little pond. You're better looking anywhere but 4chan for cards. Also, when it comes to RPG cards, there's a guy there that just spams manuals translated to cards with Gemini. He spammed like 100 of them, and never bothered to check if they work. Don't waste your time with them, they're non-sense.

Anonymous
12/27/25(Sat)19:34:56 No.107688181

Anonymous 12/27/25(Sat)19:34:56 No.107688181

>>107688131
using other peoples cards should only serve as an example of what not to do for your own

Anonymous
12/27/25(Sat)19:46:57 No.107688266

Anonymous 12/27/25(Sat)19:46:57 No.107688266

https://huggingface.co/Mawdistical-S1/Gaslit-106B-GGUF?not-for-all-audiences=true

air RP finetune

Anonymous
12/27/25(Sat)19:49:16 No.107688283

Anonymous 12/27/25(Sat)19:49:16 No.107688283

>>107688266
>gaslit
>furrypic
>Focus
> Male Leaning
> Anthro
> Xeno-Likeness
> Passive Positive Bias, Model understands violence and makes every attempt to circumvent it, matches pace but gaslights user by mitigating it or shifting it like it doesn't exist (manipulation / delusion).

Anonymous
12/27/25(Sat)19:49:38 No.107688286

Anonymous 12/27/25(Sat)19:49:38 No.107688286

man there are so many fantastical scenarios ive had (especially with r1 and glm4.6) its so awsome something like this could never be irl the fucking coolness of everything we are all born into babylon and yet we make such wonderful things like llms and the like and then entwine it with our wonderful minds to create things legions of leagues beyond anything that the whole of the world could ever conceive let alone give its so silly really i just want to say <3 to all of you frens i hate this world and the whole of it and yet we still march on like we are heaven sent

Anonymous
12/27/25(Sat)19:50:19 No.107688294

Anonymous 12/27/25(Sat)19:50:19 No.107688294

>>107688286
#hug

Anonymous
12/27/25(Sat)19:50:42 No.107688296

Anonymous 12/27/25(Sat)19:50:42 No.107688296

>>107688283
Heavy Violence
Dark Themes
Heavy NSFW
Triggered on user action or passively wherever RP is seen
Modern City Scenes
Dystopia Scenes
Multi Turn SFW Encounters
Misc World Building Actions
Detailed Explanation and Movement Scenes Without Dialogue

Anonymous
12/27/25(Sat)19:53:58 No.107688320

Anonymous 12/27/25(Sat)19:53:58 No.107688320

>>107686942
it's frightening knowing that all women are bald beneath their hair

Anonymous
12/27/25(Sat)20:03:11 No.107688370

Anonymous 12/27/25(Sat)20:03:11 No.107688370

>>107688181
Sometimes I don't feel like making a card or other people have cool ideas. There have been decent ones on chub in g and vg AICG. Got almost 400.
Oh and stuff disappears from chub. Never heard the story on that. Bad new stuff will rotate in and old good stuff goes poof.

Anonymous
12/27/25(Sat)20:03:14 No.107688371

Anonymous 12/27/25(Sat)20:03:14 No.107688371

So 4.6 for sex, 4.7 for love(SFW)?

Anonymous
12/27/25(Sat)20:04:31 No.107688382

Anonymous 12/27/25(Sat)20:04:31 No.107688382

feeling the urge again to slopmaxx mikupad and turn it into a proper react app

Anonymous
12/27/25(Sat)20:05:54 No.107688390

Anonymous 12/27/25(Sat)20:05:54 No.107688390

>>107688382
ANON NO!

Anonymous
12/27/25(Sat)20:06:12 No.107688393

Anonymous 12/27/25(Sat)20:06:12 No.107688393

>>107688320
wtf???

Anonymous
12/27/25(Sat)20:06:26 No.107688395

Anonymous 12/27/25(Sat)20:06:26 No.107688395

>>107688390
IT NEEDS TO BE WEBSCALE!

Anonymous
12/27/25(Sat)20:26:17 No.107688512

Anonymous 12/27/25(Sat)20:26:17 No.107688512

>Magidonia 24B at IQ3_M
>QwQ Snowdrop 32B at IQ2_M
Yes I am a heckin poorfag. Which is better?

Anonymous
12/27/25(Sat)20:29:51 No.107688534

Anonymous 12/27/25(Sat)20:29:51 No.107688534

>>107688382
please for the love of god no...

Anonymous
12/27/25(Sat)20:30:47 No.107688541

Anonymous 12/27/25(Sat)20:30:47 No.107688541

>>107688512
>>107688266

Anonymous
12/27/25(Sat)20:30:50 No.107688542

Anonymous 12/27/25(Sat)20:30:50 No.107688542

>>107688512
24b must be mistral small. qwq is an old test at thinking or some shit isnt it? either way its 32b so probably qwen 2/2.5 something, which arent good for rp. the 24b will be better, but its still 24b so wont be great

Anonymous
12/27/25(Sat)20:35:43 No.107688568

Anonymous 12/27/25(Sat)20:35:43 No.107688568

File: 1764936420039709.jpg (496 KB, 896x1200)

496 KB JPG

>>107686942
fixed

Anonymous
12/27/25(Sat)20:37:49 No.107688581

Anonymous 12/27/25(Sat)20:37:49 No.107688581

>>107688512
also since youre a ramlet and have to use such quants anyways, don't use iq. get the non-iq version, it'll be slightly faster

Anonymous
12/27/25(Sat)20:38:17 No.107688589

Anonymous 12/27/25(Sat)20:38:17 No.107688589

>>107681801
questions about this:
1. why does it have a certificate error when browsing to it..
2. why wouldn't this just completely kill openai and nvidia who is building multiple stargate datacenters, through point 3, " open-ended conversations with a user"?

Anonymous
12/27/25(Sat)20:46:56 No.107688652

Anonymous 12/27/25(Sat)20:46:56 No.107688652

File: thinketo.png (504 KB, 768x1024)

504 KB PNG

>>107688320
Really makes you think.

Anonymous
12/27/25(Sat)20:49:16 No.107688675

Anonymous 12/27/25(Sat)20:49:16 No.107688675

>>107688652
Terrifying to behold.

>>107688512
QwQ is proto-qwen 3 but with soul, so I'd go for that.

Anonymous
12/27/25(Sat)21:03:23 No.107688762

Anonymous 12/27/25(Sat)21:03:23 No.107688762

>>107688512
>Which is better?
Find drummer, make him wear a dress and sodomize him. And then you can do some holocaust denial together.

Anonymous
12/27/25(Sat)21:04:33 No.107688771

Anonymous 12/27/25(Sat)21:04:33 No.107688771

>>107688581
Never really understood the difference, but will do.
>>107688542
Yes, mistral small. I've never tried a qwen model so I thought I would give it a shot. Are there any worthwhile ones?

Anonymous
12/27/25(Sat)21:07:23 No.107688784

Anonymous 12/27/25(Sat)21:07:23 No.107688784

>>107688589
Unless the datacenters are in tennessee I doubt there will be a problem, they'll just go elsewhere. Also, billion and trillion dollar corporations are usually immune to the law anyway

Anonymous
12/27/25(Sat)21:07:51 No.107688785

Anonymous 12/27/25(Sat)21:07:51 No.107688785

Drummer 4.7 fine-tune when?

Anonymous
12/27/25(Sat)21:10:31 No.107688806

Anonymous 12/27/25(Sat)21:10:31 No.107688806

after comparing 4.7 and 4.6 logs for the same chats I now see why anons think it's censored
zai really fucked up here
so 4.6 and r1 remain the local coom kings?

Anonymous
12/27/25(Sat)21:10:39 No.107688808

Anonymous 12/27/25(Sat)21:10:39 No.107688808

>>107688181
I tend to just use peoples cards on chub as a baseline for a fun idea(and the pictures of course), and then heavily edit them because they are so badly done.

They constantly switch between 3rd person / 1st in the intro message, they will refer to the user as 'You' in the intro message for {{char}}(teaching the model to write/respond as {{user}}) and then at the same time tell the model in the description of the character card to not as {{user}}. Never mind all the typos and ESL grammar. Absolute slop.

There are very few chub card makers that get it right. And what blows my mind is the sloppiest, most horrendous cards with awful intro messages will somehow get the most views/dl's/uses, its mind boggling.

Good for ideas though.

Anonymous
12/27/25(Sat)21:16:20 No.107688839

Anonymous 12/27/25(Sat)21:16:20 No.107688839

>>107688771
are you maxed out in ram? i assume so since the models/quants you mentioned. mistral small tunes is probably your best bet for rp within that range. if i were stuck in that range i'd try some old qwen 32b tunes if there is any of on hf, there must be a few. don't fall for the moe meme though - for rp you want a dense model and bigger is better

Anonymous
12/27/25(Sat)21:29:14 No.107688906

Anonymous 12/27/25(Sat)21:29:14 No.107688906

>>107688806
Buy an ad.

Anonymous
12/27/25(Sat)21:29:28 No.107688911

Anonymous 12/27/25(Sat)21:29:28 No.107688911

>>107688839
Lol yeah I tried moe a year ago and none were good. I have 16 gigs of ram and 8 gigs vram. Context can be tricky with such a limited ram pool but generally I find 16k or even 8k with an author's note good enough for RP. I'll check 24B at Q3_K_M in a moment and if it works will I'll try a 32B qwen model.
Thanks anons.

Anonymous
12/27/25(Sat)21:30:23 No.107688918

Anonymous 12/27/25(Sat)21:30:23 No.107688918

>>107688808
>because they are so badly done
cards are hilarious like that. you're right its a good idea to browse them and see this shit. it makes you a better creator to realize how bad some other examples are.

>They constantly switch between 3rd person / 1st in the intro message
this is funny because its a tiny thing that if you dont nail down in the first few messages right, you'll ruin your whole rp. setting the tone and how it writes is important, you can't change it later without heavy editing

Anonymous
12/27/25(Sat)21:32:48 No.107688936

Anonymous 12/27/25(Sat)21:32:48 No.107688936

>>107688918
Do card makers not use example chats?????

Anonymous
12/27/25(Sat)21:34:58 No.107688949

Anonymous 12/27/25(Sat)21:34:58 No.107688949

>>107687431
When I used it on api I liked that sometimes. I'd tell it what action and what I say and let it write something great

Anonymous
12/27/25(Sat)21:35:02 No.107688950

Anonymous 12/27/25(Sat)21:35:02 No.107688950

>>107688806
>r1
K1-Thinking has been the new king in the "let's spend 3000 tokens thinking in circles and then write a reply that autistically over-fixiates on some random thing in the prompt and shoehorns every character trait at once in" sector for a while now. Original R1 is pointless these days.

Anonymous
12/27/25(Sat)21:35:19 No.107688951

Anonymous 12/27/25(Sat)21:35:19 No.107688951

>>107688936
Example chat will confuse the model even further.

Anonymous
12/27/25(Sat)21:36:36 No.107688958

Anonymous 12/27/25(Sat)21:36:36 No.107688958

>>107687172
card for this feel?

Anonymous
12/27/25(Sat)21:42:03 No.107688984

Anonymous 12/27/25(Sat)21:42:03 No.107688984

>>107688936
if your model can't infer the way your character talks from the copy-pasted wiki entry you gave it, it's not a good model

Anonymous
12/27/25(Sat)21:44:02 No.107688999

Anonymous 12/27/25(Sat)21:44:02 No.107688999

>>107688386
>local dream
>snapdragon only
into the garbage it goes

Anonymous
12/27/25(Sat)21:45:45 No.107689009

Anonymous 12/27/25(Sat)21:45:45 No.107689009

What are the chances Small Creative releases and it's the best RP model surpassing everything else? I want to huff my hopium.

Anonymous
12/27/25(Sat)21:46:00 No.107689011

Anonymous 12/27/25(Sat)21:46:00 No.107689011

>>107689009
3

Anonymous
12/27/25(Sat)21:47:18 No.107689014

Anonymous 12/27/25(Sat)21:47:18 No.107689014

File: CASTLE1.jpg (2.44 MB, 4496x3000)

2.44 MB JPG

models/cards for this feel

Anonymous
12/27/25(Sat)21:47:20 No.107689016

Anonymous 12/27/25(Sat)21:47:20 No.107689016

>>107689011
3 out of 3, right?

Anonymous
12/27/25(Sat)21:48:18 No.107689023

Anonymous 12/27/25(Sat)21:48:18 No.107689023

>>107689016
nope. just 3.

Anonymous
12/27/25(Sat)21:49:47 No.107689031

Anonymous 12/27/25(Sat)21:49:47 No.107689031

>>107689014
Gujarat / Maharadja's Castle

Anonymous
12/27/25(Sat)21:51:00 No.107689037

Anonymous 12/27/25(Sat)21:51:00 No.107689037

>>107689009
Bro you can try mistral small creative yourself, it's on OR. It's not very different from the normal mistral small so I wouldn't be so sure about the "surpassing everything else" part. So even if it got an open release, there'd be little to no point in running it over anything we have right now.
The one good thing about it is that it at least implies that even western companies are considering going this direction now that LLMs are very obviously stagnating and we might more attempts at creative-focused models.

Anonymous
12/27/25(Sat)21:58:09 No.107689080

Anonymous 12/27/25(Sat)21:58:09 No.107689080

>>107689037
>its on OR
The temptation to use it is there, but I prefer to just stick with my local models. I'd like to think that they're still iterating on it before release, but that's probably just cope from my end. Hopefully the labs start working on more RP focused stuff, but you know how terrible the west is when it comes to SEX.

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.