/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

[Post a Reply]

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous
/lmg/ - Local Models General 10/18/25(Sat)14:40:56 No.106931567

File: Miku-09.jpg (131 KB, 512x768)

131 KB JPG

/lmg/ - Local Models General Anonymous 10/18/25(Sat)14:40:56 No.106931567

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>106919198 & >>106904820

►News
>(10/17) LlamaBarn released for Mac: https://github.com/ggml-org/LlamaBarn
>(10/14) Qwen3-VL 4B and 8B released: https://hf.co/Qwen/Qwen3-VL-8B-Thinking
>(10/11) koboldcpp-1.100.1 prebuilt released with Wan video generation support: https://github.com/LostRuins/koboldcpp/releases/tag/v1.100.1
>(10/10) KAT-Dev-72B-Exp released: https://hf.co/Kwaipilot/KAT-Dev-72B-Exp
>(10/09) RND1: Simple, Scalable AR-to-Diffusion Conversion: https://radicalnumerics.ai/blog/rnd1

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
10/18/25(Sat)14:41:28 No.106931573

Anonymous 10/18/25(Sat)14:41:28 No.106931573

File: miguu.jpg (74 KB, 600x648)

74 KB JPG

►Recent Highlights from the Previous Thread: >>106919198

--Paper: Verbalized Sampling: How to Mitigate Mode Collapse and Unlock LLM Diversity:
>106921354 >106921377 >106921488
--Paper: REAP the Experts: Why Pruning Prevails for One-Shot MoE compression:
>106926865 >106926930 >106926935 >106926946 >106926973 >106926999
--Cost-performance analysis of AMD 3950X 128GB vs custom server for LLM/home server/gaming:
>106919401 >106919472 >106919477 >106920242
--Synthetic data and conversational CoT dataset generation for LLM training:
>106919615 >106919833
--glm-chan model behavior and prompt optimization challenges:
>106919852 >106919884 >106919974 >106920107
--Defining uncensored models through role adaptability vs unpredictable behavior:
>106919886 >106920057 >106920564 >106920631 >106920777
--Limitations and workarounds for training LoRA on quantized models:
>106920664 >106920700 >106920848 >106921079 >106921407
--Sparse model scaling advantages over dense architectures:
>106920856 >106920874 >106920885 >106920916 >106920998 >106921046 >106921100 >106921142 >106921007
--Adding Metal4 tensor support to llama.cpp:
>106920993
--Proprietary GGUF format criticisms:
>106921215 >106923524 >106923584 >106923681 >106923793
--Struggles with AWQ model conversion and vLLM optimization:
>106922104 >106922122 >106922147
--AI/ML education vs practical skills and networking for job prospects:
>106922370 >106922549 >106922690 >106922736
--Valve devs improve Vulkan for llama.cpp AI:
>106930141
--LlamaBarn project announcement and real platform inquiry:
>106928231 >106928236
--Designing a multi-agent AI RPG with state management and narrative consistency:
>106930493 >106930613 >106931198 >106930663
--Challenges in RAG systems for base knowledge integration:
>106931465 >106931513
--Miku (free space):
>106924924 >106930166 >106930227 >106930335 >106930569

►Recent Highlight Posts from the Previous Thread: >>106919206

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
10/18/25(Sat)14:47:31 No.106931622

Anonymous 10/18/25(Sat)14:47:31 No.106931622

>>106931370
Thanks but I usually use other people's checkpoints and with them I don't see much difference between temperatures, but the quality is better than I remember

Anonymous
10/18/25(Sat)14:49:11 No.106931637

Anonymous 10/18/25(Sat)14:49:11 No.106931637

>>106931562
>const static std::string pattern_moe_all = "blk\\.\\d+\\.ffn_(up|down|gate)_(ch|)exps";
Okay I shouldn't need to do set -ot myself at all then.

Anonymous
10/18/25(Sat)14:50:35 No.106931647

Anonymous 10/18/25(Sat)14:50:35 No.106931647

https://github.com/ggml-org/llama.cpp/pull/16653
posting again since it got posted just at the end of the thread, but proper auto gpu memory allocation is finally coming

Anonymous
10/18/25(Sat)14:52:33 No.106931665

Anonymous 10/18/25(Sat)14:52:33 No.106931665

Elsa Hitler Margret

Anonymous
10/18/25(Sat)14:56:33 No.106931696

Anonymous 10/18/25(Sat)14:56:33 No.106931696

Instead of using zram, zswap is more performant option these day. Feels snappier. Even if you have shit tons of ram zswap is still useful because it stabilizes system paging.

Anonymous
10/18/25(Sat)15:25:50 No.106931969

Anonymous 10/18/25(Sat)15:25:50 No.106931969

File: rip.png (87 KB, 556x512)

87 KB PNG

bros, hf is after /ourguy/

Anonymous
10/18/25(Sat)15:26:54 No.106931980

Anonymous 10/18/25(Sat)15:26:54 No.106931980

>>106931969
good riddance

Anonymous
10/18/25(Sat)15:28:24 No.106931997

Anonymous 10/18/25(Sat)15:28:24 No.106931997

File: file.png (30 KB, 519x134)

30 KB PNG

>>106931980
>t.

Anonymous
10/18/25(Sat)15:29:16 No.106932012

Anonymous 10/18/25(Sat)15:29:16 No.106932012

>>106931969
always were

Anonymous
10/18/25(Sat)15:29:26 No.106932013

Anonymous 10/18/25(Sat)15:29:26 No.106932013

>>106931969
I lost like 40GB of private storage lately. Seems like they're trying to free as much space as they can

Anonymous
10/18/25(Sat)15:30:41 No.106932026

Anonymous 10/18/25(Sat)15:30:41 No.106932026

>>106932013
b-but when it was discussed lately anons said it was le nothingburger and nothing would change?

Anonymous
10/18/25(Sat)15:33:38 No.106932061

Anonymous 10/18/25(Sat)15:33:38 No.106932061

>>106931969
what about the guy that had like 20k merges

Anonymous
10/18/25(Sat)15:33:57 No.106932065

Anonymous 10/18/25(Sat)15:33:57 No.106932065

>>106932026
Well, saas are always trash nothing new here. I'm saving for a 20+TB drive now

Anonymous
10/18/25(Sat)15:38:01 No.106932106

Anonymous 10/18/25(Sat)15:38:01 No.106932106

File: file.png (11 KB, 211x148)

11 KB PNG

>>106932061
>*26K quants
thank you very much, and he seems fine, tough he uploads to the team mradermacher account,
I wonder how much space they use.
in comparison to picrel davidau has ~300 models and drummer is only approaching 200

Anonymous
10/18/25(Sat)15:38:19 No.106932109

Anonymous 10/18/25(Sat)15:38:19 No.106932109

>>106931969
Scammer lost. People won.

Anonymous
10/18/25(Sat)15:44:47 No.106932168

Anonymous 10/18/25(Sat)15:44:47 No.106932168

If I was him I would only keep the few latest tunes in HF and deposit older stuff to somewhere else.

Anonymous
10/18/25(Sat)15:46:12 No.106932185

Anonymous 10/18/25(Sat)15:46:12 No.106932185

>>106932168
but you're not him, and never will be

Anonymous
10/18/25(Sat)15:47:45 No.106932201

Anonymous 10/18/25(Sat)15:47:45 No.106932201

Thankfully the userbase has developed enough to realize slop tunes are all placebo and it is entirely skill issue or being too lazy to prefill a prude model

Anonymous
10/18/25(Sat)15:48:21 No.106932207

Anonymous 10/18/25(Sat)15:48:21 No.106932207

>>106932185
What do you mean?

Anonymous
10/18/25(Sat)15:48:32 No.106932210

Anonymous 10/18/25(Sat)15:48:32 No.106932210

>>106932201
>skill issue
Only if earning money is the skill we are talking about.

Anonymous
10/18/25(Sat)15:50:51 No.106932229

Anonymous 10/18/25(Sat)15:50:51 No.106932229

>>106932201
true, just use Gemma with the response you want pre-written by you in the sys prompt, it works 99% of the time better than nemo!

Anonymous
10/18/25(Sat)15:51:37 No.106932235

Anonymous 10/18/25(Sat)15:51:37 No.106932235

>>106932210
despite richfags constantly dunking on vramlets in the thread, they never post side by sides of the supposed retard vramlet model and their patrician richfag model because they know in their heart of heats for the purpose of ERP you really don't need that many parameters.

Anonymous
10/18/25(Sat)15:53:52 No.106932251

Anonymous 10/18/25(Sat)15:53:52 No.106932251

>>106932235
4B ought to be enough, if only they stopped trying to shove the entire internet in there.

Anonymous
10/18/25(Sat)15:54:45 No.106932258

Anonymous 10/18/25(Sat)15:54:45 No.106932258

I am Drummer.

Anonymous
10/18/25(Sat)15:54:54 No.106932259

Anonymous 10/18/25(Sat)15:54:54 No.106932259

File: imatrix.png (970 KB, 3110x1315)

970 KB PNG

>These quants provide best in class perplexity for the given memory footprint.
What am I missing?
IQ quants seem to be the meme I suspected them to be. All other inference params identical min_p=0.04 sampler only

Anonymous
10/18/25(Sat)15:55:34 No.106932264

Anonymous 10/18/25(Sat)15:55:34 No.106932264

File: file.png (1.05 MB, 871x796)

1.05 MB PNG

>>106932235
>for the purpose of ERP you really don't need that many parameters

Anonymous
10/18/25(Sat)15:56:00 No.106932271

Anonymous 10/18/25(Sat)15:56:00 No.106932271

>>106931969
Open source work?

Hi all, Drummer here...
10/18/25(Sat)15:56:05 No.106932276

Hi all, Drummer here... 10/18/25(Sat)15:56:05 No.106932276

>>106932258
No you're not.

Hi all, Drummer here...
10/18/25(Sat)15:57:10 No.106932288

Hi all, Drummer here... 10/18/25(Sat)15:57:10 No.106932288

>>106932258
You only suck penises like me. But you aren't me.

Anonymous
10/18/25(Sat)15:57:29 No.106932290

Anonymous 10/18/25(Sat)15:57:29 No.106932290

>>106932264
Well anon come on then, post your favorite card with vramlet nemo or gemma and with GLM i'm sure we'll be able to see 3000$ worth of prose improvement.

Hi all, Drummer here...
10/18/25(Sat)15:58:22 No.106932299

Hi all, Drummer here... 10/18/25(Sat)15:58:22 No.106932299

>>106932290
Nemo and Gemma gave me ED. Glm-chan gave me PE.

Hi all, Drummer here...
10/18/25(Sat)16:00:07 No.106932314

Hi all, Drummer here... 10/18/25(Sat)16:00:07 No.106932314

>>106932264
Air when? They're scammier than the drummer at this point.

Anonymous
10/18/25(Sat)16:03:57 No.106932340

Anonymous 10/18/25(Sat)16:03:57 No.106932340

File: drummer schizo grooming.png (79 KB, 542x390)

79 KB PNG

David won?

Anonymous
10/18/25(Sat)16:04:14 No.106932341

Anonymous 10/18/25(Sat)16:04:14 No.106932341

>>106932106
damn, would really like to see the exact storage usage numbers

Anonymous
10/18/25(Sat)16:07:42 No.106932369

Anonymous 10/18/25(Sat)16:07:42 No.106932369

>>106932340
schizotunes bros.... WE WON!!!

Anonymous
10/18/25(Sat)16:07:51 No.106932373

Anonymous 10/18/25(Sat)16:07:51 No.106932373

genuine advice to drummer: make llamacpp agpl fork with lora support, then upload loras only
i doubt u did FFT of glm air right? and for models that bartowski made quants of u coud delete the quants to save space. just keep original models.. id like you to publically announce wat ur gonna do before u start deleting models so we can archive some of ur stuff maybe. at least i know id like to
goodluck drumdrum, i still like ytrying your sloptunes no matter what anons say.
also instead of paying 200$/month u could rent seedbox and host models there or something..

Anonymous
10/18/25(Sat)16:09:06 No.106932391

Anonymous 10/18/25(Sat)16:09:06 No.106932391

>>106932373
>schizobabble
Try running that through an LLM next time zoomie

Anonymous
10/18/25(Sat)16:09:51 No.106932395

Anonymous 10/18/25(Sat)16:09:51 No.106932395

>>106932373
Question : >>106932363

Anonymous
10/18/25(Sat)16:12:09 No.106932411

Anonymous 10/18/25(Sat)16:12:09 No.106932411

>>106931969
>open source work
The only thing he ever did was fill up their hard drives with shit models and there wasn't even anything open source about it.

Anonymous
10/18/25(Sat)16:14:16 No.106932428

Anonymous 10/18/25(Sat)16:14:16 No.106932428

>>106931969
drummer, start an OF. I'll support you. show off that bussy while you do those 'toons baby

Anonymous
10/18/25(Sat)16:14:23 No.106932430

Anonymous 10/18/25(Sat)16:14:23 No.106932430

>>106932373
Great advice. He should totally do that.

Anonymous
10/18/25(Sat)16:14:49 No.106932433

Anonymous 10/18/25(Sat)16:14:49 No.106932433

>>106932395
He's retarded. LoRAs have always worked, but Drummer and the mouthbreathers that use his models probably wouldn't know how to load a LoRA. He also can't simply take llama.cpp and relicense it.

Anonymous
10/18/25(Sat)16:16:39 No.106932449

Anonymous 10/18/25(Sat)16:16:39 No.106932449

>>106932433
loras are a pain in the ass to use with quanted shit

Anonymous
10/18/25(Sat)16:17:24 No.106932458

Anonymous 10/18/25(Sat)16:17:24 No.106932458

>>106932433
I do remember LoRA not working in some specific circumstances (multi GPU?), but yeah.
As far as I know, people could release their LoRAs instead of just the final merged model.
I don't know how LoRA interacts with quantization however, if there's something specific you need to do for a specific quant and such, or if it only works with the unquanted model in GGUF format, etc.

Anonymous
10/18/25(Sat)16:17:30 No.106932459

Anonymous 10/18/25(Sat)16:17:30 No.106932459

What's a lora?

Anonymous
10/18/25(Sat)16:20:31 No.106932492

Anonymous 10/18/25(Sat)16:20:31 No.106932492

File: muah.jpg (1.08 MB, 3840x2160)

1.08 MB JPG

>>106931969
What does El Drummer actually do?
My assumption was the model merging/raw fucking around with the tensor data, for no good reason. But if he's actually tuning model weights and people enjoy them then respect.

Anonymous
10/18/25(Sat)16:21:13 No.106932495

Anonymous 10/18/25(Sat)16:21:13 No.106932495

>106932201
this is who is now pushing for lora bs by the by

Anonymous
10/18/25(Sat)16:21:41 No.106932501

Anonymous 10/18/25(Sat)16:21:41 No.106932501

>>106932492
He does tune, but it's all pretty half assed and not very interesting. Most attempts are big flops that contribute absolutely nothing.

Anonymous
10/18/25(Sat)16:21:54 No.106932505

Anonymous 10/18/25(Sat)16:21:54 No.106932505

>>106932459
It's like qlora but without quantization

Anonymous
10/18/25(Sat)16:22:19 No.106932509

Anonymous 10/18/25(Sat)16:22:19 No.106932509

>>106932492
he does indeed do tunes, david is the one that's mainly merges

Anonymous
10/18/25(Sat)16:26:26 No.106932549

Anonymous 10/18/25(Sat)16:26:26 No.106932549

>>106932501
>>106932509
We've moved beyond entertaining the concept of somehow merging trained model weights, right? huzzah

Anonymous
10/18/25(Sat)16:30:07 No.106932577

Anonymous 10/18/25(Sat)16:30:07 No.106932577

>>106932509
I love that David exists.
Where else would you get
>DavidAU/Qwen3-MOE-2x8B-TNG-Deckard-Beta-16B

Anonymous
10/18/25(Sat)16:32:01 No.106932589

Anonymous 10/18/25(Sat)16:32:01 No.106932589

>>106932577
Did anyone actually try any of these turds? Does David actually do anything to the weights, or does he just slap that shit together in mergekit and calls it a day?

Anonymous
10/18/25(Sat)16:32:20 No.106932593

Anonymous 10/18/25(Sat)16:32:20 No.106932593

>>106932577
I mean, look at this shit
>This is MOE model config of TWO "DND" (double neuron density) 8B models.
>The first model is trained on the TNG/Star Trek Universe (2 datasets) via Unsloth.
>The second model is trained on the Deckard/PK Dick Universe (5 datasets) via Unsloth.
>Both models use a BASE of Jan V1 4B + Brainstorm 40x (4B+ 40x => 8B parameters.)
>The MOE - mixture of experts - config is 2x8B - 16B parameters. With compression this creates a model of 13B - all the power of 16B in 13B package.
>This MOE drastically upscales the BOTH expert models in this MOE model.
>This model can also be used for Role play, Star Treking, Science Fiction, writing, story generation etc etc.
>The BASE model is (a 4B model + Brainstorm 40x adapter):
This is amazing.

Anonymous
10/18/25(Sat)16:33:07 No.106932601

Anonymous 10/18/25(Sat)16:33:07 No.106932601

>>106932589
that's not the point, it's drummer that needs to be stopped, david is a wholesome bean.

Anonymous
10/18/25(Sat)16:33:21 No.106932603

Anonymous 10/18/25(Sat)16:33:21 No.106932603

>>106932589
I've tried a couple and they were all, without fail, schizo out of the box.
Or just exactly like llama 3 8b with high temp but taking double to 4x the memory.

Anonymous
10/18/25(Sat)16:35:06 No.106932612

Anonymous 10/18/25(Sat)16:35:06 No.106932612

>>106932601
They're both retards wasting space and compute.

Anonymous
10/18/25(Sat)16:35:43 No.106932615

Anonymous 10/18/25(Sat)16:35:43 No.106932615

>>106932601
if beans could be schizophrenic....

Anonymous
10/18/25(Sat)16:35:52 No.106932616

Anonymous 10/18/25(Sat)16:35:52 No.106932616

File: file.png (149 KB, 603x920)

149 KB PNG

>>106932603
what class of model did you try doe?

Anonymous
10/18/25(Sat)16:37:00 No.106932623

Anonymous 10/18/25(Sat)16:37:00 No.106932623

>>106932601
drummer bought an ad on 4chan. He is /ourguy/ regardless of any other factor.

Anonymous
10/18/25(Sat)16:37:04 No.106932624

Anonymous 10/18/25(Sat)16:37:04 No.106932624

>>106931969
>scammer is no longer able to waste bandwidth advertising recycled toys over and over again
based

Anonymous
10/18/25(Sat)16:37:44 No.106932631

Anonymous 10/18/25(Sat)16:37:44 No.106932631

>>106932616
I love the schizo ass model cards.
Seriously, it's pure ML vodoo.
It's great.

Anonymous
10/18/25(Sat)16:38:45 No.106932638

Anonymous 10/18/25(Sat)16:38:45 No.106932638

File: Screenshot 2025-10-18 at (...).png (191 KB, 733x2170)

191 KB PNG

>>106932631
Forgot the image.

Anonymous
10/18/25(Sat)16:39:07 No.106932642

Anonymous 10/18/25(Sat)16:39:07 No.106932642

>>106932631
I assume you did consult the required reading material, right? https://huggingface.co/DavidAU/Maximizing-Model-Performance-All-Quants-Types-And-Full-Precision-by-Samplers_Parameters

Anonymous
10/18/25(Sat)16:44:11 No.106932672

Anonymous 10/18/25(Sat)16:44:11 No.106932672

>>106932642
Of course.
Although you also have to keep the caveats of the individual models in mind, such as
>https://huggingface.co/DavidAU/Llama-3.2-8X3B-MOE-Dark-Champion-Instruct-uncensored-abliterated-18.4B-GGUF

Anonymous
10/18/25(Sat)16:45:07 No.106932681

Anonymous 10/18/25(Sat)16:45:07 No.106932681

>>106932623
fuck off

Anonymous
10/18/25(Sat)16:46:40 No.106932692

Anonymous 10/18/25(Sat)16:46:40 No.106932692

>>106932623
I forgot about that. Maybe drummer isn't a terrorist, he's just misguided.

Anonymous
10/18/25(Sat)16:46:47 No.106932693

Anonymous 10/18/25(Sat)16:46:47 No.106932693

>>106932681
it'll be okay Icy-Helicopter-kyun do not worries

Anonymous
10/18/25(Sat)16:55:05 No.106932749

Anonymous 10/18/25(Sat)16:55:05 No.106932749

Getting the MI50 to work on windows was such a pain in the ass.

Anonymous
10/18/25(Sat)16:57:07 No.106932759

Anonymous 10/18/25(Sat)16:57:07 No.106932759

>>106932749
Do tell.
Just drivers? Some sort of incompatibility?
RocM issues?

Anonymous
10/18/25(Sat)16:59:02 No.106932767

Anonymous 10/18/25(Sat)16:59:02 No.106932767

>>106932759
nta but amd never bothered with official windows support for the instinct cards

Anonymous
10/18/25(Sat)17:01:45 No.106932783

Anonymous 10/18/25(Sat)17:01:45 No.106932783

File: cursed-nagatoro.png (3.43 MB, 1920x1782)

3.43 MB PNG

>>106932749
On Linux it worked out of the box.

Anonymous
10/18/25(Sat)17:29:00 No.106932987

Anonymous 10/18/25(Sat)17:29:00 No.106932987

>>106932749
im flabbergasted you even got it to work on it

Anonymous
10/18/25(Sat)17:34:58 No.106933044

Anonymous 10/18/25(Sat)17:34:58 No.106933044

>>106932759
>>106932987
I had to flash the radeon pro VII vbios and use the bootcamp drivers because the normal drivers wouldn't work for some reason even though they are available in the AMD page.
Also for some reason the flashing tool refused to work on windows so I had to flash it on linux.
For anyone interested in flashing the original vbios for a better one here is the page I used: https://gist.github.com/evilJazz/14a4c82a67f2c52a6bb5f9cea02f5e13

Anonymous
10/18/25(Sat)17:37:28 No.106933066

Anonymous 10/18/25(Sat)17:37:28 No.106933066

>>106933044
that's a lot of fucking around for having it gimped by windows anyway

Anonymous
10/18/25(Sat)17:40:15 No.106933086

Anonymous 10/18/25(Sat)17:40:15 No.106933086

>>106931969
Why make this public so soon? I don't think "Thank you HF for giving my popular coomer tunes special treatment!" is good PR for HF.

Anonymous
10/18/25(Sat)17:42:18 No.106933100

Anonymous 10/18/25(Sat)17:42:18 No.106933100

>>106933044
Cool shit.
Thank you for sharing the link too.

Anonymous
10/18/25(Sat)17:45:02 No.106933123

Anonymous 10/18/25(Sat)17:45:02 No.106933123

>>106932492
>What does El Drummer actually do?
His biggest success was making the thinking process of a model make the model go into safe rejection mode. I think it was nemo so he made unsafe model safe.

Anonymous
10/18/25(Sat)17:47:22 No.106933141

Anonymous 10/18/25(Sat)17:47:22 No.106933141

>>106932749
What is this mindset? You are buying some used server gpus and are like "NONONO IT MUST WORK ON MY W10 MACHINE".

Anonymous
10/18/25(Sat)17:47:29 No.106933144

Anonymous 10/18/25(Sat)17:47:29 No.106933144

>>106932593
>double neuron density
Wow. I think he actually isn't a shyster like drummer. He is just that sovlful.

Anonymous
10/18/25(Sat)17:49:24 No.106933157

Anonymous 10/18/25(Sat)17:49:24 No.106933157

>>106932623
suck his dick and get HIV

Anonymous
10/18/25(Sat)17:50:47 No.106933168

Anonymous 10/18/25(Sat)17:50:47 No.106933168

Is it david_AU cause he is golden?

Anonymous
10/18/25(Sat)17:53:13 No.106933185

Anonymous 10/18/25(Sat)17:53:13 No.106933185

>NEVER
[4031, 3848]
>never
[37593]
>EXCLUSIVELY
[3337, 38953, 3166, 50309]
>exclusively
[327, 4256, 3210]
fuck you zuck

Anonymous
10/18/25(Sat)17:54:10 No.106933196

Anonymous 10/18/25(Sat)17:54:10 No.106933196

>>106931567
Okay, I know there is stuff you can do to make text models send prompts to SD based on the contents of the chat.
How do you do that?

I've got Forge SD WebUI up and running.
I haven't done anything with locally hosting LLMs yet, only messed around with Janitor and Venus, but I can probably pretty easily get KoboldCPP+SillyTavern+Mistral up and running.

Anonymous
10/18/25(Sat)17:55:32 No.106933210

Anonymous 10/18/25(Sat)17:55:32 No.106933210

>>106933185
get tokened, faggot

Anonymous
10/18/25(Sat)18:00:25 No.106933257

Anonymous 10/18/25(Sat)18:00:25 No.106933257

>>106932428
I'm honestly surprised more people don't have OFs starring AI starlets.

The image technology is there.
The video technology is there.
The voice technology is there if you want to got that far.
The text generation technology is there.

Seems like Digital Pimp is a major career possibility.

Anonymous
10/18/25(Sat)18:01:38 No.106933269

Anonymous 10/18/25(Sat)18:01:38 No.106933269

>>106933141
I'm traveling for some months so I'm lending my pc to my normie cousin because he wants to play some racing games and other stuff.

Anonymous
10/18/25(Sat)18:23:16 No.106933415

Anonymous 10/18/25(Sat)18:23:16 No.106933415

Can any anons that use ik_llama.cpp sanity check me on my llama-bench.exe setup?

I can run llama-server without issue, but I can't seem to get bench to load. Using a IQ2 GLM-4.6 on 128+24GB.

I'm mainly getting "main: error: failed to load model 'model_path'"

Here is the PS script I'm using to start the server - I've got -ngl 1 just to test, as my issues started when I tried to load any of the model into GPU.
# Change to the directory this script is in
Set-Location -Path $PSScriptRoot

# === Full path to your GLM-4.6 model ===
$MODEL = "G:\LLM\Models\GLM-4.6-IQ2_KL\GLM-4.6-IQ2_KL-00001-of-00003.gguf"

# === Launch llama-server with recommended GLM-4.6 settings ===
& .\llama-bench.exe `
  -m "$MODEL" `
  -mmp 0 `
  -ngl 1 `
  -fa 1 `
  -fmoe 1 `
  -ctk q4_0 -ctv q4_0 `
  -ot exps=CPU `
  -t 20

Pause

Anonymous
10/18/25(Sat)18:25:20 No.106933437

Anonymous 10/18/25(Sat)18:25:20 No.106933437

>>106933415
>-m "$MODEL" `
lol, lmao even. use your brain

Anonymous
10/18/25(Sat)18:41:56 No.106933561

Anonymous 10/18/25(Sat)18:41:56 No.106933561

>>106933437
It's not very big ok

I have "$MODEL" on my scripts for loading it in llama-server.exe and it doesn't give me a hard time.

Anonymous
10/18/25(Sat)18:48:46 No.106933613

Anonymous 10/18/25(Sat)18:48:46 No.106933613

>>106931969
The only decent model he put out was Unslopnemo 3.0. (4.0 is braindead, 4.1 is okay but less fun with writing style) Everything else I tried is either mega slop or just bad.

Anonymous
10/18/25(Sat)18:54:55 No.106933658

Anonymous 10/18/25(Sat)18:54:55 No.106933658

downloading ling to see if it can replace kimi sex

Anonymous
10/18/25(Sat)19:08:58 No.106933766

Anonymous 10/18/25(Sat)19:08:58 No.106933766

>>106933415
I've got it working now.

For some reason, only ~13 of my 24GB of VRAM is used during these benches. Is that normal, or should I be looking to fully saturate that?
$MODEL = "G:\LLM\Models\GLM-4.6-smol-IQ2_KS\GLM-4.6-smol-IQ2_KS-00001-of-00003.gguf"

# === Launch llama-server with recommended GLM-4.6 settings ===
& .\llama-bench.exe `
  -m $MODEL `
  -mmp 0 `
  -ngl 999 `
  -p 128,512 `
  -n 128,512 `
  -b 4096 `
  -ub 4096 `
  -fa 1 `
  -fmoe 1 `
  -ctk q8_0 -ctv q8_0 `
  -ot exps=CPU `
  -t 20

Pause

Anonymous
10/18/25(Sat)19:11:10 No.106933782

Anonymous 10/18/25(Sat)19:11:10 No.106933782

>>106933766
>For some reason, only ~13 of my 24GB of VRAM is used
> -ot exps=CPU `
You can write an enormous -ot expression or you can wait for >>106931647

Anonymous
10/18/25(Sat)19:11:55 No.106933790

Anonymous 10/18/25(Sat)19:11:55 No.106933790

Why didn't drummer do his own mememark? Big corpos lie about mememarks all the time. Why not make some fake bars himself?

Anonymous
10/18/25(Sat)19:13:42 No.106933802

Anonymous 10/18/25(Sat)19:13:42 No.106933802

>>106933766
Do all of those arguments work on llama-bench?
I can't remember what it was exactly, but some stuff that worked in llama-server wasn't implemented in llama-bench IIRC.
Maybe it's the cache quantization, I dunno.

Anonymous
10/18/25(Sat)19:14:56 No.106933822

Anonymous 10/18/25(Sat)19:14:56 No.106933822

>>106933790
Stop bullying Drummer... He might be bit simple but doesn't mean any harm to anyone.

Anonymous
10/18/25(Sat)19:16:19 No.106933834

Anonymous 10/18/25(Sat)19:16:19 No.106933834

>>106933802
if you run llama-bench.exe -h it'll give you a list of what it accepts.

>>106933782
What would the enormous -ot expression look like? All I've ever seen is exps=CPU and ".ffn_.*_exps.=CPU"

Anonymous
10/18/25(Sat)19:24:07 No.106933915

Anonymous 10/18/25(Sat)19:24:07 No.106933915

Why isn't there LFM2-VL 2.6B yet ? Also abliterated / uncensored ?

Need to caption 300k images and all the LLMs suck and I have to stick with FLorence-2 still...

Anonymous
10/18/25(Sat)19:25:08 No.106933927

Anonymous 10/18/25(Sat)19:25:08 No.106933927

>>106933822
That is Undi and Davidau.

Anonymous
10/18/25(Sat)19:37:15 No.106934031

Anonymous 10/18/25(Sat)19:37:15 No.106934031

>>106933185
>token banning in 2025
we have string bans now gramps

Anonymous
10/18/25(Sat)19:46:41 No.106934099

Anonymous 10/18/25(Sat)19:46:41 No.106934099

File: screenshot0240.png (1.99 MB, 2202x1238)

1.99 MB PNG

>>106932264
i agree with this image

Anonymous
10/18/25(Sat)19:58:44 No.106934183

Anonymous 10/18/25(Sat)19:58:44 No.106934183

>>106934099
I can't read those bent letters

Anonymous
10/18/25(Sat)19:59:44 No.106934190

Anonymous 10/18/25(Sat)19:59:44 No.106934190

>>106934183
>t. qwen vl

Anonymous
10/18/25(Sat)20:02:36 No.106934211

Anonymous 10/18/25(Sat)20:02:36 No.106934211

File: niggers cant read this.jpg (66 KB, 1080x817)

66 KB JPG

>>106934183
are you black?

Anonymous
10/18/25(Sat)20:03:12 No.106934215

Anonymous 10/18/25(Sat)20:03:12 No.106934215

>>106934190
>t. moniqwen

Anonymous
10/18/25(Sat)20:03:19 No.106934216

Anonymous 10/18/25(Sat)20:03:19 No.106934216

>>106934211
They can't?

Anonymous
10/18/25(Sat)20:05:27 No.106934235

Anonymous 10/18/25(Sat)20:05:27 No.106934235

>>106934216
They can barely read normal words, and let's not talk about reading comprehension...

Anonymous
10/18/25(Sat)20:09:12 No.106934263

Anonymous 10/18/25(Sat)20:09:12 No.106934263

Minimum specs to run GLM 4.6?

Anonymous
10/18/25(Sat)20:09:47 No.106934269

Anonymous 10/18/25(Sat)20:09:47 No.106934269

>>106934263
24GB VRAM + 128GB RAM (maybe 96?)

Anonymous
10/18/25(Sat)20:09:58 No.106934271

Anonymous 10/18/25(Sat)20:09:58 No.106934271

>>106934263
128gb ram + 24gb vram

Anonymous
10/18/25(Sat)20:11:06 No.106934277

Anonymous 10/18/25(Sat)20:11:06 No.106934277

>>106932235
Because it's not about the quality of the output, it's about how much you need to reroll and tard wrangle a small model until you get what you want, whereas a large model just gets it. The quality of a small model's output may even be better in side by side comparison, the difference is that with a small model, your struggle to make it output what you have in mind, while with a large model, you're balls deep in actual rp

Anonymous
10/18/25(Sat)20:11:16 No.106934283

Anonymous 10/18/25(Sat)20:11:16 No.106934283

>>106934269
>>106934271
>32GB 5090 with 64GB RAM
So close and yet so far.

Anonymous
10/18/25(Sat)20:11:46 No.106934288

Anonymous 10/18/25(Sat)20:11:46 No.106934288

>>106934283
ram is cheap though

Anonymous
10/18/25(Sat)20:12:25 No.106934295

Anonymous 10/18/25(Sat)20:12:25 No.106934295

>>106934277
You can reroll a small model a hundred times by the time your offloaded large model finish its first gen

Anonymous
10/18/25(Sat)20:12:39 No.106934297

Anonymous 10/18/25(Sat)20:12:39 No.106934297

File: file.png (250 KB, 1303x760)

250 KB PNG

>>106934288
was*

Anonymous
10/18/25(Sat)20:13:55 No.106934305

Anonymous 10/18/25(Sat)20:13:55 No.106934305

GLM 4.6/Deepsex for the first 20k tokens or so followed by Qwen 235B/22A thinking up to 60k context is the KINO setup for long-form lorebook RP, prove me wrong.

Anonymous
10/18/25(Sat)20:14:39 No.106934316

Anonymous 10/18/25(Sat)20:14:39 No.106934316

>>106934288
I'm unfortunately a 2 slotkek.

Anonymous
10/18/25(Sat)20:16:28 No.106934329

Anonymous 10/18/25(Sat)20:16:28 No.106934329

>>106934316
2x64gb sticks exist

Anonymous
10/18/25(Sat)20:17:51 No.106934341

Anonymous 10/18/25(Sat)20:17:51 No.106934341

>>106934305
Does it actually get better and not worse with 20k prefill?
...
...
I actually never tired Qwen above 10k tokens and I was using it for like 2 months at least. With glmchan I am hitting 10k almost everday. Can't be just the tokenizer right? Weird.

Anonymous
10/18/25(Sat)20:18:18 No.106934343

Anonymous 10/18/25(Sat)20:18:18 No.106934343

What are the differences between GLM 4.6 and 4.5 for practical use? I don't give a shit about benchmark faggotry.

Anonymous
10/18/25(Sat)20:22:44 No.106934381

Anonymous 10/18/25(Sat)20:22:44 No.106934381

>>106934288
NTA but my cpu/mobo doesn't boot if I try to use more than 2 32gb ram sticks

Anonymous
10/18/25(Sat)20:22:59 No.106934384

Anonymous 10/18/25(Sat)20:22:59 No.106934384

>>106934295
Waiting is fine, reading garbage output ruins immersion

Anonymous
10/18/25(Sat)20:25:43 No.106934400

Anonymous 10/18/25(Sat)20:25:43 No.106934400

>>106934341
From what I noticed, the 2507 thinking version at Q8 does keep the same syntax as the previous context. I also use a user prefill regarding the paragraph formatting so it doesn't devolve into one word sentences and it seems to hold together.
I'm using it because it has the best high context performance out of all local models outside of Deepsex v3.2 at the moment (which isn't even implemented yet), while still being pretty damn fast even at high context. Again, its meh when used at 0 context due to its quirks and lack of world knowledge, but with 20k+ context filled in, its acceptable. Give it a try if you have the RAM.

Anonymous
10/18/25(Sat)20:53:18 No.106934595

Anonymous 10/18/25(Sat)20:53:18 No.106934595

>>106934381
Have you tried updating your bios?

Anonymous
10/18/25(Sat)21:01:06 No.106934635

Anonymous 10/18/25(Sat)21:01:06 No.106934635

MTP support soon inshallah

Anonymous
10/18/25(Sat)21:03:17 No.106934650

Anonymous 10/18/25(Sat)21:03:17 No.106934650

>>106934635
And Qwen 80b!

Anonymous
10/18/25(Sat)21:04:13 No.106934656

Anonymous 10/18/25(Sat)21:04:13 No.106934656

File: 1704599385920673.png (354 KB, 488x651)

354 KB PNG

can any anons recommend the current best general knowledge model? something encyclopedic on science, medicine, history, coding.
I dont care about roleplay or artistic output, just something to answer my inane comments, I am currently using gemini 3 27b

Anonymous
10/18/25(Sat)21:06:19 No.106934669

Anonymous 10/18/25(Sat)21:06:19 No.106934669

>>106934656
In general, the largest the model, the more knowledge.
So something like kimi I guess.
Of course, Gemma models for example know a lot more than any other model in their weight range, but they are relatively small.

Anonymous
10/18/25(Sat)21:06:29 No.106934670

Anonymous 10/18/25(Sat)21:06:29 No.106934670

>>106934635
Be the change you want to see bro

Anonymous
10/18/25(Sat)21:11:13 No.106934704

Anonymous 10/18/25(Sat)21:11:13 No.106934704

>>106934635
Vibe coders will save us.

Anonymous
10/18/25(Sat)21:12:15 No.106934707

Anonymous 10/18/25(Sat)21:12:15 No.106934707

>>106934381
Pretty sure for AMD there were a bunch of BIOS updates in June or earlier that enabled 64GB DIMM support for many vendors.

Anonymous
10/18/25(Sat)21:13:21 No.106934710

Anonymous 10/18/25(Sat)21:13:21 No.106934710

>>106934381
>>106934707
Disregard that I thought you were trying to put 64GB sticks in there

Anonymous
10/18/25(Sat)21:14:17 No.106934720

Anonymous 10/18/25(Sat)21:14:17 No.106934720

>>106931969
Based. /lmg/ shills BTFO.

Anonymous
10/18/25(Sat)21:14:18 No.106934721

Anonymous 10/18/25(Sat)21:14:18 No.106934721

File: 1521254484420.jpg (625 KB, 2048x1365)

625 KB JPG

>>106934669
thanks anon, sorry I made typo with gemini, I was indeed using gemma 3 27b. I'll try kimi

Anonymous
10/18/25(Sat)21:15:23 No.106934728

Anonymous 10/18/25(Sat)21:15:23 No.106934728

>>106934721
>cheetah
They want to be our pets SO BAD.

Anonymous
10/18/25(Sat)21:19:13 No.106934745

Anonymous 10/18/25(Sat)21:19:13 No.106934745

>>106933658
the fuck kind of rig do you have?

Anonymous
10/18/25(Sat)21:22:55 No.106934763

Anonymous 10/18/25(Sat)21:22:55 No.106934763

>>106934635
https://voca.ro/1915MlAOFtMx

Anonymous
10/18/25(Sat)21:23:52 No.106934768

Anonymous 10/18/25(Sat)21:23:52 No.106934768

https://www.tomshardware.com/tech-industry/jensen-huang-says-nvidia-china-market-share-has-fallen-to-zero
>Jensen says Nvidia’s China AI GPU market share has plummeted from 95% to zero
lol, lmao even?

Anonymous
10/18/25(Sat)21:24:37 No.106934774

Anonymous 10/18/25(Sat)21:24:37 No.106934774

File: 1751552205845476.png (210 KB, 498x529)

210 KB PNG

>>106931567
Question for anons who rp with LLMs: do typically set a specific Max output tokens setting? Or do you usually stick with whatever the default your inference engine/webui uses? Sometimes I'll enter a prompt and the output from the "person" I'm role-playing with (I typically have a system prompt that tells the alarm to act as a specific person or persona ) is only like a sentence or two and other times output an entire paragraph. Which output length it does seems to be completely random. Sometimes I'll do a particular prompt and it shits out a paragraph or two worth of text. I'll restart the engine and input that exact prompt. In this time it'll only be a couple sentences. Is it better to set your own Max token output? I don't really RP with it that often so I'm not really sure what counts as "too much", "too little", "good", or " bad"

Anonymous
10/18/25(Sat)21:36:57 No.106934844

Anonymous 10/18/25(Sat)21:36:57 No.106934844

>>106934774
>Which output length it does seems to be completely random
If the model's output ends before the token limit, then it's done saying what it meant to say. If it reaches the token limit, the reply will get truncated.
>Is it better to set your own Max token output?
It may truncate the reply. But you can.
>I'm not really sure what counts as "too much", "too little", "good", or " bad"
Whatever you prefer. You can nudge the model by just instructing it to give short or long replies. Results may vary.

The model has no idea what the token limit is, nor does it know how many tokens it generated already. It generates tokens until "it's done" (by generating an EOS token). The token limit is just a setting for the inference program or the client, not the model.

Anonymous
10/18/25(Sat)21:37:43 No.106934851

Anonymous 10/18/25(Sat)21:37:43 No.106934851

>>106934774
It isn't what you think it is. All it does is reduce the model’s context size by max output tokens, so the response will fit within the context

Anonymous
10/18/25(Sat)21:41:28 No.106934875

Anonymous 10/18/25(Sat)21:41:28 No.106934875

>>106934851
>All it does is reduce the model’s context size by max output tokens
No. It stops generating once the output limit is reached in the current gen request. It's equivalent to the n_predict setting in a gen request to llama-server.
>so the response will fit within the context
No. It's to prevent run-away generation or to just generate in chunks with [continue]. The reply will not necessarily fit in the context as it can get truncated.

Anonymous
10/18/25(Sat)21:43:28 No.106934886

Anonymous 10/18/25(Sat)21:43:28 No.106934886

Llama.cpp is refusing to load a finetune converted to gguf from an axolotl checkpoint, which according to Grok is because the rank of the lora_a is 256 while the rank of the lora_b is 128. The rank of the lora was supposed to be 128. Any ideas?

Anonymous
10/18/25(Sat)21:45:35 No.106934898

Anonymous 10/18/25(Sat)21:45:35 No.106934898

>>106934886
>The rank of the lora was supposed to be 128.
Well? Is it 128? 100% sure?

Anonymous
10/18/25(Sat)21:49:30 No.106934923

Anonymous 10/18/25(Sat)21:49:30 No.106934923

>>106934774
I set the output length to the maximum supported length when using instruct mode.
The model will generate as much text as it deems necessary. There's no point in cutting it short, especially when reasoning is enabled.

For text completion though I'll put the gen limit at 512 tokens, since text completion will just keep generating text until you stop it.
512 tokens is enough to write a paragraph or two, and gives me a chance to make edits or steer the model before continuing.

Anonymous
10/18/25(Sat)21:50:30 No.106934934

Anonymous 10/18/25(Sat)21:50:30 No.106934934

>>106934774
for me, 550 is about as much as I allow for non-thinking models for a more book like experience with 3 paragraphs
reasoning models you have to make it higher because the reasoning process uses these tokens.
are you using Silly Tavern?

Anonymous
10/18/25(Sat)21:52:20 No.106934945

Anonymous 10/18/25(Sat)21:52:20 No.106934945

File: 2025 dram market.png (88 KB, 1280x720)

88 KB PNG

dam

Anonymous
10/18/25(Sat)22:05:13 No.106935020

Anonymous 10/18/25(Sat)22:05:13 No.106935020

>>106934875
Yes. It's what happens in any practical situation
https://github.com/SillyTavern/SillyTavern/blob/74c158bd2e98b8b4dc54d2bb0d088c5a5e918826/public/script.js#L5084
If you set a 4K max response with 16K context, you are only getting 12K tokens for your prompt and chat history
>The reply will not necessarily fit in the context
Wrong. Prompt + response can't be longer than the context
>to just generate in chunks with [continue]
The sole reason for generating in chunks is to provide the model with as much context as possible at the start of a reply

Anonymous
10/18/25(Sat)22:14:09 No.106935071

Anonymous 10/18/25(Sat)22:14:09 No.106935071

>>106934774
You need to tell the model to keep its replies under 200 tokens unless asked to provide a long answer, for example.
I'll keep output length at infinite, it doesn't do that much.

Anonymous
10/18/25(Sat)22:17:21 No.106935091

Anonymous 10/18/25(Sat)22:17:21 No.106935091

>>106935020
>The reply will not necessarily fit in the context
>Prompt + response can't be longer than the context
Yes. Meant to say "The reply will not necessarily fit in the token limit".
>All it does is reduce the model’s context size
It does not reduce the context size. It's set at launch. But it does reduce the gen limit so that it doesn't go over the context size.
>so the response will fit within the context
The *generated tokens* will fit in the context, not necessarily the entire reply. That cannot be guaranteed.

Anonymous
10/18/25(Sat)22:18:36 No.106935099

Anonymous 10/18/25(Sat)22:18:36 No.106935099

>>106934898
When I downloaded another finetune from HF it had the same error so I think there must be some issue with the trainer or Grok was just wrong.

Anonymous
10/18/25(Sat)22:22:37 No.106935129

Anonymous 10/18/25(Sat)22:22:37 No.106935129

>>106935099
Show your llama-server output where you get the error. Post the fucking models you tried, at least. Can you load other models?

Anonymous
10/18/25(Sat)22:27:47 No.106935162

Anonymous 10/18/25(Sat)22:27:47 No.106935162

>>106935091
It reduces the available context size for the prompt and chat history. At this point, I refuse to participate in the nitpicking contest

Anonymous
10/18/25(Sat)22:30:54 No.106935187

Anonymous 10/18/25(Sat)22:30:54 No.106935187

>>106935162
It cannot reduce the context size. That's set at launch. It reduces the gen limit so that it doesn't go over the context size.

Anonymous
10/18/25(Sat)22:33:16 No.106935204

Anonymous 10/18/25(Sat)22:33:16 No.106935204

>>106935129
Ok, gimme a minute.

Anonymous
10/18/25(Sat)22:35:02 No.106935213

Anonymous 10/18/25(Sat)22:35:02 No.106935213

>>106935187
You're either retarded or can't read

Anonymous
10/18/25(Sat)22:37:43 No.106935232

Anonymous 10/18/25(Sat)22:37:43 No.106935232

>>106935187
Ring Attention exists.

Anonymous
10/18/25(Sat)23:23:09 No.106935530

Anonymous 10/18/25(Sat)23:23:09 No.106935530

File: G3jKisDWsAAHAWW.jpg (506 KB, 2048x1536)

506 KB JPG

Anonymous
10/18/25(Sat)23:27:23 No.106935555

Anonymous 10/18/25(Sat)23:27:23 No.106935555

>>106931969
this guy is such an e-begging piece of shit. his troon tunes, all of them, are worse than the originals
>>106931997
i mean, the drummer sucks, but this is the kind of corporate bootlicking you only see on r*ddit. literally the worst humans on the planet

Anonymous
10/18/25(Sat)23:29:17 No.106935564

Anonymous 10/18/25(Sat)23:29:17 No.106935564

>>106932264
this

Anonymous
10/18/25(Sat)23:42:39 No.106935639

Anonymous 10/18/25(Sat)23:42:39 No.106935639

>>106935564
>when she talks while deepthroating your dick as you fuck her ass

Anonymous
10/19/25(Sun)00:17:21 No.106935832

Anonymous 10/19/25(Sun)00:17:21 No.106935832

>>106933196

You don't need Forge, Koboldcpp has image generation support too and the same models work in it. It even supports models forge doesn't like WAN, qwen image and kontext

Anonymous
10/19/25(Sun)00:31:52 No.106935934

Anonymous 10/19/25(Sun)00:31:52 No.106935934

>>106934211
to be fair, that's very bad cursive.

also, it's anyone under the age of 30.

Anonymous
10/19/25(Sun)00:38:51 No.106935980

Anonymous 10/19/25(Sun)00:38:51 No.106935980

>>106934945
>lpddr
>ai

Anonymous
10/19/25(Sun)00:58:45 No.106936098

Anonymous 10/19/25(Sun)00:58:45 No.106936098

I wish there was a balance between GLM-4.6 and K2-0905. It feels like GLM-4.6 is a bit too clean and K2-0905 has too much of a slop tendency. If they were blended together it would be the perfect model.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.