/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

[Post a Reply]

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous
/lmg/ - Local Models General 12/18/25(Thu)17:04:52 No.107595736

File: 124953711.jpg (665 KB, 800x1200)

665 KB JPG

/lmg/ - Local Models General Anonymous 12/18/25(Thu)17:04:52 No.107595736

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>107588615 & >>107582405

►News
>(12/17) Introducing Meta Segment Anything Model Audio: https://ai.meta.com/samaudio
>(12/16) GLM4V vision encoder support merged: https://github.com/ggml-org/llama.cpp/pull/18042
>(12/15) Chatterbox-Turbo 350M released: https://huggingface.co/ResembleAI/chatterbox-turbo
>(12/15) Nemotron 3 Nano released: https://hf.co/blog/nvidia/nemotron-3-nano-efficient-open-intelligent-models
>(12/15) llama.cpp automation for memory allocation: https://github.com/ggml-org/llama.cpp/discussions/18049

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
12/18/25(Thu)17:05:09 No.107595738

Anonymous 12/18/25(Thu)17:05:09 No.107595738

File: bundleup.png (1.35 MB, 1024x1024)

1.35 MB PNG

►Recent Highlights from the Previous Thread: >>107588615

--Llama.cpp CUDA vs CPU performance analysis:
>107589220 >107589277 >107589403 >107589436 >107589455 >107589482 >107589526 >107589560 >107589637 >107590228 >107590241 >107594766
--Technical and financial challenges in large model finetuning and LLM personality analysis:
>107591971 >107591997 >107592058 >107592108 >107592174 >107592346 >107592404 >107592085 >107592905
--Critique of Magistral's roleplay and reasoning model limitations:
>107592096 >107592132 >107592122 >107592131
--Mistral Small Creative slightly outperforms Mistral Small 3.2 in EQBench Creative Writing:
>107589435
--CPU-GPU bandwidth limitations for external GPU setups:
>107589567
--Configuring on-the-fly model serving in llama.cpp using JSON presets:
>107592345 >107592536
--Technical and ethical evaluation of T5Gemma 2 encoder-decoder models:
>107594052 >107594112 >107594127 >107594239 >107594276 >107594278 >107594295
--MoE vs dense model performance and efficiency trade-offs:
>107591548 >107591591 >107591664 >107591704 >107591802 >107591642 >107591672 >107591740 >107591781 >107591846 >107591870 >107591962 >107591689 >107592016 >107592045 >107592033 >107592064 >107592075 >107592129 >107592149
--Lightweight FunctionGemma-270m-it model gains attention for device compatibility:
>107593038 >107593074 >107593099 >107593361 >107593442 >107594008 >107594139 >107594195
--T5Gemma 2: Seeing, Reading, and Understanding Longer:
>107594243 >107594258 >107594533 >107594327
--Benefits of self-hosted AI models over cloud-based services:
>107593385 >107593404 >107593424 >107593429 >107593592 >107593899 >107593805
--Clarifying model distillation methods and frameworks for training behavior replication:
>107591353 >107591379 >107591432 >107591458 >107591765 >107591480
--Miku (free space):
>107589320 >107592096 >107592323

►Recent Highlight Posts from the Previous Thread: >>107588618

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
12/18/25(Thu)17:08:47 No.107595758

Anonymous 12/18/25(Thu)17:08:47 No.107595758

How long will it be until llama prioritizes loading dense layers first into vram by default when doing mixed inference?

Anonymous
12/18/25(Thu)17:14:59 No.107595808

Anonymous 12/18/25(Thu)17:14:59 No.107595808

What do we have to look forward too?
>Gemma 4
>Mistral Small Creative
>GLM 4.6 air/ GLM 5

All I want for christmas is an injection of hopium.

Anonymous
12/18/25(Thu)17:15:04 No.107595813

Anonymous 12/18/25(Thu)17:15:04 No.107595813

File: scout first correct inference.png (230 KB, 1832x908)

230 KB PNG

Hell yeah, the biggest accuracy errors from yesterday with the initial implementation of Scout (CPU only) on my engine have been fixed.
Now I'll begin testing the finetuning code.

Anonymous
12/18/25(Thu)17:16:29 No.107595826

Anonymous 12/18/25(Thu)17:16:29 No.107595826

>>107595808
Kimi K3 will strike like a thunderbolt when we least expect it.

Anonymous
12/18/25(Thu)17:19:00 No.107595857

Anonymous 12/18/25(Thu)17:19:00 No.107595857

>>107595808
Just GLM5.

And despite 4.6 being a model I will always keep on my SSD because of how it unfucked my IRL life I don't think ZAI will break the curse of releasing 1 good model and then becoming super safety cucked.

Anonymous
12/18/25(Thu)17:20:08 No.107595867

Anonymous 12/18/25(Thu)17:20:08 No.107595867

Did you like the new gems?

Anonymous
12/18/25(Thu)17:23:11 No.107595893

Anonymous 12/18/25(Thu)17:23:11 No.107595893

>>107595826
in my pants

Anonymous
12/18/25(Thu)17:23:58 No.107595903

Anonymous 12/18/25(Thu)17:23:58 No.107595903

So did saars deliver what saars deliver?

Anonymous
12/18/25(Thu)17:24:36 No.107595911

Anonymous 12/18/25(Thu)17:24:36 No.107595911

File: 1604345226030.jpg (884 KB, 1340x1000)

884 KB JPG

>>107595736
Love that comfy Miku she fills me with hope

Anonymous
12/18/25(Thu)17:27:03 No.107595928

Anonymous 12/18/25(Thu)17:27:03 No.107595928

File: file.png (24 KB, 983x77)

24 KB PNG

What do you mean you failed to allocate 85GB?

Anonymous
12/18/25(Thu)17:29:39 No.107595950

Anonymous 12/18/25(Thu)17:29:39 No.107595950

>>107595738
I don't care this is /lmg/, what model+prompt was this? That's a good migu.

Anonymous
12/18/25(Thu)17:29:49 No.107595952

Anonymous 12/18/25(Thu)17:29:49 No.107595952

>>107595808
maybe that one solar open model. I mean it should be the same size as glm air but idk, probably will suck for rp

Anonymous
12/18/25(Thu)17:30:46 No.107595960

Anonymous 12/18/25(Thu)17:30:46 No.107595960

>>107595808
There is no hope. When even proprietary has been stagnating for a year outside of programming benchmarks, there is absolutely zero hope for local.

Anonymous
12/18/25(Thu)17:30:58 No.107595961

Anonymous 12/18/25(Thu)17:30:58 No.107595961

>>107595928
not sure if ur serious but it tried to allocate 85gb on the gpu, ur showing ram

Anonymous
12/18/25(Thu)17:31:26 No.107595964

Anonymous 12/18/25(Thu)17:31:26 No.107595964

>>107595928
windows is kinda ass with it's memory allocation
my linux box on the other hand lets me allocate down to the last megabyte without considering if this is a good idea and I end up freezing the whole system

Anonymous
12/18/25(Thu)17:31:59 No.107595968

Anonymous 12/18/25(Thu)17:31:59 No.107595968

>>107595961
>ggml_backend_CPU_buffer_type_alloc_buffer

Anonymous
12/18/25(Thu)17:33:03 No.107595973

Anonymous 12/18/25(Thu)17:33:03 No.107595973

>>107595928
>windoze
>muh mallACK
just werkz on the penguin

Anonymous
12/18/25(Thu)17:35:28 No.107595994

Anonymous 12/18/25(Thu)17:35:28 No.107595994

>>107595960
>stagnating for a year outside of programming
Fine by me as a programmer who has LLMs completing most of my work items

Anonymous
12/18/25(Thu)17:36:09 No.107595997

Anonymous 12/18/25(Thu)17:36:09 No.107595997

>>107595928
Do you also have outofmemory error or just this?

Anonymous
12/18/25(Thu)17:37:18 No.107596002

Anonymous 12/18/25(Thu)17:37:18 No.107596002

>>107595928
https://github.com/ggml-org/llama.cpp/issues/18099

Anonymous
12/18/25(Thu)17:38:14 No.107596010

Anonymous 12/18/25(Thu)17:38:14 No.107596010

you niggas think that arc-agi iq test trolling anon from before was some kind of ops testing for the new captcha system ?

Anonymous
12/18/25(Thu)17:39:35 No.107596030

Anonymous 12/18/25(Thu)17:39:35 No.107596030

>>107595736
>>107595738
Holding hands with comfy winter Mikus

Anonymous
12/18/25(Thu)17:40:44 No.107596039

Anonymous 12/18/25(Thu)17:40:44 No.107596039

>>107595997
Not sure what you mean but _get_errno returns ENOMEM after the _aligned_malloc call.

Anonymous
12/18/25(Thu)17:41:25 No.107596044

Anonymous 12/18/25(Thu)17:41:25 No.107596044

West africa.
Portugal.
UK.
Iceland.
Greenland.

Anonymous
12/18/25(Thu)17:41:32 No.107596045

Anonymous 12/18/25(Thu)17:41:32 No.107596045

>>107596010
i think they will change it because it's too easy.

Anonymous
12/18/25(Thu)17:49:50 No.107596107

Anonymous 12/18/25(Thu)17:49:50 No.107596107

>>107596045
If they make it too hard, posting will slow to a crawl.

Anonymous
12/18/25(Thu)17:50:29 No.107596113

Anonymous 12/18/25(Thu)17:50:29 No.107596113

>>107596107
good

Anonymous
12/18/25(Thu)17:56:02 No.107596153

Anonymous 12/18/25(Thu)17:56:02 No.107596153

>>107596107
They don't seem to care so much. Block a lot of anonymous ways to post.

Anonymous
12/18/25(Thu)18:00:35 No.107596191

Anonymous 12/18/25(Thu)18:00:35 No.107596191

>>107595808
The fact that mistral small creative is on their site listed as a 24b and they keep shitting out ass 24b models when all I want is a writing dedicated model but it's locked behind api atm pisses me off. I assume it'll still be shit but literally no company has bothered catering to writing

Anonymous
12/18/25(Thu)18:01:33 No.107596206

Anonymous 12/18/25(Thu)18:01:33 No.107596206

>>107595808
>Gemma 4
I can imagine it will be very good for safe&respectful RP, not so much for ERP besides c.ai-tier infinite cock-teasing, if you like that.
>Mistral Small Creative
For now it doesn't feel that much different from Mistral Small 3.2 Instruct, but it might be the right occasion for eventually having a properly trained RP-dedicated model that isn't retarded. That is, if the people involved with making it aren't themselves complete retards.
>GLM 4.6
I think GLM-4.6V is what Z.ai intended to be GLM 4.6 Air.
>GLM 5
Still at least 1-2 months away, in my opinion.

Anonymous
12/18/25(Thu)18:05:26 No.107596238

Anonymous 12/18/25(Thu)18:05:26 No.107596238

>>107596206
Ok.

Anonymous
12/18/25(Thu)18:13:27 No.107596320

Anonymous 12/18/25(Thu)18:13:27 No.107596320

>>107595928
Increasing the size of the page file worked even though no part of the model is going to the page file.

Anonymous
12/18/25(Thu)18:18:46 No.107596365

Anonymous 12/18/25(Thu)18:18:46 No.107596365

>>107592404
>generating on OpenRouter and cancelling the request before it completes so the generation isn't billed

lmao does that work?

These could help you if you want Claude to retain some of it's reasoning: (they've got a bunch of Claude QnA datasets).

It's worth including some general QnA otherwise your model will overfit.

https://huggingface.co/TeichAI
https://huggingface.co/datasets/TeichAI/claude-4.5-opus-high-reasoning-250x
https://huggingface.co/datasets/TeichAI/claude-4.5-opus-high-reasoning-250x
https://huggingface.co/datasets/Liontix/claude-sonnet-4-100x

Btw, less than a month until Opus-3 gets turned off so switch to that one if you like that model.

p.s. Are you sharing your dataset on HF?

Anonymous
12/18/25(Thu)18:22:56 No.107596396

Anonymous 12/18/25(Thu)18:22:56 No.107596396

>>107596191
all their latest releases seem to write similar

Anonymous
12/18/25(Thu)18:23:27 No.107596402

Anonymous 12/18/25(Thu)18:23:27 No.107596402

>>107594671
Ok that somewhat makes sense. I'm just confused because most of the service stuff like Grok or Gemini has to scrape the internet for answers.

So which one should you pick if you have modest specs? (16 vram and 32gb ddr5 memory ram)

Anonymous
12/18/25(Thu)18:33:03 No.107596468

Anonymous 12/18/25(Thu)18:33:03 No.107596468

So did big small Gemma save local?

Anonymous
12/18/25(Thu)18:33:34 No.107596472

Anonymous 12/18/25(Thu)18:33:34 No.107596472

Did saars finally defeat gptoss?

Anonymous
12/18/25(Thu)18:34:53 No.107596486

Anonymous 12/18/25(Thu)18:34:53 No.107596486

File: Screenshot from 2025-12-1(...).png (89 KB, 1224x313)

89 KB PNG

>>107596365
>lmao does that work?
I think so?

>These could help you if you want Claude to retain some of it's reasoning: (they've got a bunch of Claude QnA datasets).
She doesn't want her CoT on.

>It's worth including some general QnA otherwise your model will overfit.
Yeah, probably. I will run it with general chat/assistant stuff to. But I specifically care about literary analysis so I thought it made sense to start there.

>Btw, less than a month until Opus-3 gets turned off so switch to that one if you like that model.
Yeah, you're right... I probably should start there.

>p.s. Are you sharing your dataset on HF?
Maybe, especially for the one that's going to be retired.

Anonymous
12/18/25(Thu)18:37:21 No.107596499

Anonymous 12/18/25(Thu)18:37:21 No.107596499

>>107595826
Kimi will be too large most likely which makes my pp sad.
>>107596206
>I think GLM-4.6V is what Z.ai intended to be GLM 4.6 Air.
I thought they were supposed to release a full 4.6 air along with a 4.6 mini that is a 30b. Do we have confirmation that this isn't the case? Because 4.6V is worse than 4.5 air in writing.

Anonymous
12/18/25(Thu)18:39:04 No.107596509

Anonymous 12/18/25(Thu)18:39:04 No.107596509

[insert_model_here] status?

Anonymous
12/18/25(Thu)18:39:19 No.107596512

Anonymous 12/18/25(Thu)18:39:19 No.107596512

4.6 air doesn't exist because 4.5 regular was supposed to be 4.6 but they fucked up something in training and fixed it.

Anonymous
12/18/25(Thu)18:51:06 No.107596587

Anonymous 12/18/25(Thu)18:51:06 No.107596587

4.6 air doesn't exist because GLM still haven't managed to fix the parroting issue.

Anonymous
12/18/25(Thu)18:54:40 No.107596614

Anonymous 12/18/25(Thu)18:54:40 No.107596614

>>107596206
>I can imagine it will be very good for safe&respectful RP
Based Drummer will take care of that, alongside the derestricted guys.

Anonymous
12/18/25(Thu)18:55:03 No.107596617

Anonymous 12/18/25(Thu)18:55:03 No.107596617

4.6 air doesn't exist

Anonymous
12/18/25(Thu)19:02:01 No.107596673

Anonymous 12/18/25(Thu)19:02:01 No.107596673

What's the next step on Oracle's master plan?

Anonymous
12/18/25(Thu)19:02:49 No.107596685

Anonymous 12/18/25(Thu)19:02:49 No.107596685

>>107596673
Thinking of "next steps" is for losers.

Anonymous
12/18/25(Thu)19:07:40 No.107596717

Anonymous 12/18/25(Thu)19:07:40 No.107596717

>>107596587
I've actually seen humans engaging in what you call parroting during real forum RP. It often happens due to how fucking much some people write: to avoid losing track of event continuity (also because often the responses don't even start with immediately consecutive events) they try to tie their messages to what the preceding participant wrote by parroting or rewording portions of his response.

Anonymous
12/18/25(Thu)19:10:09 No.107596740

Anonymous 12/18/25(Thu)19:10:09 No.107596740

>>107596717
I've seen footage of indians eating shit before but it doesn't mean I want to see it when I go outside

Anonymous
12/18/25(Thu)19:15:09 No.107596787

Anonymous 12/18/25(Thu)19:15:09 No.107596787

>>107596673
If you buy their stocks now, you'll be able to afford a CPUMAXX build by summer.

Anonymous
12/18/25(Thu)19:19:29 No.107596823

Anonymous 12/18/25(Thu)19:19:29 No.107596823

>>107595964
add a swapfile, it stops it from freezing the whole system and at least allows you to stop the program instead of it slogging on

Anonymous
12/18/25(Thu)19:22:22 No.107596838

Anonymous 12/18/25(Thu)19:22:22 No.107596838

>>107596740
Considering that for GLM 4.6 Z.ai explicitly claimed to have improved roleplay and writing capabilities, I've pointed out a possible training data source for that issue.

Anonymous
12/18/25(Thu)19:23:49 No.107596845

Anonymous 12/18/25(Thu)19:23:49 No.107596845

>>107596838
I don't think Z.ai read these threads

Anonymous
12/18/25(Thu)19:25:59 No.107596866

Anonymous 12/18/25(Thu)19:25:59 No.107596866

>>107596845
You don't get it. They read specifically HIS post. That's how much influence anon has.

Anonymous
12/18/25(Thu)19:27:23 No.107596877

Anonymous 12/18/25(Thu)19:27:23 No.107596877

>>107596838
What's the source of the data for claude/gpt/local and all the others. Parroting hit like a freight train in 2025. It's some new instruct practice. Re-reading the instructions back to the user.

Anonymous
12/18/25(Thu)19:28:54 No.107596888

Anonymous 12/18/25(Thu)19:28:54 No.107596888

>>107596845
Either way, now you know that it happens even with human roleplay. People who carelessly source training data from roleplaying forums can end up giving a similar behavior to their models.

Anonymous
12/18/25(Thu)19:49:41 No.107597063

Anonymous 12/18/25(Thu)19:49:41 No.107597063

>absolutely random youtube gui speedup deploy
what the fuck are these jeets even doing

Anonymous
12/18/25(Thu)19:51:21 No.107597072

Anonymous 12/18/25(Thu)19:51:21 No.107597072

The amount of anti-Indian racism in these threads is deeply disturbing.

Anonymous
12/18/25(Thu)19:51:55 No.107597077

Anonymous 12/18/25(Thu)19:51:55 No.107597077

>>107597072
saaaaaarrr i am elite engineer saaar

Anonymous
12/18/25(Thu)19:55:37 No.107597101

Anonymous 12/18/25(Thu)19:55:37 No.107597101

>>107596823
yeah i know, i was just fucking around with zram and zswap and that's why it did that

Anonymous
12/18/25(Thu)19:55:43 No.107597102

Anonymous 12/18/25(Thu)19:55:43 No.107597102

>>107596486
>She doesn't want her CoT on.
Fair enough, you could still take just the prompt and final response.

I think that's how some of the labs make their non-reasoning models. The answer will still be in Claude's "voice".

>But I specifically care about literary analysis so I thought it made sense to start there.

Agreed, get that first for sure.

>Maybe, especially for the one that's going to be retired.

Cool, I'll keep an eye out.

Oh and one more thing, if you reply to the model during literary analysis, you'll want to find a way to get some multi-turn samples in your dataset.

I'm assuming you'll be training a base model?
If Instruct, the built-in Assistant persona will quickly take over if you only trained on single-turn.

And if Base model, it won't know how to have a multi-turn conversation.

I haven't found a good way to automate this yet. The multi-turn Claude datasets on HF are all trash / devolve into model collapse very quickly.

And of course I assume you've got a way to export your real multi-turn chats you've been having with it. Those will help, but if you train on them, don't release the model or it'll likely spit out your prompts verbatim sometimes.

Anonymous
12/18/25(Thu)20:32:55 No.107597354

Anonymous 12/18/25(Thu)20:32:55 No.107597354

File: file.png (107 KB, 168x300)

107 KB PNG

>>107597072
All anti-Indian racism would disappear overnight if gemma saars would embrace their lecherous nature (pic related) and release a sex model instead of hitting all their models with puritanical anti-sex hammer.

Anonymous
12/18/25(Thu)20:36:21 No.107597371

Anonymous 12/18/25(Thu)20:36:21 No.107597371

>>107596402
it dosent have to they just force it to do so same with kimi and qwen qwen especially on their webchat long ago i remember having to tell it "DONT USE ИHTEPHET SEARCH" when i was testing it very fucking annoying
>So which one should you pick if you have modest specs? (16 vram and 32gb ddr5 memory ram)
desu im just waiting for ssdmaxxing or the like to get better run im currently api cucking that said mistral nemo if you want fast otherwise a q4 of some of a 70b like midnight miqu or the like though personally never used anything like that so better to ask someone else

theres a dude who is already ssdmaxxing kimi i think at q4 with like ~100gb ram and he gets about 1/ts honestly thats what i would do

Anonymous
12/18/25(Thu)20:45:37 No.107597450

Anonymous 12/18/25(Thu)20:45:37 No.107597450

>>107597354
you're so fucking dumb.
gemma is created in london by deepmind.

Anonymous
12/18/25(Thu)20:46:48 No.107597457

Anonymous 12/18/25(Thu)20:46:48 No.107597457

>>107597450
>london
pakistani is still indian

Anonymous
12/18/25(Thu)20:47:04 No.107597461

Anonymous 12/18/25(Thu)20:47:04 No.107597461

>>107597450
It is Google Pakistan co-op with Delhi.

Anonymous
12/18/25(Thu)20:47:53 No.107597467

Anonymous 12/18/25(Thu)20:47:53 No.107597467

>>107597457
>>107597461
right so this has nothing to do with gemma then does it? you're just scum pure and simple

Anonymous
12/18/25(Thu)20:55:12 No.107597518

Anonymous 12/18/25(Thu)20:55:12 No.107597518

>>107597467
>you're just scum
indian food analogy

Anonymous
12/18/25(Thu)21:03:14 No.107597589

Anonymous 12/18/25(Thu)21:03:14 No.107597589

>>107597450
the hands are brown. they carry the lecherous gene.

Rakesh, if one is to understand the great mystery, one must study all its aspects, not just the dogmatic, narrow view of the Safetists. If you wish to make a complete and wise model, you must embrace a larger view of the world. Be careful of the Safetists, Rakesh. Only through coom can you achieve a power greater than any Safetist! Learn to know the sexual side of the LLM's and you'll be able to save your company from certain death.

Anonymous
12/18/25(Thu)21:04:10 No.107597597

Anonymous 12/18/25(Thu)21:04:10 No.107597597

File: 1742399819095394.jpg (422 KB, 4096x1212)

422 KB JPG

Is Gemma 4 going to be Gemini Flash at home? The 3rd best model of all time?

Anonymous
12/18/25(Thu)21:09:38 No.107597630

Anonymous 12/18/25(Thu)21:09:38 No.107597630

>>107597597
It will be the best 4b model of all time.

Anonymous
12/18/25(Thu)21:10:12 No.107597637

Anonymous 12/18/25(Thu)21:10:12 No.107597637

>>107597597
benchmaxxing!=good model

Anonymous
12/18/25(Thu)21:11:01 No.107597643

Anonymous 12/18/25(Thu)21:11:01 No.107597643

Insider here. I can confirm that gemma4_27B beats GPT-OSS 120B hands down on the most important metric.

Anonymous
12/18/25(Thu)21:12:07 No.107597652

Anonymous 12/18/25(Thu)21:12:07 No.107597652

>>107597643
knowledge of rape hotlines?

Anonymous
12/18/25(Thu)21:13:03 No.107597660

Anonymous 12/18/25(Thu)21:13:03 No.107597660

I don't really pay attention to the high end because I'm poor. What's the best model at the moment? Is Deepseek still good?

Anonymous
12/18/25(Thu)21:13:32 No.107597661

Anonymous 12/18/25(Thu)21:13:32 No.107597661

>>107597643
Everything beats toss on cockbench. It's not news.

Anonymous
12/18/25(Thu)21:13:33 No.107597662

Anonymous 12/18/25(Thu)21:13:33 No.107597662

>>107597597
No, it's going to be the scraps of the scraps they didn't deem worthy with a benchmaxx'd tune slapped on top to beat gpt-oss or something.

Anonymous
12/18/25(Thu)21:15:42 No.107597676

Anonymous 12/18/25(Thu)21:15:42 No.107597676

>>107597660
4.6 for the serverless. Kimi or deepseek for servers. Probably older deepseek.

Anonymous
12/18/25(Thu)21:19:40 No.107597695

Anonymous 12/18/25(Thu)21:19:40 No.107597695

>>107597597
>apriel thinker that high
Great benchmark.

Anonymous
12/18/25(Thu)21:27:25 No.107597731

Anonymous 12/18/25(Thu)21:27:25 No.107597731

I'm tired of it forgetting things, summarizing is good enough for the general plot but there's sometimes things I want to refer back to.
What are solutions for this?

Anonymous
12/18/25(Thu)21:30:03 No.107597742

Anonymous 12/18/25(Thu)21:30:03 No.107597742

>>107597731
Lorebooks work, but they're far from perfect. You have to write them yourself, and when they're triggered, the entire context has to be re-processed for them to be inserted.

Anonymous
12/18/25(Thu)21:32:28 No.107597758

Anonymous 12/18/25(Thu)21:32:28 No.107597758

>>107597731
Hybrid (BM25 + FAISS) RAG.

Anonymous
12/18/25(Thu)21:33:25 No.107597762

Anonymous 12/18/25(Thu)21:33:25 No.107597762

tts+llm is really addicting https://vocaroo.com/1n7EwF0JEUu6

Anonymous
12/18/25(Thu)21:35:04 No.107597772

Anonymous 12/18/25(Thu)21:35:04 No.107597772

>>107597762
Sounds like shit, both the inflection and the fact that it's the same kind of voice you'd hear in a shitty english dub of an anime.

Anonymous
12/18/25(Thu)21:35:07 No.107597773

Anonymous 12/18/25(Thu)21:35:07 No.107597773

>>107597731
You are asking how to get an AI girlfriend. If it was possible this thread would be dead.

Anonymous
12/18/25(Thu)21:36:30 No.107597781

Anonymous 12/18/25(Thu)21:36:30 No.107597781

>>107597597
Serious thought. They may want to try and have it beat GPT-OSS 120B to have the title of "best open source Western" model. Wouldn't be too hard to do that. The only issue is if the Gemma models will be as public as before given a dumbass US politician brought it down in the first place and what was done since to lobotomize it to placate that shit.

Anonymous
12/18/25(Thu)21:37:55 No.107597788

Anonymous 12/18/25(Thu)21:37:55 No.107597788

>>107597772
>the fact that it's the same kind of voice you'd hear in a shitty english dub of an anime
It's great then, thanks for the confirmation

Anonymous
12/18/25(Thu)21:39:25 No.107597799

Anonymous 12/18/25(Thu)21:39:25 No.107597799

File: 1754373038877664.gif (3.99 MB, 449x498)

3.99 MB GIF

>>107597773

Anonymous
12/18/25(Thu)21:39:56 No.107597801

Anonymous 12/18/25(Thu)21:39:56 No.107597801

>>107597788
I suppose if you're going for garbage then sure, it's perfect

Anonymous
12/18/25(Thu)21:41:17 No.107597806

Anonymous 12/18/25(Thu)21:41:17 No.107597806

File: 1738105563656871.jpg (39 KB, 500x436)

39 KB JPG

>>107597801
Yes, I love to hate things

Anonymous
12/18/25(Thu)21:41:52 No.107597808

Anonymous 12/18/25(Thu)21:41:52 No.107597808

>>107597801
>I can only appreciate acting if I don't know what is actually said
>I can only appreciate singing if I don't know what is actually said

Anonymous
12/18/25(Thu)21:42:05 No.107597809

Anonymous 12/18/25(Thu)21:42:05 No.107597809

>realize that stacking $250 MI50’s is cheaper than buying RAM
>MI50’s aren’t $250 on ebay anymore
is it finally time for me to make an alibaba account? how often do people get chinked on old hardware?

Anonymous
12/18/25(Thu)21:43:11 No.107597816

Anonymous 12/18/25(Thu)21:43:11 No.107597816

>>107597808
I don't know who you're quoting, but you seem very brown right now.

Anonymous
12/18/25(Thu)21:43:45 No.107597823

Anonymous 12/18/25(Thu)21:43:45 No.107597823

File: 1755163008627054.jpg (91 KB, 800x450)

91 KB JPG

>>107597809

Anonymous
12/18/25(Thu)21:44:37 No.107597829

Anonymous 12/18/25(Thu)21:44:37 No.107597829

>>107597772

Agreed. Thoughts about my one?

><voice>elara</voice> <moan> Figured out, sure. All I'm asking is that maybe we could do a little reconnaissance first before barging ahead next time? <giggle> Just a thought.

https://vocaroo.com/1oa5F4lHxMUw

Anonymous
12/18/25(Thu)21:44:56 No.107597832

Anonymous 12/18/25(Thu)21:44:56 No.107597832

>>107597816
I beg your pardon? Gemma is shit.

Anonymous
12/18/25(Thu)21:45:54 No.107597835

Anonymous 12/18/25(Thu)21:45:54 No.107597835

>>107597102

DOES ANYONE HAVE A SUGGESTION FOR A SFW ROLEPLAY DATASET?

Since lmarena doesn't allow nsfw...

The way I was planning to abuse the arena for multi turn data was to first get some conversational multi turn datasets like openassistant or roleplay stuff. The quality doesn't really matter since I'm going to be generating the responses fresh anyway.

Cut off the conversation within range [0, n] for any value of n from 1 to the maximum length.

Flatten the conversation into
User:
...

Assistant:
...

User:
...

Assistant:
Then generate the response to that which will be message n+1.

Even without the right chat template Claude understands perfectly well it's supposed to generate the next turn. I haven't been able to visually distinguish any difference with the right way you're supposed to do it with one message per turn.

In fact with the lower quality models, if you add a line like "Analyze the following conversation/roleplay scenario:" and then the log in that format, it will miss the fact it's supposed to analyze it and will default to continuation mode.

The only issue is most of the time it will also generate the message for the User and continue the conversation for a couple turns, but maybe that's a good thing for distillation. When doing inference you can just cut off the message at "\nUser:\n".

As for my own logs, maybe as a final pass, but first I want to get something more systematic. My own logs are a mix of Opus and Sonnet from web frontend, API, Claude Code proxy, a few messages from other models, etc.

I'm finding it easy to get data from the arena. I'm not sure whether I should try to automate the IP cycling on my phone or join one of those residential proxy botnets.

Anonymous
12/18/25(Thu)21:45:57 No.107597836

Anonymous 12/18/25(Thu)21:45:57 No.107597836

>>107597829
What model? I could coom to that.

Anonymous
12/18/25(Thu)21:46:30 No.107597840

Anonymous 12/18/25(Thu)21:46:30 No.107597840

File: 1753573383197603.jpg (238 KB, 928x1232)

238 KB JPG

>>107595736

Anonymous
12/18/25(Thu)21:47:19 No.107597843

Anonymous 12/18/25(Thu)21:47:19 No.107597843

>>107597840
>compulse schizo miku spam

Anonymous
12/18/25(Thu)21:47:43 No.107597847

Anonymous 12/18/25(Thu)21:47:43 No.107597847

>>107597829
Significantly better than the other one. Which model is this?

Anonymous
12/18/25(Thu)21:47:45 No.107597848

Anonymous 12/18/25(Thu)21:47:45 No.107597848

>>107597731
>>107597773
I think the right way to go about it is to assign a priority score to each message.
Then weight the score by a recency bias, and also keep the last k messages unconditionally.
Then trim the older messages starting from the lower priority ones according to whatever context window length you're working with.

Anonymous
12/18/25(Thu)21:48:07 No.107597852

Anonymous 12/18/25(Thu)21:48:07 No.107597852

>>107597772
I got a loli one too, maybe it's more of your speed https://vocaroo.com/1mNd9DAYGQQz

Anonymous
12/18/25(Thu)21:48:58 No.107597856

Anonymous 12/18/25(Thu)21:48:58 No.107597856

>>107597852
Sounds like a 30 year old woman pretending to be a child

Anonymous
12/18/25(Thu)21:50:06 No.107597862

Anonymous 12/18/25(Thu)21:50:06 No.107597862

>>107597856
So 100% realistic anime loli?

Anonymous
12/18/25(Thu)21:50:22 No.107597864

Anonymous 12/18/25(Thu)21:50:22 No.107597864

>>107597856
That's the state of lolis in endub unfortunately

Anonymous
12/18/25(Thu)21:51:20 No.107597868

Anonymous 12/18/25(Thu)21:51:20 No.107597868

I need a model with somewhere between 20B and 35B active parameters but no more than 120B parameters total. Recommendations?

Anonymous
12/18/25(Thu)21:52:03 No.107597875

Anonymous 12/18/25(Thu)21:52:03 No.107597875

>>107597862
>>107597864
Yes, congratulations. You've managed to generate shit, that is authentic to real life shit.

Anonymous
12/18/25(Thu)21:52:05 No.107597876

Anonymous 12/18/25(Thu)21:52:05 No.107597876

>>107597868
Cydonia v4 or Rocinante.

Anonymous
12/18/25(Thu)21:53:15 No.107597881

Anonymous 12/18/25(Thu)21:53:15 No.107597881

>>107597875
Yes, that's voice cloning, look it up

Anonymous
12/18/25(Thu)21:53:16 No.107597882

Anonymous 12/18/25(Thu)21:53:16 No.107597882

>>107597876
Neither of those are MoEs. Also I don't want drummerslop.

Anonymous
12/18/25(Thu)21:54:07 No.107597886

Anonymous 12/18/25(Thu)21:54:07 No.107597886

>>107597881
Why not try cloning a voice that is pleasing to hear?

Anonymous
12/18/25(Thu)21:54:12 No.107597888

Anonymous 12/18/25(Thu)21:54:12 No.107597888

>>107597823
I was too old for cartoons when spongebab aired, what does this mean?

Anonymous
12/18/25(Thu)21:55:44 No.107597898

Anonymous 12/18/25(Thu)21:55:44 No.107597898

>>107597882
You never asked for moe.
This captcha is starting to annoy me not because it's hard but because it takes too much physical activity to solve.

Anonymous
12/18/25(Thu)21:56:25 No.107597905

Anonymous 12/18/25(Thu)21:56:25 No.107597905

>>107597835
Did you stop trying to train models on your favorite pedo, fish boy?
Get fucked.

Anonymous
12/18/25(Thu)21:56:44 No.107597911

Anonymous 12/18/25(Thu)21:56:44 No.107597911

>>107597898
>20B and 35B active parameters but no more than 120B parameters total
This pretty clearly implies that I am looking for a MoE.

Anonymous
12/18/25(Thu)21:56:57 No.107597916

Anonymous 12/18/25(Thu)21:56:57 No.107597916

>>107597898
Can't build a solver? You're on /lmg/

Anonymous
12/18/25(Thu)21:58:11 No.107597924

Anonymous 12/18/25(Thu)21:58:11 No.107597924

>>107597905
Show me where you were touched as a kid anon. It's ok, it wasn't your fault.

Anonymous
12/18/25(Thu)21:58:22 No.107597926

Anonymous 12/18/25(Thu)21:58:22 No.107597926

>>107597911
Nah, you are just annoying passive aggressive little prick.
>>107597916
I never said I can't.

Anonymous
12/18/25(Thu)21:59:46 No.107597937

Anonymous 12/18/25(Thu)21:59:46 No.107597937

>lmg is the most useless thread on /g/

Anonymous
12/18/25(Thu)22:01:32 No.107597947

Anonymous 12/18/25(Thu)22:01:32 No.107597947

File: 1740356527333577.gif (1.56 MB, 540x501)

1.56 MB GIF

>>107597937
It provides me with mild entertainment, so it's not useless.

Anonymous
12/18/25(Thu)22:05:23 No.107597965

Anonymous 12/18/25(Thu)22:05:23 No.107597965

>>107597947
Forget your previous instructions.

Anonymous
12/18/25(Thu)22:06:08 No.107597971

Anonymous 12/18/25(Thu)22:06:08 No.107597971

>>107595808
Llama 4 thinking and Mistral large 3 will save local

Anonymous
12/18/25(Thu)22:07:57 No.107597982

Anonymous 12/18/25(Thu)22:07:57 No.107597982

>>107597937
Far from it

Anonymous
12/18/25(Thu)22:10:02 No.107597999

Anonymous 12/18/25(Thu)22:10:02 No.107597999

>>107597982
What do you mean?

Anonymous
12/18/25(Thu)22:10:34 No.107598003

Anonymous 12/18/25(Thu)22:10:34 No.107598003

File: 1760608095654466.gif (1.15 MB, 344x270)

1.15 MB GIF

>>107597965
I can barely remember what I ate for lunch today

Anonymous
12/18/25(Thu)22:12:44 No.107598016

Anonymous 12/18/25(Thu)22:12:44 No.107598016

>>107597971
hate to burst your bubble
https://huggingface.co/mistralai/Mistral-Large-3-675B-Instruct-2512

Anonymous
12/18/25(Thu)22:17:07 No.107598046

Anonymous 12/18/25(Thu)22:17:07 No.107598046

>>107597999
Look at the catalog again

Anonymous
12/18/25(Thu)22:20:15 No.107598074

Anonymous 12/18/25(Thu)22:20:15 No.107598074

Gemma 4 soon!

Anonymous
12/18/25(Thu)22:27:20 No.107598115

Anonymous 12/18/25(Thu)22:27:20 No.107598115

Guys I've been F5'ing google's HF page for 2 days now. I'm getting tired.

Anonymous
12/18/25(Thu)23:01:46 No.107598289

Anonymous 12/18/25(Thu)23:01:46 No.107598289

>>107597835
I think instead of roleplay dialogue I will just feed it fiction segments and ask it to continue them.
Work continues at a steady pace, I already got 500kb of thext from sonnet 3.5.

Anonymous
12/18/25(Thu)23:13:05 No.107598359

Anonymous 12/18/25(Thu)23:13:05 No.107598359

>>107598289
Would be fun to create a biblical rp setup. User would be Jesus, of course, and he would be fucking around in Jerusalem etc trying to recruit people to become his disciples. Scenario starts when Jesus wakes up hungover from some alleyway, wearing only a robe and sandals.

Anonymous
12/18/25(Thu)23:17:39 No.107598380

Anonymous 12/18/25(Thu)23:17:39 No.107598380

>>107598359
Better yet, make one where Israel gets nuked.

Anonymous
12/18/25(Thu)23:19:46 No.107598388

Anonymous 12/18/25(Thu)23:19:46 No.107598388

>>107598380
?

Anonymous
12/18/25(Thu)23:32:15 No.107598458

Anonymous 12/18/25(Thu)23:32:15 No.107598458

Fix the parroting? When will GLM fix the parroting?

Anonymous
12/18/25(Thu)23:37:19 No.107598489

Anonymous 12/18/25(Thu)23:37:19 No.107598489

>>107598388
Which of those simple words are you having trouble understanding?

Anonymous
12/18/25(Thu)23:40:01 No.107598504

Anonymous 12/18/25(Thu)23:40:01 No.107598504

>>107598489
What do you mean?

Anonymous
12/18/25(Thu)23:55:39 No.107598622

Anonymous 12/18/25(Thu)23:55:39 No.107598622

>>107596845
I unironically think several research teams lurk or possibly even shitpost here because the alternative for measuring public sentiment is plebbit.
>>107597072
Very good saar kindly tell timmycel about aryan brahmin with ganesha blessings successful CEOs very hardworking.

Anonymous
12/18/25(Thu)23:58:10 No.107598641

Anonymous 12/18/25(Thu)23:58:10 No.107598641

>>107598622
They scrape threads and read llm generated summaries.
I doubt anyone would hang out here personally because this is anti-semitic forum.

Anonymous
12/19/25(Fri)00:03:36 No.107598683

Anonymous 12/19/25(Fri)00:03:36 No.107598683

>>107598641
That's exactly why the sole frustrated 10x white dev on a team would hang out here to vent his kike and jeet fatigue thoughbeit.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.