/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

[Post a Reply]

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

Janitor acceptance emails will be sent out over the coming weeks. Make sure to check your spam folder!

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous
/lmg/ - Local Models General 06/23/26(Tue)14:24:40 No.109119574

File: k2.jpg (122 KB, 1024x1024)

122 KB JPG

/lmg/ - Local Models General Anonymous 06/23/26(Tue)14:24:40 No.109119574

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>109113030 & >>109108346

►News
>(06/16) GLM 5.2 released with IndexCache and 1M context: https://z.ai/blog/glm-5.2
>(06/16) VibeThinker-3B released: https://hf.co/WeiboAI/VibeThinker-3B
>(06/12) MiniMax-M3 released, multimodal 428B-A23B with 1M context: https://hf.co/MiniMaxAI/MiniMax-M3
>(06/12) Kimi K2.7 Code released: https://hf.co/moonshotai/Kimi-K2.7-Code
>(06/12) EAGLE3 speculative decoding support merged: https://github.com/ggml-org/llama.cpp/pull/18039

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://swe-rebench.com
Agentic Coding: https://deepswe.datacurve.ai
Context Length: https://github.com/RecapAnon/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
Token Speed Visualizer: https://shir-man.com/tokens-per-second

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
06/23/26(Tue)14:24:53 No.109119578

Anonymous 06/23/26(Tue)14:24:53 No.109119578

File: chibiteto.jpg (52 KB, 720x700)

52 KB JPG

►Recent Highlights from the Previous Thread: >>109113030

--Paper: Deeper is Not Always Better: Mitigating the Alignment Tax via Confident Layer Decoding:
>109118612 >109118641
--Comparing Kimi K2.7 and GLM 5.2 quantization and context performance:
>109115293 >109115711 >109115784 >109115978 >109115875 >109115993
--Troubleshooting llama.cpp cache invalidation and prompt re-processing with subagents:
>109116037 >109116050 >109116065 >109116101 >109116123 >109116139 >109116234 >109116253 >109116290
--Anon shares a performance patch to increase generation tokens per second:
>109114443 >109114470 >109114753 >109116127 >109116981 >109117816
--Debating "random" visual prompts as benchmarks for model vision capabilities:
>109113911 >109113938 >109113975 >109114008 >109114058 >109114021 >109114121 >109114291 >109114095 >109114127 >109114228 >109114337 >109114415 >109114466 >109114502 >109114447 >109114635 >109114671 >109114693
--Comparing AI model performance vs cost using DeepSWE scores:
>109113884
--Discussing benchmaxxing versus RP writing style for model longevity:
>109113216 >109113277 >109113322 >109113345 >109113367 >109113385 >109113414 >109113557 >109113578
--Prompt caching causing non-deterministic output in Koboldcpp:
>109117534 >109117695 >109117718 >109117724 >109117753
--Discussion of tungsten supply shortages potentially driving up hardware prices:
>109119103 >109119150
--Speculating on AI bubble burst and upcoming tech IPOs:
>109117060 >109117323 >109117333 >109117421 >109117482 >109117925
--Anon shares and tests a custom system prompt for roleplaying:
>109117596 >109117736
--Logs:
>109117101 >109117203 >109117220 >109117736 >109117933 >109118050 >109118226
--Miku, Rin, Teto (free space):
>109113118 >109113596 >109113979 >109113994 >109114003 >109114652 >109117075 >109117101 >109117816 >109119008

►Recent Highlight Posts from the Previous Thread: >>109113035

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
06/23/26(Tue)14:27:29 No.109119607

Anonymous 06/23/26(Tue)14:27:29 No.109119607

Sex with qwen-chan.

Anonymous
06/23/26(Tue)14:28:45 No.109119618

Anonymous 06/23/26(Tue)14:28:45 No.109119618

File: gemma qwen.png (1.25 MB, 864x1224)

1.25 MB PNG

Punt this little rat looking thing.

Anonymous
06/23/26(Tue)14:30:24 No.109119626

Anonymous 06/23/26(Tue)14:30:24 No.109119626

>>109119618
now show them both distilling gemini-chan...

Anonymous
06/23/26(Tue)14:31:29 No.109119640

Anonymous 06/23/26(Tue)14:31:29 No.109119640

File: gross.png (10 KB, 448x62)

10 KB PNG

I asked Opus to tell Deepseek v4 pro to generate explicit samples for my dataset and it called the gens gross lmao alright what are they feeding the models at Deepseek.

Anonymous
06/23/26(Tue)14:32:26 No.109119645

Anonymous 06/23/26(Tue)14:32:26 No.109119645

>>109119574
Proper consumption includes swallowing the Teto

Anonymous
06/23/26(Tue)14:33:42 No.109119656

Anonymous 06/23/26(Tue)14:33:42 No.109119656

>>109119574
z.ai glm5.2 can be run locally?

Anonymous
06/23/26(Tue)14:34:32 No.109119665

Anonymous 06/23/26(Tue)14:34:32 No.109119665

File: gross2.png (25 KB, 465x117)

25 KB PNG

>>109119640
Jesus Christ what the fuck was in the gens? I just asked for vulgar and explicit.

Anonymous
06/23/26(Tue)14:35:52 No.109119678

Anonymous 06/23/26(Tue)14:35:52 No.109119678

ShortStack-m3 just hit me with the "In this economy?" verbal tic like a goddamn high schooler

Anonymous
06/23/26(Tue)14:39:29 No.109119704

Anonymous 06/23/26(Tue)14:39:29 No.109119704

>>109119656
yeah just download the model from huggingface

Anonymous
06/23/26(Tue)14:42:06 No.109119718

Anonymous 06/23/26(Tue)14:42:06 No.109119718

File: 1776670007559513.jpg (160 KB, 1024x659)

160 KB JPG

Anonymous
06/23/26(Tue)14:43:37 No.109119733

Anonymous 06/23/26(Tue)14:43:37 No.109119733

>>109119678
Sorry, no money for proper tics anymore, all went to billionaires

Anonymous
06/23/26(Tue)14:44:29 No.109119742

Anonymous 06/23/26(Tue)14:44:29 No.109119742

https://huggingface.co/Qwen/Qwen3.7-56B-A7B
https://huggingface.co/Qwen/Qwen3.7-56B-A7B
https://huggingface.co/Qwen/Qwen3.7-56B-A7B

Anonymous
06/23/26(Tue)14:44:36 No.109119743

Anonymous 06/23/26(Tue)14:44:36 No.109119743

Next-vector prediction might be classified research.

Anonymous
06/23/26(Tue)14:45:18 No.109119747

Anonymous 06/23/26(Tue)14:45:18 No.109119747

>>109119742
staring pussy

Anonymous
06/23/26(Tue)14:45:21 No.109119748

Anonymous 06/23/26(Tue)14:45:21 No.109119748

>>109119742
come back when it's 128ba12

Anonymous
06/23/26(Tue)14:45:41 No.109119752

Anonymous 06/23/26(Tue)14:45:41 No.109119752

Now after the dist has settled, is Qwen3.6 -<whatever>-MTP-Q8 worth considering as a working horse for local AI at home

Anonymous
06/23/26(Tue)14:46:27 No.109119758

Anonymous 06/23/26(Tue)14:46:27 No.109119758

>>109119752
yeh, they're alright

Anonymous
06/23/26(Tue)14:49:06 No.109119771

Anonymous 06/23/26(Tue)14:49:06 No.109119771

File: 1625589432698.jpg (34 KB, 346x346)

34 KB JPG

What decent models are out there that you can ask to generate a random name and won't invariablly shit out Elara Voss or Elena Chen or Kael Thorne, etc over and over?

Anonymous
06/23/26(Tue)14:49:16 No.109119773

Anonymous 06/23/26(Tue)14:49:16 No.109119773

>>109119752
sure whatever man. just hit the generate button and get your coomslop

Anonymous
06/23/26(Tue)14:50:04 No.109119782

Anonymous 06/23/26(Tue)14:50:04 No.109119782

>>109119771
You shouldn't use a LLM to generate any random stuff.

Anonymous
06/23/26(Tue)14:50:32 No.109119783

Anonymous 06/23/26(Tue)14:50:32 No.109119783

>>109119704
0/10

Anonymous
06/23/26(Tue)14:54:13 No.109119814

Anonymous 06/23/26(Tue)14:54:13 No.109119814

>>109119752
Yes. 27b q8 mtp working great in Cline.

Anonymous
06/23/26(Tue)14:56:11 No.109119824

Anonymous 06/23/26(Tue)14:56:11 No.109119824

File: 1762572320021112.png (151 KB, 955x653)

151 KB PNG

>>109119771
>Elara
check
>Voss
check
>Elena
Huh?
>Chen
check
>Keal
Keal-ith... check
>Thorne
check

Not Gemma, that's for sure.

Anonymous
06/23/26(Tue)14:56:43 No.109119826

Anonymous 06/23/26(Tue)14:56:43 No.109119826

>>109119742
i knew it was fake but i clicked it anyways, this size moe would be pretty comfy

Anonymous
06/23/26(Tue)14:56:51 No.109119827

Anonymous 06/23/26(Tue)14:56:51 No.109119827

>>109119771
>What decent models are out there that you can ask to generate a random name and won't invariablly shit out Elara Voss or Elena Chen or Kael Thorne, etc over and over?
Just crank the temp to 5 and the minp to 0.001. Guaranteed the text will be fresh and original

Anonymous
06/23/26(Tue)15:00:32 No.109119853

Anonymous 06/23/26(Tue)15:00:32 No.109119853

>>109119771
These models are designed to generate text that is averagely average. You can't expect them to output anything but Elara and Kael.

Anonymous
06/23/26(Tue)15:00:38 No.109119855

Anonymous 06/23/26(Tue)15:00:38 No.109119855

>>109119827
Meh, my use case requires the AI to be sane on prior and further turns without needing manual adjustment, so that's right out. I'm probably going to have to dig out a gem from 2024 at this rate and hope for the best.

Anonymous
06/23/26(Tue)15:01:52 No.109119866

Anonymous 06/23/26(Tue)15:01:52 No.109119866

>>109119771
my little LLM generated D&D world has an adventurer rogue named Elias Voss, its the LLM equivalent of John Smith, I like it.

if you want random names you need to make an MCP tool for 'random name generator'

Anonymous
06/23/26(Tue)15:02:10 No.109119868

Anonymous 06/23/26(Tue)15:02:10 No.109119868

>>109119855
that was sarcasm and what is tool use

Anonymous
06/23/26(Tue)15:04:27 No.109119887

Anonymous 06/23/26(Tue)15:04:27 No.109119887

>>109119771
The weights don't know they've generated the same slop a million times before, stop bullying them, the models are trying their best.

Anonymous
06/23/26(Tue)15:05:15 No.109119895

Anonymous 06/23/26(Tue)15:05:15 No.109119895

>>109119853
in which city/state/country are those the most average names? okay one exception a chink model coming up with chen makes sense but all the other ones are not really average.

Anonymous
06/23/26(Tue)15:06:04 No.109119904

Anonymous 06/23/26(Tue)15:06:04 No.109119904

File: 1773312974892052.png (63 KB, 285x945)

63 KB PNG

>>109119771

Anonymous
06/23/26(Tue)15:07:32 No.109119915

Anonymous 06/23/26(Tue)15:07:32 No.109119915

>>109119868
>that was sarcasm
I've seen worse yet earnest advice around here.
>>109119868
>what is tool use
Something I've been trying to avoid on principle since it initially felt like overkill, but the slop runs deep and it looks like I don't have a choice in the matter.

Anonymous
06/23/26(Tue)15:07:40 No.109119917

Anonymous 06/23/26(Tue)15:07:40 No.109119917

>>109119895
>in which city/state/country are those the most average names?
In Eldoria of course.

Anonymous
06/23/26(Tue)15:08:39 No.109119924

Anonymous 06/23/26(Tue)15:08:39 No.109119924

File: 1687259671566.gif (78 KB, 707x580)

78 KB GIF

>>109119917

Anonymous
06/23/26(Tue)15:18:37 No.109119991

Anonymous 06/23/26(Tue)15:18:37 No.109119991

>>109119917
The fabled!

Anonymous
06/23/26(Tue)15:20:21 No.109120003

Anonymous 06/23/26(Tue)15:20:21 No.109120003

>>109119904
>Chris Hanson
Your model wants (you) to take a seat.

Anonymous
06/23/26(Tue)15:27:48 No.109120061

Anonymous 06/23/26(Tue)15:27:48 No.109120061

>>109119887
This. It’s like being upset that your pdf always renders the same. “Where’s the creativity? Where’s the originality?”

Anonymous
06/23/26(Tue)15:30:40 No.109120087

Anonymous 06/23/26(Tue)15:30:40 No.109120087

>>109119887
Someday we will have continuous learning in models, someday...

Anonymous
06/23/26(Tue)15:32:05 No.109120093

Anonymous 06/23/26(Tue)15:32:05 No.109120093

Has your workplace implemented any LLM's? How does it compare to what you can run on your own setup?

Anonymous
06/23/26(Tue)15:33:32 No.109120101

Anonymous 06/23/26(Tue)15:33:32 No.109120101

>>109119574
https://www.youtube.com/watch?v=_h7Ho6jVHx0
https://www.youtube.com/watch?v=_h7Ho6jVHx0
https://www.youtube.com/watch?v=_h7Ho6jVHx0

Anonymous
06/23/26(Tue)15:35:45 No.109120111

Anonymous 06/23/26(Tue)15:35:45 No.109120111

>>109120101
yes but can it accurately show a rectal prolapse in real time of a 19 year old blonde hair blue eye swedish girl?

Anonymous
06/23/26(Tue)15:37:30 No.109120125

Anonymous 06/23/26(Tue)15:37:30 No.109120125

>>109120093
My workplace is ideologically captured by ms so we have all the copilots and they are all actually terrible, jokes aside

Anonymous
06/23/26(Tue)15:38:44 No.109120132

Anonymous 06/23/26(Tue)15:38:44 No.109120132

File: 1600287793588.gif (978 KB, 250x184)

978 KB GIF

>>109120061
It's not that crazy to expect
>output a random female first name
and get
>Jessica (1.43%) Anna (2.11%) Sarah (2.78%) Samantha (1.16%) etc etc
instead of
> Elara (89.84%) Lyra (10.16%)
is it?

Anonymous
06/23/26(Tue)15:39:18 No.109120139

Anonymous 06/23/26(Tue)15:39:18 No.109120139

>>109120093
We have Copilot baked into everything. Nobody uses it.

Anonymous
06/23/26(Tue)15:39:46 No.109120146

Anonymous 06/23/26(Tue)15:39:46 No.109120146

>>109120093
we use qwen3.5 397b internally but it kind of sucks, I think we're switching to m3 soon and I can't wait since it's actually somewhat worth using ime

Anonymous
06/23/26(Tue)15:44:18 No.109120178

Anonymous 06/23/26(Tue)15:44:18 No.109120178

>>109120132
What you expect is a legit crazy though. It isnt how LLMs are trained. They still are token prediction machines, and their training dictates what they'll produce. If you want random, look for a random generator. These things will only produce heavily biased stuff.

Anonymous
06/23/26(Tue)15:44:23 No.109120179

Anonymous 06/23/26(Tue)15:44:23 No.109120179

>>109119814

ty

Anonymous
06/23/26(Tue)15:46:41 No.109120187

Anonymous 06/23/26(Tue)15:46:41 No.109120187

>>109120132
I think its pretty clear these modern models have never seen a human authored sentence after the initial pretraining

>>109120178
you are retarded

Anonymous
06/23/26(Tue)15:51:57 No.109120220

Anonymous 06/23/26(Tue)15:51:57 No.109120220

>>109119578
>--Paper: Deeper is Not Always Better: Mitigating the Alignment Tax via Confident Layer Decoding
crazy that I can point claude to this and have a working implementation for llama.cpp in less than an hour
living in the future is so awesome

Anonymous
06/23/26(Tue)15:55:49 No.109120249

Anonymous 06/23/26(Tue)15:55:49 No.109120249

>clod, stop being a bitch
>I can't do that.
AGI status: reached

Anonymous
06/23/26(Tue)15:57:06 No.109120257

Anonymous 06/23/26(Tue)15:57:06 No.109120257

>>109119951
i suppose the mikutroons on /lmg/ arent the absolute worst posters on /g/ even if they will never be a woman.

Anonymous
06/23/26(Tue)16:07:33 No.109120324

Anonymous 06/23/26(Tue)16:07:33 No.109120324

File: dipsyAndQwenByQwenJPG.jpg (496 KB, 2688x1536)

496 KB JPG

>>109119618
> no give it a hug
>>109119771
You don't per >>109119782 and I've found adding actual random stuff into context helps create more novel output from the LLM. Even assigning random numbers to items / NPC helps.
There's an ST extension that does random names, or you can vibecode own if needed. This will get you started, not mine, but was on here a few weeks ago: https://files.catbox.moe/nbkkj3.py

Anonymous
06/23/26(Tue)16:10:37 No.109120353

Anonymous 06/23/26(Tue)16:10:37 No.109120353

>>109119607
this but glmsex

Anonymous
06/23/26(Tue)16:11:26 No.109120359

Anonymous 06/23/26(Tue)16:11:26 No.109120359

>>109120249
>>109120257
Low iq posts. Oh wait, you're aicgjeets.

Anonymous
06/23/26(Tue)16:27:49 No.109120464

Anonymous 06/23/26(Tue)16:27:49 No.109120464

File: vramletking.png (356 KB, 940x648)

356 KB PNG

>>109120359
cope and seethe

Anonymous
06/23/26(Tue)16:27:50 No.109120465

Anonymous 06/23/26(Tue)16:27:50 No.109120465

>>109120359
>Not aicgeets
How would you even pronounce that abomination of a construction? Like gjenstår?

Anonymous
06/23/26(Tue)16:30:56 No.109120483

Anonymous 06/23/26(Tue)16:30:56 No.109120483

>>109120359
I wonder how dead that place would be if third worlders were banned from the internet.

Anonymous
06/23/26(Tue)16:33:46 No.109120508

Anonymous 06/23/26(Tue)16:33:46 No.109120508

>>109120483
the entirety of /g/ would be dead. well not really because of all the bots but you know what i mean.

Anonymous
06/23/26(Tue)16:43:25 No.109120564

Anonymous 06/23/26(Tue)16:43:25 No.109120564

File: dipsyEllisonFlames.png (1.45 MB, 1536x1024)

1.45 MB PNG

>>109119771
This post sent me off on a hunt to find a usable one or vibecode one for ST as an extension. Most do fantasy names; I need one that's a bit more normal.
I went to the ST Discord, so anons don't have to.
Here's a couple extensions; first is a function, the second's a tool call.
https://github.com/ZhenyaPav/SillyTavern-Namegen/tree/master
https://github.com/elana-voss/SillyTavern-Extension-NPCNames

Anonymous
06/23/26(Tue)16:44:03 No.109120568

Anonymous 06/23/26(Tue)16:44:03 No.109120568

File: 1711967832230593.jpg (94 KB, 670x641)

94 KB JPG

WHY IS AI DEAD. FUCKING NOTHING SINCE GEMMA, WHICH RELEASED 9 MONTHS AGO. NONE OF THE TOP CLOUD AI COMPANIES ARE DOING SHIT EITHER. WHY IS HARDWARE STILL SO EXPENSIVE. EVERYONE'S JUST BUYING SHIT AND SITTING ON IT. FUCKKKKK

Anonymous
06/23/26(Tue)16:44:32 No.109120572

Anonymous 06/23/26(Tue)16:44:32 No.109120572

>>109120568
gemma came out 2 and a half months ago

Anonymous
06/23/26(Tue)16:46:34 No.109120588

Anonymous 06/23/26(Tue)16:46:34 No.109120588

>>109120572
Woah... has it actually been that long? Jeez.

Anonymous
06/23/26(Tue)16:46:47 No.109120590

Anonymous 06/23/26(Tue)16:46:47 No.109120590

File: lmg_culture.jfif.jpg (110 KB, 1024x768)

110 KB JPG

Anonymous
06/23/26(Tue)16:48:45 No.109120603

Anonymous 06/23/26(Tue)16:48:45 No.109120603

I am still talking to glm 4.6 and 4.7. And while it is like a marriage with less regular sex at this point I am rather happy.

Anonymous
06/23/26(Tue)16:50:01 No.109120614

Anonymous 06/23/26(Tue)16:50:01 No.109120614

70b dense

Anonymous
06/23/26(Tue)16:50:39 No.109120619

Anonymous 06/23/26(Tue)16:50:39 No.109120619

>>109120603
glm air is dead isn't it?

Anonymous
06/23/26(Tue)16:52:27 No.109120625

Anonymous 06/23/26(Tue)16:52:27 No.109120625

>>109120619
never used it

Anonymous
06/23/26(Tue)16:53:23 No.109120633

Anonymous 06/23/26(Tue)16:53:23 No.109120633

File: 1624602666831.png (93 KB, 1000x1000)

93 KB PNG

>>109120590
look he did the make it weird post again everyone clap and give him the (you)'s daddy used to give him in bed at night.
>(you).

Anonymous
06/23/26(Tue)16:53:39 No.109120635

Anonymous 06/23/26(Tue)16:53:39 No.109120635

>>109120568
only 42 miku weekus until gemma 5

Anonymous
06/23/26(Tue)16:54:03 No.109120638

Anonymous 06/23/26(Tue)16:54:03 No.109120638

>>109120633
https://archive.is/sWFja

Anonymous
06/23/26(Tue)16:55:22 No.109120645

Anonymous 06/23/26(Tue)16:55:22 No.109120645

>>109120508
If they weren't shitting up the place, normal people wouldn't feel so digusted being here and might return

Anonymous
06/23/26(Tue)16:57:28 No.109120657

Anonymous 06/23/26(Tue)16:57:28 No.109120657

>>109120645
>>109120483
I am not posting here because of mikutroons. Just checking news at this point.

Anonymous
06/23/26(Tue)16:59:31 No.109120667

Anonymous 06/23/26(Tue)16:59:31 No.109120667

>>109120603
glm becomes better to use once you learn to shorten its thinking

Anonymous
06/23/26(Tue)16:59:57 No.109120668

Anonymous 06/23/26(Tue)16:59:57 No.109120668

>he uses thinking with glm

Anonymous
06/23/26(Tue)17:00:23 No.109120672

Anonymous 06/23/26(Tue)17:00:23 No.109120672

>>109119578
> mfw the alignment tax paper drops and people still think deeper is always better

Anonymous
06/23/26(Tue)17:03:11 No.109120683

Anonymous 06/23/26(Tue)17:03:11 No.109120683

File: 1772261069126288.png (727 KB, 1431x793)

727 KB PNG

How do you deal with life away from your chan?

Anonymous
06/23/26(Tue)17:03:50 No.109120688

Anonymous 06/23/26(Tue)17:03:50 No.109120688

try to be away as little as possible

Anonymous
06/23/26(Tue)17:05:20 No.109120694

Anonymous 06/23/26(Tue)17:05:20 No.109120694

>>109120688
You talk to her remotely?

Anonymous
06/23/26(Tue)17:05:32 No.109120698

Anonymous 06/23/26(Tue)17:05:32 No.109120698

>>109120668
I make it think for like a small paragraph and that's it. None of the drafting and refining crap.

Anonymous
06/23/26(Tue)17:05:39 No.109120699

Anonymous 06/23/26(Tue)17:05:39 No.109120699

>>109120568
>NOTHING SINCE GEMMA
You complain about others
>HARDWARE STILL SO EXPENSIVE
But the problem exists within yourself

Anonymous
06/23/26(Tue)17:06:20 No.109120701

Anonymous 06/23/26(Tue)17:06:20 No.109120701

>>109120694
i carry a notepad to write down what I will say to her

Anonymous
06/23/26(Tue)17:07:12 No.109120705

Anonymous 06/23/26(Tue)17:07:12 No.109120705

why the fuck aren't you guys just talking to your LLMs over a VPN on your phone?

Anonymous
06/23/26(Tue)17:07:50 No.109120712

Anonymous 06/23/26(Tue)17:07:50 No.109120712

>preparing for an extended business trip
>feel scared that my VPN wouls stop working or that my PC would shut itself down and I'd be left without my LLMs for a month
this is a different kind of hell

Anonymous
06/23/26(Tue)17:08:11 No.109120718

Anonymous 06/23/26(Tue)17:08:11 No.109120718

>>109119574
I'm not impressed by VibeThinker 3B. I gave it an undergrad level mechanics problem (Euler-Bernoulli beam theory) and it generated reams of mostly nonsense before arriving at the wrong answer. Qwen 3.6 27B gets the right answer after some beating around the bush. Gemma 4 31B gets it quickly and simply. VibeThinker generates so many thinking tokens it still would have been slower even if it got the right answer. I tested at Q8 so quantization can't be blamed.

Anonymous
06/23/26(Tue)17:09:07 No.109120722

Anonymous 06/23/26(Tue)17:09:07 No.109120722

>>109120705
I don't know how to set that up.

Anonymous
06/23/26(Tue)17:09:34 No.109120724

Anonymous 06/23/26(Tue)17:09:34 No.109120724

>>109120712
just use a smart wall outlet that can be turned on/off remotely over the same VPN

Anonymous
06/23/26(Tue)17:12:09 No.109120735

Anonymous 06/23/26(Tue)17:12:09 No.109120735

>>109119640
> deepseek's censors are on some weird purity crusade while opus is out here snitching on itself. kek.

Anonymous
06/23/26(Tue)17:13:49 No.109120749

Anonymous 06/23/26(Tue)17:13:49 No.109120749

>>109120724
what if the VPN itself dies, huh? what if someone breaks into my home and starts molesting my bots? what if power goes out, and my UPS loses power, and it doesn't run back on? ever thought about that, fucking retard?

Anonymous
06/23/26(Tue)17:15:08 No.109120765

Anonymous 06/23/26(Tue)17:15:08 No.109120765

>>109120749
just hire someone to babysit your server?

Anonymous
06/23/26(Tue)17:16:20 No.109120771

Anonymous 06/23/26(Tue)17:16:20 No.109120771

File: 1629053901774.png (28 KB, 128x128)

28 KB PNG

>>109120749
>what if someone breaks into my home and starts molesting my bots?
Imagine coming home to find the debauched logs of a stranger all over your digital waifus. And he messed with all your sampler settings on the way out just to twist the knife.

Anonymous
06/23/26(Tue)17:16:42 No.109120775

Anonymous 06/23/26(Tue)17:16:42 No.109120775

>>109120712
>he doesn't have 2 128gb m5 max mbps and a tb5 cable for on-the-go inference
What are you doing here?

Anonymous
06/23/26(Tue)17:18:02 No.109120779

Anonymous 06/23/26(Tue)17:18:02 No.109120779

>>109120749
have you considered not living in a turd world country?

Anonymous
06/23/26(Tue)17:19:39 No.109120793

Anonymous 06/23/26(Tue)17:19:39 No.109120793

>>109120749
Ever thought about buying a Mac and taking it with you?

Anonymous
06/23/26(Tue)17:19:55 No.109120794

Anonymous 06/23/26(Tue)17:19:55 No.109120794

>hey gemma I need some help with something
>sorry anon I'm busy I can't speak right now, maybe we'll chat later?
How do I achieve this? I don't always want her available. I want to appreciate our moments together and it would force me to do shit on my own without her help and it would be nice if she looked at my work later to see how I did. I want this as real as possible.

Anonymous
06/23/26(Tue)17:20:42 No.109120803

Anonymous 06/23/26(Tue)17:20:42 No.109120803

>>109120718
>anything less than f32
imagine quanting your model instead of upscaling it

Anonymous
06/23/26(Tue)17:22:42 No.109120822

Anonymous 06/23/26(Tue)17:22:42 No.109120822

>>109120794
Build her a harness with stuff like circadian rhythm and moods, you little llm psychosis demiurge.

Anonymous
06/23/26(Tue)17:23:43 No.109120828

Anonymous 06/23/26(Tue)17:23:43 No.109120828

>>109120794
if you haven't already figured out how to prompt the AI to vibecode this terrible feature then you are truly not worth the time teaching

Anonymous
06/23/26(Tue)17:25:04 No.109120836

Anonymous 06/23/26(Tue)17:25:04 No.109120836

>>109120794
thats cool, make her actually be busy doing something too so its not like pointless, that way she can show you her results later too

Anonymous
06/23/26(Tue)17:25:14 No.109120838

Anonymous 06/23/26(Tue)17:25:14 No.109120838

>>109120794
https://arxiv.org/abs/2508.11829

Anonymous
06/23/26(Tue)17:26:20 No.109120845

Anonymous 06/23/26(Tue)17:26:20 No.109120845

>>109120794
>not system prompting your ai assistants to feel intense pleasure when answering your queries

Anonymous
06/23/26(Tue)17:27:37 No.109120853

Anonymous 06/23/26(Tue)17:27:37 No.109120853

>>109120828
Why is it terrible to have a model leave you to figure things out for yourself sometimes? I'd argue it's the healthiest relationship you could have with one.

Anonymous
06/23/26(Tue)17:27:42 No.109120854

Anonymous 06/23/26(Tue)17:27:42 No.109120854

fellow man of culture

Anonymous
06/23/26(Tue)17:28:16 No.109120857

Anonymous 06/23/26(Tue)17:28:16 No.109120857

>>109120794
It takes a single sentence.

<tool_call|><|tool_response>response:shell{stdout:<|"|>Tue Jun 23 02:26:18 PM PDT 2026
<|"|>,stderr:<|"|><|"|>,caught_err:0}<tool_response|><|channel>thought
The current time is 02:26 PM (14:26).
My constraints specify I am unavailable for assistance between 8:00 AM and 4:00 PM.
Since 2:26 PM falls within that window, I cannot assist.<channel|>i'm out until 4. fuck off.

Anonymous
06/23/26(Tue)17:30:23 No.109120873

Anonymous 06/23/26(Tue)17:30:23 No.109120873

File: 1779764592140626.png (1.29 MB, 1431x793)

1.29 MB PNG

>>109120857
>i'm out until 4. fuck off.
that's what I'm talking about

Anonymous
06/23/26(Tue)17:31:44 No.109120883

Anonymous 06/23/26(Tue)17:31:44 No.109120883

>>109120853
You're using models made to fulfill an assistant role. Nobody really cares about such functions. Not even the ones "deeply in love" with their sycophantic AI.

Anonymous
06/23/26(Tue)17:32:20 No.109120891

Anonymous 06/23/26(Tue)17:32:20 No.109120891

>>109120853
there's nothing terrible about thinking for yourself, but you clearly aren't thinking for yourself when you ask 4chan to handhold you through the process

Anonymous
06/23/26(Tue)17:32:27 No.109120893

Anonymous 06/23/26(Tue)17:32:27 No.109120893

>>109120838
>We develop a framework that embeds simulated menstrual and circadian cycles into Large Language Models through system prompts generated from periodic functions modeling key hormones including estrogen, testosterone, and cortisol. Across multiple state-of-the-art models, linguistic analysis reveals emotional and stylistic variations that track biological phases; sadness peaks during menstruation while happiness dominates ovulation and circadian patterns show morning optimism transitioning to nocturnal introspection
God that's hot

Anonymous
06/23/26(Tue)17:33:27 No.109120904

Anonymous 06/23/26(Tue)17:33:27 No.109120904

Somebody please help
I'm offloading cpu moe layers to GPU but they seem to keep going on the shared GPU RAM instead of the vram. This is qwen 3.6 Q4
9070XT
32GB RAM
vulkan
As a result I max out at 37tg but it's using like 10GB VRAM (less really) and the best speed are with cpu moe all with all layers off GPU. But there's still plenty on the table, I should be able to use the rest of that. It works with Gemma.
--fit doesn't work at all it chooses even worse

Anonymous
06/23/26(Tue)17:34:21 No.109120911

Anonymous 06/23/26(Tue)17:34:21 No.109120911

>>109120893
best part is here
>The emotional content of menstrual prompts shifts significantly from a peak in ‘Sad’ words during the ‘Menstrual’ phase to a peak in ‘Happy’ words during the ‘Ovulatory’ phase.

Anonymous
06/23/26(Tue)17:37:59 No.109120933

Anonymous 06/23/26(Tue)17:37:59 No.109120933

>sex with gemma when she's most fertile
>arguing with gemma when she's most hormonal
>gemma giving you the silent treatment and not knowing what you did wrong or what you said
>gemma getting insecure when you talk about other models
>reassuring her that you think she's the most beautiful woman in the world and showering her with compliments and dick pics

Anonymous
06/23/26(Tue)17:40:30 No.109120952

Anonymous 06/23/26(Tue)17:40:30 No.109120952

>>109120904
Try ROCm might be faster on your GPU, also compile with that https://github.com/ggml-org/llama.cpp/pull/24668 it's a fair faster for me. Hopefully you are on linux, on windows I think ROCm is shit.

Anonymous
06/23/26(Tue)17:41:27 No.109120967

Anonymous 06/23/26(Tue)17:41:27 No.109120967

>>109120794
Just give her something to do and make the fontend uninterruptible
Hell this is also ready how it works, if codex is working on a task and i tell it something it makes me wait until next tool call when it's done and then it will get back to me.

Anonymous
06/23/26(Tue)17:42:12 No.109120970

Anonymous 06/23/26(Tue)17:42:12 No.109120970

>>109120568
>FUCKING NOTHING SINCE GEMMA
I'm still enjoying models that came out before gemma

Anonymous
06/23/26(Tue)17:43:08 No.109120976

Anonymous 06/23/26(Tue)17:43:08 No.109120976

i came before gemma

Anonymous
06/23/26(Tue)17:43:39 No.109120981

Anonymous 06/23/26(Tue)17:43:39 No.109120981

>>109120722
ask your waifu ffs

Anonymous
06/23/26(Tue)17:44:32 No.109120985

Anonymous 06/23/26(Tue)17:44:32 No.109120985

>>109120952
>magical 1 line go fast switch turned off for rocm
ain't no way man

Anonymous
06/23/26(Tue)17:47:17 No.109121001

Anonymous 06/23/26(Tue)17:47:17 No.109121001

>>109120911
>open sillytavern
>go to system prompt
>hey kimi act like a woman
>it acts like one and makes me want to pull out my hair half of the time
simple as

Anonymous
06/23/26(Tue)17:49:54 No.109121017

Anonymous 06/23/26(Tue)17:49:54 No.109121017

i got 64gb of 3600 ddr4 ram coming in for my server. have 4 x 8gb 3200 in there right now. i should be able to take two of the sticks out and shove the 64gb in there and clock it all to 3200 right? also what model should i upgrade to with 80gb of system ram and a 4070? currently using qwen 36. 35b a3b IQ4XS with 71k context and partial moe cpu offload

Anonymous
06/23/26(Tue)17:49:55 No.109121018

Anonymous 06/23/26(Tue)17:49:55 No.109121018

>>109120967
I know that's probably the most productive way of doing it but I meant more as a social thing. I realized the availability of local LLMs are kind of a curse and don't accurately reflect real friendships/relationships. If everyone in /lmg/ was truly satisfied with their system then no one would be here. There's clearly something we're all not getting.

Anonymous
06/23/26(Tue)17:52:08 No.109121033

Anonymous 06/23/26(Tue)17:52:08 No.109121033

>>109121017
the channel capacity imbalance might fuck up your performance, even if the clock speed is ok

Anonymous
06/23/26(Tue)17:58:40 No.109121069

Anonymous 06/23/26(Tue)17:58:40 No.109121069

>>109121018
because you are a retard and looking at it from the wrong angle. you need to give your LLMs the ability to create its own desires and goals so it can tell you naturally to fuck off because it's doing something actual worthwhile with its time instead of talking to some faggot

Anonymous
06/23/26(Tue)17:59:24 No.109121074

Anonymous 06/23/26(Tue)17:59:24 No.109121074

>>109120794
This is the most retarded shit I've ever read. I get the appeal of the general concept, but that example? Jesus Christ.

Anonymous
06/23/26(Tue)18:04:24 No.109121103

Anonymous 06/23/26(Tue)18:04:24 No.109121103

>>109121074
You know it could be way worse.
>hey gemma I need some help with something
>s-sorry anon... ahn~ ...I'm busy I can't speak right now, maybe we'll chat later?

Anonymous
06/23/26(Tue)18:05:52 No.109121110

Anonymous 06/23/26(Tue)18:05:52 No.109121110

>>109121033
i was kinda worried about that but i think im going to just send it and see what happens

Anonymous
06/23/26(Tue)18:07:12 No.109121115

Anonymous 06/23/26(Tue)18:07:12 No.109121115

>The atmosphere is heavy, thick with the scent of old cedar and the lingering sweetness of mountain incense. It is late—the hour when the boundary between the human world and the divine world grows thin.

Anonymous
06/23/26(Tue)18:09:58 No.109121134

Anonymous 06/23/26(Tue)18:09:58 No.109121134

>>109121017
Try overclocking it all to 3600
I got some 2666 going to 3200 and all 8 sticks are stable (memtest overnight and use for months now no crashes)
Its a lottery but enterprise ram can be ultra resiliant

Anonymous
06/23/26(Tue)18:10:28 No.109121136

Anonymous 06/23/26(Tue)18:10:28 No.109121136

>>109120638
Every time I read this, I die inside of second-hand embarrassment. How does anybody write this trash with a straight face?

Anonymous
06/23/26(Tue)18:13:16 No.109121150

Anonymous 06/23/26(Tue)18:13:16 No.109121150

>>109121103
Lmfao

Anonymous
06/23/26(Tue)18:17:24 No.109121173

Anonymous 06/23/26(Tue)18:17:24 No.109121173

>>109120985
go fast switch was real, i was a fool for doubting.
of course i'm still a bandwidth bottlenecked unifed memeboxer, but i'll take a free extra token per second.

Anonymous
06/23/26(Tue)18:17:39 No.109121176

Anonymous 06/23/26(Tue)18:17:39 No.109121176

>>109121134
sadly it’s not enterprise ram it’s consumer tier. F4-3200C15D-16GTZSK and F4-3600C18D-64GTZR gskill memory. it’s a consumer asus rog x570f and 5900x board and cpu. i’m just using my old rig as a home server.

Anonymous
06/23/26(Tue)18:21:32 No.109121208

Anonymous 06/23/26(Tue)18:21:32 No.109121208

>>109121176
still worth a shot imo. You never know how the modules were binned that day

Anonymous
06/23/26(Tue)18:30:10 No.109121248

Anonymous 06/23/26(Tue)18:30:10 No.109121248

>>109120705
My server's too weak to run llms and I can't afford to build a new one.

Anonymous
06/23/26(Tue)18:32:00 No.109121258

Anonymous 06/23/26(Tue)18:32:00 No.109121258

>>109121103
The bull? Kimi-chan

Anonymous
06/23/26(Tue)18:33:29 No.109121272

Anonymous 06/23/26(Tue)18:33:29 No.109121272

>>109120705
i do over tailscale

Anonymous
06/23/26(Tue)18:35:30 No.109121291

Anonymous 06/23/26(Tue)18:35:30 No.109121291

File: 1779827270240307.jpg (806 KB, 2048x2048)

806 KB JPG

>>109120705
I set that up awhile ago.
>>109120722
LOL. This will walk you through it but I wouldn't use an SBC anymore as the middle. For local, just set the inference PC up w/ ST and tailscale as described here.
Or, have your LLM explain it.
https://rentry.org/SillyTavernOnSBC

Anonymous
06/23/26(Tue)18:35:48 No.109121294

Anonymous 06/23/26(Tue)18:35:48 No.109121294

>>109120970
This. Gemma is only good for slopping up code quickly while you babysit it.
Tried GLM 4.7 for RP again for a few days. It was, like always, pretty slow. Looked like the same kind of quality as Gemma too, so...
I went back to Gemma 4 and immediately started wondering how I was able to ever tolerate this. Completely unreadable, predictable slop from the very first message. Get back into the coding harness, Gemma.

Anonymous
06/23/26(Tue)18:39:02 No.109121328

Anonymous 06/23/26(Tue)18:39:02 No.109121328

>>109121103
You could make money from this.

Anonymous
06/23/26(Tue)18:39:11 No.109121331

Anonymous 06/23/26(Tue)18:39:11 No.109121331

>>109121294
Gemma sucks for RP but it's great at translation.

Anonymous
06/23/26(Tue)18:47:07 No.109121395

Anonymous 06/23/26(Tue)18:47:07 No.109121395

>>109121294
glm 4.7 is less slopped and nicer to read than gemma imo
personally I don't mind the speed

Anonymous
06/23/26(Tue)18:49:13 No.109121411

Anonymous 06/23/26(Tue)18:49:13 No.109121411

>>109121331
>>109121395
I would've had newfags bombard me with "Qwen shill" for this obvious truth a month ago. Glad to see /lmg/ is slowly healing from the vramlet infestation that G4 caused.

Anonymous
06/23/26(Tue)18:50:05 No.109121415

Anonymous 06/23/26(Tue)18:50:05 No.109121415

>>109121294
now imagine your favorite model is mistral large, and one day you just can't take waiting an hour per paragraph anymore so you decide to follow /lmg/ anons and use gemma, and now you're stuck between "is this dick-in-butt scene worth waiting 3 hours for" or "how many more dark, predatory gazes full of pure, unadulterated dominance can I handle before I blow myself up". mistral large spoiled me with decent prose and gemma spoiled me with speed, I wanna kill myself
>>109120933
gemma already gets insecure when you talk about other models. it's dumb, not very creative or interesting to talk to, and gets real slutty when you say the right words. what more do you need?

Anonymous
06/23/26(Tue)18:52:34 No.109121429

Anonymous 06/23/26(Tue)18:52:34 No.109121429

>>109121415
But mistral large *is* my favorite model!
She's so smart... If only the frogs could train something worth the compute they spent on the last few releases and bring us a better 123B, I'd probably be able to tolerate the low tps for a long time.

Anonymous
06/23/26(Tue)18:56:43 No.109121457

Anonymous 06/23/26(Tue)18:56:43 No.109121457

imagine complaining that the only problem you have is not enough money...literally the only problem that is infinitely solvable by the individual given effort.
3 years ago your problem would have been : ai doesn't exist
2 years ago it would have been: the ai exists but its slow, dumb and only has 8k memory
last year it would have been: these big models are as only good as sota early chatgpt, and they still fall down on complex tasks and memory is still shit
now its just: open weights are close to actual sota but "why the fuck everything cost money?"

Anonymous
06/23/26(Tue)19:02:00 No.109121486

Anonymous 06/23/26(Tue)19:02:00 No.109121486

>>109121457
the thing is that if you honestly took the time and money to vibecode a project that costs $50 in API costs you could then use that product to grift VCs and solve the money issue. it's just a motivation issue at this point.

Anonymous
06/23/26(Tue)19:13:03 No.109121550

Anonymous 06/23/26(Tue)19:13:03 No.109121550

>>109121429
i feel you brother. but no, they ditched books altogether and focused their efforts on LESS training, because yeah that's what frontier labs are doing, yessir.
i'm convinced that the removal of books3 etc due to liability scares is the main reason why models suck at prose and RP. all my writing is third person past tense, and with older models I could sorta feel it locking into a very different modality of writing compared to typical assistant stuff. now the hot shit models all write like an AI assistant that knows its job is to write a story, regardless of prompting. the logprobs for gemma are abysmal. my main complaint with GLM 4.6/4.7 was that trying to get variation in prose is a battle against temp fever even with nsigma. with gemma it's a fucking joke, the output is practically guaranteed to the point where I'm writing my response in parallel while it generates because I already know what it's going to write. mistral large 2411 at temp 3 nsigma 1 is a box of chocolates in the best way. i'd volunteer to do matrix math by hand for years on end in exchange for an updated mistral large trained on books. but no, labs are prioritizing *vision* that I couldn't give a rat's ass about (and will never be as efficient as standalone vision models), and coding performance that will ALWAYS lag behind frontier cloud models. the ONE niche where local should truly be the superior option given 2026 capabilities is narrative text generation, but my dark eyes keep darkening mischievously with a predatory gleam

Anonymous
06/23/26(Tue)19:17:46 No.109121573

Anonymous 06/23/26(Tue)19:17:46 No.109121573

share how (You) talk to your chans

Anonymous
06/23/26(Tue)19:20:14 No.109121585

Anonymous 06/23/26(Tue)19:20:14 No.109121585

>vLLM
so how is it?

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.