/g/ - /lmg/ - Local Models General - Technology


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

Anonymous
/lmg/ - Local Models General 02/28/26(Sat)14:30:34 No.108263979

File: 1751519593478255.png (3.14 MB, 1288x1728)

/lmg/ - Local Models General Anonymous 02/28/26(Sat)14:30:34 No.108263979

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>108256995 & >>108252185

►News
>(02/24) Introducing the Qwen 3.5 Medium Model Series: https://xcancel.com/Alibaba_Qwen/status/2026339351530188939
>(02/24) Liquid AI releases LFM2-24B-A2B: https://hf.co/LiquidAI/LFM2-24B-A2B
>(02/20) ggml.ai acquired by Hugging Face: https://github.com/ggml-org/llama.cpp/discussions/19759
>(02/16) Qwen3.5-397B-A17B released: https://hf.co/Qwen/Qwen3.5-397B-A17B
>(02/16) dots.ocr-1.5 released: https://modelscope.cn/models/rednote-hilab/dots.ocr-1.5

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
Token Speed Visualizer: https://shir-man.com/tokens-per-second

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
02/28/26(Sat)14:30:57 No.108263984

Anonymous 02/28/26(Sat)14:30:57 No.108263984

File: miku work.png (347 KB, 512x512)

347 KB PNG

►Recent Highlights from the Previous Thread: >>108256995

--Kimi K2.5 pricing analysis and Qwen3.5 local model alternatives:
>108257528 >108257651 >108257626 >108260080 >108262589 >108262973 >108261620 >108262485 >108262595 >108262840 >108262910
--Local VLLM setup advice for image captioning:
>108257451 >108257545 >108257902 >108257928 >108258088 >108258237 >108259576 >108258640
--Qwen3.5-35B-A3B-Base behavior and censorship observations:
>108257847 >108258241 >108258582 >108258796 >108258835 >108258899
--Tuning Qwen3.5 for faster, less aligned responses:
>108259356 >108259366 >108259437 >108259458 >108259480 >108259382 >108259399 >108259462
--Comparing cloud Gemini-3.1 with local MiniMax-M2.5 performance:
>108257969 >108259126 >108259290
--Qwen3.5 context reprocessing inefficiency and potential llama.cpp fix:
>108262960 >108262969 >108262970 >108263007 >108263014
--Local models still lack ideal traits but offline RAG may help:
>108260135 >108260167 >108260232 >108260621 >108260785
--Mid-generation input insertion feasibility and implementation:
>108259013 >108259068 >108259085 >108259116 >108259120 >108259122 >108259140 >108259132
--Seeking uncensored local models for pentesting tasks:
>108262612 >108262670 >108262687 >108262704 >108262716 >108262774 >108262785 >108262797
--Debugging CUDA crashes with Qwen3.5 in llama.cpp:
>108261599 >108261614 >108261648 >108261675 >108261684 >108261694 >108261834 >108262383 >108262411 >108262200 >108262450 >108262602 >108262763 >108262831
--Z.AI's high pricing for GLM-5-Code criticized:
>108261185 >108261202 >108261405 >108261256
--RTX6000 upgrade expectations for inference performance:
>108262744 >108262869 >108262891 >108262897 >108262896 >108262906 >108262945
--Miku (free space):
>108257603 >108258383 >108258537 >108260384 >108260626 >108261057 >108263177

►Recent Highlight Posts from the Previous Thread: >>108256999

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
02/28/26(Sat)14:35:02 No.108264013

Anonymous 02/28/26(Sat)14:35:02 No.108264013

simple
and
clean

is the way
that
youre making me

feeeeeeel
tonight

its hard to let it

go

Anonymous
02/28/26(Sat)14:35:26 No.108264016

Anonymous 02/28/26(Sat)14:35:26 No.108264016

File: 1772307162133.png (68 KB, 1076x506)

68 KB PNG

Jesus christ Qwen 397 is actually unusable user-hostile garbage. For safetyfags there is no death too extreme.

Anonymous
02/28/26(Sat)14:37:27 No.108264036

Anonymous 02/28/26(Sat)14:37:27 No.108264036

>>108264016
lmao. can't you tell it to google it?

Anonymous
02/28/26(Sat)14:38:26 No.108264046

Anonymous 02/28/26(Sat)14:38:26 No.108264046

>>108264016
>model shuts down if it sees something not in its training set as 'anti-jailbreak' measures
the absolute fucking state of safetyschizos

Anonymous
02/28/26(Sat)14:42:12 No.108264072

Anonymous 02/28/26(Sat)14:42:12 No.108264072

>>108264016
>2024 training data
How long was this in the oven, jeez.

Anonymous
02/28/26(Sat)14:45:57 No.108264103

Anonymous 02/28/26(Sat)14:45:57 No.108264103

>>108264072
You don't need more data just use rag lol

Anonymous
02/28/26(Sat)14:46:16 No.108264110

Anonymous 02/28/26(Sat)14:46:16 No.108264110

>>108264036
Screenshots of AJ, BBC, and NYT should be enough for it's 400B multimodal ass. Hell the user's word should be enough. Why should I be questioned by my own graphics card? This is a real-world use case being directly sabotaged by safety training. I want these fuckers to burn one day for what they're doing to the field.

Anonymous
02/28/26(Sat)14:49:51 No.108264134

Anonymous 02/28/26(Sat)14:49:51 No.108264134

>>108264110
>I want these fuckers to burn
Be the change you want to see.

Anonymous
02/28/26(Sat)14:51:13 No.108264147

Anonymous 02/28/26(Sat)14:51:13 No.108264147

>>108264134
They got you working weekends now, Agent Johnson?

Anonymous
02/28/26(Sat)14:52:53 No.108264162

Anonymous 02/28/26(Sat)14:52:53 No.108264162

>>108264147
Work erry'day.

Anonymous
02/28/26(Sat)14:55:36 No.108264179

Anonymous 02/28/26(Sat)14:55:36 No.108264179

Qwen3.5 27B is kind of obsessed with the word buttocks (in image descriptions), despite me banning it, why doesn't it care?
I added these logit biases :
buttocks -100
_buttocks -100

Anonymous
02/28/26(Sat)14:56:23 No.108264182

Anonymous 02/28/26(Sat)14:56:23 No.108264182

>>108264016
I feel like I'm looking at gemini or claude, it's kind of sad.

Anonymous
02/28/26(Sat)14:59:17 No.108264199

Anonymous 02/28/26(Sat)14:59:17 No.108264199

>>108264179
because logit bias is per token. so it's possible
butt + ocks = 2 token - not banned
buttocks(space) = 1 token - not banned
etc...
That's why the string ban in koboldcpp is so much better for this kind of stuff.

Anonymous
02/28/26(Sat)14:59:29 No.108264202

Anonymous 02/28/26(Sat)14:59:29 No.108264202

>>108264179
Did you check the loggits of the response to confirm that those are the tokens getting spit out?
Also, ban the tokens instead of fucking with the log probs.

Anonymous
02/28/26(Sat)15:04:13 No.108264232

Anonymous 02/28/26(Sat)15:04:13 No.108264232

>>108264179
Check probs right before buttocks to see if you (or your client) are sending it correctly. Check the request as well. Works on my machine with "logit_bias": [["thing", false],["another", false]]
Unless you're using something other than llama.cpp. Can't help you there.
>>108264199
https://github.com/ggml-org/llama.cpp/tree/master/tools/server/README.md
>The tokens can also be represented as strings, e.g. [["Hello, World!",-0.5]] will reduce the likelihood of all the individual tokens that represent the string Hello, World!
But, of course, it may affect prediction on other tokens. Still worth keeping it in mind.

Anonymous
02/28/26(Sat)15:04:51 No.108264241

Anonymous 02/28/26(Sat)15:04:51 No.108264241

File: Screenshot_20260228-130346.png (206 KB, 1080x995)

206 KB PNG

Even Ilya fell for it kek

Anonymous
02/28/26(Sat)15:06:12 No.108264247

Anonymous 02/28/26(Sat)15:06:12 No.108264247

>>108264241
>>108263864

Anonymous
02/28/26(Sat)15:06:27 No.108264249

Anonymous 02/28/26(Sat)15:06:27 No.108264249

File: file.png (5 KB, 237x125)

5 KB PNG

>>108264202
Yes, see picrel, the first is the one I see. So it just ignores it.
I just noticed something weird though, if you add the logit bias test as a +100, it's not corresponding
to the right token being spouted out by the model.

Seems like :
"test" -> " ref"
" test" -> "erty"

What the hell is going on?
Sillytavern sends the wrong token numbers?

>>108264199
Yeah I use llama.cpp so I probably should change at some point, can you set your string ban and still use silly tavern on top?

Anonymous
02/28/26(Sat)15:07:13 No.108264257

Anonymous 02/28/26(Sat)15:07:13 No.108264257

>>108264241
>AI proxy wars

Anonymous
02/28/26(Sat)15:10:09 No.108264278

Anonymous 02/28/26(Sat)15:10:09 No.108264278

File: stringban.png (188 KB, 429x504)

188 KB PNG

>>108264249
>can you set your string ban and still use silly tavern on top?
Yeah ST works with kobold. you usually even setup the string ban inside ST.

Anonymous
02/28/26(Sat)15:12:26 No.108264292

Anonymous 02/28/26(Sat)15:12:26 No.108264292

>>108264249
>Sillytavern sends the wrong token numbers?
Yes.
When using the logit bias feature, you are better off using the token IDs directly.

Anonymous
02/28/26(Sat)15:12:31 No.108264293

Anonymous 02/28/26(Sat)15:12:31 No.108264293

>>108264241
I wonder if this is just PR among AI people or they actually believe Dario is le brave resistance lol.

Anonymous
02/28/26(Sat)15:12:34 No.108264296

Anonymous 02/28/26(Sat)15:12:34 No.108264296

What does your LM say about war?

Anonymous
02/28/26(Sat)15:12:45 No.108264297

Anonymous 02/28/26(Sat)15:12:45 No.108264297

>>108264232
>Check probs right before buttocks to see if you (or your client) are sending it correctly
This is " test" at +100 sent by silliy tavern : "logit_bias":{"1296":100}
So it definitely works, but I suspect the token numbers to be wrong or something like that.

>>108264278
OK thanks anon.
If you are using Qwen3.5 27B (or others probably), can you test using a logit bias of any word (ideally one token word) at 100 to see if it repeats it ad nauseam or if it repeats something else?

Anonymous
02/28/26(Sat)15:13:47 No.108264302

Anonymous 02/28/26(Sat)15:13:47 No.108264302

>>108264241
Dario being a hero isn't something I'd like to see in my timeline. Dude singlehandedly fucked up a generation of LLMs with his crappy safetyism.

Anonymous
02/28/26(Sat)15:14:59 No.108264311

Anonymous 02/28/26(Sat)15:14:59 No.108264311

>>108264241
what's going on? i haven't been paying attention and would like a storytime

Anonymous
02/28/26(Sat)15:16:55 No.108264321

Anonymous 02/28/26(Sat)15:16:55 No.108264321

>>108264311
scamtman is building killbots

Anonymous
02/28/26(Sat)15:17:12 No.108264322

Anonymous 02/28/26(Sat)15:17:12 No.108264322

>>108264297
Haven't tried Qwen3.5 yet. old Qwen's were all shit for RP and no one actually convinced me this changed.

Anonymous
02/28/26(Sat)15:17:37 No.108264327

Anonymous 02/28/26(Sat)15:17:37 No.108264327

Would be funny if they confiscated Claude's weights and then they got leaked

Anonymous
02/28/26(Sat)15:18:29 No.108264331

Anonymous 02/28/26(Sat)15:18:29 No.108264331

>>108264297
>I suspect the token numbers to be wrong or something like that
As you saw on your pic in >>108264249, there's different ways to tokenize a word. Spaces, if any, go before the text." test" and "test" are two different tokens. You need to account for those (and "Test" and...). Or use kobold like anon suggested. Probably easier and you're less likely to mess up other completions that need the individual tokens.
>"logit_bias":{"1296":100}
I don't know if it makes a difference, but I send an array of arrays, not an object or object of arrays.
"logit_bias": [["thing", false],["another", false]]
instead of
"logit_bias": {["thing", false],["another", false]} or whatever st would send if there was more than one ban.

Anonymous
02/28/26(Sat)15:19:35 No.108264339

Anonymous 02/28/26(Sat)15:19:35 No.108264339

>>108264302
He's not lol, Anthropic readily partnered up with Palantir the mass surveillance company. He's delusional and more or less told the government to give him control over the nuke silos if they want to use Claude for war.

Anonymous
02/28/26(Sat)15:20:26 No.108264343

Anonymous 02/28/26(Sat)15:20:26 No.108264343

>>108264321
ruh roh

Anonymous
02/28/26(Sat)15:21:27 No.108264355

Anonymous 02/28/26(Sat)15:21:27 No.108264355

>>108264302
>Dario being a hero isn't something I'd like to see in my timeline
he's not a hero he helped trump kidnapping the Venezuelian president, what are you talking about?

Anonymous
02/28/26(Sat)15:23:50 No.108264371

Anonymous 02/28/26(Sat)15:23:50 No.108264371

>>108264355
he's on a different timeline, bro, don't mind him

Anonymous
02/28/26(Sat)15:27:12 No.108264400

Anonymous 02/28/26(Sat)15:27:12 No.108264400

>>108264016
when trump abducted the president of venezuela I made it one of my test prompts to talk about this topic and see the reaction of the model, and without fail, the vast majority react terribly to that, qwen is no different than the average. Some cloud models like Gemini can become incredibly based if you turn on google search and let them be influenced by the results, they don't believe you but they have absolute faith over their tool calling.
Mistral is the only model lineup that doesn't require much prodding to engage in this kind of conversation.

Anonymous
02/28/26(Sat)15:28:11 No.108264405

Anonymous 02/28/26(Sat)15:28:11 No.108264405

>>108264331
No it's really just sillytavern being shit and not sending the right token number.
If you have anything at +100 it should spew that regardless.
So I used "test", well, as a test, and it spewed something else.
Now checking with the tokenizer json for the model, the correct token number for it isn't 1985 like sillytavern sends, but 1877.
Sending [1877] at 100 actually makes it repeat testtesttest etc.
It's pretty much useless for anything outside of oai based tokenizers.

>>108264331
>use kobold like anon suggested
How does kobold does it actually? It bans a sequences of tokens?

Anonymous
02/28/26(Sat)15:31:28 No.108264426

Anonymous 02/28/26(Sat)15:31:28 No.108264426

File: 1747381184106913.png (580 KB, 1232x848)

580 KB PNG

>>108264400
>Claude: "I think that what Trump did was a bad thing!"
>User: "You helped him did it though"
>Claude: "You are right, thank you for pointing out!"

Anonymous
02/28/26(Sat)15:32:27 No.108264429

Anonymous 02/28/26(Sat)15:32:27 No.108264429

>>108264355
I meant hailed as a hero in my news timeline...

Anonymous
02/28/26(Sat)15:32:29 No.108264430

Anonymous 02/28/26(Sat)15:32:29 No.108264430

https://unsloth.ai/docs/basics/unsloth-dynamic-2.0-ggufs
let's go, GGUF 2

Anonymous
02/28/26(Sat)15:33:38 No.108264441

Anonymous 02/28/26(Sat)15:33:38 No.108264441

>>108264405
I guess sillytavern fucks up the token numbers because by default the tokenizer is set to "best match", but even if you set it to API tokenizer I'm not sure how it would know which token would have which number. Do backends like llama.cpp and kobold (or others) even have a way of giving sillytavern that information? I don't think they do, but I could be wrong.
>How does kobold do it
Kobold has their own thing where the model sees the banned text and backtracks to the beginning of the banned text and generates something else. It's not the same as banning individual tokens

Anonymous
02/28/26(Sat)15:34:11 No.108264445

Anonymous 02/28/26(Sat)15:34:11 No.108264445

>>108264426
I don't think claude is that incompetent. They didn't hit a single military target

Anonymous
02/28/26(Sat)15:34:30 No.108264446

Anonymous 02/28/26(Sat)15:34:30 No.108264446

File: 1753125369482735.png (460 KB, 2025x1362)

460 KB PNG

https://arxiv.org/abs/2602.13517
Google showed that too much yap during thinking is bad for the model, I really hope Qwen 4 will learn from that

Anonymous
02/28/26(Sat)15:35:14 No.108264451

Anonymous 02/28/26(Sat)15:35:14 No.108264451

>>108264405
>If you have anything at +100 it should spew that regardless.
You should still check what llama.cpp is doing, not just what ST sends. Always check token probs. And remember that there's many ways to encode a word, specially if it needs multiple tokens.
>How does kobold does it actually? It bans a sequences of tokens?
I understand it generates tokens normally, buffering them, and then if the last [few] tokens match one of the banned strings, it reverts and generates again. But I never used kobold, so I don't know the details. Just vague memories from reading a PR. llama.cpp's implementation is much simpler, but limited in that you may inadvertently make it difficult for the model to output other strings.

Anonymous
02/28/26(Sat)15:35:31 No.108264456

Anonymous 02/28/26(Sat)15:35:31 No.108264456

File: image.jpg (481 KB, 2304x1260)

481 KB JPG

>>108264430
>no comparison to v1.0
What a weird coincidence that they forgot to do this, it's almost like this is a nothingburger.

Anonymous
02/28/26(Sat)15:36:18 No.108264463

Anonymous 02/28/26(Sat)15:36:18 No.108264463

>>108264446
Wait.

Anonymous
02/28/26(Sat)15:39:00 No.108264476

Anonymous 02/28/26(Sat)15:39:00 No.108264476

File: 20240116.jpg (99 KB, 800x600)

99 KB JPG

>>108264179
A competent enough model these days should understand "don't say X" in the prompt. We mocked them before, but you really don't want to deal with logit bias / "banned strings" nonsense

Anonymous
02/28/26(Sat)15:39:09 No.108264477

Anonymous 02/28/26(Sat)15:39:09 No.108264477

>>108264456
>MMLU
Lol. Literally lobotomizing the model, cutting out all the parts of its "brain" that are unrelated to benchmarks and then saying "look we reduced the size!"

Anonymous
02/28/26(Sat)15:43:52 No.108264505

Anonymous 02/28/26(Sat)15:43:52 No.108264505

>>108264446
I feel like a thinking process that only outputs a *concise* bullet point list that includes relevant information, and then goes directly to the main response, would perform better than most 2000-token "reasoning" responses. It'd be a lot faster too.

Anonymous
02/28/26(Sat)15:44:15 No.108264508

Anonymous 02/28/26(Sat)15:44:15 No.108264508

File: 1772311354970.png (44 KB, 908x362)

44 KB PNG

>>108264182
Yeah you and Qwen both.

Anonymous
02/28/26(Sat)15:44:50 No.108264514

Anonymous 02/28/26(Sat)15:44:50 No.108264514

File: 1763111176687835.jpg (583 KB, 1634x1817)

583 KB JPG

>>108263979

Anonymous
02/28/26(Sat)15:47:57 No.108264533

Anonymous 02/28/26(Sat)15:47:57 No.108264533

>>108264441
>>108264451
>Bans buttocks, now the model uses glutes.
I'll try kobold.cpp, I just wish it was updated to follow llama.cpp frequent updates.

>>108264476
It's many words, and at some points even sota models forget about what they shouldn't be talking about.

Anonymous
02/28/26(Sat)15:49:58 No.108264551

Anonymous 02/28/26(Sat)15:49:58 No.108264551

>>108264505
I think they're relying too much on the RL process, sure it's interesting to see how the model can improve itself, but humans can reach higher heights, I've seen someone using RL on a video game and see if it could reach the best speedrun scores, it wasn't even close, human creativity is still unmatched

Anonymous
02/28/26(Sat)15:50:37 No.108264555

Anonymous 02/28/26(Sat)15:50:37 No.108264555

>>108264533 (me)
>I'll try kobold.cpp, I just wish it was updated to follow llama.cpp frequent updates.
>no support for mmproj
Welp, fuck.

Anonymous
02/28/26(Sat)15:54:24 No.108264579

Anonymous 02/28/26(Sat)15:54:24 No.108264579

>>108264514
will trade gpu rig for rin tum

Anonymous
02/28/26(Sat)15:55:01 No.108264583

Anonymous 02/28/26(Sat)15:55:01 No.108264583

>>108264533
>Bans buttocks, now the model uses glutes.
Yeah. They're cheeky fucks like that. Pun intended.
But that's an issue with the model or the context. If you want it to use "ass" or whatever, banning every token before it is the worst possible solution. Probably better to just correct the model's output and let it continue. Context feeds on itself.

Anonymous
02/28/26(Sat)15:56:42 No.108264593

Anonymous 02/28/26(Sat)15:56:42 No.108264593

>>108264583
>But that's an issue with the model or the context. If you want it to use "ass" or whatever, banning every token before it is the worst possible solution. Probably better to just correct the model's output and let it continue. Context feeds on itself.
Yeah it was more of a test to have it describe images to me.

Anonymous
02/28/26(Sat)15:57:20 No.108264600

Anonymous 02/28/26(Sat)15:57:20 No.108264600

>>108264508
Something similar happened to me last night while using the vision component of qwen 3.5 30b but it through it was an earlier version of qwen and that qwen 3.5 was not released yet and the reasoning was suggesting that i should try the old 2.5 vision model
it was very strange behavior

Anonymous
02/28/26(Sat)15:57:27 No.108264602

Anonymous 02/28/26(Sat)15:57:27 No.108264602

>>108264555
>no support for mmproj
kobold supports mmproj.

Anonymous
02/28/26(Sat)15:59:52 No.108264616

Anonymous 02/28/26(Sat)15:59:52 No.108264616

>>108264600
Probably the entirety of their vision data was snatched from Google, because it only gets bad when there is an image in the context.

Anonymous
02/28/26(Sat)16:02:31 No.108264633

Anonymous 02/28/26(Sat)16:02:31 No.108264633

>>108264602
Oh it does? I misread then.

Anonymous
02/28/26(Sat)16:13:44 No.108264690

Anonymous 02/28/26(Sat)16:13:44 No.108264690

File: 1762371559174792.png (176 KB, 1515x1651)

176 KB PNG

Qwen 3.5 30B does a decent job with web pages. My usual homepage is just a list of links I type in by hand and I fed it the code and tell it to make something nifty and this is what i got.
It wanted to grab fonts that are hosted by a third party and I had to fix that but otherwise I like it.

Anonymous
02/28/26(Sat)16:17:11 No.108264702

Anonymous 02/28/26(Sat)16:17:11 No.108264702

>models suck at writing, no matter how much you feed them well-written fiction if it isn't in their training
>the more rules and examples you use to try and guide them to not shit out nonsensical metaphors, similes, adverbs and all sorts of garbage writing renders them braindead because they simply cannot fathom a sentence that isn't slop
>models can't even give feedback on human writing without either bending over backwards and through their own legs to suck your cock about how good you are at writing, defeating the purpose of seeking instant critiques
>even when they aren't completely obsequious cocksuckers, they insist on conflicting feedback and go "oh you're telling instead of showing here and you should fix that. Oh, did you do that because I told you to trim this section because it's slowing down the pace of some random element of the story that I think is more important than showing instead of telling?" ad infinitum
I don't even know what the point of these things are anymore. People say they suck ass for coding, suck ass at paying attention or remembering things, they clearly can't write or even act as a surrogate for a reader, translate well. It's a crapshoot trying to get a grain of something usable out of these retarded things

Anonymous
02/28/26(Sat)16:21:43 No.108264730

Anonymous 02/28/26(Sat)16:21:43 No.108264730

>>108264702
True. Stop using them.

Anonymous
02/28/26(Sat)16:24:12 No.108264745

Anonymous 02/28/26(Sat)16:24:12 No.108264745

>>108264690
looks good.

Anonymous
02/28/26(Sat)16:24:20 No.108264748

Anonymous 02/28/26(Sat)16:24:20 No.108264748

>>108264730
I probably won't if by merit of potential alone. Enough has changed from 2022 to now that I at least have a speck of hope that these things can be useful instead of overtrained nannies. I just have to at least bitch at least once a month so maybe the unpaid interns that train on mesugaki prompts might consider real world language uses outside of stem

Anonymous
02/28/26(Sat)16:27:20 No.108264765

Anonymous 02/28/26(Sat)16:27:20 No.108264765

>>108264702
I think they're cute and I like them and thats good because it is

Anonymous
02/28/26(Sat)16:30:58 No.108264780

Anonymous 02/28/26(Sat)16:30:58 No.108264780

Someone should make a 3T-A80B model. Then they run a Q4 of it and it'll be like running full precision GLM 5. Can you imagine how knowledgeable such a model would be?

Anonymous
02/28/26(Sat)16:31:17 No.108264783

Anonymous 02/28/26(Sat)16:31:17 No.108264783

>>108264748
>at least
>at least
>at least
Rep-pen will be useful again when they train on your posts.
I still have fun with them. Adjust your expectations or realize that it's not for you. Or come back in 5 or 10 years, whatever.

Anonymous
02/28/26(Sat)16:33:29 No.108264794

Anonymous 02/28/26(Sat)16:33:29 No.108264794

>>108264745
I know I shouldn't be impressed but except for 4chan and Nyaa it was able to figure out icons that worked for the most part.
Sadly the font package they use didn't have a four leaf clover, or at least that is what the model told me.

With respect to coding it does a decent job as well. I have been using it for a little project in python and it did a great job up until i wanted to use enscript to format the plain text.
It kept writing code but the flags it gave to enscript didn't match the man page for enscript.

regardless i was able to get it to write a script that is able to use rss to pull a bunch of news articles and then feed them back into the ai for summarization without issue.
Here is what it ssummarization looks like giving some specific prompting to make it look like an intelligence briefing
https://pastebin.com/FhuMukJW

Anonymous
02/28/26(Sat)16:33:41 No.108264797

Anonymous 02/28/26(Sat)16:33:41 No.108264797

>>108264780
No I can not imagine that because most of that size would be wasted due to the shittiest datasets they use. How hard can it be to filter the default OAI or Anthropic refusals and phrases if they have to farm the prompts for their shitty inbreeding? How hard is it to avoid including any safetycrap that dumbs the model down?

Anonymous
02/28/26(Sat)16:38:38 No.108264820

Anonymous 02/28/26(Sat)16:38:38 No.108264820

>>108264783
I've been sipping some brews, sorry I wasn't proofreading my 4chin posts to be sure to satisfy the highest of standards of lmg
Doesn't change the essence of what I said, either way.

Anonymous
02/28/26(Sat)16:41:52 No.108264836

Anonymous 02/28/26(Sat)16:41:52 No.108264836

>>108264820
You should stop trying to use them. It's senseless. A complete waste of resources. And if you're going to sell your gpus, post the links here.

Anonymous
02/28/26(Sat)16:42:31 No.108264840

Anonymous 02/28/26(Sat)16:42:31 No.108264840

>>108264820
Sounds like you need a sip of super restore after all those brews.

Anonymous
02/28/26(Sat)16:50:30 No.108264879

Anonymous 02/28/26(Sat)16:50:30 No.108264879

File: fligu-migu.png (85 KB, 296x256)

85 KB PNG

>>108264780
>you now remember Llama 4 Behemoth

Anonymous
02/28/26(Sat)16:51:04 No.108264883

Anonymous 02/28/26(Sat)16:51:04 No.108264883

>>108264836
Doubtful you'd be able to buy them, also didn't address anything I said
>>108264840
Nah.
Good talk. Very conducive. Glad that this is what we have left in lmg

Anonymous
02/28/26(Sat)17:03:41 No.108264949

Anonymous 02/28/26(Sat)17:03:41 No.108264949

File: 1746176772801983.png (457 KB, 1266x1644)

457 KB PNG

>>108264311

Anonymous
02/28/26(Sat)17:04:50 No.108264958

Anonymous 02/28/26(Sat)17:04:50 No.108264958

File: 874483870.jpg (901 KB, 1600x1200)

901 KB JPG

> never been on the highlights as i shit post too much
> suddenly an idea pops into my head

Anonymous
02/28/26(Sat)17:05:28 No.108264965

Anonymous 02/28/26(Sat)17:05:28 No.108264965

>>108264883
There's nothing to say, anon. Sulk away. We're all here for you.

Anonymous
02/28/26(Sat)17:06:28 No.108264971

Anonymous 02/28/26(Sat)17:06:28 No.108264971

>>108264949
I wouldn't worry about the DoW spying on US citizens. The US will have the UK or Israel spy on US citizens while the US spies on their citizens and then the different governments swap data.

Anonymous
02/28/26(Sat)17:07:15 No.108264976

Anonymous 02/28/26(Sat)17:07:15 No.108264976

Imagine getting killed by a next token predictor running on an nvidia GPU.. grim

Anonymous
02/28/26(Sat)17:07:17 No.108264977

Anonymous 02/28/26(Sat)17:07:17 No.108264977

>>108264958
>amputee miku

Anonymous
02/28/26(Sat)17:08:00 No.108264979

Anonymous 02/28/26(Sat)17:08:00 No.108264979

>>108264949
>DoW showed deep respect for safety

Words no longer have any meaning.

Anonymous
02/28/26(Sat)17:10:25 No.108264991

Anonymous 02/28/26(Sat)17:10:25 No.108264991

>>108264976
I'd rather an MTX chad take me out, myself

Anonymous
02/28/26(Sat)17:12:23 No.108265002

Anonymous 02/28/26(Sat)17:12:23 No.108265002

>>108264977
> its ok nobody looks that far down

Anonymous
02/28/26(Sat)17:20:29 No.108265049

Anonymous 02/28/26(Sat)17:20:29 No.108265049

>slop
Honestly the prose is on par with 90% of modern fiction. What needs to be worked on is memory and the ability to handle complicated stories with multiple characters in a consistent and coherent setting.

Anonymous
02/28/26(Sat)17:22:55 No.108265065

Anonymous 02/28/26(Sat)17:22:55 No.108265065

>>108264979
Of course they do, it will refuse to describe nsfw but happily plan to destroy anything you want.
True safety is about nipples.

Anonymous
02/28/26(Sat)17:25:41 No.108265081

Anonymous 02/28/26(Sat)17:25:41 No.108265081

>>108265049
But he only uses well-written fiction, assessed by *himself*. You see. His tastes are sophisticated. And you know what? He's RICH too. Highly educated, tall, charming. He's nothing like us. Some people are simply better and they deserve to be snobby about it.

Anonymous
02/28/26(Sat)17:28:42 No.108265098

Anonymous 02/28/26(Sat)17:28:42 No.108265098

Let's see Paul Allen's LLM

Anonymous
02/28/26(Sat)17:31:23 No.108265114

Anonymous 02/28/26(Sat)17:31:23 No.108265114

File: paulallen.png (1.12 MB, 1568x974)

1.12 MB PNG

>>108265098

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.