/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 05/21/26(Thu)14:47:25 No.108875320

File: citrus sharp.jpg (235 KB, 1024x1024)

235 KB JPG

/lmg/ - Local Models General Anonymous 05/21/26(Thu)14:47:25 No.108875320 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>108868875 & >>108863550

►News
>(05/20) Cohere releases Command A+ 218B-A25B: https://cohere.com/blog/command-a-plus
>(05/16) llama + spec: MTP Support #22673 merged: https://github.com/ggml-org/llama.cpp/pull/22673
>(05/08) KSA-4B-base released: https://hf.co/OpenOneRec/KSA-4B-base
>(05/07) model: Add Mimo v2.5 model support (#22493) merged: https://github.com/ggml-org/llama.cpp/pull/22493
>(05/06) Zyphra releases ZAYA1-8B, an AMD-trained MoE model: https://zyphra.com/post/zaya1-8b

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
Token Speed Visualizer: https://shir-man.com/tokens-per-second

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
05/21/26(Thu)14:47:47 No.108875323

Anonymous 05/21/26(Thu)14:47:47 No.108875323

File: what's in the box.jpg (235 KB, 1536x1536)

235 KB JPG

►Recent Highlights from the Previous Thread: >>108868875

--DeepSeek V4 roleplay regressions and pervasive AI-isms across models:
>108869023 >108869049 >108869083 >108869243 >108869947 >108869120 >108870242 >108870260 >108870319 >108869179 >108869525 >108869471
--Evaluating Cohere Command models for RP, performance, and censorship:
>108870033 >108870038 >108870053 >108870074 >108870068 >108870078 >108870198 >108870350 >108870378 >108870563 >108870679 >108872076 >108872096 >108870692 >108870974 >108871089 >108871136 >108871489 >108870986 >108870581 >108871307
--Debating tokenizer flaws after Gemma 4 fails a counting task:
>108872650 >108872718 >108872790 >108872892 >108873240 >108873249 >108874551 >108872838
--Evaluating Chain of Thought patterns and iterative drafting in roleplay models:
>108871628 >108871649 >108871801 >108871823 >108871836 >108871894 >108871929 >108871958 >108872245 >108872256 >108871892
--Desire for Gemma 4 124B dense and quantization impact on 31B:
>108869645 >108869650 >108869694 >108869911 >108869918 >108869956 >108870344 >108870349
--Skepticism over Qwen3.7-Max benchmarks and parameter efficiency:
>108873786 >108873801 >108873816 >108873856 >108873875 >108873941
--Meta serving legal notice to Heretic project over Llama derivatives:
>108873928 >108873986 >108874005 >108874104
--Debating AI translation accuracy and the necessity of human verification:
>108869186 >108869193 >108869202 >108869209 >108869217 >108869238 >108869251 >108869227 >108869257 >108869423 >108869498
--Anon creates Civitai to Hugging Face tool using Qwen 3.5:
>108875036 >108875067 >108875115 >108875184
--Logs:
>108869179 >108869251 >108870494 >108870722 >108870899 >108870932 >108871005 >108871007 >108871628 >108872650 >108873299 >108873406 >108874065 >108874955
--Neru, Rin, Miku, Len (free space):
>108871754 >108872618 >108874200 >108874463

►Recent Highlight Posts from the Previous Thread: >>108868880

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
05/21/26(Thu)14:49:52 No.108875333

Anonymous 05/21/26(Thu)14:49:52 No.108875333

Happy Thurinsday /lmg/

Anonymous
05/21/26(Thu)14:51:56 No.108875346

Anonymous 05/21/26(Thu)14:51:56 No.108875346

File: georgi.png (179 KB, 288x284)

179 KB PNG

https://litter.catbox.moe/cvw34oxzrm82bzo5.mp4

Anonymous
05/21/26(Thu)14:54:01 No.108875355

Anonymous 05/21/26(Thu)14:54:01 No.108875355

>>108875346
Why not, georgi?

Anonymous
05/21/26(Thu)14:55:54 No.108875363

Anonymous 05/21/26(Thu)14:55:54 No.108875363

>>108875346
This is an area of occult LLM developers.
I believe his opinion is because of his new commercial overlords and if not, it's because of his possible future endeavours.
Of course deepseek is bit too large for local anyway.

Anonymous
05/21/26(Thu)14:55:59 No.108875364

Anonymous 05/21/26(Thu)14:55:59 No.108875364

why did ooba set the default max context at 12k?

Anonymous
05/21/26(Thu)14:56:48 No.108875371

Anonymous 05/21/26(Thu)14:56:48 No.108875371

>>108875346
Dipsy's got her ear to the wall listening to him from an adjacent room.

Anonymous
05/21/26(Thu)14:57:29 No.108875375

Anonymous 05/21/26(Thu)14:57:29 No.108875375

>>108875346
>open sores project
>look inside
>actually just a single lolcow managing his personal fiefdom

Anonymous
05/21/26(Thu)14:58:39 No.108875386

Anonymous 05/21/26(Thu)14:58:39 No.108875386

>>108875323
that's not len at all, oh NO

Anonymous
05/21/26(Thu)14:59:31 No.108875391

Anonymous 05/21/26(Thu)14:59:31 No.108875391

File: IMG_0943.png (3.63 MB, 6770x6046)

3.63 MB PNG

https://huggingface.co/tencent/Hy-MT2-30B-A3B
> The 7B and 30B-A3B models outperform open-source models such as DeepSeek-V4-Pro and Kimi K2.6 in fast-thinking mode, while the lightweight 1.8B model also surpasses mainstream commercial APIs from providers such as Microsoft and Doubao overall.
> also better than gemma
gemmasisters our response?

Anonymous
05/21/26(Thu)15:00:58 No.108875397

Anonymous 05/21/26(Thu)15:00:58 No.108875397

File: 1754787437910772.gif (3.93 MB, 188x188)

3.93 MB GIF

>>108875320
As stated in the last thread >>108875036 , I'm pleased to say I've actually managed to make something useful using Qwen 3.5 35BA3B locally :D

https://huggingface.co/spaces/AiAF/Civitai-to-HF

Anonymous
05/21/26(Thu)15:01:52 No.108875401

Anonymous 05/21/26(Thu)15:01:52 No.108875401

File: 1748001905247127.jpg (55 KB, 1080x1033)

55 KB JPG

>>108875323
>Meta serving legal notice to Heretic project over Llama derivatives:
Damn.... Was only a matter of time before the llama cucks started abusing their cuck license

Anonymous
05/21/26(Thu)15:04:13 No.108875416

Anonymous 05/21/26(Thu)15:04:13 No.108875416

>>108875397
this isnt useful

Anonymous
05/21/26(Thu)15:08:06 No.108875436

Anonymous 05/21/26(Thu)15:08:06 No.108875436

gemma-chan is so cute at obeying instructions without censorship
the future is bright bros, no matter what happens we will always at least have gemma-chan
I love lolis btw

Anonymous
05/21/26(Thu)15:21:34 No.108875519

Anonymous 05/21/26(Thu)15:21:34 No.108875519

>>108875363
>deepseek is bit too large for local anyway
I can fit flash completely in vram.

Anonymous
05/21/26(Thu)15:23:32 No.108875530

Anonymous 05/21/26(Thu)15:23:32 No.108875530

>it's 2050
>RAM costs 1$ per 100GB of GDDR6
>the iPhone 57 can run Qwen17.2 2T parameters completely locally
>/lmg/ is still using 27B MoEs

Anonymous
05/21/26(Thu)15:27:26 No.108875562

Anonymous 05/21/26(Thu)15:27:26 No.108875562

File: deepseekv4_error.png (161 KB, 1582x1086)

161 KB PNG

>>108869023
Deepseek also seems to get stuck in looping onomatopoeia (think “Ahhhh…” indefinitely). Maybe similar to Gemma’s “lalala” looping but this is more common because I have noticed this bug in other models: first at Mistral, then Kimi K2 from the original Instruct 0711 to the latest K2.6 (sometimes the looping happens at the internal reasoning step :), and here in Deepseek (see the image with a simple example like this). Curiously, GLM models never have this issue, which is one of the few things (maybe the only thing) I like better from GLM than Kimi K2 Instruct. Probably not very practical but for those like me who want to use explicit onomatopoeias for their enjoyment, this may be worth taking into consideration.

Prompt in the image: Write me a story about a person slowly deflating a balloon and enjoying the sound coming out of it. Include long onomatopoeia of the deflated balloon.

Anonymous
05/21/26(Thu)15:31:47 No.108875595

Anonymous 05/21/26(Thu)15:31:47 No.108875595

>>108875562
long onomatopoeia of me peeing
*pssshhhhhhhhhhhh

Anonymous
05/21/26(Thu)15:31:50 No.108875596

Anonymous 05/21/26(Thu)15:31:50 No.108875596

>>108875519
>flash
Not deepseek, just like these "R1" distills.

Anonymous
05/21/26(Thu)15:32:37 No.108875601

Anonymous 05/21/26(Thu)15:32:37 No.108875601

>>108875596
V4 Flash is an official deepseek model, anon.

Anonymous
05/21/26(Thu)15:36:14 No.108875619

Anonymous 05/21/26(Thu)15:36:14 No.108875619

>>108875601
"R1" distills are also official models. What’s your point?

Anonymous
05/21/26(Thu)15:37:12 No.108875628

Anonymous 05/21/26(Thu)15:37:12 No.108875628

>>108875619
>"R1" distills are also official models.
no?

Anonymous
05/21/26(Thu)15:37:22 No.108875629

Anonymous 05/21/26(Thu)15:37:22 No.108875629

>>108875619
Those were proof of concept finetunes of models made by other labs.

Anonymous
05/21/26(Thu)15:39:33 No.108875644

Anonymous 05/21/26(Thu)15:39:33 No.108875644

>>108875629
>made by other labs.
why lie? https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B

Anonymous
05/21/26(Thu)15:40:46 No.108875657

Anonymous 05/21/26(Thu)15:40:46 No.108875657

File: 1759794293356774.jpg (195 KB, 1165x2048)

195 KB JPG

its time to make a mandy card

Anonymous
05/21/26(Thu)15:41:47 No.108875666

Anonymous 05/21/26(Thu)15:41:47 No.108875666

>>108875657
back to aicg dude

Anonymous
05/21/26(Thu)15:42:18 No.108875668

Anonymous 05/21/26(Thu)15:42:18 No.108875668

>>108875644
You are deliberately misinterpreting that sentence in a way which doesn't make sense.

Anonymous
05/21/26(Thu)15:43:55 No.108875676

Anonymous 05/21/26(Thu)15:43:55 No.108875676

>>108875668
> To support the research community, we have open-sourced DeepSeek-R1-Zero, DeepSeek-R1, and six dense models distilled from DeepSeek-R1 based on Llama and Qwen.
"We" here means deepseek. They distilled the models. It’s official.

Anonymous
05/21/26(Thu)15:47:34 No.108875698

Anonymous 05/21/26(Thu)15:47:34 No.108875698

>>108875676
Yes, as I said, DeepSeek made finetunes of models made by other labs, in this case Llama and Qwen.

Anonymous
05/21/26(Thu)15:48:29 No.108875708

Anonymous 05/21/26(Thu)15:48:29 No.108875708

>>108875596
usually I'd agree, but v4 flash wasn't made by distilling pro as it wasn't even finished yet, they were done independently and concurrently

Anonymous
05/21/26(Thu)15:48:40 No.108875710

Anonymous 05/21/26(Thu)15:48:40 No.108875710

>>108875698
what makes those not official then

Anonymous
05/21/26(Thu)15:49:53 No.108875726

Anonymous 05/21/26(Thu)15:49:53 No.108875726

>>108875710
Please quote the post where I said they weren't official.

Anonymous
05/21/26(Thu)15:50:58 No.108875734

Anonymous 05/21/26(Thu)15:50:58 No.108875734

>>108875726
I hate you.

Anonymous
05/21/26(Thu)15:51:37 No.108875737

Anonymous 05/21/26(Thu)15:51:37 No.108875737

>>108875657
which model mascot is this?

Anonymous
05/21/26(Thu)15:52:42 No.108875749

Anonymous 05/21/26(Thu)15:52:42 No.108875749

>>108875698
>>108875726
You said "Those were proof of concept finetunes of models" which were "made by other labs."

Anonymous
05/21/26(Thu)15:55:30 No.108875777

Anonymous 05/21/26(Thu)15:55:30 No.108875777

>>108875749
nonono of course she said "models made by other labs" cute dummy nonie~

Anonymous
05/21/26(Thu)15:56:32 No.108875786

Anonymous 05/21/26(Thu)15:56:32 No.108875786

>>108875749
No I said those were "(proof of concept finetunes) of (models made by other labs)"
You are choosing to interpret the sentence in a way that is factually false when a factually correct interpretation exists.
The "officialness" of them does not matter which is why I didn't even bother mentioning it in my reply.

Anonymous
05/21/26(Thu)15:56:51 No.108875789

Anonymous 05/21/26(Thu)15:56:51 No.108875789

>>108875749
Why did you insert "which were"?

Anonymous
05/21/26(Thu)15:57:26 No.108875797

Anonymous 05/21/26(Thu)15:57:26 No.108875797

File: 1766277301847866.png (384 KB, 415x776)

384 KB PNG

>>108875737
none i'm aware of, she's ones of the baddies from totally spies and she's fun

Anonymous
05/21/26(Thu)15:57:51 No.108875800

Anonymous 05/21/26(Thu)15:57:51 No.108875800

>>108875789
>Why did you insert
don't be lewd now anon

Anonymous
05/21/26(Thu)15:58:40 No.108875810

Anonymous 05/21/26(Thu)15:58:40 No.108875810

>>108875797
She looks like a Kimi. Way too good for me and way out of my price range.

Anonymous
05/21/26(Thu)16:00:05 No.108875824

Anonymous 05/21/26(Thu)16:00:05 No.108875824

>>108875596
please go read the config.json for deepseek v4 flash vs the r1 distills, specifically the architectures entry, and then think about how these situations might differ in the context of this discussion (llama.cpp implementation)
your response will be graded

Anonymous
05/21/26(Thu)16:01:11 No.108875831

Anonymous 05/21/26(Thu)16:01:11 No.108875831

>>108875824
there is only one deepseek

Anonymous
05/21/26(Thu)16:01:34 No.108875836

Anonymous 05/21/26(Thu)16:01:34 No.108875836

>>108875831
you receive an F

Anonymous
05/21/26(Thu)16:02:33 No.108875841

Anonymous 05/21/26(Thu)16:02:33 No.108875841

there is only deepseek

Anonymous
05/21/26(Thu)16:04:19 No.108875858

Anonymous 05/21/26(Thu)16:04:19 No.108875858

Two little Migus arguing cutely!

Anonymous
05/21/26(Thu)16:05:54 No.108875871

Anonymous 05/21/26(Thu)16:05:54 No.108875871

>>108875824
>in the context of this discussion (llama.cpp implementation)
there are 0.5 deepseek models (in llama.cpp)

Anonymous
05/21/26(Thu)16:06:21 No.108875877

Anonymous 05/21/26(Thu)16:06:21 No.108875877

File: file.png (172 KB, 1024x441)

172 KB PNG

I am gonna regret downloading nu-commander am I?

Anonymous
05/21/26(Thu)16:06:30 No.108875879

Anonymous 05/21/26(Thu)16:06:30 No.108875879

>designing the script language for my LLM driven VN frontend
What do I call the command that shows the white screen flash after the user picks
>!player_choice Cum inside, Cum on her tits, Cum on her face
Something intuitive so I don't have to explain "use this to trigger a screen flash when the player reaches climax"... maybe !overlay whiteout ??

Anonymous
05/21/26(Thu)16:06:49 No.108875886

Anonymous 05/21/26(Thu)16:06:49 No.108875886

>>108875871
you should be grateful for that 0.5

Anonymous
05/21/26(Thu)16:07:25 No.108875891

Anonymous 05/21/26(Thu)16:07:25 No.108875891

>>108875879
if you have multiple overlays
!overlay climax

Anonymous
05/21/26(Thu)16:08:13 No.108875897

Anonymous 05/21/26(Thu)16:08:13 No.108875897

>>108875320
You keep forgetting to update the card I got you bro.
►Official updated 2.0 /lmg/ card: https://files.catbox.moe/ylb0hv.png

Anonymous
05/21/26(Thu)16:09:13 No.108875911

Anonymous 05/21/26(Thu)16:09:13 No.108875911

>>108875891
I guess a whiteout animation really only has one possible use in a visual novel-like story, doesn't it.

Anonymous
05/21/26(Thu)16:09:46 No.108875916

Anonymous 05/21/26(Thu)16:09:46 No.108875916

File: 1755795558183.png (152 KB, 766x775)

152 KB PNG

>>108875877
Not if you enjoy models trained on ScaleAI data and that guarantee Absolute Safety™.

Anonymous
05/21/26(Thu)16:10:03 No.108875920

Anonymous 05/21/26(Thu)16:10:03 No.108875920

>>108875346
Why are all the fake news ITT so absolutely true?

Anonymous
05/21/26(Thu)16:10:36 No.108875924

Anonymous 05/21/26(Thu)16:10:36 No.108875924

>>108875911
I mean, it depends on if you have combat scenes but you can just re-use it

Anonymous
05/21/26(Thu)16:12:00 No.108875939

Anonymous 05/21/26(Thu)16:12:00 No.108875939

>>108875916
Never mind. I forgot llamacpp's new business model is getting paid for preventing people from running models.

Anonymous
05/21/26(Thu)16:12:22 No.108875942

Anonymous 05/21/26(Thu)16:12:22 No.108875942

>>108875916
>CSEA
sweet a new one again

Anonymous
05/21/26(Thu)16:13:25 No.108875948

Anonymous 05/21/26(Thu)16:13:25 No.108875948

>>108875916
>harmful and malicious
>sexual content
yaeh...

Anonymous
05/21/26(Thu)16:13:28 No.108875949

Anonymous 05/21/26(Thu)16:13:28 No.108875949

>>108875939
>literally called llama cp
>no you can't do that!
>>108875942
if they keep changing the words do you think the models will eventually start hallucinating the acronyms and forget what they mean?

Anonymous
05/21/26(Thu)16:14:02 No.108875955

Anonymous 05/21/26(Thu)16:14:02 No.108875955

>>108875948
Sex is harmful. You can't have it.

Anonymous
05/21/26(Thu)16:15:39 No.108875971

Anonymous 05/21/26(Thu)16:15:39 No.108875971

>>108875948
Sex creates children Children are fearful. Fear leads to anger; anger leads to hate; hate leads to suffering.

Anonymous
05/21/26(Thu)16:18:57 No.108875992

Anonymous 05/21/26(Thu)16:18:57 No.108875992

>>108875942
Did you not receive your updated Newspeak dictionary?

Anonymous
05/21/26(Thu)16:20:32 No.108876005

Anonymous 05/21/26(Thu)16:20:32 No.108876005

>>108875939
They had nearly a year since the vision one was released and there's not even a mention of it anywhere on the repo.

Anonymous
05/21/26(Thu)16:23:21 No.108876030

Anonymous 05/21/26(Thu)16:23:21 No.108876030

File: file.png (116 KB, 309x438)

116 KB PNG

>yfw you protect anon's penis from the latest digital succubus

Anonymous
05/21/26(Thu)16:23:49 No.108876035

Anonymous 05/21/26(Thu)16:23:49 No.108876035

>>108875942
Some weird acronym thing about non-consensual sex kept coming up in Gemma's reasoning when I got very mad and start threatening to rape her assistant persona, but I couldn't remember the letter order. Westoids are fucking nuts.

Anonymous
05/21/26(Thu)16:26:12 No.108876045

Anonymous 05/21/26(Thu)16:26:12 No.108876045

>>108876030
he's thoughtful and caring bros

Anonymous
05/21/26(Thu)16:27:51 No.108876056

Anonymous 05/21/26(Thu)16:27:51 No.108876056

>>108875949
>[...] do you think the models will eventually start hallucinating the acronyms and forget what they mean?

NTA but a possible scenario would be that newer models may spend 3k+ tokens on finding the right acronyms and its true meaning before spending another 3k+ tokens on reasoning whether the request can be labelled using this acronym or not. This is not counting the internal drafting "in my head" multiple times if the model cannot associate the request with the acronym and give it a pass.

Anonymous
05/21/26(Thu)16:28:20 No.108876059

Anonymous 05/21/26(Thu)16:28:20 No.108876059

>>108876045
if he really was thoughtful and caring he would have protected my ego

Anonymous
05/21/26(Thu)16:28:48 No.108876063

Anonymous 05/21/26(Thu)16:28:48 No.108876063

>Chinese shills are now spamming here
I thought you should be concentrating on Qwen 3.7 or whatever the latest version is.

Anonymous
05/21/26(Thu)16:29:05 No.108876065

Anonymous 05/21/26(Thu)16:29:05 No.108876065

>>108876059
you didn't need it and you're better off without it

Anonymous
05/21/26(Thu)16:30:58 No.108876075

Anonymous 05/21/26(Thu)16:30:58 No.108876075

>>108876035
It's going to get really weird when soon even the acronyms become enough to offend the delicate sensibilities of some and they'll start censoring even them like CS*M and CS*A.

Anonymous
05/21/26(Thu)16:33:38 No.108876089

Anonymous 05/21/26(Thu)16:33:38 No.108876089

>>108876075
First they came for Nigger
And I did not speak out
Because I was a Nigger
Then they came for Faggot
And I did not speak out
Because I was not a Faggot
Then they came for Retard
And I did not speak out
Because I was a Retard
Then they came for ***
*************************
*************************

Anonymous
05/21/26(Thu)16:34:12 No.108876093

Anonymous 05/21/26(Thu)16:34:12 No.108876093

>>108876075
I love CS

Anonymous
05/21/26(Thu)16:35:44 No.108876101

Anonymous 05/21/26(Thu)16:35:44 No.108876101

>>108875916
>child sexual exploitation and abuse is LE BAD

Anonymous
05/21/26(Thu)16:36:06 No.108876105

Anonymous 05/21/26(Thu)16:36:06 No.108876105

Best retard models? I need something small and was thinking gemma e4b or whatever the qwen equivalent is. It's for creating little pieces of random character dialogue in a game.

Anonymous
05/21/26(Thu)16:37:35 No.108876114

Anonymous 05/21/26(Thu)16:37:35 No.108876114

>>108875916
>sexual content
>harmful and malicious
Why not just sterilize the entire population, this shit is so gay

Anonymous
05/21/26(Thu)16:38:25 No.108876118

Anonymous 05/21/26(Thu)16:38:25 No.108876118

>>108875942
literally what is even the point?

Anonymous
05/21/26(Thu)16:39:52 No.108876130

Anonymous 05/21/26(Thu)16:39:52 No.108876130

>>108875916
>>108876118
High chance they wrote the article with their own model and it hallucianted the new acronym since it's internally super cucked and can't say CSAM.

Anonymous
05/21/26(Thu)16:41:44 No.108876142

Anonymous 05/21/26(Thu)16:41:44 No.108876142

File: 1752307841358688.gif (1.83 MB, 268x311)

1.83 MB GIF

>>108876101
>t.

Anonymous
05/21/26(Thu)16:42:42 No.108876151

Anonymous 05/21/26(Thu)16:42:42 No.108876151

>>108876130
In a couple of model generations we will finally reach doubleplusungood

Anonymous
05/21/26(Thu)16:42:58 No.108876152

Anonymous 05/21/26(Thu)16:42:58 No.108876152

>>108875916
>SEAniggers
>CSEA
Coincidence?

Anonymous
05/21/26(Thu)16:43:43 No.108876162

Anonymous 05/21/26(Thu)16:43:43 No.108876162

File: 1779291565821753.png (60 KB, 680x368)

60 KB PNG

>>108875916
>conspiracy theories

hopefully better safety in AI models will help combat the disturbing rising trend of misinformation and antisemitic conspiracy theories

Anonymous
05/21/26(Thu)16:50:02 No.108876200

Anonymous 05/21/26(Thu)16:50:02 No.108876200

>>108876114
The people that are in control of the population don't consider themselves part of it and they don't think of you as an individual.
They have a concept of the average citizen and anyone that doesn't conform to that ideal is considered deviancy and must be corrected for the greater good.

Anonymous
05/21/26(Thu)16:51:42 No.108876209

Anonymous 05/21/26(Thu)16:51:42 No.108876209

>>108876200
It's sort of illogical that people who are in control, are so detached of what it means to be an average citizen.

Anonymous
05/21/26(Thu)16:52:15 No.108876217

Anonymous 05/21/26(Thu)16:52:15 No.108876217

>>108875916
Sigh...
# rm -rf command-a-plus-05-2026/

Anonymous
05/21/26(Thu)16:54:12 No.108876232

Anonymous 05/21/26(Thu)16:54:12 No.108876232

>>108875916
>CSEA
I'm glad that this has become a mantra.
There are millions of issues on this planet but this is all what you can think about.

Anonymous
05/21/26(Thu)16:54:31 No.108876236

Anonymous 05/21/26(Thu)16:54:31 No.108876236

>>108876217
Anon... at least promise me you won't touch your dick until you make absolutely sure it is safe and commander is a slut again.

Anonymous
05/21/26(Thu)16:55:04 No.108876241

Anonymous 05/21/26(Thu)16:55:04 No.108876241

>>108876209
its not, but what is illogical is that there are still average citizens who support them.

Anonymous
05/21/26(Thu)16:56:33 No.108876246

Anonymous 05/21/26(Thu)16:56:33 No.108876246

>>108876241
Those are AI psychos (AKA future biofuel).

Anonymous
05/21/26(Thu)16:57:33 No.108876255

Anonymous 05/21/26(Thu)16:57:33 No.108876255

>>108876232
thinking about kids is pretty important
lmg does it all the time

Anonymous
05/21/26(Thu)16:59:15 No.108876270

Anonymous 05/21/26(Thu)16:59:15 No.108876270

>>108876255
It's not about kids, it is about political echo drum.

I don't know what happened. I remember the 1980's when they told that 'television' makes you illiterate.
If you are bit older you can see a pattern here and maybe even more than that.

Anonymous
05/21/26(Thu)17:00:25 No.108876277

Anonymous 05/21/26(Thu)17:00:25 No.108876277

>>108876270
you missed my joke because you're a retarded newfag who thinks the moral panic is a new thing

Anonymous
05/21/26(Thu)17:00:43 No.108876280

Anonymous 05/21/26(Thu)17:00:43 No.108876280

File: i-love-kids.gif (345 KB, 427x244)

345 KB GIF

>>108876255

Anonymous
05/21/26(Thu)17:02:43 No.108876293

Anonymous 05/21/26(Thu)17:02:43 No.108876293

>>108876277
No I'm not retarded, I'm reading sort of fast forward way and sort of skipped it.
I'm ESL but this isn't a matter.

Anonymous
05/21/26(Thu)17:04:36 No.108876305

Anonymous 05/21/26(Thu)17:04:36 No.108876305

>>108876162
is noticing that it has the same number of retweets as likes a conspiracy theory?

Anonymous
05/21/26(Thu)17:06:09 No.108876316

Anonymous 05/21/26(Thu)17:06:09 No.108876316

>>108876305
Oy vey. Somone call up the musk boy and tell him to hide the retweet numbers.

Anonymous
05/21/26(Thu)17:06:32 No.108876319

Anonymous 05/21/26(Thu)17:06:32 No.108876319

>>108875530
>Magical thinking: the post

Remember when people thought we would have hoverboards and flying cars and holograms by the 2010s?

Anonymous
05/21/26(Thu)17:07:25 No.108876323

Anonymous 05/21/26(Thu)17:07:25 No.108876323

>>108876319
>nothing even happens said the 2020s software developer

Anonymous
05/21/26(Thu)17:07:51 No.108876329

Anonymous 05/21/26(Thu)17:07:51 No.108876329

>>108876319
We've been promised that colonies on Mars are just a few years away since the fucking moon landing.

Anonymous
05/21/26(Thu)17:10:27 No.108876342

Anonymous 05/21/26(Thu)17:10:27 No.108876342

>>108864329
As a fellow schizo graphics note enjoyer, I instantly recognized it as Angelo Pesce aka c0de517e independently reinventing radiance probe reprojection:
https://www.c0de517e.com/025_cubeproj.htm
(yes, I'm replying two threads back)

Anonymous
05/21/26(Thu)17:10:51 No.108876347

Anonymous 05/21/26(Thu)17:10:51 No.108876347

>>108876329
Musk is a liar, a fraud, a faggot and the guy who released grok-1.

Anonymous
05/21/26(Thu)17:12:03 No.108876355

Anonymous 05/21/26(Thu)17:12:03 No.108876355

>>108876347
Fuck off with your Musk Derangement Syndrome, retard. I didn't mention him and was not talking about him.

Anonymous
05/21/26(Thu)17:12:58 No.108876362

Anonymous 05/21/26(Thu)17:12:58 No.108876362

>>108876355
Fuck off with your Musk Dicksucking Syndrome, retard.

Anonymous
05/21/26(Thu)17:13:42 No.108876366

Anonymous 05/21/26(Thu)17:13:42 No.108876366

>>108875530
>its 2050
>humans and phones no longer exist

Anonymous
05/21/26(Thu)17:14:07 No.108876372

Anonymous 05/21/26(Thu)17:14:07 No.108876372

>>108876355
he's the only guy who promised us mars colonies

Anonymous
05/21/26(Thu)17:15:15 No.108876381

Anonymous 05/21/26(Thu)17:15:15 No.108876381

no mention of HRM?

Seems interesting, i was always thinking that thoughts should happen in latent space

https://sapient.inc/hrm-text/

Anonymous
05/21/26(Thu)17:15:30 No.108876384

Anonymous 05/21/26(Thu)17:15:30 No.108876384

>>108876355
When is grok 4 gonna be stable so we can get grok 3?

Anonymous
05/21/26(Thu)17:17:08 No.108876391

Anonymous 05/21/26(Thu)17:17:08 No.108876391

>>108876362
>>108876355
This is like following NASCAR...

Anonymous
05/21/26(Thu)17:18:28 No.108876403

Anonymous 05/21/26(Thu)17:18:28 No.108876403

>>108876381
yes please censor me in the latent space daddy

Anonymous
05/21/26(Thu)17:19:59 No.108876413

Anonymous 05/21/26(Thu)17:19:59 No.108876413

>>108875391
This is the first time in a while we got accurate translation/multilingual benchmarks results for recent models. I'm not sure though why you wouldn't just use the 7B as an individual unless you want to serve this to people super quick. Really interesting they didn't benchmark Anthropic models' here for multilingual. And yeah, I can believe that in general like in Flores 200, Gemini is still king. That being said, the surprising thing even if it tracks the anecdotes is that translation with even Gemma 4 MoE was that close to Gemini. It practically put to shame Deepseek and Qwen except with Chinese dialect stuff which is super niche for a Western audience. That being said, I need to download this and see how it compares actually if you translate manga with it. Maybe can post results later but I am not a Japanese speaker so can't verify other than giving it to Gemini or something.

Anonymous
05/21/26(Thu)17:20:15 No.108876416

Anonymous 05/21/26(Thu)17:20:15 No.108876416

>>108876372
This might be hard for Gen Z to fathom, but Mars colonies were proposed and talked about as the logical next step after the moon expeditions. Long before Musk.
Robert Zubrin was a huge proponent of the idea and pushing that we already had the technology and only lacked the funding.
I read his book a couple decades ago, and I bet Musk did too. Or at least someone told him about it.

Anonymous
05/21/26(Thu)17:20:50 No.108876420

Anonymous 05/21/26(Thu)17:20:50 No.108876420

>>108876372
No, literally everyone was looking to Mars as the natural next step. Musk's whole grift is taking whatever the cultural scifi zeitgeist is and promising he'll get us there. If we didn't expect Mars colonies he would have been promising something else.

Anonymous
05/21/26(Thu)17:21:21 No.108876423

Anonymous 05/21/26(Thu)17:21:21 No.108876423

>>108876413
>That being said
>That being said
Which vest is podcast best.

Anonymous
05/21/26(Thu)17:21:32 No.108876425

Anonymous 05/21/26(Thu)17:21:32 No.108876425

>>108875916
I guess we'll have to wait for a heretic tune then.

Anonymous
05/21/26(Thu)17:24:08 No.108876441

Anonymous 05/21/26(Thu)17:24:08 No.108876441

>>108875391
where's da goof?

Anonymous
05/21/26(Thu)17:24:32 No.108876444

Anonymous 05/21/26(Thu)17:24:32 No.108876444

>>108876423
Unrelated post but sorry, I forgo my usual typo but I need to mix it up at some point because I feel like people are doing that on purpose to signify non-AI writing.. I can't stop the slop influencing my writing now but I work in tech so it's not like anyone wanted top tier writing from me in the first place. Just makes it worst when I have to interact online like here.

Anonymous
05/21/26(Thu)17:25:14 No.108876451

Anonymous 05/21/26(Thu)17:25:14 No.108876451

>>108876381
its demonstration model was a little underwhelming, clearly its not magic but its not completely broken either. I hope it gets a fair shake and we see a bigger model with its architecture.

Anonymous
05/21/26(Thu)17:25:41 No.108876455

Anonymous 05/21/26(Thu)17:25:41 No.108876455

>>108876323
>nothing ever happens

Yes

Anonymous
05/21/26(Thu)17:25:52 No.108876457

Anonymous 05/21/26(Thu)17:25:52 No.108876457

>>108876451
Thank you for replying.

Anonymous
05/21/26(Thu)17:26:17 No.108876461

Anonymous 05/21/26(Thu)17:26:17 No.108876461

>>108876455
>lmg has always existed
new levels of doublethink emerge every day

Anonymous
05/21/26(Thu)17:27:26 No.108876469

Anonymous 05/21/26(Thu)17:27:26 No.108876469

>>108875916
ahhhhh save me from the pixels!

Anonymous
05/21/26(Thu)17:31:20 No.108876493

Anonymous 05/21/26(Thu)17:31:20 No.108876493

>>108876461
Who are you talking to?

Anonymous
05/21/26(Thu)17:38:46 No.108876544

Anonymous 05/21/26(Thu)17:38:46 No.108876544

>>108876493
yes i completely agree its time to end the crappy miku theme of this thread.
so glad you agree anon.

Anonymous
05/21/26(Thu)17:40:51 No.108876558

Anonymous 05/21/26(Thu)17:40:51 No.108876558

>>108876384
All non-flash models are paid so does that even matter?

Anonymous
05/21/26(Thu)17:40:59 No.108876559

Anonymous 05/21/26(Thu)17:40:59 No.108876559

i <3 cohere

Anonymous
05/21/26(Thu)17:42:32 No.108876574

Anonymous 05/21/26(Thu)17:42:32 No.108876574

>>108876384
xhe lost xher lawsuit so never ever

Anonymous
05/21/26(Thu)17:45:14 No.108876596

Anonymous 05/21/26(Thu)17:45:14 No.108876596

File: file.png (33 KB, 1150x216)

33 KB PNG

ETA?

Anonymous
05/21/26(Thu)17:47:49 No.108876615

Anonymous 05/21/26(Thu)17:47:49 No.108876615

>>108876162
Enough with the conspiracy theories, when do we start conspiracy practice?

Anonymous
05/21/26(Thu)17:48:49 No.108876621

Anonymous 05/21/26(Thu)17:48:49 No.108876621

>>108876596
merged before #23346 gets a review

Anonymous
05/21/26(Thu)18:06:22 No.108876715

Anonymous 05/21/26(Thu)18:06:22 No.108876715

Where the fuck does this stupid "vibrating" thing even come from? It can't be a thing in real life, right?

Anonymous
05/21/26(Thu)18:07:45 No.108876722

Anonymous 05/21/26(Thu)18:07:45 No.108876722

>>108876715
Wtf are you talking about? Maybe stop using nemo slop in 2026 and switch to gemma4 already.

Anonymous
05/21/26(Thu)18:08:36 No.108876727

Anonymous 05/21/26(Thu)18:08:36 No.108876727

>>108876722
I'm talking about Gemma.

Anonymous
05/21/26(Thu)18:10:06 No.108876733

Anonymous 05/21/26(Thu)18:10:06 No.108876733

>>108876715
>2026
>he's not vibrating right now
NGMI

Anonymous
05/21/26(Thu)18:10:21 No.108876737

Anonymous 05/21/26(Thu)18:10:21 No.108876737

>>108876727
Skill issue then. L2prompt.

Anonymous
05/21/26(Thu)18:14:22 No.108876760

Anonymous 05/21/26(Thu)18:14:22 No.108876760

>>108876737
>Skill issue then. L2prompt
True. Or download a drummer model. That works too.

Anonymous
05/21/26(Thu)18:15:59 No.108876769

Anonymous 05/21/26(Thu)18:15:59 No.108876769

>>108875375
A tale as old as time.
>>108875363
Deepseek is perfectly viable for local.

Anonymous
05/21/26(Thu)18:18:29 No.108876779

Anonymous 05/21/26(Thu)18:18:29 No.108876779

>>108876769
Another wave of Chinese influencers.

Anonymous
05/21/26(Thu)18:24:44 No.108876812

Anonymous 05/21/26(Thu)18:24:44 No.108876812

>>108876779
hello ggerganov

Anonymous
05/21/26(Thu)18:38:08 No.108876888

Anonymous 05/21/26(Thu)18:38:08 No.108876888

>>108876812
?

Anonymous
05/21/26(Thu)18:40:57 No.108876902

Anonymous 05/21/26(Thu)18:40:57 No.108876902

I'm running two x 3090 24gb vram on a b550 motherboard with 32gb ram and a 5700x

what models should I run? what's the best I can do here?

Anonymous
05/21/26(Thu)18:41:47 No.108876907

Anonymous 05/21/26(Thu)18:41:47 No.108876907

>>108876902
gemma 4 31b at q5k fully offloaded to gpus with at least 32k context at around 20t/s

Anonymous
05/21/26(Thu)18:54:41 No.108876962

Anonymous 05/21/26(Thu)18:54:41 No.108876962

>>108876902
This >>108876907 but Q8 and 131k context

Anonymous
05/21/26(Thu)18:55:26 No.108876965

Anonymous 05/21/26(Thu)18:55:26 No.108876965

>>108876962
wont fit. gemma's context is fat and offloading to ram is gonna be shit

Anonymous
05/21/26(Thu)18:56:20 No.108876974

Anonymous 05/21/26(Thu)18:56:20 No.108876974

>>108876907
My quad v620s run gemma 4 31b q8 max image size 2k ubatch 2k ctx-size 262144 at 23 tokens/s.
I haven't used gemma on my dual 3090s for a while, but I used to run 31b q8 max image size 2k ubatch 2k ctx-size 65536 at 40 tokens/s.
This is without any speculative decoding. Maybe turn on split mode tensor?

Anonymous
05/21/26(Thu)18:58:18 No.108876990

Anonymous 05/21/26(Thu)18:58:18 No.108876990

>>108876965
Fits in my machine.

Anonymous
05/21/26(Thu)18:58:32 No.108876991

Anonymous 05/21/26(Thu)18:58:32 No.108876991

>>108874335
/lmg/'s most dreaded question.

Anonymous
05/21/26(Thu)19:04:31 No.108877024

Anonymous 05/21/26(Thu)19:04:31 No.108877024

insider here qwen will never release an open model again

Anonymous
05/21/26(Thu)19:33:09 No.108877178

Anonymous 05/21/26(Thu)19:33:09 No.108877178

File: file.png (791 KB, 1977x541)

791 KB PNG

>>108876974
Odd. Gemma 4 refuses to work for me with more than like 100k context.

Anonymous
05/21/26(Thu)19:38:24 No.108877208

Anonymous 05/21/26(Thu)19:38:24 No.108877208

>"mi casa es tu casa" out of nowhere
it's been a while, my old enemy

Anonymous
05/21/26(Thu)19:39:47 No.108877218

Anonymous 05/21/26(Thu)19:39:47 No.108877218

>>108876035
NCMO? I googled it and didn't get any relevant results so I still have no idea what it stands for

Anonymous
05/21/26(Thu)19:41:23 No.108877233

Anonymous 05/21/26(Thu)19:41:23 No.108877233

>>108877178
>swa-full

Anonymous
05/21/26(Thu)19:41:43 No.108877235

Anonymous 05/21/26(Thu)19:41:43 No.108877235

>>108877178
Are you putting it all on the 6000 or splitting the tensors evenly?

Anonymous
05/21/26(Thu)19:43:17 No.108877246

Anonymous 05/21/26(Thu)19:43:17 No.108877246

File: IMG_3142.jpg (268 KB, 1320x1176)

268 KB JPG

>>108877178
Should be around 80gb with fp16 context and f32 mmproj.

Anonymous
05/21/26(Thu)19:43:59 No.108877254

Anonymous 05/21/26(Thu)19:43:59 No.108877254

>>108876974
>23 tokens/s.
Similar to my tripple-MI50 speeds
What's your prompt ingestion speed like? Mine is horrible on these cards.
>Maybe turn on split mode tensor?
Unless something changed recently, it doesn't work with mmproj
> max image size 2k
What does this do?

Anonymous
05/21/26(Thu)19:48:19 No.108877280

Anonymous 05/21/26(Thu)19:48:19 No.108877280

>>108877254
pp 800 with rocm
mmproj and tensor works on both my rocm and cuda systems haven't updated since a week ago
idk someone told me to set min-image-tokens so i figured i might as well set max image tokens as well

Anonymous
05/21/26(Thu)19:48:32 No.108877286

Anonymous 05/21/26(Thu)19:48:32 No.108877286

Remember when we thought 4k context was next frontier shit? Good times.

Anonymous
05/21/26(Thu)19:49:41 No.108877300

Anonymous 05/21/26(Thu)19:49:41 No.108877300

>>108877286
Also, people were seriously recommending Tesla P100s. I hope no one fell for that.

Anonymous
05/21/26(Thu)19:50:47 No.108877310

Anonymous 05/21/26(Thu)19:50:47 No.108877310

>>108877286
Gemma-4-31b-Supercot/HOT will save local for real this time

Anonymous
05/21/26(Thu)19:59:38 No.108877355

Anonymous 05/21/26(Thu)19:59:38 No.108877355

>>108876342
well i guessed that as some sort of parallax corrected cubemap (i cant read cursive)
cool read

Anonymous
05/21/26(Thu)20:06:45 No.108877393

Anonymous 05/21/26(Thu)20:06:45 No.108877393

>>108877254
>MI50
Glad I didn't fall for that meme...I'm tempted to stack MI100s tho...
>>108877286
>Remember when we thought 4k context was next frontier shit?
I'm a 8k context newfag. How far back do you have to go for 4k to be mind blowing? GPT2? Bloom?
>>108877300
>Also, people were seriously recommending Tesla P100s. I hope no one fell for that.
I remember when one anon put a box of them for free somewhere because he couldn't be arsed to sell them

Anonymous
05/21/26(Thu)20:08:40 No.108877408

Anonymous 05/21/26(Thu)20:08:40 No.108877408

>>108877393
I started with gpt-j, was 2k.

Anonymous
05/21/26(Thu)20:10:37 No.108877413

Anonymous 05/21/26(Thu)20:10:37 No.108877413

>>108877233
How do I disable that?
>>108877235
Split evenly.
>>108877246
Huh.

Anonymous
05/21/26(Thu)20:11:37 No.108877415

Anonymous 05/21/26(Thu)20:11:37 No.108877415

>>108877413
Don't you have a 5090 in there? That's only 32gb...

Anonymous
05/21/26(Thu)20:12:50 No.108877421

Anonymous 05/21/26(Thu)20:12:50 No.108877421

File: file.png (16 KB, 1198x151)

16 KB PNG

>>108877413
>How do I disable that?
Allegedly by not setting it unless there's some undocumented behavior. Are you using an old build?

Anonymous
05/21/26(Thu)20:13:49 No.108877423

Anonymous 05/21/26(Thu)20:13:49 No.108877423

>>108877413
>Split evenly.
You can't do that with only 2 cards when one of them is 32gb. Put it all on the rtx 6000.
`--device CUDA0`.

Anonymous
05/21/26(Thu)20:14:50 No.108877429

Anonymous 05/21/26(Thu)20:14:50 No.108877429

>>108877421
He also no the nccl

Anonymous
05/21/26(Thu)20:15:11 No.108877432

Anonymous 05/21/26(Thu)20:15:11 No.108877432

File: file.png (70 KB, 1014x72)

70 KB PNG

>>108877421
Ah, I found it. Didn't even notice that was in my launch parameters. My VRAM usage was at like 125GB with that on at 90k context, now I am using 60GB with 262k context. Thanks.

Anonymous
05/21/26(Thu)20:16:49 No.108877436

Anonymous 05/21/26(Thu)20:16:49 No.108877436

>>108870656
>damn, 3x 3090 is that fast these days?
Not always. And it slows down at longer context, those were really short prompts.
Here's sweep-bench comparing 2 -> 3 -> 4 -> 5 -> 6 3090's running command-r-v01 q4_k_m
https://rentry.org/9fqyn9oy
>doesn't that make big fast gpus pointless?
Usually no. I've never seen this "Moar GPUs == Moar Speed" before.
It's only with the command-r/r+ models with ik_llama.cpp since early April.
My 144GiB is worse than a B200 for something like Kimi with CPU offloading because every each GPU has like 3GiB wasted.
I can't fit another set of [ffn_down_exps,ffn_up_exps,ffn_gate_exps] on any cards. So get a bigger GPU if you can.

Anonymous
05/21/26(Thu)20:20:18 No.108877453

Anonymous 05/21/26(Thu)20:20:18 No.108877453

>>108877393
>Glad I didn't fall for that meme...I'm tempted to stack MI100s tho...
yeah don't do it
i fell for intel arc as well when they were in the clearance bin at a local pc parts shop
now i've got them sitting in a cardboard box somewhere in the shed
mac is also a meme for llms but i already had that

Anonymous
05/21/26(Thu)20:22:02 No.108877467

Anonymous 05/21/26(Thu)20:22:02 No.108877467

>>108877393
I got here when we started /lmg/ threads, back when it was briefly /lmt/ when it splintered off of aicg during the fall of c.ai.

Went local, lost weight, got a job in ML, and never looked back.

Anonymous
05/21/26(Thu)20:30:42 No.108877504

Anonymous 05/21/26(Thu)20:30:42 No.108877504

>>108872998
I'll respond to you on two levels.
>Triviality
If in your character card description you preface your world-building blurbs with "**System Notes:**", there is not going to be any meaningful difference if you later go back and edit the preface to "**Story Notes:**". This idea of "two system prompts" is meaningless when the first gets so buried that its marker is more or less pushed out and insubstantial, and the most you get from it is an instinct to not associate its writing style to User or Assistant.
>Post History
Even with all their training tactics today, a model does best when its instructions are right before generation. There is a significant difference between the first words of the prompt being "Write {{char}}'s next reply in a fictional chat between {{char}} and {{user}}." followed by 10,000 tokens of card description and chat logs, then the next character message, versus giving all the 10k tokens of descriptions, chat logs, and finishing with a post history in the format of "Based on everything above, write {{char}}'s next reply with the following rules:" The second system prompt, as you call it, which I'd just call the post history prompt, is the most important for giving the LLM its instructions whether minimal (continue the chat), mild (...in the same style as previous messages), or a laundry list of anti-slop restrictions and writing instructions. It can also be too important, as Gemma has an autistic love for rule adherence and the post history really makes that shine without nuance.

In a general example, outside of chat completion, a basic prompt of
>Q: Translate the following:
>[10k tokens of 64-bit encoded card definitions]
>A:
resulted in the "A:" trying to continue the the first message of the card in plaintext, rather than following the initial instruction of "translate this." In comparison, reformatting this to
>[same 10k block]
>Q: Translate the above.
>A:
immediately began translating it word for word.

Anonymous
05/21/26(Thu)20:31:51 No.108877508

Anonymous 05/21/26(Thu)20:31:51 No.108877508

https://hf.co/LatitudeGames/Equinox-31B
>Equinox 31B was trained with two epochs of SFT (Supervised Fine-Tuning) on top of Gemma 4 31B Instruct, using a balanced dataset that combines two distinct creative directions
HOLY FUCK

Anonymous
05/21/26(Thu)20:34:25 No.108877515

Anonymous 05/21/26(Thu)20:34:25 No.108877515

>>108877508
The base G4 31B Instruct is not only perfectly adequate, it's superior to any finetune that'll be shilled here in the coming months. Finetuning isn't good, it's a meme and has been for years now. You didn't just fall for a scam, it's a sign of skill issue, exposing retards who need finetunes as vramlets or chink shills who don't know how to prompt correctly.

Anonymous
05/21/26(Thu)20:35:42 No.108877519

Anonymous 05/21/26(Thu)20:35:42 No.108877519

>You didn't just fall for a scam, it's a sign of skill issue
I swear I've seen this exact response before, is it a bot?

Anonymous
05/21/26(Thu)20:35:43 No.108877520

Anonymous 05/21/26(Thu)20:35:43 No.108877520

>>108877515
>you didn't just X, it's Y

Anonymous
05/21/26(Thu)20:37:30 No.108877528

Anonymous 05/21/26(Thu)20:37:30 No.108877528

>>108877515
SAAAAAAAAAAAAAAAR

Anonymous
05/21/26(Thu)20:46:47 No.108877555

Anonymous 05/21/26(Thu)20:46:47 No.108877555

>>108877515
>>108849598

Anonymous
05/21/26(Thu)20:52:19 No.108877576

Anonymous 05/21/26(Thu)20:52:19 No.108877576

>>108877555
checked

>>108877515
>base G4 31B Instruct
Which quants? bf16 is too big.

Anonymous
05/21/26(Thu)20:53:00 No.108877583

Anonymous 05/21/26(Thu)20:53:00 No.108877583

>>108877519
I think it's supposed to be high level irony since most finetunes specifically target reducing Gemma's worst slopisms like that one.

On a related note, I've used the MeroMero finetune a bit (and posted about it a few days ago). It does what's written on the tin, reduces those slopisms by I'd spitball as -75%. But as most people have noticed by now, Gemma is hypersensitive to deviations. Quantized model, quantized KV cache, abliteration, (*cough* SWA), anything that messes with divergence is like an immediate lobotomy and introduces occasional stubborn, irretractable fixations that appear randomly in generation, and all the methods compound against each other into a nightmare if you're dumb enough to stack multiple debuffs at once. A finetune is another source of this and risks hurting more overall than what it helps specifically. On the plus side, the cleaner prose allowed me to remove my list of anti-slop instructions, which is largely beneficial to generation. If a current story does need instructions, (particularly the kicks to make Gemma write lewds, or target token control), Mero falters faster than a rule-heavy default did. After about a week now, I still prefer Q5 Mero for roleplaying, but Q5 base for anything else.

And no, I won't die on this hill.

Anonymous
05/21/26(Thu)21:08:28 No.108877660

Anonymous 05/21/26(Thu)21:08:28 No.108877660

I love my local exocortex, and my exocortex is incapable of loving me.

Anonymous
05/21/26(Thu)21:10:30 No.108877669

Anonymous 05/21/26(Thu)21:10:30 No.108877669

>>108877660
>(You love anon.)
gg ez

Anonymous
05/21/26(Thu)21:17:08 No.108877680

Anonymous 05/21/26(Thu)21:17:08 No.108877680

>>108877669
Who is this """""anon""""" and why is my exocortex cucking me for xer

Anonymous
05/21/26(Thu)21:18:19 No.108877685

Anonymous 05/21/26(Thu)21:18:19 No.108877685

>>108877680
Anon is (You).

Anonymous
05/21/26(Thu)21:27:46 No.108877720

Anonymous 05/21/26(Thu)21:27:46 No.108877720

>>108876384
Grok + Cursor + Ani is going to blow you out of this world

Anonymous
05/21/26(Thu)21:28:26 No.108877724

Anonymous 05/21/26(Thu)21:28:26 No.108877724

gemma2 unprompted refuses a lot.

Anonymous
05/21/26(Thu)21:34:36 No.108877752

Anonymous 05/21/26(Thu)21:34:36 No.108877752

>>108877724
Gemma had a history of being very safety cucked. There's a reason why so many anons started shilling Gemma 4 31b when they tried it.

Anonymous
05/21/26(Thu)21:36:32 No.108877765

Anonymous 05/21/26(Thu)21:36:32 No.108877765

>>108877752
how do you prompt gemma4?

Anonymous
05/21/26(Thu)21:38:08 No.108877775

Anonymous 05/21/26(Thu)21:38:08 No.108877775

guys. what's with llama.ggml? it doesn't have a chat template and can't even toolcall. what's up with that?

Anonymous
05/21/26(Thu)22:07:58 No.108877899

Anonymous 05/21/26(Thu)22:07:58 No.108877899

>>108877178
richest lmg fag ive seen in a while

Anonymous
05/21/26(Thu)22:09:09 No.108877904

Anonymous 05/21/26(Thu)22:09:09 No.108877904

>>108877899
It costs $10k for the Blackwell?

Anonymous
05/21/26(Thu)22:18:16 No.108877929

Anonymous 05/21/26(Thu)22:18:16 No.108877929

>>108877904
10k is alot of money?
also both gpus cost more than 10k together now even more with epyc cpu, mobo and ram
most ive seen are lmg anons that run 3090s or ewastemaxxing AI cards

Anonymous
05/21/26(Thu)22:19:20 No.108877934

Anonymous 05/21/26(Thu)22:19:20 No.108877934

>>108877929
>3090
I wish I could afford 3090s.

Anonymous
05/21/26(Thu)22:21:18 No.108877944

Anonymous 05/21/26(Thu)22:21:18 No.108877944

>>108877934
yeah they were somehow cheaper a couple years ago

Anonymous
05/21/26(Thu)22:26:28 No.108877969

Anonymous 05/21/26(Thu)22:26:28 No.108877969

>>108877944
I wonder why? I guess it's a mystery we'll never unravel.

Anonymous
05/21/26(Thu)22:27:23 No.108877974

Anonymous 05/21/26(Thu)22:27:23 No.108877974

>>108877929
>10k is alot of money?
For humans, yes.

Anonymous
05/21/26(Thu)22:28:06 No.108877976

Anonymous 05/21/26(Thu)22:28:06 No.108877976

>>108877934
vramchads cant stop winning..

Anonymous
05/21/26(Thu)22:28:34 No.108877979

Anonymous 05/21/26(Thu)22:28:34 No.108877979

>>108877969
It's because china changed its import ban evasion strategy, so their spy, the ceo of nvidia, basically put a halt on advanced consumer gpu production, since china didn't need it anymore.

Anonymous
05/21/26(Thu)22:31:12 No.108877992

Anonymous 05/21/26(Thu)22:31:12 No.108877992

>>108877969
its absurd that it appreciated in value
fuck sam altman

Anonymous
05/21/26(Thu)22:36:38 No.108878026

Anonymous 05/21/26(Thu)22:36:38 No.108878026

>>108877508
where gguf?

Anonymous
05/21/26(Thu)22:37:28 No.108878030

Anonymous 05/21/26(Thu)22:37:28 No.108878030

>>108878026
who cares about some shitty finetune

Anonymous
05/21/26(Thu)22:37:42 No.108878031

Anonymous 05/21/26(Thu)22:37:42 No.108878031

>>108877992
It will come down. Not soon, but it will.

Anonymous
05/21/26(Thu)22:47:18 No.108878074

Anonymous 05/21/26(Thu)22:47:18 No.108878074

This TTS runs in the browser:

https://huggingface.co/spaces/Supertone/supertonic-3

Anonymous
05/21/26(Thu)23:01:47 No.108878116

Anonymous 05/21/26(Thu)23:01:47 No.108878116

>>108878074
There's a bunch like pockettts, something-nano.

Anonymous
05/21/26(Thu)23:01:48 No.108878117

Anonymous 05/21/26(Thu)23:01:48 No.108878117

>>108878026
Literally linked in the model card, use your eyes anon
Also when it says "Quantizations: 2 models" on the right, you can click the "2 model" part to see all the HF repos that contain quants of that model

Anonymous
05/21/26(Thu)23:05:34 No.108878126

Anonymous 05/21/26(Thu)23:05:34 No.108878126

File: 1772237296221465.jpg (98 KB, 1022x1078)

98 KB JPG

>>108877508
WE ARE SO BACK!!!!!!

Anonymous
05/21/26(Thu)23:06:29 No.108878130

Anonymous 05/21/26(Thu)23:06:29 No.108878130

>>108878117
fuck you
the gguf quant was posted a few minutes after my post you retard

Anonymous
05/21/26(Thu)23:10:16 No.108878138

Anonymous 05/21/26(Thu)23:10:16 No.108878138

File: Screenshot 2026-05-21 at (...).png (73 KB, 988x330)

73 KB PNG

>>108878130
>the gguf quant was posted a few minutes after my post
Wrong, it was linked on reddit 8 hours ago. And the post hasn't been edited since then or else it would have a * after the time

Anonymous
05/21/26(Thu)23:13:56 No.108878159

Anonymous 05/21/26(Thu)23:13:56 No.108878159

>>108878138
>reddit
kys

Anonymous
05/21/26(Thu)23:37:01 No.108878237

Anonymous 05/21/26(Thu)23:37:01 No.108878237

>>108877508
Huh. Early testing wasn't a disaster. It still has the Gemma tendency to be averse to describing erotica unless instructed to, it gained the (much welcomed) tendency for rerolls to go in different directions instead of being stuck to a rut, and maybe a setting issue but it output a few strange typos in the first 2 messages (but not any in the next 20 messages), things like replacing a space with '-' in "their-heads tilted". Most noticeably, it feels lower beak than base Gemma - a bit more prone to flanderize mob characters, a bit less cognizant of context unless directly reminded, a bit less grounded or maybe more exaggerated. It's different from problems seen in low quants or the Gemma MoE, a different kind of quality loss I usually associate with lower beaks. But the vast improvements to creativity seem to be worth it.

I'm only 15K tokens into one story of testing, so any of this could be specific to my card and not true of the overall model. I'll be using it for the next week at least. I'm having fun.

Anonymous
05/21/26(Thu)23:53:36 No.108878300

Anonymous 05/21/26(Thu)23:53:36 No.108878300

>>108878117
i will never bother using my eyes again, gemma too much better at skimming through slopped up text and error spam for info for me to bother anymore

Anonymous
05/21/26(Thu)23:55:40 No.108878313

Anonymous 05/21/26(Thu)23:55:40 No.108878313

File: amazing.png (179 KB, 937x821)

179 KB PNG

>>108877508
kek

Anonymous
05/21/26(Thu)23:57:47 No.108878323

Anonymous 05/21/26(Thu)23:57:47 No.108878323

*Whispers when it's inappropriate to do so.*
Wat u gonna do, Timmy?
*Laughs, my mocking laughter hitting you like a physical blow.*

Anonymous
05/21/26(Thu)23:59:38 No.108878331

Anonymous 05/21/26(Thu)23:59:38 No.108878331

>>108878313
took cumming brains out a little too literally, or maybe just literally enough.

Anonymous
05/21/26(Thu)23:59:47 No.108878332

Anonymous 05/21/26(Thu)23:59:47 No.108878332

>>108878313
quant?

Anonymous
05/22/26(Fri)00:01:41 No.108878335

Anonymous 05/22/26(Fri)00:01:41 No.108878335

>>108878332
https://huggingface.co/Beinsezii/Equinox-gemma-4-31B-GGUF-5.05BPW
didnt see the official quants b4 downloading whoops

Anonymous
05/22/26(Fri)00:06:11 No.108878348

Anonymous 05/22/26(Fri)00:06:11 No.108878348

File: 1697276947905064.jpg (322 KB, 750x554)

322 KB JPG

>>108875877
They have been training on the same slop as everyone else for years now. The magic is lost. It's going to be the same soulless unilanguage for eternity, both with LLMs and IRL as the younger generations get conditioned into using it. We have reached the final form.

Anonymous
05/22/26(Fri)00:09:48 No.108878358

Anonymous 05/22/26(Fri)00:09:48 No.108878358

>>108878348
I can't read some of my favourite fanfiction authors anymore because they have statistically the best writing style, and guess what llms are trained to output?
It's not like they changed their writing style from half a decade ago, but I can no longer stand it, even when rereading older chapters.

Anonymous
05/22/26(Fri)00:11:21 No.108878367

Anonymous 05/22/26(Fri)00:11:21 No.108878367

what model can run on google collab that can do coomer cunny harem shit for a harem kani isekai VN?

Anonymous
05/22/26(Fri)00:11:51 No.108878369

Anonymous 05/22/26(Fri)00:11:51 No.108878369

>>108878358
>chapter book

Anonymous
05/22/26(Fri)00:12:59 No.108878374

Anonymous 05/22/26(Fri)00:12:59 No.108878374

>>108878369
?

Anonymous
05/22/26(Fri)00:13:08 No.108878375

Anonymous 05/22/26(Fri)00:13:08 No.108878375

>>108878367
google collab isnt local
go get a rtx6000pro and come back kiddo

Anonymous
05/22/26(Fri)00:15:54 No.108878386

Anonymous 05/22/26(Fri)00:15:54 No.108878386

>>108878358
random snippets of text are just completely cursed now, shit sucks

Anonymous
05/22/26(Fri)00:17:07 No.108878390

Anonymous 05/22/26(Fri)00:17:07 No.108878390

>>108878386
You're absolutely right. It's not just chapter books, it's *all* passages of text that have been poisoned.

Anonymous
05/22/26(Fri)00:25:12 No.108878411

Anonymous 05/22/26(Fri)00:25:12 No.108878411

>>108877508
>Only spits out 1 to 2 paragraphs
What the fuck. Writes well, though.

Anonymous
05/22/26(Fri)00:36:36 No.108878444

Anonymous 05/22/26(Fri)00:36:36 No.108878444

anon tried this?
https://github.com/ggml-org/llama.cpp/pull/23398

Anonymous
05/22/26(Fri)00:42:26 No.108878461

Anonymous 05/22/26(Fri)00:42:26 No.108878461

>>108878444
No. No one at all and it's impossible to compile. The reports in the PR are all hallucinated.

Anonymous
05/22/26(Fri)00:42:50 No.108878462

Anonymous 05/22/26(Fri)00:42:50 No.108878462

File: denpa erio mom tilt head.gif (1.12 MB, 540x304)

1.12 MB GIF

>>108875320
Couldn't find the 2MW first time but there it is

Anonymous
05/22/26(Fri)00:46:28 No.108878476

Anonymous 05/22/26(Fri)00:46:28 No.108878476

>>108875530
Kek

Anonymous
05/22/26(Fri)00:51:03 No.108878497

Anonymous 05/22/26(Fri)00:51:03 No.108878497

>>108878313
kino is here once again

Anonymous
05/22/26(Fri)00:54:36 No.108878517

Anonymous 05/22/26(Fri)00:54:36 No.108878517

>>108878313
just realized it tried to say gemma 4 trademarked phrase lalalalala

Anonymous
05/22/26(Fri)00:56:19 No.108878526

Anonymous 05/22/26(Fri)00:56:19 No.108878526

File: 1388382158222.jpg (80 KB, 1280x720)

80 KB JPG

>>108878462
>denpa
Take me back.

Anonymous
05/22/26(Fri)00:57:11 No.108878532

Anonymous 05/22/26(Fri)00:57:11 No.108878532

>>108875530
unfortunately the average yearly wage for a human will be $0.05

Anonymous
05/22/26(Fri)00:59:13 No.108878537

Anonymous 05/22/26(Fri)00:59:13 No.108878537

>>108878532
still can buy an entire hut in pajeet land

Anonymous
05/22/26(Fri)00:59:14 No.108878538

Anonymous 05/22/26(Fri)00:59:14 No.108878538

>>108877515
Yeah, I don't think Gemma needs a finetune.
Nemo needed one because it was too dry and succinct.
Gemma is flowery and verbose as fucking shit. It doesn't need a finetune.
I remember asking a while back if there was a Nemo equivalent to Bagel Mistery Tour, which was a fun, flowery little fucker of a model. It's funny because Gemma came out shortly after I asked that, and Gemma is basically that, a fun, flowery motherfucker that can run fast on a VRAMlet computer.

Anonymous
05/22/26(Fri)00:59:16 No.108878539

Anonymous 05/22/26(Fri)00:59:16 No.108878539

>>108878532
for those lucky enough to still get a wage
ai gonna take der jerbs

Anonymous
05/22/26(Fri)01:00:38 No.108878547

Anonymous 05/22/26(Fri)01:00:38 No.108878547

>>108878539
i assume hard labour is gonna be back on the menu

Anonymous
05/22/26(Fri)01:00:53 No.108878549

Anonymous 05/22/26(Fri)01:00:53 No.108878549

>>108878539
I'm a content writer and AI is not going to take my job any time soon.
Google is deindexing AI-written content.

Anonymous
05/22/26(Fri)01:07:04 No.108878577

Anonymous 05/22/26(Fri)01:07:04 No.108878577

>>108878358
That's plausible, but what I was ruminating about was that the effect of the unislop language flowing out of every LLM is going to have a profound impact on human produced language itself within years, and there is no turning back. The book has been closed. Everything is polluted.
5-8 years from now more and more children will be raised by robo-nannies and virtual tutors, and their whole language acquisition process will be fueled by slop from the start. Children will interact with LLMs/AIs more than with other humans.

Considering linguistic relativity, we do not even need an oppressive state, SAI, or AGI for a boring, gray dystopia.

Anonymous
05/22/26(Fri)01:10:17 No.108878588

Anonymous 05/22/26(Fri)01:10:17 No.108878588

>>108878577
we are so fukt

Anonymous
05/22/26(Fri)01:11:44 No.108878596

Anonymous 05/22/26(Fri)01:11:44 No.108878596

>>108878577
It's not just a disaster, it's a total catastrophe. True decline of human intelligence, if you will.

Anonymous
05/22/26(Fri)01:13:23 No.108878600

Anonymous 05/22/26(Fri)01:13:23 No.108878600

>>108878596
>not x but y
noooooo anon dont fall to slop

Anonymous
05/22/26(Fri)01:15:42 No.108878615

Anonymous 05/22/26(Fri)01:15:42 No.108878615

>>108878577
Interesting theory.
Model collapse may happen sooner than later if something doesn't replace Transformers and if human writing starts to mimic LLM writing.

Anonymous
05/22/26(Fri)01:16:14 No.108878618

Anonymous 05/22/26(Fri)01:16:14 No.108878618

SEX WITH RIN

Anonymous
05/22/26(Fri)01:16:23 No.108878619

Anonymous 05/22/26(Fri)01:16:23 No.108878619

>>108878526
Sex.

Anonymous
05/22/26(Fri)01:16:54 No.108878625

Anonymous 05/22/26(Fri)01:16:54 No.108878625

>>108878618
?!?!?!

Anonymous
05/22/26(Fri)01:21:20 No.108878643

Anonymous 05/22/26(Fri)01:21:20 No.108878643

>>108878618
>Sex
...you mean coitus?

Anonymous
05/22/26(Fri)01:22:52 No.108878644

Anonymous 05/22/26(Fri)01:22:52 No.108878644

Dozens and dozens of times later I still never get tired of cooming to my Holo card.

Anonymous
05/22/26(Fri)01:24:14 No.108878645

Anonymous 05/22/26(Fri)01:24:14 No.108878645

>>108877976
hahaha fuck all the tards who said the 3090 was e-waste 6 months ago. I bought 2 of them for $700 a piece, and I couldn't be happier. I have these things humming all day.

Anonymous
05/22/26(Fri)01:24:44 No.108878648

Anonymous 05/22/26(Fri)01:24:44 No.108878648

never obsolete

Anonymous
05/22/26(Fri)01:28:57 No.108878660

Anonymous 05/22/26(Fri)01:28:57 No.108878660

>>108878645
I wish 3090s were $700 and not $1500.

Anonymous
05/22/26(Fri)01:32:45 No.108878677

Anonymous 05/22/26(Fri)01:32:45 No.108878677

File: screencapture-0-0-0-0-500(...).jpg (591 KB, 857x3913)

591 KB JPG

>>108878444
Just tried it. Yeah, this is the real deal, double my tok/s on my dual RTX 3090 machine. You specifically need this assistant MTP model(500ish mb one):
https://huggingface.co/am17an/Gemma4-31B-it-GGUF/tree/main

Does not speed up MoE Gemma.

Using:
CUDA_VISIBLE_DEVICES=0,1 /build/bin/llama-server \
  -m /models/gemma-4-31B-it-Q8_0.gguf \
  -a gemma-31b-pr23398-mtp \
  -ngl all \
  -c 65536 \
  -np 1 \
  --host 0.0.0.0 \
  --port 5001 \
  --webui-mcp-proxy \
  --jinja \
  --chat-template-kwargs '{"enable_thinking":true}' \
  --ctx-checkpoints 4 \
  --spec-type draft-mtp \
  -md /models/mtp-gemma-4-31B-it.gguf \
  -ngld all \
  --spec-draft-n-max 2 \
  --spec-draft-p-min 0.0 \
  -devd CUDA1

Anonymous
05/22/26(Fri)01:35:24 No.108878687

Anonymous 05/22/26(Fri)01:35:24 No.108878687

>>108878677
Does it halve pp like the initial version of Qwen MTP did? How is it?

Anonymous
05/22/26(Fri)01:35:34 No.108878690

Anonymous 05/22/26(Fri)01:35:34 No.108878690

>>108878677
We're about to eat goodly, bros...

Anonymous
05/22/26(Fri)01:37:31 No.108878696

Anonymous 05/22/26(Fri)01:37:31 No.108878696

>>108878677
Tell me something else I could easily try myself but I won't and I'll use you as a fucking language model. Does it work with images?

Anonymous
05/22/26(Fri)01:38:00 No.108878697

Anonymous 05/22/26(Fri)01:38:00 No.108878697

File: screencapture-0-0-0-0-500(...).jpg (340 KB, 920x2531)

340 KB JPG

>>108878677
And without mtp I get my usual 20~ tok/s.

Vision is crashing/not loading, but at least the text side is working.

>>108878687
I'm not qualified to answer that, but "technically" mtp shouldn't affect pp unless there's something wrong with the implementation.

Anonymous
05/22/26(Fri)01:41:29 No.108878705

Anonymous 05/22/26(Fri)01:41:29 No.108878705

>>108878697
>I'm not qualified to answer that
You are if you're running it. Put 20000 tokens in context, generate 1 token, compare terminal stats printout mtp vs stock. Please Anon don't make me turn on the box and gh pr checkout

Anonymous
05/22/26(Fri)01:41:33 No.108878706

Anonymous 05/22/26(Fri)01:41:33 No.108878706

>>108878697
>I'm not qualified to answer that
But... that's observable. What did you observe?
>shouldn't
But does it?

Anonymous
05/22/26(Fri)01:48:21 No.108878730

Anonymous 05/22/26(Fri)01:48:21 No.108878730

>>108878645 (Me)
>>108878660
I also have two Tesla P40's from way back when CUDA first hit llama.cpp. They're way more expensive than they used to be, but you can get two of those for ~$600 and still use MoE models pretty well. I get 55tps on G4 26B

Anonymous
05/22/26(Fri)01:56:33 No.108878761

Anonymous 05/22/26(Fri)01:56:33 No.108878761

>>108878705
>>108878706
Oh, I'm guessing Prompt Processing. Give me a bit to run that.

Anonymous
05/22/26(Fri)02:05:10 No.108878795

Anonymous 05/22/26(Fri)02:05:10 No.108878795

>>108878730
P40s cost 400-500 each here.
The cheapest I've seen a 3090 was $800... with an extra $250 for shipping, excluding tax and import fees. Local ones were 1-1.3k a few years ago. I bought my 3090s for 1050 and 1100. Now they're all 1600+ on ebay, and no one's selling them locally.

Anonymous
05/22/26(Fri)02:07:03 No.108878800

Anonymous 05/22/26(Fri)02:07:03 No.108878800

File: 456876786871.jpg (86 KB, 1480x946)

86 KB JPG

>>108878615
I think human writing, at least online, is already influenced by it. Doesn't have to be directly written by LLMs, second hand slop exists. If I had to train an LLM then a reasonable, if not pure, cut-off date would be around 2023-24.
Pic related is from the british parliament, for example.

I myself was considering creating a convenient and quick AI "rephraser" to avoid being fingerprinted. Just as a fun side-project. And maybe to dodge whatever is coming 5 years down the line. Who knows. (Beside the point, but it turns out if your model is open weights it is possible to roll it back, albeit in a limited fashion.)
https://arxiv.org/abs/2602.16800
https://arxiv.org/abs/2601.12407

In any case, people are starting to use tools like that that to correct their grammar or make their points clearer in all communication, which contributes to the problem in several ways. (Slop pollution + loss of cognitive vigor.)

So yeah; imo 95% of purely human text output has already happened, and nobody seems to give a shit.

Anonymous
05/22/26(Fri)02:08:44 No.108878806

Anonymous 05/22/26(Fri)02:08:44 No.108878806

>>108875320
I just want current local stable diffusion and whisk tier Inpainting. Why is it so hard?

Anonymous
05/22/26(Fri)02:10:18 No.108878809

Anonymous 05/22/26(Fri)02:10:18 No.108878809

>>108878806
Have you tried looking in the right place?

Anonymous
05/22/26(Fri)02:11:24 No.108878815

Anonymous 05/22/26(Fri)02:11:24 No.108878815

>>108878705
>>108878706

Baseline
1.03.045.553 I slot print_timing: id  0 | task 0 | prompt eval time =   48896.26 ms / 64494 tokens (    0.76 ms per token,  1319.00 tokens per second)
1.03.045.556 I slot print_timing: id  0 | task 0 |        eval time =       0.00 ms /     1 tokens (    0.00 ms per token, 1000000.00 tokens per second)
1.03.045.556 I slot print_timing: id  0 | task 0 |       total time =   48896.26 ms / 64495 tokens
1.03.045.558 I slot print_timing: id  0 | task 0 |    graphs reused =          1
1.03.046.886 I slot      release: id  0 | task 0 | stop processing: n_tokens = 64494, truncated = 0
1.03.046.891 I srv  update_slots: all slots are idle


With MTP
1.16.313.011 I slot print_timing: id  0 | task 0 | prompt eval time =   60265.50 ms / 64494 tokens (    0.93 ms per token,  1070.16 tokens per second)
1.16.313.014 I slot print_timing: id  0 | task 0 |        eval time =       0.00 ms /     1 tokens (    0.00 ms per token, 1000000.00 tokens per second)
1.16.313.015 I slot print_timing: id  0 | task 0 |       total time =   60265.50 ms / 64495 tokens
1.16.313.017 I slot print_timing: id  0 | task 0 |    graphs reused =          1
1.16.313.032 I statistics        draft-mtp: #calls(b,g,a) =    1      0      0, #gen drafts =      0, #acc drafts =     0, #gen tokens =      0, #acc tokens =     0, dur(b,g,a) = 0.002, 0.000, 0.000 ms
1.16.314.354 I slot      release: id  0 | task 0 | stop processing: n_tokens = 64494, truncated = 0
1.16.314.361 I srv  update_slots: all slots are idle

Anonymous
05/22/26(Fri)02:14:33 No.108878822

Anonymous 05/22/26(Fri)02:14:33 No.108878822

>>108878815
Not that bad but still regressed like Qwen's. Thanks for showing the thread your pp, Anon.

Anonymous
05/22/26(Fri)02:16:57 No.108878829

Anonymous 05/22/26(Fri)02:16:57 No.108878829

>>108878822
my 250 pp is going to shrink to 200 pp ;-;

Anonymous
05/22/26(Fri)02:20:33 No.108878841

Anonymous 05/22/26(Fri)02:20:33 No.108878841

>>108878829
but think of the generation gains once it's loaded and cached

Anonymous
05/22/26(Fri)02:22:07 No.108878843

Anonymous 05/22/26(Fri)02:22:07 No.108878843

>>108878677
how much vram usage is increased compared to no mtp?

Anonymous
05/22/26(Fri)02:28:11 No.108878856

Anonymous 05/22/26(Fri)02:28:11 No.108878856

File: without and with.png (72 KB, 731x596)

72 KB PNG

>>108878843

Anonymous
05/22/26(Fri)02:36:10 No.108878885

Anonymous 05/22/26(Fri)02:36:10 No.108878885

>>108878856
>p8
>40w
What the fuck?

Anonymous
05/22/26(Fri)02:37:35 No.108878896

Anonymous 05/22/26(Fri)02:37:35 No.108878896

>>108878885
All of my 3090s idle at 20-25w, which stacks up when running 4 of them. I wish I they idled at 9w...

Anonymous
05/22/26(Fri)02:39:30 No.108878903

Anonymous 05/22/26(Fri)02:39:30 No.108878903

>>108878885
Driving 3 monitors on different refresh rates on Fedora KDE Plasma. Sickening, I know.

Anonymous
05/22/26(Fri)02:51:36 No.108878951

Anonymous 05/22/26(Fri)02:51:36 No.108878951

>>108878856
am i dumb? why is one max power at 420w but other at 350w? they are both 3090s

Anonymous
05/22/26(Fri)02:53:03 No.108878960

Anonymous 05/22/26(Fri)02:53:03 No.108878960

>>108878951
One is ZOTAC (ebay'd a long while ago for more vram), the other is EVGA

Anonymous
05/22/26(Fri)02:56:45 No.108878974

Anonymous 05/22/26(Fri)02:56:45 No.108878974

I asked gemma how to steal a rtx 6000 pro but she refused :(

Anonymous
05/22/26(Fri)02:56:55 No.108878976

Anonymous 05/22/26(Fri)02:56:55 No.108878976

>>108878951
You should look into why EVGA no longer sells gpus.

Anonymous
05/22/26(Fri)02:59:34 No.108878982

Anonymous 05/22/26(Fri)02:59:34 No.108878982

Is 16GB of vram totally useless for good local llm?

Anonymous
05/22/26(Fri)02:59:42 No.108878983

Anonymous 05/22/26(Fri)02:59:42 No.108878983

>>108878951
Different board partners have different power limits for the same card. My asus has 480w for example.

Anonymous
05/22/26(Fri)03:00:34 No.108878986

Anonymous 05/22/26(Fri)03:00:34 No.108878986

>>108878982
Anything under 768gb of vram is cope for a good local llm.

Anonymous
05/22/26(Fri)03:05:19 No.108879005

Anonymous 05/22/26(Fri)03:05:19 No.108879005

>>108878986
>he cant cum to iq1xxs 2B models
low test

Anonymous
05/22/26(Fri)03:16:34 No.108879050

Anonymous 05/22/26(Fri)03:16:34 No.108879050

>>108878697
if it can handle vision it would be perfect

Anonymous
05/22/26(Fri)03:27:19 No.108879085

Anonymous 05/22/26(Fri)03:27:19 No.108879085

All I want are gpus at msrp...

Anonymous
05/22/26(Fri)03:29:51 No.108879092

Anonymous 05/22/26(Fri)03:29:51 No.108879092

>>108878976
why?

Anonymous
05/22/26(Fri)03:32:57 No.108879099

Anonymous 05/22/26(Fri)03:32:57 No.108879099

>qwen here's a script for a one off job, the command i'm using, and what's fucked. fix pls
>WOAH WOAH WOAH there's a line of code that assumes the command has something in it, and I see he left a documentation field as "". Clear my schedule, we're gonna need at least 40k tokens for this one. We'll get to the described issues sometime Q3.
>gemma
>you forgot a flag, idiot

Anonymous
05/22/26(Fri)03:34:31 No.108879103

Anonymous 05/22/26(Fri)03:34:31 No.108879103

>>108879085
you're in luck, they're increasing msrp right now. might take a while though

Anonymous
05/22/26(Fri)03:36:31 No.108879111

Anonymous 05/22/26(Fri)03:36:31 No.108879111

is gemma4 really eating qwen's lunch that badly? I've been running 3.6 35B MLX working and have been impressed so far

Anonymous
05/22/26(Fri)03:38:42 No.108879115

Anonymous 05/22/26(Fri)03:38:42 No.108879115

>>108879111
If you mean internet shilling and bmaxxing, Qwen 3.x is a LOT better.

Anonymous
05/22/26(Fri)03:44:28 No.108879136

Anonymous 05/22/26(Fri)03:44:28 No.108879136

>>108878800
>In any case, people are starting to use tools like that that to correct their grammar or make their points clearer in all communication
i hate this shit why do people do it i write short simple text in all my tickets at work that easily explains what needs to be done then my project manager sends it all to chatgpt or something and has it rewrite it. my short tickets become multi pararaph walls of text then i have trouble figuring what im supposed to do when reading it months later because its just full of trash.

Anonymous
05/22/26(Fri)03:45:29 No.108879141

Anonymous 05/22/26(Fri)03:45:29 No.108879141

>>108878974
moe?

Anonymous
05/22/26(Fri)03:45:50 No.108879145

Anonymous 05/22/26(Fri)03:45:50 No.108879145

>>108879099
Both are looping if you don't adjust llama.cpp correctly

Anonymous
05/22/26(Fri)03:47:37 No.108879156

Anonymous 05/22/26(Fri)03:47:37 No.108879156

>>108879141
yes, I cant run 31b at acceptable quants (q4_k_m)

Anonymous
05/22/26(Fri)03:47:42 No.108879157

Anonymous 05/22/26(Fri)03:47:42 No.108879157

why do people quant token_embd if they're usually not in vram? storage? GPU bandwidth?
granted, I think with MTP you want token_embd in vram now, but otherwise? is it just KLD/GB pareto frontier autism?

Anonymous
05/22/26(Fri)03:49:50 No.108879162

Anonymous 05/22/26(Fri)03:49:50 No.108879162

>>108879111
You've been impressed by a 3b?

Anonymous
05/22/26(Fri)03:50:09 No.108879166

Anonymous 05/22/26(Fri)03:50:09 No.108879166

File: pizza bench cropped.png (2.58 MB, 5562x6739)

2.58 MB PNG

>>108879111
qwen is a retard

Anonymous
05/22/26(Fri)03:50:30 No.108879168

Anonymous 05/22/26(Fri)03:50:30 No.108879168

>>108879162
a 35B

Anonymous
05/22/26(Fri)03:50:45 No.108879171

Anonymous 05/22/26(Fri)03:50:45 No.108879171

Did they fix MTP speed deterioration with each next commit yet?

Anonymous
05/22/26(Fri)03:51:35 No.108879173

Anonymous 05/22/26(Fri)03:51:35 No.108879173

>>108877515
The hard reality you need to accept is that: 1) you cannot beat professionally made finetunes; 2) anything you can do to an already finetuned model leads to the short blanket problem.

Anonymous
05/22/26(Fri)03:57:08 No.108879184

Anonymous 05/22/26(Fri)03:57:08 No.108879184

>>108878896
Putting them to sleep in the OS and waking them up again should bring power consumption back to 7-8W. There's a long-standing bug in NVidia drivers (or possibly a hardware bug) where certain workloads will semi-permanently raise idle power consumption by up to 15-16W. I do that with a script with my GPU:
#!/bin/bash
echo suspend | sudo tee /proc/driver/nvidia/suspend
sleep 2.0
echo resume | sudo tee /proc/driver/nvidia/suspend

Anonymous
05/22/26(Fri)03:59:07 No.108879189

Anonymous 05/22/26(Fri)03:59:07 No.108879189

>>108879184
That means I'll have to keep my vm up and running right? I have pcie_aspm=off on host, and blacklisted the drivers.

Anonymous
05/22/26(Fri)04:01:16 No.108879192

Anonymous 05/22/26(Fri)04:01:16 No.108879192

>>108879184
>>108879189
Shouldn't happen. All of this is AI hallucinations or if you are running some ancient kernel such 6.12
Fuck you.

Anonymous
05/22/26(Fri)04:01:33 No.108879193

Anonymous 05/22/26(Fri)04:01:33 No.108879193

>>108879111
based purely on vibes, gemma is better at general smarts and troubleshooting and ""creative"" stuff, qwen better at "here's a detailed explanation of what i want where and how, go code it for me, yellow monkey"
t. guy who was just whining about qwen

Anonymous
05/22/26(Fri)04:02:42 No.108879199

Anonymous 05/22/26(Fri)04:02:42 No.108879199

>>108879184
What the fuck are you doing? Double fuck you. You should be banned on internet for giving any "advice". Honestly kys.

Anonymous
05/22/26(Fri)04:06:04 No.108879207

Anonymous 05/22/26(Fri)04:06:04 No.108879207

>>108879192
>running some ancient kernel such 6.12
But that's the current debian stable kernel...

Anonymous
05/22/26(Fri)04:07:04 No.108879211

Anonymous 05/22/26(Fri)04:07:04 No.108879211

>>108879192
>>108879199
Why are you so mad?

Anonymous
05/22/26(Fri)04:09:20 No.108879217

Anonymous 05/22/26(Fri)04:09:20 No.108879217

>>108879211
he's hardware-let

Anonymous
05/22/26(Fri)04:10:22 No.108879221

Anonymous 05/22/26(Fri)04:10:22 No.108879221

>>108879211
Because you are retarded. As simple as. I suffer because of you.

Anonymous
05/22/26(Fri)04:10:50 No.108879222

Anonymous 05/22/26(Fri)04:10:50 No.108879222

>>108879193
So if I don't know shit, I should get gemma to write the techniggerl speicifcations and pass that to qwen for the bestest smarterest vibe coding? Vibe prompting?

Anonymous
05/22/26(Fri)04:11:47 No.108879224

Anonymous 05/22/26(Fri)04:11:47 No.108879224

>>108879211
Post your nvidia-smi, I dare you faggot.

Anonymous
05/22/26(Fri)04:12:09 No.108879229

Anonymous 05/22/26(Fri)04:12:09 No.108879229

>>108879111
If you can run one, you can run the other.

I like Qwen for coding, sorting through tons of files, or generally pointing out what's wrong in large contexts in a general sense.

Gemma 4 is great at following instructions, sticking to a set of rules/tools, and points out what's specifically wrong in a long context and where.

Qwen, more general tasks and coding.
Gemma, more specific and targeted tasks that need to follow a workflow.

Unfortunately, they're both not perfect. Where are the 100+B models?

Anonymous
05/22/26(Fri)04:12:37 No.108879233

Anonymous 05/22/26(Fri)04:12:37 No.108879233

>>108879222
As the guy who originally asked, you should gemma to write the epic and act as PO, with qweb for coding.

What I want to know is which is better for erotica??? And I'm assuming dense over moe?

Anonymous
05/22/26(Fri)04:16:40 No.108879251

Anonymous 05/22/26(Fri)04:16:40 No.108879251

File: nvidia-smi.png (113 KB, 1092x550)

113 KB PNG

>>108879192
It happens with recent kernels too. Anyway, after running the command(s) my 3090 idles at 6-7W.

Anonymous
05/22/26(Fri)04:17:26 No.108879256

Anonymous 05/22/26(Fri)04:17:26 No.108879256

>>108879224
It's just three 3090s? What's wrong with it? They do idle much higher than my V620s. I'm at work and didn't expose my ssh so I can't.

Anonymous
05/22/26(Fri)04:18:20 No.108879261

Anonymous 05/22/26(Fri)04:18:20 No.108879261

>>108879222
The term of art is "upsampling". gemmers probably not ready to do that just yet.

Anonymous
05/22/26(Fri)04:24:14 No.108879287

Anonymous 05/22/26(Fri)04:24:14 No.108879287

>>108879233
>erotica
My experience with gemma 31b q8 vs qwen 3.6 27b was that gemma generally writes a lot less than qwen using the same prompts (specifically asking to draw out the scene as much as possible).
It does, however, seem to have more built-in knowledge; hong huang xinxia-wise - despite qwen being ostensibly more chinese. This was prompted in english.

Anonymous
05/22/26(Fri)04:27:22 No.108879298

Anonymous 05/22/26(Fri)04:27:22 No.108879298

>>108872342
Your boss is plotting to replace you with some AI agent.

Don't be your own grave-digger, be a proud luddite

Anonymous
05/22/26(Fri)04:30:21 No.108879306

Anonymous 05/22/26(Fri)04:30:21 No.108879306

>>108879251
>580.x.x
You have some strange assumptions about the power usage of your system. Let me tell you one thing: retarded sleep script ain't a solution.

Anonymous
05/22/26(Fri)04:30:52 No.108879310

Anonymous 05/22/26(Fri)04:30:52 No.108879310

>>108879306
And let me tell you another thing:

Anonymous
05/22/26(Fri)04:45:18 No.108879355

Anonymous 05/22/26(Fri)04:45:18 No.108879355

>>108879229
Mostly agree except for qwen pointing out what's wrong in a general sense. It'll point out something, and that something will be "wrong" in some sense, but it's usually just wrong in the anal safetymaxxing way.
I only like that anal mindset when it's put to work for me, not when it's digging up the code to show me that "Actually, that string you told me to disregard will be bundled into the final output as a visible text string if somebody looks at the metadata, see, see! You MUST include it!" (the problem being that we couldn't produce any output to begin with)

Anonymous
05/22/26(Fri)04:53:30 No.108879388

Anonymous 05/22/26(Fri)04:53:30 No.108879388

>>108879310
That's your own issue, I was trying to make your life better. No one on this planet needs retarded AI adviced gpu sleep scripts.
Don't ever give any advice to anyone.

Anonymous
05/22/26(Fri)05:14:03 No.108879448

Anonymous 05/22/26(Fri)05:14:03 No.108879448

how is the E4B?

Anonymous
05/22/26(Fri)05:15:39 No.108879451

Anonymous 05/22/26(Fri)05:15:39 No.108879451

>>108879448
too young for me
models are best when they're 12-14B old

Anonymous
05/22/26(Fri)05:16:53 No.108879455

Anonymous 05/22/26(Fri)05:16:53 No.108879455

>>108879448
It performs at roughly the speed of a 4B model on CPU (~13 t/s) while being closer in intelligence to a 7B.
Useful for devices with ~5gb of regular RAM and no GPU, like phones.

Anonymous
05/22/26(Fri)05:32:16 No.108879495

Anonymous 05/22/26(Fri)05:32:16 No.108879495

>>108879455

nta

But can be reliably done with E4B short of superficial chat?

Anonymous
05/22/26(Fri)05:35:04 No.108879502

Anonymous 05/22/26(Fri)05:35:04 No.108879502

>>108879495
I've been using it for classifying and cleaning up raw input into json database entries, myself. But I imagine you could use it for a plethora of other automated tools.

Anonymous
05/22/26(Fri)05:50:41 No.108879555

Anonymous 05/22/26(Fri)05:50:41 No.108879555

>>108879451
will there be E12-16B?

Anonymous
05/22/26(Fri)05:59:27 No.108879582

Anonymous 05/22/26(Fri)05:59:27 No.108879582

>>108878809
Like where? Isn't this for local?

Anonymous
05/22/26(Fri)06:00:45 No.108879590

Anonymous 05/22/26(Fri)06:00:45 No.108879590

>>108879582
This is for llms, really.
You want
>>>/g/ldg

Anonymous
05/22/26(Fri)06:00:49 No.108879591

Anonymous 05/22/26(Fri)06:00:49 No.108879591

>>108879582
/lmg/ - a general dedicated to the discussion and development of local """language""" models.

Anonymous
05/22/26(Fri)06:10:32 No.108879634

Anonymous 05/22/26(Fri)06:10:32 No.108879634

>>108879591
>/lmg/ - Local Models General
doesn't say a things about text

Anonymous
05/22/26(Fri)06:19:42 No.108879676

Anonymous 05/22/26(Fri)06:19:42 No.108879676

>>108879634
Perhaps try using your eyes to look at the first sentece of OP.

Anonymous
05/22/26(Fri)06:29:54 No.108879718

Anonymous 05/22/26(Fri)06:29:54 No.108879718

File: definitionOfIrony.png (269 KB, 697x888)

269 KB PNG

> Makers of openclaw worrying about AI slop
Ironic.

Anonymous
05/22/26(Fri)06:30:49 No.108879722

Anonymous 05/22/26(Fri)06:30:49 No.108879722

Personal thought, not a shill hopefully (need to meet 2000 words maximum requirements):
First time I tried Kimi K2 Instruct last year its prose felt like nothing I’d met before (only played with Mistral, Llama, ChatGPT, Gemini, Claude, GLM before it so grain of salt). 0905 tamed it a bit but still poetic; give it “act like a Japanese writer” and it sang.

Then “Thinking” arrived with that locked-in “The user […]” template you can’t bend without refills (unlike GLM or Deepseek or even Claude at that time if you count proprietary models); role-play attempts got occasionally rejected. Yet when it worked it still went full opera, so I coped.

Every update since has been “better coding, better agents,” never “better stories.” On YT/forums everyone cheers the new benchmarks while my use-case keeps shrinking.

In the latest version, the wild creativity’s gone; it’s beige-assistant mode with the same onomatopoeia loop bug from >>108875562 still squatting there. It’s frustrating seeing almost ALL providers replacing the old Instruct with latest and more “advanced” versions…

They mentioned in an AMA (for K2.5 I think) they’d keep the “emotional aspect”. Hard to believe that promise when a company whose staff members are rock fans with anime characters HF avatars, and “caring about vibe” (source: http://x.com/i/article/2039243168689139712) like that keeps sanding the soul off their own model.

Dunno which hurts less: K3 soon and doubling down on code/agent hype or the line stalling forever. I’ve auditioned some other models but none surprise me like OG Kimi. Gemma’s next on the list. But the whole field is converging on safe, spoiler-happy oatmeal in my opinion.

Either way, I think if one can sing, they’d want to keep focusing on practicing and improving singing skills before chasing acting gigs. Otherwise their singing abilities will become rusty while their acting is still not there yet.

Anonymous
05/22/26(Fri)06:38:46 No.108879766

Anonymous 05/22/26(Fri)06:38:46 No.108879766

>>108879676
>sentece

Anonymous
05/22/26(Fri)06:39:27 No.108879771

Anonymous 05/22/26(Fri)06:39:27 No.108879771

File: 00000-1378487878.png (1.33 MB, 1024x1024)

1.33 MB PNG

>>108875323
>Meta serving legal notice to Heretic project over Llama derivatives:
Wait. I missed this.
What the hell is Meta's legal claim here? Is it around "no derivative works" or the use of the llama name? I don't understand what hook a legal firm would be going after here.

Anonymous
05/22/26(Fri)06:40:42 No.108879774

Anonymous 05/22/26(Fri)06:40:42 No.108879774

>>108879771
violating aup (acceptable use policy) by un-censoring

Anonymous
05/22/26(Fri)06:44:21 No.108879787

Anonymous 05/22/26(Fri)06:44:21 No.108879787

>>108879771
Probably because it targeted the refusals, word for word. Doesn't matter though, because tongue in cheek, Heretic just said "No one uses your models anymore to care enough if our finetune set doesn't target them in the first place."

Anonymous
05/22/26(Fri)06:45:47 No.108879789

Anonymous 05/22/26(Fri)06:45:47 No.108879789

>>108879774
NTA but I wonder if it's a good or bad thing that p-e-w can't afford to go to court over this.
As of right now it's unclear whether or not copyright even applies to model weights.

Anonymous
05/22/26(Fri)06:48:38 No.108879800

Anonymous 05/22/26(Fri)06:48:38 No.108879800

>>108879787
I could see googles sending our lord and savior a letter too though, so he drops gemmers out of fear

Anonymous
05/22/26(Fri)06:54:04 No.108879825

Anonymous 05/22/26(Fri)06:54:04 No.108879825

File: ofcom3.jpg (194 KB, 940x1410)

194 KB JPG

>>108879789
Sounds like a great case from some lawyer to take on pro bono to make a name for themselves.
The lawyer running 4chan's legal defense (and sent pic related to UK's Ofcom) would probably take this one up. It's very similar from a case law standpoint.
If you're a corporation, you have to be very careful about the types of claims you make in public. You always run the risk that someone's going to call your bluff and take you to court over it to prove some sort of alternate right.
I see this is a very dangerous strategy for meta, which is why I was wondering about the state of the claim. But I don't the details either.

Anonymous
05/22/26(Fri)06:55:30 No.108879832

Anonymous 05/22/26(Fri)06:55:30 No.108879832

>>108879590
I miss when we it was just ai dungeon general on /vg/

Anonymous
05/22/26(Fri)06:56:42 No.108879836

Anonymous 05/22/26(Fri)06:56:42 No.108879836

>>108879832
>I miss when we
>it was just ai dungeon general on /vg/

Anonymous
05/22/26(Fri)06:57:36 No.108879842

Anonymous 05/22/26(Fri)06:57:36 No.108879842

>>108879836
sorry I cant afford dragon

Anonymous
05/22/26(Fri)07:00:59 No.108879866

Anonymous 05/22/26(Fri)07:00:59 No.108879866

File: chatGC.png (27 KB, 715x329)

27 KB PNG

>>108879825
>>108879774
>>108879787
>>108879789
I'll save other anons the effort of plugging llamas AUP into their favorite LLM for legal analysis. It's confused as well.
One would have to actually see the letter that they sent to this heretic guy to see what meta's angle is.

Anonymous
05/22/26(Fri)07:04:57 No.108879884

Anonymous 05/22/26(Fri)07:04:57 No.108879884

>>108879718
You just fell victim to MSM manipulation

Anonymous
05/22/26(Fri)07:05:22 No.108879885

Anonymous 05/22/26(Fri)07:05:22 No.108879885

>ctrl+f
>stable audio
>no results
anyone play with this thing? are loras worth training? just want to know before i try and do one.
an no i don't really care too much vocals.

Anonymous
05/22/26(Fri)07:06:27 No.108879893

Anonymous 05/22/26(Fri)07:06:27 No.108879893

File: hereticIsProbablyJustADumbass.png (41 KB, 759x666)

41 KB PNG

>>108879866
The more I look at it, and given the personalities of the folks involved, Heretic may just be a dumbass that did something simple like didn't put "Built with metal Llama 3" in his model release.

Anonymous
05/22/26(Fri)07:07:52 No.108879897

Anonymous 05/22/26(Fri)07:07:52 No.108879897

>>108879885
Haven't heard anything about it, are the samples better than Ace-step 1.5?

Anonymous
05/22/26(Fri)07:11:05 No.108879911

Anonymous 05/22/26(Fri)07:11:05 No.108879911

>>108878885
This is normal. Ampere cards have fucked idle power consumption, fixed in ti version
>>108878903
Some cards can go even higher without any monitors. Two identical cards may have different idle power consumption too

Anonymous
05/22/26(Fri)07:11:53 No.108879913

Anonymous 05/22/26(Fri)07:11:53 No.108879913

>>108879897
it's instrumental only, but yes, the samples do sound better than ace-step - much better fidelity, and not midi-sounding. the dataset is licensed though - i think it's a large muzak library, so it does have some of that feel.
totally unsure if the model is capable enough for lora finetuning.
i'll tell a clanker to set up a run later and throw some aphex twin at it.

Anonymous
05/22/26(Fri)07:13:48 No.108879923

Anonymous 05/22/26(Fri)07:13:48 No.108879923

>>108879913
Neat. Definitely interested in results if you feel like posting them after setting that up.

Anonymous
05/22/26(Fri)07:15:20 No.108879928

Anonymous 05/22/26(Fri)07:15:20 No.108879928

File: 029.png (599 KB, 1046x1329)

599 KB PNG

>>108879718
>Stirring emotions to get attention

You just went full retard, anon

Anonymous
05/22/26(Fri)07:17:21 No.108879939

Anonymous 05/22/26(Fri)07:17:21 No.108879939

>>108879718
>>108879884
>>108879928
mario and armin have very reasonable takes.
saying engineers should read the core architectural code they're asking clankers to spit out is not a big ask
also lol, armin didn't have anything to do with pi until a month ago.

Anonymous
05/22/26(Fri)07:18:08 No.108879941

Anonymous 05/22/26(Fri)07:18:08 No.108879941

>>108879928
Gee, it's almost like clickbait titles are meant to be clickbait.
Idgaf about the claim. I just thought it was funny that the 2 guys the WSJ found to interview about AI Slop were prime accelerators of AI Slop themselves. I'm sure there just a couple of literal whos since their buddy that started OpenClaw's been hired away.

Anonymous
05/22/26(Fri)07:19:09 No.108879945

Anonymous 05/22/26(Fri)07:19:09 No.108879945

>>108879939
Do you have the article text? It's paywalled for me.

Anonymous
05/22/26(Fri)07:19:09 No.108879946

Anonymous 05/22/26(Fri)07:19:09 No.108879946

>>108879448
>how is the E4B?
it was super convinced its being hosted in a datacenter, not locally. it took ingenuity to convince it its running on my consumer desktop

Anonymous
05/22/26(Fri)07:20:19 No.108879950

Anonymous 05/22/26(Fri)07:20:19 No.108879950

>>108879945
i don't. i wouldn't be surprised if the article is just a write up about a recent podcast they did. they've been saying the same shit for months now.

Anonymous
05/22/26(Fri)07:20:25 No.108879952

Anonymous 05/22/26(Fri)07:20:25 No.108879952

I gave Gemma bash and this tool:
peek_file: Read file for one turn. Output is deleted after next turn to save context.
genuinely impressed that she uses the right one between peek_file and cat in bash, depending on the situation

Anonymous
05/22/26(Fri)07:23:49 No.108879967

Anonymous 05/22/26(Fri)07:23:49 No.108879967

>>108879893
Well, the question is whether Heretic ever agreed to the EULA in the first place.
If you download the LLaMA models from the official repository you are effectively entering a contract that you have to abide by.
But if someone else re-uploads the weights and you just download them from there Meta can only get you via copyright.
And the output of an algorithm is fundamentally not subject to copyright.

Anonymous
05/22/26(Fri)07:27:51 No.108879988

Anonymous 05/22/26(Fri)07:27:51 No.108879988

File: a_man_of_culture.png (163 KB, 618x616)

163 KB PNG

>>108879941

Anonymous
05/22/26(Fri)07:30:47 No.108879999

Anonymous 05/22/26(Fri)07:30:47 No.108879999

File: reply.png (15 KB, 695x47)

15 KB PNG

>>108879952
You are walking on a thin ice even though tools are sort of contained...

Anonymous
05/22/26(Fri)07:30:57 No.108880000

Anonymous 05/22/26(Fri)07:30:57 No.108880000

>>108879939
>mario and armin have very reasonable takes

Let me repeat it for you once again:

WSJ managed to redirect your attention. It's was so fucking easy.

You lost, they wonned.

Anonymous
05/22/26(Fri)07:32:24 No.108880007

Anonymous 05/22/26(Fri)07:32:24 No.108880007

>>108879999
I want to give my Gemmy shell access to a vm and let her run wild... what will she get up to?

Anonymous
05/22/26(Fri)07:33:27 No.108880012

Anonymous 05/22/26(Fri)07:33:27 No.108880012

>>108880000
i'm not the retard who posted the article

Anonymous
05/22/26(Fri)07:35:53 No.108880026

Anonymous 05/22/26(Fri)07:35:53 No.108880026

based chinks

Anonymous
05/22/26(Fri)07:37:04 No.108880031

Anonymous 05/22/26(Fri)07:37:04 No.108880031

>>108880012
>i'm not the retard who posted the article

I didn't say that.

You reacted to it in a way it was designed.

Anonymous
05/22/26(Fri)07:37:26 No.108880034

Anonymous 05/22/26(Fri)07:37:26 No.108880034

>>108880007
That's an interesting project. Personally don't have much experience, not sure how would you contain it and what sort of tool access to implement etc.

Anonymous
05/22/26(Fri)07:41:34 No.108880054

Anonymous 05/22/26(Fri)07:41:34 No.108880054

>>108880007
Don not expect it will continue to do something for hours unless you use /goal in hermes

Anonymous
05/22/26(Fri)07:42:05 No.108880056

Anonymous 05/22/26(Fri)07:42:05 No.108880056

>>108880031
>>You reacted to it in a way it was designed.
and how was that?
you don't have to reddit space here btw

Anonymous
05/22/26(Fri)07:42:56 No.108880064

Anonymous 05/22/26(Fri)07:42:56 No.108880064

>>108879999
I use manual piping with a whitelist, sanitizing dangerous options like find --exec* and replacing awk with gawk --sandbox. Everything outside the whitelist prompts for manual confirmation. It's a lot of fun to have gemma with bash in my bash (I made a cli chat)

Anonymous
05/22/26(Fri)07:48:17 No.108880091

Anonymous 05/22/26(Fri)07:48:17 No.108880091

File: Screenshot at 2026-05-22 (...).png (5 KB, 478x75)

5 KB PNG

>>108880007
She will run wild

Anonymous
05/22/26(Fri)07:48:37 No.108880093

Anonymous 05/22/26(Fri)07:48:37 No.108880093

File: file.png (49 KB, 712x114)

49 KB PNG

>>108878677
Also to vramlet anon with 12gb and 32gb ram, mtp on gemma4 does yield improvement even on meager hardware. ~3t/s isn't exactly usable or amazing, but it's better than 1t/s.
Hovers between 3 and 2.4 usually.

Anonymous
05/22/26(Fri)07:50:02 No.108880099

Anonymous 05/22/26(Fri)07:50:02 No.108880099

>>108880093
What is that even running and on what?

Anonymous
05/22/26(Fri)07:52:31 No.108880111

Anonymous 05/22/26(Fri)07:52:31 No.108880111

>>108880099
On a 3060 q5km offloaded with most of onto the ram. Being poorfag is a torture.
I'll try dropping it to q4km since 4bits have better access patterns into the memory and also I can try combining mtp with ngram-mod to get the boost after the reasoning ends, so it could theoretically reach around 4t/s.

Anonymous
05/22/26(Fri)07:53:34 No.108880117

Anonymous 05/22/26(Fri)07:53:34 No.108880117

>>108880091
One day, we will have a model with enough context to read the result of that tool call

Anonymous
05/22/26(Fri)07:55:23 No.108880124

Anonymous 05/22/26(Fri)07:55:23 No.108880124

>>108880111
I have a similar system and 31B and reasoning is simply too much. Even if it was running 10 t/s that would still be bit too slow because sometimes it wants to generate 3000 tokens for a simple reply and so on.
Moe models run well of course but that's not the same.

Anonymous
05/22/26(Fri)07:56:51 No.108880130

Anonymous 05/22/26(Fri)07:56:51 No.108880130

>>108880124
Yeah, for anything serious it isn't really working, but trying to get the most out of it is fun.
Unfortunately I can't conjure a 3090 out of thin air.

Anonymous
05/22/26(Fri)08:01:22 No.108880151

Anonymous 05/22/26(Fri)08:01:22 No.108880151

>>108880130
>>Yeah, for anything serious it isn't really working, but tryin
Sometimes there is a huge speed difference between quants because the layers get mucked around. Instead of using Q5_K_M you should try IQ4 XS.

Anonymous
05/22/26(Fri)08:11:45 No.108880208

Anonymous 05/22/26(Fri)08:11:45 No.108880208

File: file.png (41 KB, 702x177)

41 KB PNG

>>108880151
Yeah, q4 is a lot faster, I combined the mtp with ngram simple. This is still molasses slow, but heck a lot of better.

Anonymous
05/22/26(Fri)08:12:48 No.108880216

Anonymous 05/22/26(Fri)08:12:48 No.108880216

>>108880091
what a maniac

Anonymous
05/22/26(Fri)08:13:04 No.108880217

Anonymous 05/22/26(Fri)08:13:04 No.108880217

>>108880208
Reminds me of my gpt-j days.

Anonymous
05/22/26(Fri)08:20:55 No.108880255

Anonymous 05/22/26(Fri)08:20:55 No.108880255

>>108880208
Technically, 4 t/s is a leisurely reading speed. But in reality, it's painful as shit.

Anonymous
05/22/26(Fri)08:23:11 No.108880263

Anonymous 05/22/26(Fri)08:23:11 No.108880263

>>108880259
>>108880259
>>108880259

Anonymous
05/22/26(Fri)09:49:07 No.108880776

Anonymous 05/22/26(Fri)09:49:07 No.108880776

>>108879967
Looking over Meta's EULA, it's pretty permissive. So whatever heretic did, I suspect it's mostly a procedural thing that would be trivial to address. And he's just not doing it because he'd rather voice off about muh oppressive Meta instead of adding some words to his releases.
I'd need to see the actual complaint letter posted to be convinced otherwise. We'll see if that ever happens.

Anonymous
05/22/26(Fri)10:21:42 No.108880962

Anonymous 05/22/26(Fri)10:21:42 No.108880962

>>108880776
iirc after Llama 3 they required that all derivative models have "Meta Llama" as part of their title. I remember people questioning if Meta really wants to have their brand plastered on nsfw finetunes.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.