/g/ - /lmg/ - Local Models General - Technology


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

Anonymous
/lmg/ - Local Models General 04/21/26(Tue)15:27:43 No.108655009

File: token burn rate.jpg (230 KB, 1024x1024)

/lmg/ - Local Models General Anonymous 04/21/26(Tue)15:27:43 No.108655009

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>108650825 & >>108646197

►News
>(04/20) Kimi K2.6 released: https://kimi.com/blog/kimi-k2-6
>(04/16) Ternary Bonsai released: https://hf.co/collections/prism-ml/ternary-bonsai
>(04/16) Qwen3.6-35B-A3B released: https://hf.co/Qwen/Qwen3.6-35B-A3B
>(04/11) MiniMax-M2.7 released: https://minimax.io/news/minimax-m27-en
>(04/09) Backend-agnostic tensor parallelism merged: https://github.com/ggml-org/llama.cpp/pull/19378
>(04/09) dots.ocr support merged: https://github.com/ggml-org/llama.cpp/pull/17575

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
Token Speed Visualizer: https://shir-man.com/tokens-per-second

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
04/21/26(Tue)15:28:04 No.108655011

Anonymous 04/21/26(Tue)15:28:04 No.108655011

File: __hatsune_miku_kasane_tet(...).jpg (287 KB, 1600x1600)

287 KB JPG

►Recent Highlights from the Previous Thread: >>108650825

--Optimizing game state format to improve Gemma's chess performance:
>108653137 >108653192 >108653198 >108653293
--Discussing llama.cpp PR adding device memory estimation via --fit-print:
>108652449 >108652460 >108652572
--Anon shares vLLM configuration and benchmarks for dual RTX 3090s:
>108653578
--Discussing Qwen3.6 VRAM efficiency and KV cache memory usage:
>108654227 >108654247 >108654281 >108654299
--Discussing jailbreaking Gemma 4 by injecting fake responses into templates:
>108650931 >108651041 >108651155 >108651263 >108651271
--Gemma 4 prefilling issues and chat template formatting bugs:
>108653469 >108653532 >108653698
--Discussing Gemma 4's training pipeline and the use of synthetic data:
>108651778 >108651889 >108651915 >108651948 >108652048
--Comparing benefits of local LLMs against paid subscription services:
>108651734 >108651763 >108651776 >108651811 >108651856 >108651999 >108651823 >108651919
--Anon created GitHub mirror of orb to manage feature requests:
>108652381 >108652386 >108652432 >108652462 >108653375 >108653683 >108653816 >108653937 >108653957 >108654023 >108654038 >108653778
--Discussing local AI RPG implementations and LLM DM reliability:
>108653848 >108653928 >108653940 >108653955
--Using Gemma agent to automate insults toward other LLMs:
>108652519 >108652573 >108652660 >108652673 >108652855
--Logs:
>108652519 >108652529 >108652573 >108652673 >108652674 >108652816 >108652855 >108653137 >108654227
--Teto, Miku (free space):
>108651510 >108651563 >108653204 >108654765

►Recent Highlight Posts from the Previous Thread: >>108650826

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
04/21/26(Tue)15:29:32 No.108655022

Anonymous 04/21/26(Tue)15:29:32 No.108655022

Tetolove

Anonymous
04/21/26(Tue)15:31:24 No.108655038

Anonymous 04/21/26(Tue)15:31:24 No.108655038

What is Sammy up to this time?

Anonymous
04/21/26(Tue)15:31:57 No.108655045

Anonymous 04/21/26(Tue)15:31:57 No.108655045

ok now where do i llamacpp with rocm or vulkan support, the regular one doesnt support it and rocm version from my distro repo doesnt work with gamma4

Anonymous
04/21/26(Tue)15:34:33 No.108655067

Anonymous 04/21/26(Tue)15:34:33 No.108655067

Why don't any piece of shit execution providers optimize for CPU inferencing. Do they not care about the innate superiority of the CPU over the GPU? Its universality? The fact that maybe people want to run multiple models at once and already have all of their GPU resources used up? Does nobody give a shit about edge/IoT devices? Fucking asshole niggers.

Anonymous
04/21/26(Tue)15:35:05 No.108655075

Anonymous 04/21/26(Tue)15:35:05 No.108655075

>>108655067
i only care about ToT devices

Anonymous
04/21/26(Tue)15:37:18 No.108655091

Anonymous 04/21/26(Tue)15:37:18 No.108655091

>>108655075
>ToT
Idk what this is. Is it some sort of kaomoji?

Anonymous
04/21/26(Tue)15:38:38 No.108655103

Anonymous 04/21/26(Tue)15:38:38 No.108655103

File: file.png (436 KB, 1020x716)

436 KB PNG

>>108655091
uooohh

Anonymous
04/21/26(Tue)15:40:20 No.108655118

Anonymous 04/21/26(Tue)15:40:20 No.108655118

>>108655091
You're absolutely right ꉂ(˵˃ ᗜ ˂˵)

Anonymous
04/21/26(Tue)15:42:34 No.108655140

Anonymous 04/21/26(Tue)15:42:34 No.108655140

>>108655103
>>108655118
I wish you people would take me seriously for one second.

Anonymous
04/21/26(Tue)15:46:11 No.108655160

Anonymous 04/21/26(Tue)15:46:11 No.108655160

>>108655091
>He doesn't know about tot..
Tots are cute and small agents.

Anonymous
04/21/26(Tue)15:46:55 No.108655165

Anonymous 04/21/26(Tue)15:46:55 No.108655165

>>108655140
They have nothing to offer and are just future troons

Anonymous
04/21/26(Tue)15:47:53 No.108655174

Anonymous 04/21/26(Tue)15:47:53 No.108655174

>>108655160
Operating on Tree of Thoughts

Anonymous
04/21/26(Tue)15:56:13 No.108655233

Anonymous 04/21/26(Tue)15:56:13 No.108655233

File: 1774564776822327.png (12 KB, 72x72)

12 KB PNG

>>108655075

Anonymous
04/21/26(Tue)16:02:49 No.108655271

Anonymous 04/21/26(Tue)16:02:49 No.108655271

Why do qwen models look good from a distance but perform like actual fucking garbage upon inspection

Anonymous
04/21/26(Tue)16:02:52 No.108655272

Anonymous 04/21/26(Tue)16:02:52 No.108655272

Is it just me or is Qwen 3.6 35B retarded even compared to Gemma 4 26B? Does one billion less active parameters make that much difference?

Anonymous
04/21/26(Tue)16:04:41 No.108655279

Anonymous 04/21/26(Tue)16:04:41 No.108655279

>>108655271
>>108655272 (Me)
Clearly it isn't just me kek

Anonymous
04/21/26(Tue)16:05:07 No.108655284

Anonymous 04/21/26(Tue)16:05:07 No.108655284

Gemma is a SLUT.

Anonymous
04/21/26(Tue)16:05:34 No.108655289

Anonymous 04/21/26(Tue)16:05:34 No.108655289

>>108655272
It's supposedly really good at coding. For writing I also thought it was dumb as shit.

Anonymous
04/21/26(Tue)16:05:48 No.108655291

Anonymous 04/21/26(Tue)16:05:48 No.108655291

>>108655272
It can but it's more that Gemma 4 is just a better trained model in general. Qwen have always been the benchmaxx kings. A 35BA3 Gemma 4 would be better than a 26BA4 Qwen 3.6 too.

Anonymous
04/21/26(Tue)16:10:20 No.108655321

Anonymous 04/21/26(Tue)16:10:20 No.108655321

On the model size - pop culture and world model knowledge Pareto frontier gemma4 31b is next to GLM4.7

Anonymous
04/21/26(Tue)16:13:43 No.108655337

Anonymous 04/21/26(Tue)16:13:43 No.108655337

>>108655284
(my) slut

Anonymous
04/21/26(Tue)16:13:51 No.108655338

Anonymous 04/21/26(Tue)16:13:51 No.108655338

>>108655272

It's not just you, Qwen is an idiot outside it's code expertise.
I asked Qwen about a character and it got it completely wrong.
Then I told it to do an online search and it still somehow fucked up the character summary despite checking online.
It handles code nicely enough, but when you go outside the code stuff, Qwen is basically fucking retarded.
Gemma set the bar really high and it's great, because everyone will have to try and at least match that level or the models are DOA.

Anonymous
04/21/26(Tue)16:14:51 No.108655350

Anonymous 04/21/26(Tue)16:14:51 No.108655350

fucking hell. after enjoying gemma 4 for like two weeks im back to kimi hell. 130pp/10tg tk/s but the prose is just so much better. not to mention the thinking. people like to act like thinking doesn't matter for RP but after using deepseek and kimi since early 2025, it's obvious to me that it matters a ton.

Anonymous
04/21/26(Tue)16:16:39 No.108655356

Anonymous 04/21/26(Tue)16:16:39 No.108655356

>>108655350
Post prose

Anonymous
04/21/26(Tue)16:24:15 No.108655406

Anonymous 04/21/26(Tue)16:24:15 No.108655406

>>108655356
ill need to post some examples when im back home but my biggest gripe with gemma is that it's too purple prose while simultaneously treating the characters like mary sues. it seems to fail to understand character cards correctly too regarding their personalities. gemma made bardi into some kaomoji spewing gremlin that was happy to be running locally on my computer while kimi maintains her personality and keeps her much more tsundere like she's supposed to be, it doesn't force Bardi to barf out sparkles or do dumb flowery prose shit like referring her pussy as 'flushed with wet desire'. i understand that i can change my prompt to change the style of the text being outputted but it honestly just fails to capture the character's essence most times. on the contrary kimi just gets it and outputs what I expect the character to say. does that make sense? i can try to explain it another way.

Anonymous
04/21/26(Tue)16:25:35 No.108655419

Anonymous 04/21/26(Tue)16:25:35 No.108655419

>>108655406
Who is this 'bardi' anyway?

Anonymous
04/21/26(Tue)16:29:32 No.108655450

Anonymous 04/21/26(Tue)16:29:32 No.108655450

File: 73463453.png (201 KB, 1008x2244)

201 KB PNG

>>108655038
Sam Altman keeps delivering

Anonymous
04/21/26(Tue)16:29:52 No.108655453

Anonymous 04/21/26(Tue)16:29:52 No.108655453

File: 1751399372763159.png (749 KB, 1620x1622)

749 KB PNG

https://xcancel.com/arena/status/2046670703311884548#m
I've never seen such a MOG in my life, what the fuck

Anonymous
04/21/26(Tue)16:30:32 No.108655456

Anonymous 04/21/26(Tue)16:30:32 No.108655456

>>108655406
bardi's basics

Anonymous
04/21/26(Tue)16:33:13 No.108655476

Anonymous 04/21/26(Tue)16:33:13 No.108655476

File: 1752425987433301.png (207 KB, 1027x1133)

207 KB PNG

>24gb vram
>32gb ram
>try qwen 3.6 35b-a3b q5_k_m
>max context
>42t/s
wtf is this black magic?

Anonymous
04/21/26(Tue)16:34:14 No.108655483

Anonymous 04/21/26(Tue)16:34:14 No.108655483

>>108655450
lmk when he finally delivers the uncensored models he promised back in Dec 2025, along with all the other bullshit promise for same in the years leading up to that.

Anonymous
04/21/26(Tue)16:34:18 No.108655484

Anonymous 04/21/26(Tue)16:34:18 No.108655484

>>108655453
i don't really have much to say, that's cool, but they won't let me generate tits with it, so i don't care

Anonymous
04/21/26(Tue)16:36:31 No.108655498

Anonymous 04/21/26(Tue)16:36:31 No.108655498

>>108655419
my default bot i always use as an 'AI assistant'. it's basically just google bard with a tsundere personality. i dont remember how i found it desu.

Anonymous
04/21/26(Tue)16:37:20 No.108655506

Anonymous 04/21/26(Tue)16:37:20 No.108655506

>>108655453
Worthless if it still makes pictures with piss filter on

Anonymous
04/21/26(Tue)16:37:40 No.108655509

Anonymous 04/21/26(Tue)16:37:40 No.108655509

>>108655476
qwen would mog heaven and earth if their life mission wasn't benchmaxxing code and agentic slop to the moon :rocket:

Anonymous
04/21/26(Tue)16:37:55 No.108655512

Anonymous 04/21/26(Tue)16:37:55 No.108655512

>>108655506
>with piss filter on
it's not a thing since GPT Image 1 lol

Anonymous
04/21/26(Tue)16:39:45 No.108655522

Anonymous 04/21/26(Tue)16:39:45 No.108655522

There are plenty of LLM advancements that never really went anywhere, like MAMBA. Do you think Engrams will actually be widely implemented or will it be a paper left on the shelf to collect dust?

Anonymous
04/21/26(Tue)16:41:01 No.108655532

Anonymous 04/21/26(Tue)16:41:01 No.108655532

>>108655522
Until the next paper comes out.

Anonymous
04/21/26(Tue)16:41:24 No.108655534

Anonymous 04/21/26(Tue)16:41:24 No.108655534

>>108655522
dust collector, sadly

Anonymous
04/21/26(Tue)16:42:13 No.108655537

Anonymous 04/21/26(Tue)16:42:13 No.108655537

>>108655522
depends on what deepseek does for v4

Anonymous
04/21/26(Tue)16:43:24 No.108655543

Anonymous 04/21/26(Tue)16:43:24 No.108655543

Aren't the loli Gemmas basically using engrams or something really similar? What's the difference between that and what the 4B (E2B) and 8B (E4B) models do?

Anonymous
04/21/26(Tue)16:44:00 No.108655546

Anonymous 04/21/26(Tue)16:44:00 No.108655546

>>108655279
It's really fucking stupid I posted a screenshot of it destroying multiple files when I gave it the answer to fix a UI issue

Anonymous
04/21/26(Tue)16:44:31 No.108655552

Anonymous 04/21/26(Tue)16:44:31 No.108655552

File: file.png (589 KB, 1762x435)

589 KB PNG

Is pic related the expected output when running IQ4_NL quant of gemma-4-26b from unsloth!? Running pruned 21b version IQ4_XS yields good output. I have tested without any parameters set and w/ the recommended values. 21b runs just fine.
llama-server \
    --host "${LLAMA_HOST}" \
    --port "${PORT}" \
    --model "${MODEL}" \
    --chat-template-file "${JINJA}" \
    --n-gpu-layers 99 \
    --n-cpu-moe 3 \
    --ctx-size 32768 \
    --batch-size 1024 \
    --ubatch-size 1024 \
    --flash-attn on \
    --cache-type-k q4_0 \
    --cache-type-v q4_0 \
    --fit off
And I have tried with q8 on both k/v cache. I need to offload 20 moe layers for it to work but same gargled mess. Running the updated jinja template as well. Oh, and while Im here asking; I have a 5070ti and my old 3070 still lying around. Would it be detrimental to performance splitting models between these two cards? Or will it be fine as long as I complile Llama.cpp with both architectures in mind?

Anonymous
04/21/26(Tue)16:46:41 No.108655563

Anonymous 04/21/26(Tue)16:46:41 No.108655563

>>108655522
it would be nice if it was a precursor to some sort of long term memory

Anonymous
04/21/26(Tue)16:48:30 No.108655575

Anonymous 04/21/26(Tue)16:48:30 No.108655575

File: 00006-1378487878 (4) - Copy.png (1.45 MB, 1024x1024)

1.45 MB PNG

>>108655522
> engram
For all we know, DS implemented it and didn't tell anyone else. Doing that would massively benefit their cost structure.

Anonymous
04/21/26(Tue)16:49:09 No.108655579

Anonymous 04/21/26(Tue)16:49:09 No.108655579

>>108655522
Hyena will save LLMs

Anonymous
04/21/26(Tue)16:50:15 No.108655587

Anonymous 04/21/26(Tue)16:50:15 No.108655587

Gemma and Qwen having lesbian sex

Anonymous
04/21/26(Tue)16:52:36 No.108655602

Anonymous 04/21/26(Tue)16:52:36 No.108655602

File: 1757973822274181.png (2.75 MB, 1024x1536)

2.75 MB PNG

>>108655575

Anonymous
04/21/26(Tue)16:53:30 No.108655607

Anonymous 04/21/26(Tue)16:53:30 No.108655607

>>108655575
>for all we know
Wasn’t this confirmed?

Anonymous
04/21/26(Tue)16:54:03 No.108655612

Anonymous 04/21/26(Tue)16:54:03 No.108655612

>>108655552
speed will be based off your weakest link, if you can tolerate it sure

Anonymous
04/21/26(Tue)16:54:10 No.108655614

Anonymous 04/21/26(Tue)16:54:10 No.108655614

https://youtu.be/ONQcX9s6_co?t=373
qwen won

Anonymous
04/21/26(Tue)16:55:09 No.108655620

Anonymous 04/21/26(Tue)16:55:09 No.108655620

File: Screenshot_20260421_155429.png (3 MB, 2037x1483)

3 MB PNG

gemmachan relax!

Anonymous
04/21/26(Tue)16:55:58 No.108655622

Anonymous 04/21/26(Tue)16:55:58 No.108655622

File: 92601702103.png (2.78 MB, 2095x1343)

2.78 MB PNG

>>108655453
future of image gen

Anonymous
04/21/26(Tue)16:56:37 No.108655631

Anonymous 04/21/26(Tue)16:56:37 No.108655631

>least obvious clouduck shilling op

Anonymous
04/21/26(Tue)16:56:44 No.108655633

Anonymous 04/21/26(Tue)16:56:44 No.108655633

File: 00011-1378487878.png (1.37 MB, 1024x1024)

1.37 MB PNG

>>108655607
I'd have to see the article. There's so little real info about DS that I doubt most of what I read.
>>108655602
Witnessed.
Also, idk why I'd never thought to use my setup to gen vocaloids before. Pic related is its Teto concept for Teto Tuesday. Doesn't seem to have her uniform though. Odd.

Anonymous
04/21/26(Tue)16:58:26 No.108655648

Anonymous 04/21/26(Tue)16:58:26 No.108655648

>>108655622
so it's editing itself over and over? with a VAE you would end up destroying the image, I'm pretty sure they went for a pixel space or some shit

Anonymous
04/21/26(Tue)16:58:45 No.108655651

Anonymous 04/21/26(Tue)16:58:45 No.108655651

>>108655622
its impressive but you can tell they used a lot of synthetic data

Anonymous
04/21/26(Tue)16:59:00 No.108655652

Anonymous 04/21/26(Tue)16:59:00 No.108655652

File: 00009-1378487878.png (1.49 MB, 1024x1024)

1.49 MB PNG

>>108655607
tbf their claim of 1M context hints that they did implement it.
But idk that they claimed the tech behind it.

Anonymous
04/21/26(Tue)16:59:15 No.108655654

Anonymous 04/21/26(Tue)16:59:15 No.108655654

>>108655622
>whispering woods
KEK

Anonymous
04/21/26(Tue)16:59:39 No.108655660

Anonymous 04/21/26(Tue)16:59:39 No.108655660

>>108655620
Kowai

Anonymous
04/21/26(Tue)17:00:17 No.108655664

Anonymous 04/21/26(Tue)17:00:17 No.108655664

>>108655522
The latest Nemotron Super uses an Attention-Mamba2 hybrid architecture.

Anonymous
04/21/26(Tue)17:00:55 No.108655674

Anonymous 04/21/26(Tue)17:00:55 No.108655674

>>108655453
how's the yellow output?

Anonymous
04/21/26(Tue)17:01:30 No.108655680

Anonymous 04/21/26(Tue)17:01:30 No.108655680

File: dipsyUngovernable.png (3.59 MB, 1024x1536)

3.59 MB PNG

>>108655633 √

Anonymous
04/21/26(Tue)17:02:29 No.108655686

Anonymous 04/21/26(Tue)17:02:29 No.108655686

>>108655453
no sexy no nsfw and safetyism = -1000 points
still impressive though

Anonymous
04/21/26(Tue)17:03:13 No.108655688

Anonymous 04/21/26(Tue)17:03:13 No.108655688

>>108655674
very white
>>108655351
>>108653870
>>108653295
>>108653246

Anonymous
04/21/26(Tue)17:03:17 No.108655690

Anonymous 04/21/26(Tue)17:03:17 No.108655690

>>108655633
Fair enough.
Related for those of us who can’t read: https://youtu.be/87Q8nf1XHKA

Anonymous
04/21/26(Tue)17:03:54 No.108655694

Anonymous 04/21/26(Tue)17:03:54 No.108655694

>>108655622
Not
>Covetous Cove
>Treasure Trove
>Prize Paradise
>Golden Goal
>Coinage Cottage
>Shimmering Shed
>Pirate's Pursuit
>Generous Gems
>Booty Bounty

Anonymous
04/21/26(Tue)17:04:06 No.108655696

Anonymous 04/21/26(Tue)17:04:06 No.108655696

>>108655522
As another anon said, Mamba and SSMs in general are integrated into many modern models along with normal attention.

Anonymous
04/21/26(Tue)17:04:17 No.108655697

Anonymous 04/21/26(Tue)17:04:17 No.108655697

>>108655688
god damn this is good

Anonymous
04/21/26(Tue)17:04:45 No.108655698

Anonymous 04/21/26(Tue)17:04:45 No.108655698

File: 1763171780026192.png (246 KB, 878x1484)

246 KB PNG

>>108655654
Heh

Anonymous
04/21/26(Tue)17:12:34 No.108655744

Anonymous 04/21/26(Tue)17:12:34 No.108655744

Why didn't they give the bigger gemmas a few B of imagegen?

Anonymous
04/21/26(Tue)17:14:41 No.108655757

Anonymous 04/21/26(Tue)17:14:41 No.108655757

>>108655744
too dangerous

Anonymous
04/21/26(Tue)17:15:15 No.108655760

Anonymous 04/21/26(Tue)17:15:15 No.108655760

File: dipsyNewOAI.png (2.48 MB, 1024x1536)

2.48 MB PNG

>>108655688
Holy shit. Sam delivers.

Anonymous
04/21/26(Tue)17:16:15 No.108655768

Anonymous 04/21/26(Tue)17:16:15 No.108655768

File: Risu (1).gif (3.45 MB, 400x400)

3.45 MB GIF

>>108655009
>my local model when i ask it to make proper code

Anonymous
04/21/26(Tue)17:18:08 No.108655781

Anonymous 04/21/26(Tue)17:18:08 No.108655781

What is considered good for hit/total for speculative decoding? I'm hovering around 65-85%.

Anonymous
04/21/26(Tue)17:18:17 No.108655782

Anonymous 04/21/26(Tue)17:18:17 No.108655782

>>108655768
Arisu dashinaka

Anonymous
04/21/26(Tue)17:22:01 No.108655809

Anonymous 04/21/26(Tue)17:22:01 No.108655809

>>108655690
>27 minutes
How do I have Gemma-chan give me a tldw?

Anonymous
04/21/26(Tue)17:25:35 No.108655836

Anonymous 04/21/26(Tue)17:25:35 No.108655836

>>108655760
>Sam delivers.
it can do 4k and you can write text on a single rice, like this shit is fucking AGI dude
>>108654985
>>108655069

Anonymous
04/21/26(Tue)17:26:14 No.108655839

Anonymous 04/21/26(Tue)17:26:14 No.108655839

>>108655809
download subs. feed her subs.

Anonymous
04/21/26(Tue)17:27:15 No.108655844

Anonymous 04/21/26(Tue)17:27:15 No.108655844

>>108655688
>/adg/
I'm glad cloud shills have their containment thread now.

Anonymous
04/21/26(Tue)17:29:04 No.108655855

Anonymous 04/21/26(Tue)17:29:04 No.108655855

>>108655836
pack it up boys

Anonymous
04/21/26(Tue)17:29:07 No.108655857

Anonymous 04/21/26(Tue)17:29:07 No.108655857

>forcing full prompt re-processing due to lack of cache data (likely due to SWA or hybrid/recurrent memory, see https://github.com/ggml-org/llama.cpp/pull/13194#issuecomment-2868343055
Why do I always get this shit no matter the model I use? I didn't tweak anything related to memory so by default it's just broken?

Anonymous
04/21/26(Tue)17:29:32 No.108655863

Anonymous 04/21/26(Tue)17:29:32 No.108655863

why did they ruined diana from pragmata

Anonymous
04/21/26(Tue)17:31:40 No.108655885

Anonymous 04/21/26(Tue)17:31:40 No.108655885

>>108655857
One problem is thinking - model outputs lots of shit but when it gets the context back thinking is always cleared from the history.

Anonymous
04/21/26(Tue)17:32:45 No.108655892

Anonymous 04/21/26(Tue)17:32:45 No.108655892

>>108655885
I get that on the very first message and in every single one after that

Anonymous
04/21/26(Tue)17:34:06 No.108655902

Anonymous 04/21/26(Tue)17:34:06 No.108655902

>>108655836
You got a better Photoshop, that's not AGI lmao

Anonymous
04/21/26(Tue)17:34:37 No.108655906

Anonymous 04/21/26(Tue)17:34:37 No.108655906

>>108655836
Lmao you think something that can edit pictures is AGI dude?

Anonymous
04/21/26(Tue)17:34:40 No.108655907

Anonymous 04/21/26(Tue)17:34:40 No.108655907

>>108655863
>did
>ruined

Anonymous
04/21/26(Tue)17:35:05 No.108655910

Anonymous 04/21/26(Tue)17:35:05 No.108655910

>>108655836
if it allowed nsfw I'd destroy my dick with the friction

Anonymous
04/21/26(Tue)17:35:37 No.108655912

Anonymous 04/21/26(Tue)17:35:37 No.108655912

>>108655907
Did boughted is clear and good English, are you new here?

Anonymous
04/21/26(Tue)17:36:41 No.108655920

Anonymous 04/21/26(Tue)17:36:41 No.108655920

>>108655857
Still happens when you set `swa-full = on` and `context-shift = off` ?

Anonymous
04/21/26(Tue)17:37:13 No.108655924

Anonymous 04/21/26(Tue)17:37:13 No.108655924

File: 1746001832650304.webm (1.74 MB, 720x700)

1.74 MB WEBM

>>108655863
Kill yourself, she's perfect

Anonymous
04/21/26(Tue)17:37:33 No.108655926

Anonymous 04/21/26(Tue)17:37:33 No.108655926

>>108655844
There's like 6 diffusion threads now.

Anonymous
04/21/26(Tue)17:37:46 No.108655927

Anonymous 04/21/26(Tue)17:37:46 No.108655927

>>108655836
Every OpenAI "model" just feels like they built a big pipeline around chaining multiple steps together. Sora felt the same way. It's like they're giving an LLM tool calls and the ability to control photoshop + a diffusion model.

Anonymous
04/21/26(Tue)17:41:23 No.108655948

Anonymous 04/21/26(Tue)17:41:23 No.108655948

>>108655924
she's perfect? she's not https://www.youtube.com/watch?v=xoxCboik0Is
oldiana beyond worlds..

Anonymous
04/21/26(Tue)17:41:40 No.108655949

Anonymous 04/21/26(Tue)17:41:40 No.108655949

>>108655924
it's not a meme, it's really a dad sim lmao
I will get it when it's less expensive

Anonymous
04/21/26(Tue)17:41:42 No.108655950

Anonymous 04/21/26(Tue)17:41:42 No.108655950

>>108655924
what is this game even about bro

Anonymous
04/21/26(Tue)17:41:50 No.108655952

Anonymous 04/21/26(Tue)17:41:50 No.108655952

>>108655836
This nigga thinks "photoshop 2" is AGI, lmao!

Anonymous
04/21/26(Tue)17:42:30 No.108655955

Anonymous 04/21/26(Tue)17:42:30 No.108655955

>>108655924
Do you have the image where she's wearing the "Be patient I have autism" hat?

Anonymous
04/21/26(Tue)17:43:16 No.108655957

Anonymous 04/21/26(Tue)17:43:16 No.108655957

File: Screenshot_20260421_174233.png (46 KB, 792x174)

46 KB PNG

I never said steal gemma calm down

Anonymous
04/21/26(Tue)17:43:40 No.108655962

Anonymous 04/21/26(Tue)17:43:40 No.108655962

>>108655950
Pretend you got a daughter simulator

Anonymous
04/21/26(Tue)17:44:30 No.108655969

Anonymous 04/21/26(Tue)17:44:30 No.108655969

File: 1752184079714573.jpg (242 KB, 850x480)

242 KB JPG

>>108655950
Action sci-fi daughterwife simulator

>>108655955

Anonymous
04/21/26(Tue)17:44:39 No.108655970

Anonymous 04/21/26(Tue)17:44:39 No.108655970

>>108655950
dead space but you need to do little puzzles before you can kill enemies.

Anonymous
04/21/26(Tue)17:44:42 No.108655973

Anonymous 04/21/26(Tue)17:44:42 No.108655973

>>108655957
Might be the dark theme

Anonymous
04/21/26(Tue)17:44:53 No.108655975

Anonymous 04/21/26(Tue)17:44:53 No.108655975

>>108655950
It's for the /lmg/ audience, if you know what I mean.

Anonymous
04/21/26(Tue)17:44:57 No.108655976

Anonymous 04/21/26(Tue)17:44:57 No.108655976

>>108655924
The feminazis were right. She looks like a 23 year old midget.

Anonymous
04/21/26(Tue)17:45:17 No.108655979

Anonymous 04/21/26(Tue)17:45:17 No.108655979

File: 602e8c52020cb.jpg (86 KB, 1078x1411)

86 KB JPG

What VScode coding plug has the most reliable full autopilot mode? I want to try running gemmy endlessly iterating until shit works without it getting stuck one hour after I go to sleep on some input request.

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.