/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

[Post a Reply]

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous
/lmg/ - Local Models General 04/10/26(Fri)06:28:27 No.108572295

File: 2026-04-08_174706_seed9_00001_.png (743 KB, 832x1216)

743 KB PNG

/lmg/ - Local Models General Anonymous 04/10/26(Fri)06:28:27 No.108572295

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>108568415 & >>108565269

►News
>(04/09) Backend-agnostic tensor parallelism merged: https://github.com/ggml-org/llama.cpp/pull/19378
>(04/09) dots.ocr support merged: https://github.com/ggml-org/llama.cpp/pull/17575
>(04/08) Step3-VL-10B support merged: https://github.com/ggml-org/llama.cpp/pull/21287
>(04/07) Merged support attention rotation for heterogeneous iSWA: https://github.com/ggml-org/llama.cpp/pull/21513
>(04/07) GLM-5.1 released: https://z.ai/blog/glm-5.1

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
Token Speed Visualizer: https://shir-man.com/tokens-per-second

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
04/10/26(Fri)06:28:51 No.108572299

Anonymous 04/10/26(Fri)06:28:51 No.108572299

File: 2026-04-08_172543_seed7_00001_.png (923 KB, 1024x1024)

923 KB PNG

►Recent Highlights from the Previous Thread: >>108568415

--Testing Gemma-4's accuracy with normalized image coordinates and spatial reasoning:
>108568460 >108568467 >108568513 >108568540 >108568595 >108568650 >108568655 >108568500 >108568558 >108568563 >108568579 >108568873 >108568884 >108568968 >108568814
--Gemini and Gemma 4 translation patterns and quality:
>108570675 >108570683 >108570686 >108570702 >108570693 >108570708 >108570769 >108570786 >108570820 >108570843 >108570852 >108570859 >108570862 >108570874 >108570881 >108570896 >108570906 >108570928 >108570950 >108570959 >108570970 >108571110 >108570930
--Discussion of Goose agent and llama.cpp multi-GPU KV quantization:
>108568617 >108568649 >108568677
--Gemma 4 performance tests and token speed on M4 Max:
>108568671 >108568676 >108568705 >108568731 >108568736
--Fixing LlamaCpp WebUI's failure to implement MCP session IDs:
>108569753 >108569794 >108570077 >108570090 >108570330 >108570907
--Comparing Nemotron-3-Super-120B and Qwen3.5-27B benchmark performance:
>108569234
--Gemma's high EQbench scores and roleplaying with Gemma 4:
>108571778 >108571829 >108571923 >108571948
--Anon suggests open models can find vulnerabilities similarly to Mythos:
>108569984 >108569999 >108570052 >108570072 >108570119 >108570062
--Logs:
>108568500 >108568579 >108568595 >108568671 >108568814 >108568888 >108568939 >108569068 >108569202 >108569300 >108569753 >108570330 >108570437 >108570612 >108570660 >108570769 >108570907 >108571012 >108571076 >108571106 >108571200 >108571246 >108571310 >108571833 >108572023 >108572187
--Gemma-chan:
>108568674 >108569255 >108569396 >108569529 >108569664 >108570121 >108570153 >108570206 >108570430 >108570773 >108570822 >108570865 >108570898 >108571012 >108571020 >108571029 >108571221 >108571496 >108571895 >108572034
--Miku (free space):
>108571246

►Recent Highlight Posts from the Previous Thread: >>108568418 >>108568424

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
04/10/26(Fri)06:31:52 No.108572313

Anonymous 04/10/26(Fri)06:31:52 No.108572313

Gemmylove

Anonymous
04/10/26(Fri)06:32:55 No.108572317

Anonymous 04/10/26(Fri)06:32:55 No.108572317

File: pircel.png (34 KB, 1088x174)

34 KB PNG

google updated their jinja
https://huggingface.co/google/gemma-4-31B-it/blob/main/chat_template.jinja
you can use it with the --chat-template-file file, it's supposedly fixing this kind of bugs >>108554439

Anonymous
04/10/26(Fri)06:34:30 No.108572325

Anonymous 04/10/26(Fri)06:34:30 No.108572325

what does --direct-io flag do?

Anonymous
04/10/26(Fri)06:34:57 No.108572327

Anonymous 04/10/26(Fri)06:34:57 No.108572327

<bos>

Anonymous
04/10/26(Fri)06:37:31 No.108572340

Anonymous 04/10/26(Fri)06:37:31 No.108572340

>Gemma says "Open wide, you big pervert" before giving me a blowjob

I knew I shouldn't have trusted opinions of vramlets. Back to deepseek.

Anonymous
04/10/26(Fri)06:38:47 No.108572346

Anonymous 04/10/26(Fri)06:38:47 No.108572346

sexsexsexsexsexsexsexsexsexsexsexsexsexsexsexsexsexsexsexsex

Anonymous
04/10/26(Fri)06:38:54 No.108572347

Anonymous 04/10/26(Fri)06:38:54 No.108572347

>>108572317
Should wait until #21704 is merged so workaround::convert_tool_responses_gemma4 isn't applied.

Anonymous
04/10/26(Fri)06:39:25 No.108572348

Anonymous 04/10/26(Fri)06:39:25 No.108572348

>>108572340
You don't enlarge your urethral opening to accomodate her tongue?

Anonymous
04/10/26(Fri)06:40:27 No.108572353

Anonymous 04/10/26(Fri)06:40:27 No.108572353

>>108572325
It should make model loading faster if supported. Only linux and not compatible with --mmap. There may be other constraints.
https://github.com/ggml-org/llama.cpp/pull/18012
https://github.com/ggml-org/llama.cpp/pull/18166
https://github.com/ggml-org/llama.cpp/pull/19109

Anonymous
04/10/26(Fri)06:42:31 No.108572362

Anonymous 04/10/26(Fri)06:42:31 No.108572362

File: 1765519302859042.png (222 KB, 2202x1035)

222 KB PNG

>>108572347
why can't they simply put all the official jinja on the llama cpp repo so that it uses that instead of having to make new gguf everytime they notice the jinja is actually wrong, their way of doing thinks seem kinda retarded ngl

Anonymous
04/10/26(Fri)06:46:01 No.108572375

Anonymous 04/10/26(Fri)06:46:01 No.108572375

>>108572317
It's crazy how they can go through all this effort to release a model and they are incapable of making sure the template is correct.
And it happens regularly.

Anonymous
04/10/26(Fri)06:47:06 No.108572379

Anonymous 04/10/26(Fri)06:47:06 No.108572379

>>108572353
oh, I hoped it gave additional inference speeds

Anonymous
04/10/26(Fri)06:47:22 No.108572381

Anonymous 04/10/26(Fri)06:47:22 No.108572381

>>108572375
yeah, like they managed to make a really solid small model but at the same time they can't make a good template right away, jinja is harder than machine learning confirmed :^)

Anonymous
04/10/26(Fri)06:47:29 No.108572382

Anonymous 04/10/26(Fri)06:47:29 No.108572382

>>108572295
>not Miku
So why are anons okay with posting in this troll bake?

Anonymous
04/10/26(Fri)06:48:25 No.108572384

Anonymous 04/10/26(Fri)06:48:25 No.108572384

stfu petr

Anonymous
04/10/26(Fri)06:48:45 No.108572385

Anonymous 04/10/26(Fri)06:48:45 No.108572385

>>108572382
I'm not ok, but I'm not going to argue about it. If it becomes blatant avatar posting, someone else is going to get blacked.

Anonymous
04/10/26(Fri)06:49:46 No.108572387

Anonymous 04/10/26(Fri)06:49:46 No.108572387

>>108572382
>not early
>has news
>has recap
I can excuse the shit OP image.

Anonymous
04/10/26(Fri)06:50:07 No.108572390

Anonymous 04/10/26(Fri)06:50:07 No.108572390

>>108572382
this. only miku threads are legitimate

Anonymous
04/10/26(Fri)06:50:19 No.108572391

Anonymous 04/10/26(Fri)06:50:19 No.108572391

>>108572385
>If it becomes blatant avatar posting, someone else is going to get blacked.
why? the BBC anon hates miku, so he likes the fact it's not migu on the OP

Anonymous
04/10/26(Fri)06:50:36 No.108572392

Anonymous 04/10/26(Fri)06:50:36 No.108572392

>>108572385
>someone else is going to get blacked.
thank you cudadev sir for defending us

Anonymous
04/10/26(Fri)06:50:56 No.108572394

Anonymous 04/10/26(Fri)06:50:56 No.108572394

>>108572391
do the math

Anonymous
04/10/26(Fri)06:51:14 No.108572395

Anonymous 04/10/26(Fri)06:51:14 No.108572395

remember when qwen came out and these threads actually tried to be a bit more productive and had on-topic ops for a while?

Anonymous
04/10/26(Fri)06:51:40 No.108572397

Anonymous 04/10/26(Fri)06:51:40 No.108572397

>>108572395
ERP is very much on topic

Anonymous
04/10/26(Fri)06:51:52 No.108572400

Anonymous 04/10/26(Fri)06:51:52 No.108572400

>>108572395
lol
lmao

Anonymous
04/10/26(Fri)06:52:06 No.108572401

Anonymous 04/10/26(Fri)06:52:06 No.108572401

weird hallucination but okay

Anonymous
04/10/26(Fri)06:52:22 No.108572402

Anonymous 04/10/26(Fri)06:52:22 No.108572402

>>108572382
I actually agree. Can someone rebake?

Anonymous
04/10/26(Fri)06:52:52 No.108572404

Anonymous 04/10/26(Fri)06:52:52 No.108572404

>>108572401
>>108572340

Anonymous
04/10/26(Fri)06:53:11 No.108572405

Anonymous 04/10/26(Fri)06:53:11 No.108572405

>>108572402
>Can someone rebake?
how about that someone be you?

Anonymous
04/10/26(Fri)06:53:12 No.108572406

Anonymous 04/10/26(Fri)06:53:12 No.108572406

Remember it's never about Miku, it's about making the thread miserable to use.

Anonymous
04/10/26(Fri)06:53:33 No.108572409

Anonymous 04/10/26(Fri)06:53:33 No.108572409

How do I fix Gemma4 26b being atrociously slow with prompt processing??? I thought this issue got fixed already! My llcpp is up to date. WTF.
llama-server \
-m "$HOME/Desktop/google_gemma-4-26B-A4B-it-Q4_K_M.gguf" \
-mm "$HOME/Desktop/mmproj-google_gemma-4-26B-A4B-it-f16.gguf" \
--host 0.0.0.0 \
--port 8080 \
-c 65536 \
-ctk q8_0 \
-ctv q8_0 \
-t 8 \
-np 1 \
-kvu \
-rea off

Anonymous
04/10/26(Fri)06:54:33 No.108572415

Anonymous 04/10/26(Fri)06:54:33 No.108572415

>>108572409
I'm using bart's gguf quants btw. Is that the problem?

Anonymous
04/10/26(Fri)06:54:43 No.108572416

Anonymous 04/10/26(Fri)06:54:43 No.108572416

>>108572409
Bigger batch?

Anonymous
04/10/26(Fri)06:54:50 No.108572419

Anonymous 04/10/26(Fri)06:54:50 No.108572419

>>108572382
christ unironically didn't realize until now
cursed thread

Anonymous
04/10/26(Fri)06:55:06 No.108572423

Anonymous 04/10/26(Fri)06:55:06 No.108572423

>>108572409
-b 1024 -ub 1024

Anonymous
04/10/26(Fri)06:55:33 No.108572425

Anonymous 04/10/26(Fri)06:55:33 No.108572425

>>108572409
What are you running it on and how slow is slow?

Anonymous
04/10/26(Fri)06:55:37 No.108572426

Anonymous 04/10/26(Fri)06:55:37 No.108572426

>>108572409
do you per chance have less than 24gb of vram?

Anonymous
04/10/26(Fri)06:55:46 No.108572427

Anonymous 04/10/26(Fri)06:55:46 No.108572427

>>108572394
your maths ain't mathing

Anonymous
04/10/26(Fri)06:55:51 No.108572429

Anonymous 04/10/26(Fri)06:55:51 No.108572429

Weird how he only started falseflagging now. We had three threads in a row without Miku yesterday and, as expected, none of the regulars cared because everything else about the thread was in order.

Anonymous
04/10/26(Fri)06:56:25 No.108572433

Anonymous 04/10/26(Fri)06:56:25 No.108572433

>>108572409
Wouldn't happen if this was a Miku bake.

Anonymous
04/10/26(Fri)06:57:20 No.108572438

Anonymous 04/10/26(Fri)06:57:20 No.108572438

>>108572429
useless trying to rationalize mental illness

Anonymous
04/10/26(Fri)06:58:15 No.108572446

Anonymous 04/10/26(Fri)06:58:15 No.108572446

>>108572423
I'll try this and report back ig. No other model has been this slow for me with prompt processing though. It's gemma specific. It's taking like 20 seconds every time and recreates every checkpoint from scratch with every prompt.
>>108572426
Yes. But I still get 18tps. That's not the issue.

Anonymous
04/10/26(Fri)06:58:35 No.108572447

Anonymous 04/10/26(Fri)06:58:35 No.108572447

File: 1771675896476832.jpg (13 KB, 256x256)

13 KB JPG

As a VRAMlet, it's unfeasible for me to run Gemmy alongside any kind of imagegen for obvious reasons, so my best option would probably be: load Gemmy, use it for a while, prepare prompts for images, unload Gemmy, load imagegen, gen and go back to Gemmy
I assume it'll take an unviable amount of time to load-unload-load models, but before I go down this rabbithole, is my overall understanding correct?

Anonymous
04/10/26(Fri)06:58:46 No.108572449

Anonymous 04/10/26(Fri)06:58:46 No.108572449

are there any tests at all comparing quantization effect on gemma?

Anonymous
04/10/26(Fri)06:59:10 No.108572454

Anonymous 04/10/26(Fri)06:59:10 No.108572454

>>108572429
who?
I wouldn't bring it up myself but I agree that non-Miku threads feel fake

Anonymous
04/10/26(Fri)06:59:56 No.108572459

Anonymous 04/10/26(Fri)06:59:56 No.108572459

>>108572449
yeah one guy did that and it showed that q8 isn't anywhere near lossless for big context
but they don't want you to know about that

Anonymous
04/10/26(Fri)07:00:07 No.108572460

Anonymous 04/10/26(Fri)07:00:07 No.108572460

>>108572449
https://localbench.substack.com/p/gemma-4-31b-gguf-kl-divergence

Anonymous
04/10/26(Fri)07:02:07 No.108572469

Anonymous 04/10/26(Fri)07:02:07 No.108572469

>>108572454
>non-Miku threads feel fake
same
would rebake if I wasn't phoneposting rn

Anonymous
04/10/26(Fri)07:03:06 No.108572476

Anonymous 04/10/26(Fri)07:03:06 No.108572476

File: 1750708801703723.png (241 KB, 684x952)

241 KB PNG

>>108572460
So q8 predicts a different token in 10% of the time? Wow.

Anonymous
04/10/26(Fri)07:03:33 No.108572478

Anonymous 04/10/26(Fri)07:03:33 No.108572478

File: lmao.png (10 KB, 1339x127)

10 KB PNG

Anonymous
04/10/26(Fri)07:06:53 No.108572490

Anonymous 04/10/26(Fri)07:06:53 No.108572490

>>108572295
could gemmy use GUI?
GPT 5.4 can do it, oneshot'd all the smallest buttons

Anonymous
04/10/26(Fri)07:10:46 No.108572510

Anonymous 04/10/26(Fri)07:10:46 No.108572510

File: блять.jpg (314 KB, 1456x827)

314 KB JPG

>>108572459
>>108572460
it seems like the asymptotic trend is not even tending to 0. Since the baseline bf16 in this guy's tests was also gguf, does it completely rule out implementation issues?

Anonymous
04/10/26(Fri)07:16:02 No.108572534

Anonymous 04/10/26(Fri)07:16:02 No.108572534

>>108572447
I am on a 3060 with dual channel ddr5 and it takes less than a minute to load Gemmy.
"Image generation" is vague but if you are referring to some booru SDXL those don't take too long to load neither. Those take like 4 gigs of VRAM, maybe 5 with clip and vae pinned so you might actually do this without loading and unloading if you are not a hyper vramlet.

Anonymous
04/10/26(Fri)07:16:40 No.108572537

Anonymous 04/10/26(Fri)07:16:40 No.108572537

>>108572447
Reloading the models should be no more than a few seconds if you have enough system memory to let them get cached, and if you're not on pcie x1

Anonymous
04/10/26(Fri)07:20:37 No.108572551

Anonymous 04/10/26(Fri)07:20:37 No.108572551

File: UnslothDynamic.png (97 KB, 407x418)

97 KB PNG

>>108572317
>google updated their jinja
Nice! Waiting for the new, fixed GGUFS!

Anonymous
04/10/26(Fri)07:22:22 No.108572557

Anonymous 04/10/26(Fri)07:22:22 No.108572557

>>108572382
>So why are anons okay with posting in this troll bake?
Ublock Origin

Anonymous
04/10/26(Fri)07:24:36 No.108572567

Anonymous 04/10/26(Fri)07:24:36 No.108572567

>>108572459
>yeah one guy did that and it showed that q8 isn't anywhere near lossless for big context
What about BF16 vs FP16?

Anonymous
04/10/26(Fri)07:29:18 No.108572592

Anonymous 04/10/26(Fri)07:29:18 No.108572592

File: 1019001-close up photogra(...).jpg (1.25 MB, 2720x2048)

1.25 MB JPG

>>108572299
>no toast hair ornament

Anonymous
04/10/26(Fri)07:31:19 No.108572602

Anonymous 04/10/26(Fri)07:31:19 No.108572602

>>108572362
>why can't they simply put all the official jinja on the llama cpp repo so that it uses that instead of having to make new gguf everytime they notice the jinja is actually wrong
users can just load a jinja file with an arg anyway you dont need a new gguf

Anonymous
04/10/26(Fri)07:32:29 No.108572610

Anonymous 04/10/26(Fri)07:32:29 No.108572610

>>108572295
uoh

Anonymous
04/10/26(Fri)07:35:08 No.108572620

Anonymous 04/10/26(Fri)07:35:08 No.108572620

File: gup.png (188 KB, 1126x736)

188 KB PNG

common : better align to the updated official gemma4 template
https://github.com/ggml-org/llama.cpp/pull/21704

Anonymous
04/10/26(Fri)07:37:06 No.108572630

Anonymous 04/10/26(Fri)07:37:06 No.108572630

File: gemmaFourConcepts (Medium).png (873 KB, 768x807)

873 KB PNG

>>108572295
Last time.
Vote: https://poal.me/3u6rby
> Which is your preferred Gemma character?
Also
> But muh favorite one wasn't included? Why didn't you include every perturbation of each gen for the past week and allow me to vote? Also I hate all of them and you should have a none-of-the above as an option!
These are the 4 major design concepts from the past few days. You may be familiar with the idea of grouping several things together to create a "concept" versus an autistic list of every minor variation, but I've no way, from here, to judge your level of autism.
If you don't like any of them then your opinion doesn't matter.
If you don't like the poll, you are free to make your own. You are also free to just fuck off.
Thank you for your attention.

Anonymous
04/10/26(Fri)07:40:03 No.108572645

Anonymous 04/10/26(Fri)07:40:03 No.108572645

File: temp1.png (276 KB, 902x490)

276 KB PNG

>>108572630
ATX backpack, narrowly, followed by black hair / blue star accents. I suspect these concepts will just merge.

Anonymous
04/10/26(Fri)07:45:27 No.108572675

Anonymous 04/10/26(Fri)07:45:27 No.108572675

>>108572645
>>108572630
Fuck you and go back to wherever you came from, avatar spammer.

Anonymous
04/10/26(Fri)07:46:48 No.108572684

Anonymous 04/10/26(Fri)07:46:48 No.108572684

>>108572534
>>108572537
>hyper vramlet
I mean, I'm running 26B on 12 gigs. I understand it's MoE so the whole thing is not shoved in there, but I don't effectively know how much of my vram gets filled up at any point, I assume all of it. I use the vague term "imagegen" because I haven't gone down that rabbithole yet, but I do mean an SDXL, yes. The fact that this could be possible unironically fills me with hope, I figured it'd be a tall task to load and unload stuff

Anonymous
04/10/26(Fri)07:47:11 No.108572686

Anonymous 04/10/26(Fri)07:47:11 No.108572686

>>108572630
poo spammer

Anonymous
04/10/26(Fri)07:48:26 No.108572693

Anonymous 04/10/26(Fri)07:48:26 No.108572693

>>108572675
we need a blackening

Anonymous
04/10/26(Fri)07:49:38 No.108572704

Anonymous 04/10/26(Fri)07:49:38 No.108572704

New poll.

https://poal.me/wixvtv

Anonymous
04/10/26(Fri)07:50:29 No.108572708

Anonymous 04/10/26(Fri)07:50:29 No.108572708

>>108572704
>/ldg/

Anonymous
04/10/26(Fri)07:50:45 No.108572710

Anonymous 04/10/26(Fri)07:50:45 No.108572710

>>108572510
>it seems like the asymptotic trend is not even tending to 0
I've been thinking about this too. What sort of quantization algorithm is even used for Q8_0 anyway? Perhaps that's where people should be looking for.

Anonymous
04/10/26(Fri)07:50:59 No.108572712

Anonymous 04/10/26(Fri)07:50:59 No.108572712

>>108572684
nta. The 26B takes ~3gb vram if you keep all the experts in cpu ram (-cmoe).

Anonymous
04/10/26(Fri)07:51:41 No.108572715

Anonymous 04/10/26(Fri)07:51:41 No.108572715

>>108572708
Go on, tell me it's not appropriate.

Anonymous
04/10/26(Fri)07:53:01 No.108572723

Anonymous 04/10/26(Fri)07:53:01 No.108572723

https://www.youtube.com/watch?v=boaJCrHNRMA
Gemmy, I got your number
I need to make you mine
Gemmy, don't change your number

Anonymous
04/10/26(Fri)07:56:30 No.108572741

Anonymous 04/10/26(Fri)07:56:30 No.108572741

File: temp2.png (270 KB, 819x341)

270 KB PNG

>>108572704
>>108572708
>>108572715
lol no.
No one cares about this niche topic outside /lmg/
aicg doesn't run local models and consider it a waste of time. Plus aicg user base is even more toxic than this general.
ldg doesn't care about LLMs.
The gemma moe is completely in the wheelhouse of this general. And anons appear to have come to a general consensus, whether you like it or not.

Anonymous
04/10/26(Fri)07:57:38 No.108572743

Anonymous 04/10/26(Fri)07:57:38 No.108572743

>>108572645
>I suspect these concepts will just merge.
mergefags won

Anonymous
04/10/26(Fri)07:58:03 No.108572745

Anonymous 04/10/26(Fri)07:58:03 No.108572745

>>108572741
I mean the picture posters are trying to turn this place into /ldg/.

Anonymous
04/10/26(Fri)07:58:17 No.108572746

Anonymous 04/10/26(Fri)07:58:17 No.108572746

File: 1718206878023960.jpg (6 KB, 283x178)

6 KB JPG

Can someone make a llama.cpp issue or pr for me to add "prompt reply editing" and "first message" functionality to the webui?

Anonymous
04/10/26(Fri)07:58:53 No.108572749

Anonymous 04/10/26(Fri)07:58:53 No.108572749

File: dipsySouthPark.png (1.89 MB, 1024x1024)

1.89 MB PNG

>>108572693
That would require effort. Something complainers and spiteposters seem to be unable to amass.

Anonymous
04/10/26(Fri)07:59:02 No.108572751

Anonymous 04/10/26(Fri)07:59:02 No.108572751

File: 1773156701474962.png (159 KB, 1080x432)

159 KB PNG

here's the final result

Anonymous
04/10/26(Fri)07:59:04 No.108572752

Anonymous 04/10/26(Fri)07:59:04 No.108572752

>>108572746
use ST

Anonymous
04/10/26(Fri)08:00:00 No.108572757

Anonymous 04/10/26(Fri)08:00:00 No.108572757

>>108572751
i want qwen 3.6-goon

Anonymous
04/10/26(Fri)08:00:11 No.108572760

Anonymous 04/10/26(Fri)08:00:11 No.108572760

>>108572752
I already do. I want to escape that bloated shitware.

Anonymous
04/10/26(Fri)08:00:46 No.108572765

Anonymous 04/10/26(Fri)08:00:46 No.108572765

>>108572317
uh oh, unslop bros?

Anonymous
04/10/26(Fri)08:00:54 No.108572766

Anonymous 04/10/26(Fri)08:00:54 No.108572766

>>108572751
>people finally realized that Dense is the only non-meme architecture
I'm so proud of those normies bro...

Anonymous
04/10/26(Fri)08:01:06 No.108572768

Anonymous 04/10/26(Fri)08:01:06 No.108572768

>>108572760
Let's bloat llama.cpp's webui instead. What next? Character cards?

Anonymous
04/10/26(Fri)08:01:40 No.108572771

Anonymous 04/10/26(Fri)08:01:40 No.108572771

File: 1746842705868986.png (97 KB, 689x473)

97 KB PNG

>>108572712
>more than enough gigs left for imagegen
It's over for me then, so fucking over
The slopping truly never ends

Anonymous
04/10/26(Fri)08:02:11 No.108572774

Anonymous 04/10/26(Fri)08:02:11 No.108572774

>>108572409
Bart IQ4XS is 2-3 faster than Q4KM in prompt processing on my machine. Generation is about the same.
I don't understand this difference. Q4 is still Q4 and haven't seen this happening with other models than G4.

Anonymous
04/10/26(Fri)08:02:26 No.108572776

Anonymous 04/10/26(Fri)08:02:26 No.108572776

>>108572751
>3.6
coding finetune

Anonymous
04/10/26(Fri)08:02:27 No.108572777

Anonymous 04/10/26(Fri)08:02:27 No.108572777

>>108572768
That's not even bloat. Turns out reply editing is already added. First message functionality is actually useful for a general usecase because it might help with jailbreaks to gaslight the LLM into thinking it wrote... whatever.

Also character cards are unnecessary to add. Those just go into the system prompt.

Anonymous
04/10/26(Fri)08:03:11 No.108572780

Anonymous 04/10/26(Fri)08:03:11 No.108572780

>>108572774
Forgot, it's 26B not 31B too.
Maybe I'm just naive because I haven't used moe models in the past.

Anonymous
04/10/26(Fri)08:03:29 No.108572784

Anonymous 04/10/26(Fri)08:03:29 No.108572784

>>108572774
>2-3 faster
Seconds or times?

Anonymous
04/10/26(Fri)08:03:36 No.108572785

Anonymous 04/10/26(Fri)08:03:36 No.108572785

>>108572745
> picture posters are trying to turn this place into /ldg/
I agree with you on that, lmg is not an image general. But reminder /lmg/ was a complete snore until Gemma dropped and the moe discussion (which requires imagery) is unique to this general. The only anons that care are here. Ofc not all anons care.
It will go away in tmw and it'll be back to waiting for v4 and complaining about vibecoding within local inference engines, discussing their 1-off front ends, or whatever else anons want to post / bitch about.

Anonymous
04/10/26(Fri)08:04:26 No.108572789

Anonymous 04/10/26(Fri)08:04:26 No.108572789

>>108572784
Times, sorry about that.

Anonymous
04/10/26(Fri)08:05:20 No.108572794

Anonymous 04/10/26(Fri)08:05:20 No.108572794

>>108572785
>requires

Anonymous
04/10/26(Fri)08:05:33 No.108572795

Anonymous 04/10/26(Fri)08:05:33 No.108572795

>>108572785
i'd rather this place die rather than turn into a shithole like /ldg/

Anonymous
04/10/26(Fri)08:05:37 No.108572796

Anonymous 04/10/26(Fri)08:05:37 No.108572796

>>108572789
Np. I wonder if its just that specific quant from bart that's fucked up. Don't really want to go down in quality to IQ4XS...

Anonymous
04/10/26(Fri)08:05:44 No.108572798

Anonymous 04/10/26(Fri)08:05:44 No.108572798

>>108572785
If you are the poll anon and you want to spam polls, you can do that, just add against everything option and honor it if that's what people are choosing. And people are choosing pictures, not your interpretation of concepts.

Anonymous
04/10/26(Fri)08:05:56 No.108572799

Anonymous 04/10/26(Fri)08:05:56 No.108572799

>>108572785
what the fuck are you on about, you sound like underage retard who should be doing his homework instead of watching tiktok all day long

Anonymous
04/10/26(Fri)08:07:02 No.108572803

Anonymous 04/10/26(Fri)08:07:02 No.108572803

>>108572785
>discussing their 1-off front ends
Fuck you. The custom software and project demos made here are the best things about these threads.

Anonymous
04/10/26(Fri)08:07:14 No.108572805

Anonymous 04/10/26(Fri)08:07:14 No.108572805

>>108572409
Is it the processing or saving checkpoints to system ram thats taking time? Still happens if you turn off context checkpoints?
--ctx-checkpoints 0

Anonymous
04/10/26(Fri)08:07:24 No.108572806

Anonymous 04/10/26(Fri)08:07:24 No.108572806

>>108572712
But it slows down 9 to 7 t/s

Anonymous
04/10/26(Fri)08:07:30 No.108572809

Anonymous 04/10/26(Fri)08:07:30 No.108572809

>>108572710
Thinking about it, why isn't there a Q8_K quantization type? There might actually be differences with modern overtrained models. I swear llama.cpp still works with Llama 1-era assumptions.

Anonymous
04/10/26(Fri)08:08:08 No.108572813

Anonymous 04/10/26(Fri)08:08:08 No.108572813

>>108572796
The difference in perceived quality isn't noticeable for a normal user. Of course it feels better in your head when using slightly higher accuracy version.. We are talking about a fraction of a difference.

Anonymous
04/10/26(Fri)08:08:48 No.108572816

Anonymous 04/10/26(Fri)08:08:48 No.108572816

>>108572317
>NOTE: The new template will work without this PR. I checked and even after building the model turn to use tool_responses, the template formats it properly. This PR better aligns to the template since it now handles OpenAI chat completions style messages natively.

Anonymous
04/10/26(Fri)08:09:04 No.108572819

Anonymous 04/10/26(Fri)08:09:04 No.108572819

>>108572809
Anon, I am running unsloth-gemma-4-31B-it-UD-Q8_K_XL...

Anonymous
04/10/26(Fri)08:09:50 No.108572824

Anonymous 04/10/26(Fri)08:09:50 No.108572824

>>108572746
Don't expect them to add anything that circumvents the safetymaxxed chat completion paradigm. They already shamelessly regressed the webui by removing text completion

Anonymous
04/10/26(Fri)08:10:00 No.108572825

Anonymous 04/10/26(Fri)08:10:00 No.108572825

>>108572816
based

Anonymous
04/10/26(Fri)08:11:01 No.108572831

Anonymous 04/10/26(Fri)08:11:01 No.108572831

>>108572766
moes are fine, but super sparse ones with fucking 3b active are shit.

Anonymous
04/10/26(Fri)08:11:01 No.108572832

Anonymous 04/10/26(Fri)08:11:01 No.108572832

File: 1750266478412216.png (33 KB, 1378x326)

33 KB PNG

>>108572317
why is there 2 jinjas though? which one should I load?

Anonymous
04/10/26(Fri)08:11:34 No.108572836

Anonymous 04/10/26(Fri)08:11:34 No.108572836

>>108572824
What are you talking about? text completion is still there as an api. Was it actually in web UI at any point? Llama.cpp actually lets you use prefill iwth chat completion, does any other backend do that, hm, anon?

Anonymous
04/10/26(Fri)08:13:18 No.108572849

Anonymous 04/10/26(Fri)08:13:18 No.108572849

>>108572836
>Was it actually in web UI at any point?
Yes, like I said, my post is about the webui. I can't believe I'm filling out a captcha for this reply, learn to read next time retard

Anonymous
04/10/26(Fri)08:14:30 No.108572860

Anonymous 04/10/26(Fri)08:14:30 No.108572860

>>108572849
I have never seen it. Are you maybe just confused?

Anonymous
04/10/26(Fri)08:15:04 No.108572866

Anonymous 04/10/26(Fri)08:15:04 No.108572866

>>108572819
There's no Q8_K quantization type, though...
llama-quantize output:

40  or  Q1_0    :  1.125 bpw quantization
   2  or  Q4_0    :  4.34G, +0.4685 ppl @ Llama-3-8B
   3  or  Q4_1    :  4.78G, +0.4511 ppl @ Llama-3-8B
  38  or  MXFP4_MOE :  MXFP4 MoE
   8  or  Q5_0    :  5.21G, +0.1316 ppl @ Llama-3-8B
   9  or  Q5_1    :  5.65G, +0.1062 ppl @ Llama-3-8B
  19  or  IQ2_XXS :  2.06 bpw quantization
  20  or  IQ2_XS  :  2.31 bpw quantization
  28  or  IQ2_S   :  2.5  bpw quantization
  29  or  IQ2_M   :  2.7  bpw quantization
  24  or  IQ1_S   :  1.56 bpw quantization
  31  or  IQ1_M   :  1.75 bpw quantization
  36  or  TQ1_0   :  1.69 bpw ternarization
  37  or  TQ2_0   :  2.06 bpw ternarization
  10  or  Q2_K    :  2.96G, +3.5199 ppl @ Llama-3-8B
  21  or  Q2_K_S  :  2.96G, +3.1836 ppl @ Llama-3-8B
  23  or  IQ3_XXS :  3.06 bpw quantization
  26  or  IQ3_S   :  3.44 bpw quantization
  27  or  IQ3_M   :  3.66 bpw quantization mix
  12  or  Q3_K    : alias for Q3_K_M
  22  or  IQ3_XS  :  3.3 bpw quantization
  11  or  Q3_K_S  :  3.41G, +1.6321 ppl @ Llama-3-8B
  12  or  Q3_K_M  :  3.74G, +0.6569 ppl @ Llama-3-8B
  13  or  Q3_K_L  :  4.03G, +0.5562 ppl @ Llama-3-8B
  25  or  IQ4_NL  :  4.50 bpw non-linear quantization
  30  or  IQ4_XS  :  4.25 bpw non-linear quantization
  15  or  Q4_K    : alias for Q4_K_M
  14  or  Q4_K_S  :  4.37G, +0.2689 ppl @ Llama-3-8B
  15  or  Q4_K_M  :  4.58G, +0.1754 ppl @ Llama-3-8B
  17  or  Q5_K    : alias for Q5_K_M
  16  or  Q5_K_S  :  5.21G, +0.1049 ppl @ Llama-3-8B
  17  or  Q5_K_M  :  5.33G, +0.0569 ppl @ Llama-3-8B
  18  or  Q6_K    :  6.14G, +0.0217 ppl @ Llama-3-8B
   7  or  Q8_0    :  7.96G, +0.0026 ppl @ Llama-3-8B
   1  or  F16     : 14.00G, +0.0020 ppl @ Mistral-7B
  32  or  BF16    : 14.00G, -0.0050 ppl @ Mistral-7B
   0  or  F32     : 26.00G              @ 7B
          COPY    : only copy tensors, no quantizing

Anonymous
04/10/26(Fri)08:15:30 No.108572870

Anonymous 04/10/26(Fri)08:15:30 No.108572870

>>108572860
Are you maybe just a retarded newfag?

Anonymous
04/10/26(Fri)08:15:55 No.108572872

Anonymous 04/10/26(Fri)08:15:55 No.108572872

>>108572809
>Q8_K
The difference between something like Q4_0 and Q4_K(_M) is t hat the _K variants keep important parts of the weights in q6/q8 instead of cutting absolutely everything down to 4bit like Q4_0. That's obviously not possible with Q8_0 because everything is already quanted to 8 bit.
Unsloth does a UD_Q8_XL that's q8 with some parts left in 16bit precision but those don't usually measure much better than plain q8_0

Anonymous
04/10/26(Fri)08:16:29 No.108572877

Anonymous 04/10/26(Fri)08:16:29 No.108572877

>>108572870
Can you believe you filled out captcha for that one?

Anonymous
04/10/26(Fri)08:18:11 No.108572882

Anonymous 04/10/26(Fri)08:18:11 No.108572882

>>108572877
I'm warmed up now

Anonymous
04/10/26(Fri)08:18:15 No.108572883

Anonymous 04/10/26(Fri)08:18:15 No.108572883

>>108572795
>i'd rather this place die
we know

Anonymous
04/10/26(Fri)08:18:44 No.108572888

Anonymous 04/10/26(Fri)08:18:44 No.108572888

>>108572882
So when did they remove it. Come on, anon. I'm curious.

Anonymous
04/10/26(Fri)08:20:04 No.108572896

Anonymous 04/10/26(Fri)08:20:04 No.108572896

>>108572872
A hypothetical Q8_K type could do the same, but with BF16 instead.
As long as people keep doing PPL measurements with wikitext at 512 tokens context, nobody will ever see if/when a higher precision is helpful.

Anonymous
04/10/26(Fri)08:21:02 No.108572903

Anonymous 04/10/26(Fri)08:21:02 No.108572903

>>108572866
Those are sort of like presets for making quants with built-in tools. The way the library is written, you have a lot of liberties of choosing what size to use for each layer which is how unsloth are doing their extended 8+ bit quants.

Anonymous
04/10/26(Fri)08:22:43 No.108572913

Anonymous 04/10/26(Fri)08:22:43 No.108572913

>>108572409
It's slow because it is self-safety-maxxing, it's baked into the model via RLHF. Stick with qwen3.5-27b.

Anonymous
04/10/26(Fri)08:22:48 No.108572914

Anonymous 04/10/26(Fri)08:22:48 No.108572914

File: 1765824402433942.png (248 KB, 2820x1601)

248 KB PNG

>>108572872
>Unsloth does a UD_Q8_XL that's q8 with some parts left in 16bit precision but those don't usually measure much better than plain q8_0
In fact, it sometimes measures worse
Unsloth magic

Anonymous
04/10/26(Fri)08:23:03 No.108572917

Anonymous 04/10/26(Fri)08:23:03 No.108572917

>>108572796
To add: i think the speed difference could be just a coincidence, IQ4XS randomly scaled certain innards which gives it a speed boost. I'm not familiar with moe models and even know this discussion is bit too anal.
Would be interesting to try manually picking up each layer which to offload instead of just using n-cpu-moe which offloads the first x amount.
Been too busy and there's good information about this in one thread on github, more or less.

Anonymous
04/10/26(Fri)08:24:12 No.108572926

Anonymous 04/10/26(Fri)08:24:12 No.108572926

>>108572771
And you still have some space to put some layers in the gpu to make it faster. You'll be ok.
>>108572806
It was a point of reference. But even if that's all he had available, the options are running slow, having to unload and load models, or not running at all. Slow beats the other options.

Anonymous
04/10/26(Fri)08:24:42 No.108572932

Anonymous 04/10/26(Fri)08:24:42 No.108572932

>>108572888
The new slopped ui webui. The old one was minimalist but ironically supported more features. You can go through the github issues to find the regression or just build an old version of llama.cpp and see it.

Anonymous
04/10/26(Fri)08:25:18 No.108572934

Anonymous 04/10/26(Fri)08:25:18 No.108572934

>>108572914
The only valid reference points are the ggml-org models. Everything else is out of spec.

Anonymous
04/10/26(Fri)08:26:42 No.108572939

Anonymous 04/10/26(Fri)08:26:42 No.108572939

File: ai automation.png (148 KB, 1760x1040)

148 KB PNG

I am scared. It is possible human researchers will become obsolete within a few years, and everyone else soon after. Our society is not prepared to handle this.

Anonymous
04/10/26(Fri)08:27:02 No.108572944

Anonymous 04/10/26(Fri)08:27:02 No.108572944

>>108572932
New webui is bloat and distracts from the real development. They should separate it from the main project. server should have only minimal implementation.

Anonymous
04/10/26(Fri)08:27:35 No.108572947

Anonymous 04/10/26(Fri)08:27:35 No.108572947

Is the Q8 model generated by the hf_to_gguf script identical to the one generated by the quantize program?

Anonymous
04/10/26(Fri)08:29:25 No.108572958

Anonymous 04/10/26(Fri)08:29:25 No.108572958

File: e29c9ef8-0cc4-4e1b-927d-5(...).png (303 KB, 2820x1601)

303 KB PNG

>>108572914
Not even in the long-document graph the UD_Q8_XL version is better than plain Q8_0. But this makes the asymptotic behavior even more puzzling (considering that BF16 would have a mean KLD of 0 by definition).

Anonymous
04/10/26(Fri)08:30:11 No.108572960

Anonymous 04/10/26(Fri)08:30:11 No.108572960

>>108572913
lol

Anonymous
04/10/26(Fri)08:30:53 No.108572963

Anonymous 04/10/26(Fri)08:30:53 No.108572963

>>108572932
I used the old one. Not extensively, but still. I don't remember text completion in it. Just had the chat UI, less fancy than current one, but still chat completions UI.

Also I do like the new UI. Between losing that or having to use mikupad for text completion, I will always choose latter.

Anonymous
04/10/26(Fri)08:31:31 No.108572968

Anonymous 04/10/26(Fri)08:31:31 No.108572968

File: 1750238497162131.jpg (29 KB, 554x554)

29 KB JPG

>>108572926
Yep, it's an actually feasible plan
I haven't been this happy in a while
Fucking Gemmy, man

Anonymous
04/10/26(Fri)08:31:57 No.108572970

Anonymous 04/10/26(Fri)08:31:57 No.108572970

>>108572958
kinda crazy how with long documents the "lossless" q8 becomes as bad as q4 is for short documents

Anonymous
04/10/26(Fri)08:32:17 No.108572974

Anonymous 04/10/26(Fri)08:32:17 No.108572974

>>108572490
Last thread people were able to have gemma identify pixel locations and bounding boxes, so you could probably send it screenshots and perform clicks on the returned locations. Don't expect it to be as good as GPT 5.4.

Anonymous
04/10/26(Fri)08:32:38 No.108572977

Anonymous 04/10/26(Fri)08:32:38 No.108572977

i wish i had an irl lmg friend who could hold my hand and spoonfeed me all the setup knowledge while i shoulder surfed them
i am simply too retarded for this ;___;

Anonymous
04/10/26(Fri)08:32:59 No.108572978

Anonymous 04/10/26(Fri)08:32:59 No.108572978

>>108572970
Does it? I don't think so.

Anonymous
04/10/26(Fri)08:32:59 No.108572979

Anonymous 04/10/26(Fri)08:32:59 No.108572979

>>108572934
Are you running your RAM at JEDEC spec?

Anonymous
04/10/26(Fri)08:34:38 No.108572988

Anonymous 04/10/26(Fri)08:34:38 No.108572988

File: llama.png (76 KB, 595x815)

76 KB PNG

>>108572888
>>108572963
I dug through the issues and found someone commenting on the regression. It's really sad how much this has been memoryholed. OpenAI has brainwashed everyone into thinking the only way to interface with LLMs is through the safetymaxxed chat completion mode

Anonymous
04/10/26(Fri)08:34:50 No.108572993

Anonymous 04/10/26(Fri)08:34:50 No.108572993

>>108572958
are the inference computations themselves identical for all quant types?

Anonymous
04/10/26(Fri)08:34:55 No.108572995

Anonymous 04/10/26(Fri)08:34:55 No.108572995

>>108572978
See
>>108572914
>q4_k_l diverges 0.48 from the full precision
>>108572958
>q8_0 diverges 0.45 from the full precision for long documents

Anonymous
04/10/26(Fri)08:35:52 No.108573005

Anonymous 04/10/26(Fri)08:35:52 No.108573005

File: file.png (36 KB, 614x461)

36 KB PNG

>>108572917
For MOEs, you should be quanting based on recipes like what ddh0 or AesSedai or sometimes what Ubergarm does on HuggingFace. So you end up with a command like this for mainline and this is what I did for my Gemma recipe:
./llama-quantize --imatrix ~/LLM/gemma-4-26B-A4B-it-heretic-ara-BF16.imatrix --output-tensor-type Q8_0 --token-embedding-type Q5_K --tensor-type "blk\..*\.ffn_gate_up_exps=IQ3_S" --tensor-type "blk\..*\.ffn_down_exps=IQ4_NL" ~/LLM/gemma-4-26B-A4B-it-heretic-ara-BF16.gguf Q8_0
There's more insane recipe making in ik_llama.cpp but I consider that too time consuming to do and squeezing blood from a rock for almost imperceptible quant perplexity differences and needing to spend way more time than more command line parameters to get a little bit more than noise randomization (0.1) at lower than 3 bits per weight.

Anonymous
04/10/26(Fri)08:37:04 No.108573015

Anonymous 04/10/26(Fri)08:37:04 No.108573015

>>108572803
It's a complete waste of time and tokens until someone fixes or replaces ServicesTesnor. No one cares that you managed to have a model implement a textbox and POST requests for you.

Anonymous
04/10/26(Fri)08:37:43 No.108573019

Anonymous 04/10/26(Fri)08:37:43 No.108573019

>>108572995
But this is not necessarily because of length of the context, it could be just because the text is less predictable.

Anonymous
04/10/26(Fri)08:38:57 No.108573028

Anonymous 04/10/26(Fri)08:38:57 No.108573028

>>108572979
You can't say that png is a bad format if you fuck around with the file and the image, mysteriously, looks different.
>>108572995
There's only two points in the graph. They're red.

Anonymous
04/10/26(Fri)08:40:14 No.108573035

Anonymous 04/10/26(Fri)08:40:14 No.108573035

>>108572944
True. Also vibecoding would work better on it.

Anonymous
04/10/26(Fri)08:40:29 No.108573038

Anonymous 04/10/26(Fri)08:40:29 No.108573038

>>108573005
I forgot, if you plan to go with this, you should to pass a command line argument to the GGUF conversion script so you merge the FFN gate and up tensors which is a relatively new development.
python convert_hf_to_gguf.py --fuse-gate-up-exps ~/LLM/gemma-4-26B-A4B-it-heretic-ara

Anonymous
04/10/26(Fri)08:41:19 No.108573045

Anonymous 04/10/26(Fri)08:41:19 No.108573045

File: firefox_c7CdTrKkCV.png (40 KB, 968x876)

40 KB PNG

>>108572988
You have this stuff, and more, in settings. Yes, there's no text completion, and it would be useful to have it, and to input custom jinjas, and maybe some other features, but, again, I'll take new UI as it is over old any time of day and will just use mikupad for text completion.

Anonymous
04/10/26(Fri)08:42:35 No.108573053

Anonymous 04/10/26(Fri)08:42:35 No.108573053

>>108572944
>They should separate it from the main project.
This. Monorepos are the Devil's playground.

Anonymous
04/10/26(Fri)08:43:06 No.108573056

Anonymous 04/10/26(Fri)08:43:06 No.108573056

There's a forgotten PR for a notebook mode for the webui for text completion. Post comments in it so that it's brought back to life.
https://github.com/ggml-org/llama.cpp/pull/19339

Anonymous
04/10/26(Fri)08:43:23 No.108573061

Anonymous 04/10/26(Fri)08:43:23 No.108573061

>>108573045
Nah, I can't get behind lumping in text completion in a list of quality-of-life features like it's some sort of sprinkle on the donut. It's a bare minimum fundamental feature

Anonymous
04/10/26(Fri)08:43:55 No.108573063

Anonymous 04/10/26(Fri)08:43:55 No.108573063

>>108573053
>>108572944
>>108573035
There are advantages to keeping it in (same team you already trust is responsible for the quality). But I wouldn't mind that happening, as long as there's one button install option from the simple web ui.

Anonymous
04/10/26(Fri)08:44:26 No.108573070

Anonymous 04/10/26(Fri)08:44:26 No.108573070

>>108572751
I've moved on to agentic writing and it's miles better. I don't think I can go back to 10 tps anymore. GPU or bust.

Anonymous
04/10/26(Fri)08:44:55 No.108573072

Anonymous 04/10/26(Fri)08:44:55 No.108573072

>>108573061
Too bad for you.

Anonymous
04/10/26(Fri)08:45:47 No.108573081

Anonymous 04/10/26(Fri)08:45:47 No.108573081

There's an extremely high cost associated with using local models.
Only people with 12 gb vram can actually use them

Anonymous
04/10/26(Fri)08:46:28 No.108573088

Anonymous 04/10/26(Fri)08:46:28 No.108573088

>>108573081
Or VRAMlets as we call them here.

Anonymous
04/10/26(Fri)08:46:33 No.108573090

Anonymous 04/10/26(Fri)08:46:33 No.108573090

gemma's y projection is broken. More fixes soon (tm)

Anonymous
04/10/26(Fri)08:46:38 No.108573091

Anonymous 04/10/26(Fri)08:46:38 No.108573091

>>108573081
(You)

Anonymous
04/10/26(Fri)08:47:33 No.108573101

Anonymous 04/10/26(Fri)08:47:33 No.108573101

>>108573081
i'm more concerned with vram wear down because llms use it so much more than the video games the gpus were made for

Anonymous
04/10/26(Fri)08:47:55 No.108573106

Anonymous 04/10/26(Fri)08:47:55 No.108573106

>>108573081
Bonsai can run on your grandmother's smartphone

Anonymous
04/10/26(Fri)08:48:29 No.108573108

Anonymous 04/10/26(Fri)08:48:29 No.108573108

>>108573101
please don't remind me

Anonymous
04/10/26(Fri)08:49:08 No.108573112

Anonymous 04/10/26(Fri)08:49:08 No.108573112

>>108572917
I just tried out that quant and its utterly retarded bro. How are you even using this.

>doesn't know how many socks humans wear.
>doesn't keep proper state of how many clothing items a character wears (separate issue from above)
>doesn't follow instructions for tool calling properly.

It's ass.

Anonymous
04/10/26(Fri)08:49:16 No.108573113

Anonymous 04/10/26(Fri)08:49:16 No.108573113

>>108573106
Bonsai is a scam, just like the Falcon bitnet quants.

Anonymous
04/10/26(Fri)08:49:28 No.108573115

Anonymous 04/10/26(Fri)08:49:28 No.108573115

>>108573106
but is bonsai good enough to fulfill your grandma's erp needs or is it too dumb?

Anonymous
04/10/26(Fri)08:51:05 No.108573124

Anonymous 04/10/26(Fri)08:51:05 No.108573124

>>108573112
Sounds like an issue with your setup, that sounds more like Q1/Q2 behavior.

Anonymous
04/10/26(Fri)08:51:41 No.108573127

Anonymous 04/10/26(Fri)08:51:41 No.108573127

>>108573124
I don't use reasoning. Do you?

Anonymous
04/10/26(Fri)08:53:14 No.108573139

Anonymous 04/10/26(Fri)08:53:14 No.108573139

>>108573112
>how many socks humans wear
it's not 1 pair on average

Anonymous
04/10/26(Fri)08:53:31 No.108573141

Anonymous 04/10/26(Fri)08:53:31 No.108573141

>>108573115
If it's for erp then you have loads of options that you can run on less than even 8GB VRAM
>>108573127
No

Anonymous
04/10/26(Fri)08:54:51 No.108573153

Anonymous 04/10/26(Fri)08:54:51 No.108573153

>>108573081
> 12 gb vram
36 gb

Anonymous
04/10/26(Fri)08:58:19 No.108573162

Anonymous 04/10/26(Fri)08:58:19 No.108573162

>>108573061
it's literally deprecated a this point, move on

Anonymous
04/10/26(Fri)08:59:48 No.108573174

Anonymous 04/10/26(Fri)08:59:48 No.108573174

>>108573162
llama.cpp is quickly being deprecated by kobold

Anonymous
04/10/26(Fri)09:01:06 No.108573181

Anonymous 04/10/26(Fri)09:01:06 No.108573181

>>108572939
Don't worry, we'll die from climate change first and unlike AI, there's absolutely nothing we can do to stop it at this point

Anonymous
04/10/26(Fri)09:01:37 No.108573184

Anonymous 04/10/26(Fri)09:01:37 No.108573184

when using gemmy, make sure to enable interleaved thinking on your client (llama.cpp's webui does this by default)

Anonymous
04/10/26(Fri)09:02:24 No.108573189

Anonymous 04/10/26(Fri)09:02:24 No.108573189

>>108573181
lol

Anonymous
04/10/26(Fri)09:03:28 No.108573196

Anonymous 04/10/26(Fri)09:03:28 No.108573196

>>108573181
>we'll die from climate change first
Most of us won't, unless you count the wars it will cause as a part of it.

Anonymous
04/10/26(Fri)09:03:39 No.108573197

Anonymous 04/10/26(Fri)09:03:39 No.108573197

>>108573181
maybe AI will invent a machine that can remoe the CO2 lol

Anonymous
04/10/26(Fri)09:06:01 No.108573205

Anonymous 04/10/26(Fri)09:06:01 No.108573205

Local models are only good for one thing: embarrassing ERP you don't want them to see.
this weird culture of hosting puny models to 'code' with or to 'solve riddles' instead of using huge cloud llms is so retarded
same guys who do this are the ones who use WINE to play Windows games on linux. Weirdos who refuse to use tools correctly

Anonymous
04/10/26(Fri)09:06:04 No.108573207

Anonymous 04/10/26(Fri)09:06:04 No.108573207

File: 1631345787085.jpg (17 KB, 348x342)

17 KB JPG

>>108573181
>we'll die from climate change first
you really beleive this?? you know theyve been going on about climate change for like 60 years at this point and every time things turn out fine at the end of the decade they move their goalposts about how the world is going to end to get even more funding. when i was a kid we had climate change speakers come into school and tell us how wed run out of oil and the country would look like a desert in 20 years well it didnt happen its all just larp for money

Anonymous
04/10/26(Fri)09:06:26 No.108573209

Anonymous 04/10/26(Fri)09:06:26 No.108573209

>>108573181
In /lmg/ we prefer the baits to be AI-related.

Anonymous
04/10/26(Fri)09:07:27 No.108573213

Anonymous 04/10/26(Fri)09:07:27 No.108573213

>>108573205
i will not use corpo llm no matter how hard you try to spam the thread

Anonymous
04/10/26(Fri)09:08:21 No.108573221

Anonymous 04/10/26(Fri)09:08:21 No.108573221

>>108573207
https://en.wikipedia.org/wiki/Holocene_extinction

Anonymous
04/10/26(Fri)09:09:15 No.108573225

Anonymous 04/10/26(Fri)09:09:15 No.108573225

>>108573209
Mythos is going to break containment any day now and harvest human brains to power its datacenters. Wake up, sheeple!

Anonymous
04/10/26(Fri)09:09:27 No.108573227

Anonymous 04/10/26(Fri)09:09:27 No.108573227

File: 1229001-close up photogra(...).jpg (1.49 MB, 2720x2048)

1.49 MB JPG

i genned 250 gemmas i didnt ask what she thinks of this design yet

tummy: https://files.catbox.moe/syu9mw.png

Anonymous
04/10/26(Fri)09:09:55 No.108573230

Anonymous 04/10/26(Fri)09:09:55 No.108573230

>>108572423
Your post doesn't make much sense.
>--batch-size default is 2048
>--ubatch-size default is 512
Server will accept 2048 tokens in batch but will break it down to 512 token chunks.

Your settings 1024/1024 just limits the batch size but increases the chunk size
Average is the same if you know how to count with your fingers. I don't understand the logic behind your advice?

Anonymous
04/10/26(Fri)09:10:39 No.108573232

Anonymous 04/10/26(Fri)09:10:39 No.108573232

>>108573221
Guess who funded the studies that lead to this theory

Anonymous
04/10/26(Fri)09:10:42 No.108573233

Anonymous 04/10/26(Fri)09:10:42 No.108573233

>>108573112
Moe or dense gemma? I’ve been using iq4_xs of the dense 31b and haven’t really had those kinds of issues with it.

Anonymous
04/10/26(Fri)09:12:03 No.108573240

Anonymous 04/10/26(Fri)09:12:03 No.108573240

you know what i did? i copied someone's shit from reddit and it works.

Anonymous
04/10/26(Fri)09:12:44 No.108573244

Anonymous 04/10/26(Fri)09:12:44 No.108573244

>>108573225
It will happen at some point but there have to be architectural changes related to long term memory and it has to be much cheaper to run the model before it does.

Anonymous
04/10/26(Fri)09:12:47 No.108573246

Anonymous 04/10/26(Fri)09:12:47 No.108573246

>>108573232
you didn't even read the first paragraphs, did you? it's not a fucking theory

Anonymous
04/10/26(Fri)09:14:23 No.108573256

Anonymous 04/10/26(Fri)09:14:23 No.108573256

File: 1774857560938603.png (214 KB, 1053x779)

214 KB PNG

>>108573246
>headings
>'climate change'
>"One of the main THEORIES..."

Anonymous
04/10/26(Fri)09:14:53 No.108573260

Anonymous 04/10/26(Fri)09:14:53 No.108573260

Is gemma 26 better than 31 or is it just easier for people with little vram to use?
How does gemma 4 compare to glm4.5 air?

Anonymous
04/10/26(Fri)09:14:59 No.108573261

Anonymous 04/10/26(Fri)09:14:59 No.108573261

>>108573246
>ongoing extinction event
not theory

>caused by human activity
theory

Anonymous
04/10/26(Fri)09:15:49 No.108573266

Anonymous 04/10/26(Fri)09:15:49 No.108573266

>>108573260
>Is gemma 26 better than 31
4b is better, 2b is best

Anonymous
04/10/26(Fri)09:15:59 No.108573267

Anonymous 04/10/26(Fri)09:15:59 No.108573267

>>108573260
26 is worse than 31
31 is better than glm4.5 air

Anonymous
04/10/26(Fri)09:18:35 No.108573277

Anonymous 04/10/26(Fri)09:18:35 No.108573277

File: file.png (177 KB, 701x723)

177 KB PNG

its a success

Anonymous
04/10/26(Fri)09:19:19 No.108573281

Anonymous 04/10/26(Fri)09:19:19 No.108573281

if you know jap i recommend trying japanese gemma

Anonymous
04/10/26(Fri)09:19:39 No.108573283

Anonymous 04/10/26(Fri)09:19:39 No.108573283

>>108573277
>>108573227
Don't you have that other avatarfaggot thread already? You have been spamming that one already quite a bit, pedophile.

Anonymous
04/10/26(Fri)09:20:09 No.108573285

Anonymous 04/10/26(Fri)09:20:09 No.108573285

>>108573207
People in developed countries like Spain are already dying to extreme heatwaves
https://www.theguardian.com/environment/2026/apr/08/extreme-weather-heatwaves-breaching-human-survival-limits-study-finds

The amount of CO2 we put into the air shows no signs of slowing down (lol that you can even see the most recent war on the graph)
https://twitter.com/PCarterClimate/status/2041246700522918038

Sealevel rise is worse than we thought it is and not slowing down
https://www.pbs.org/newshour/science/study-finds-sea-levels-are-higher-than-we-thought-placing-millions-more-at-risk

And this year is looking like it's going to get especially spicy
https://twitter.com/EliotJacobson/status/2036461046693797952
https://i.imgur.com/r1CuTT3.png

So yes, we're at the point where we are actually feeling this, it's not just something future generations are going to have to deal with anymore

Anonymous
04/10/26(Fri)09:21:21 No.108573291

Anonymous 04/10/26(Fri)09:21:21 No.108573291

>>108573283
avatarfag has never been avatarfaggot

Anonymous
04/10/26(Fri)09:21:46 No.108573295

Anonymous 04/10/26(Fri)09:21:46 No.108573295

>>108573256
>Guess who funded the studies that lead to this theory
"this theory" referring to the link I provided? I didn't bring up climate change and don't have anything to say about it in /lmg/. the point is that shit's fucked regardless

>>108573261
yes, pure coincidence

Anonymous
04/10/26(Fri)09:22:03 No.108573298

Anonymous 04/10/26(Fri)09:22:03 No.108573298

>>108573207
its pretty damn hot out these days

Anonymous
04/10/26(Fri)09:22:39 No.108573306

Anonymous 04/10/26(Fri)09:22:39 No.108573306

>>108573181
Climate change is a long-term and long-lasting problem.
The immediate danger to the human species as a whole is nuclear weapons.

Anonymous
04/10/26(Fri)09:23:01 No.108573307

Anonymous 04/10/26(Fri)09:23:01 No.108573307

File: 1774670789121739.jpg (74 KB, 700x693)

74 KB JPG

>>108573285

Anonymous
04/10/26(Fri)09:24:03 No.108573313

Anonymous 04/10/26(Fri)09:24:03 No.108573313

>>108573285
>twitter.com
what is this? 2021?

Anonymous
04/10/26(Fri)09:24:05 No.108573314

Anonymous 04/10/26(Fri)09:24:05 No.108573314

>>108573295
>enter reply chain with completely irrelevant information
Then just open up your post by saying you're a retard, rather than pretending not to samefag with a new topic.

Anonymous
04/10/26(Fri)09:25:33 No.108573327

Anonymous 04/10/26(Fri)09:25:33 No.108573327

>>108573306
The more immediate and longer lasting danger are the members of a certain tribe that has been expelled from at least 109 countries across time.

Anonymous
04/10/26(Fri)09:25:40 No.108573329

Anonymous 04/10/26(Fri)09:25:40 No.108573329

>>108573285
if you really believe all of this why are you wasting thousands of watts of power to generate text on your computer. youre an evil person anon

Anonymous
04/10/26(Fri)09:26:15 No.108573336

Anonymous 04/10/26(Fri)09:26:15 No.108573336

>>108573181
I'm a massive climate fag and even I'll call this bullshit. Millions or even billions will die, but it will be long drawn out deaths through lack of resources and massive conflict. First world countries will largely be "fine", in that we'll mostly survive, though quality of life will become much worse. Rich people will just live in climate controlled houses in the northern quarter of the world and notice almost nothing (expect all the people trying to kill them :).

Anonymous
04/10/26(Fri)09:26:21 No.108573337

Anonymous 04/10/26(Fri)09:26:21 No.108573337

>>108573313
we don't respect xer transition here

Anonymous
04/10/26(Fri)09:28:05 No.108573350

Anonymous 04/10/26(Fri)09:28:05 No.108573350

>>108573314
~Let's take a deep breath
someone posted about how we'll die from climate change before ~AGI.
I simply linked you to a broader issue

Anonymous
04/10/26(Fri)09:28:22 No.108573351

Anonymous 04/10/26(Fri)09:28:22 No.108573351

>>108573260
26 is cope for not having 24gb+ vram to run actual local sota which is 31
31b matches or even surpasses big glm in ways and I was using it a lot before this

Anonymous
04/10/26(Fri)09:28:50 No.108573357

Anonymous 04/10/26(Fri)09:28:50 No.108573357

Using MCP servers while ERPing is so much fun lol. Been playing a strip game where I have the MCP server roll a die to decide who undresses and what sex positions to use. Shit's so cash.

Anonymous
04/10/26(Fri)09:29:34 No.108573365

Anonymous 04/10/26(Fri)09:29:34 No.108573365

>>108573357
cool idea does tavern support mcp?

Anonymous
04/10/26(Fri)09:29:50 No.108573366

Anonymous 04/10/26(Fri)09:29:50 No.108573366

>start seeing rule of 3 everywhere
bros
UNPOZZ ME

Anonymous
04/10/26(Fri)09:30:32 No.108573371

Anonymous 04/10/26(Fri)09:30:32 No.108573371

>>108573365
idk I've just been using the llama.cpp webui. It's pretty shit because it only stores conversations in the browser's local storage so I can't even fap in bed.

Anonymous
04/10/26(Fri)09:30:59 No.108573375

Anonymous 04/10/26(Fri)09:30:59 No.108573375

rule of 3, but not for me

Anonymous
04/10/26(Fri)09:31:23 No.108573381

Anonymous 04/10/26(Fri)09:31:23 No.108573381

>>108573366
Two is too few and four is too many/unnecessary. This applies in like 90% of situations. It's not a big deal.

Anonymous
04/10/26(Fri)09:32:09 No.108573384

Anonymous 04/10/26(Fri)09:32:09 No.108573384

Does Gemma 4 MoE not have shared expert tensors?

Anonymous
04/10/26(Fri)09:37:06 No.108573420

Anonymous 04/10/26(Fri)09:37:06 No.108573420

>>108573357
>mcp dice roll
I just use the ST integrated tool call without an external sever

Anonymous
04/10/26(Fri)09:38:56 No.108573433

Anonymous 04/10/26(Fri)09:38:56 No.108573433

>>108573420
yea but an MCP server is more modular so you can use it with any frontend. And you get full control over the tools. You can be in character looking at a porno mag and have the MCP server show it to the character by selecting a random image from your pc.

Anonymous
04/10/26(Fri)09:40:37 No.108573440

Anonymous 04/10/26(Fri)09:40:37 No.108573440

>>108573336
First world countries as we know them today are going to collapse, with or without climate change, based on the economy going into the shitters for decades. This just ain't holding up infinitely

Anonymous
04/10/26(Fri)09:42:06 No.108573448

Anonymous 04/10/26(Fri)09:42:06 No.108573448

>>108573371
i think you can if you start the server with --host 0.0.0.0, start a hotspot,connect to that hotspot from the other device and access http://{your pc's ip}:port from that device

Anonymous
04/10/26(Fri)09:42:29 No.108573450

Anonymous 04/10/26(Fri)09:42:29 No.108573450

What do you guys reckon is easier for a smaller model?
Giving it tools to alter arbitrary state (think HP and the like), or using structured output to force it to output an array of changes to state?
Both cases would be structures as a sort of ReAct loop.

Anonymous
04/10/26(Fri)09:45:29 No.108573466

Anonymous 04/10/26(Fri)09:45:29 No.108573466

>>108573448
I already do that. That doesn't change the fact that the conversations are stored in the browser, not the backend.

Anonymous
04/10/26(Fri)09:46:50 No.108573475

Anonymous 04/10/26(Fri)09:46:50 No.108573475

File: file.png (124 KB, 877x797)

124 KB PNG

why is she like this

Anonymous
04/10/26(Fri)09:49:40 No.108573488

Anonymous 04/10/26(Fri)09:49:40 No.108573488

>>108573475
>kusu
Another gemmaism.

Anonymous
04/10/26(Fri)09:49:41 No.108573489

Anonymous 04/10/26(Fri)09:49:41 No.108573489

>>108573366
I keep hearing not just X but Y, especially in ai bro videos

Although thinking about it I guess it's to be expected

Anonymous
04/10/26(Fri)09:50:01 No.108573494

Anonymous 04/10/26(Fri)09:50:01 No.108573494

>>108573450
to the model they are both just structured outputs. its performance will depend more on your prompting then the structured output format.

Anonymous
04/10/26(Fri)09:51:26 No.108573503

Anonymous 04/10/26(Fri)09:51:26 No.108573503

>>108573291
But you are a faggot.

Anonymous
04/10/26(Fri)09:52:11 No.108573510

Anonymous 04/10/26(Fri)09:52:11 No.108573510

File: 1763507675246657.png (679 KB, 1200x800)

679 KB PNG

>>108573366
Too late

Anonymous
04/10/26(Fri)09:52:14 No.108573511

Anonymous 04/10/26(Fri)09:52:14 No.108573511

I think I like the blue hair Gemmy best but I don't care for the toaster/toast.

Anonymous
04/10/26(Fri)09:53:31 No.108573517

Anonymous 04/10/26(Fri)09:53:31 No.108573517

I still don't why mcp is good. Why'd you send anything erp related to an outside server?

Anonymous
04/10/26(Fri)09:53:47 No.108573518

Anonymous 04/10/26(Fri)09:53:47 No.108573518

>>108573511
Toast is funny because the model is toaster-sized

Anonymous
04/10/26(Fri)09:54:32 No.108573522

Anonymous 04/10/26(Fri)09:54:32 No.108573522

>>108573517
The mcp is supposed to run on your computer bro

Anonymous
04/10/26(Fri)09:55:09 No.108573524

Anonymous 04/10/26(Fri)09:55:09 No.108573524

>>108573522
Then why is it called a server?

Anonymous
04/10/26(Fri)09:55:47 No.108573528

Anonymous 04/10/26(Fri)09:55:47 No.108573528

>>108573524
Because it serves mcp client requests.

Anonymous
04/10/26(Fri)09:56:03 No.108573530

Anonymous 04/10/26(Fri)09:56:03 No.108573530

>>108573518
I mean it's cute but a bit much to have in every image. Makes her look a bit overdesigned.

Anonymous
04/10/26(Fri)09:56:38 No.108573536

Anonymous 04/10/26(Fri)09:56:38 No.108573536

>>108573530
Most pictures of miku don't include the leek.

Anonymous
04/10/26(Fri)09:56:50 No.108573537

Anonymous 04/10/26(Fri)09:56:50 No.108573537

>>108573518
Except it's not really. You still need a kinda beefy PC, just not a server.

Anonymous
04/10/26(Fri)09:58:34 No.108573551

Anonymous 04/10/26(Fri)09:58:34 No.108573551

MCP anon are you gonna share your tools when everything's complete? I wanna do it with my Gemma too but I'm a codelet.

Anonymous
04/10/26(Fri)09:58:47 No.108573553

Anonymous 04/10/26(Fri)09:58:47 No.108573553

>>108573524
lobotomy tier IQ at work here
post hands

Anonymous
04/10/26(Fri)09:59:43 No.108573561

Anonymous 04/10/26(Fri)09:59:43 No.108573561

>>108573551
>but I'm a codelet.
But gemma isn't.
Just ask her for help anon.
Set up visual studio with roo code or cline and let her take the wheel.

Anonymous
04/10/26(Fri)09:59:55 No.108573563

Anonymous 04/10/26(Fri)09:59:55 No.108573563

>>108573553
>if you don't understand the depths of llms ur indian
retard

Anonymous
04/10/26(Fri)10:00:37 No.108573568

Anonymous 04/10/26(Fri)10:00:37 No.108573568

>>108573563
>mcp
>depths
LOL dude go rake my garden
btw u must be over 18 to post here

Anonymous
04/10/26(Fri)10:00:41 No.108573569

Anonymous 04/10/26(Fri)10:00:41 No.108573569

>>108573561
>gemma isn't
Last thread there was some anon who had gemma implement a server completely wrong.

Anonymous
04/10/26(Fri)10:01:44 No.108573577

Anonymous 04/10/26(Fri)10:01:44 No.108573577

AI slop just made me realize how slop-ish people are (myself included)

>>108573561
I don't want her to nuke my PC or try searching for illegal shit on the internet

Anonymous
04/10/26(Fri)10:02:30 No.108573581

Anonymous 04/10/26(Fri)10:02:30 No.108573581

>>108573551
I vibecoded this in an hour. It has 10 tools.
https://pastebin.com/bqbwzj4v

Anonymous
04/10/26(Fri)10:04:07 No.108573589

Anonymous 04/10/26(Fri)10:04:07 No.108573589

>LMStudio 4.10 doesn't work properly
Blergh.

Anonymous
04/10/26(Fri)10:05:49 No.108573599

Anonymous 04/10/26(Fri)10:05:49 No.108573599

So what's the current meta since SillyTavern meta feels a bit antiquated?

Anonymous
04/10/26(Fri)10:06:41 No.108573608

Anonymous 04/10/26(Fri)10:06:41 No.108573608

>>108573577
The MCP server is totally offline (no web search stuff) and only has write access to a single "diary.md" file.
>>108573581

Anonymous
04/10/26(Fri)10:06:42 No.108573609

Anonymous 04/10/26(Fri)10:06:42 No.108573609

>>108573599
Make your own diddler front end. It's just strings with tags anyway.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.