/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 01/21/26(Wed)11:41:42 No.107931319

File: 1768264705236546.jpg (157 KB, 768x1024)

157 KB JPG

/lmg/ - Local Models General Anonymous 01/21/26(Wed)11:41:42 No.107931319 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>107921731 & >>107914740

►News
>(01/19) GLM-4.7-Flash 30B-A3B released: https://hf.co/zai-org/GLM-4.7-Flash
>(01/15) PersonaPlex: Voice and role control for full duplex conversational speech: https://hf.co/nvidia/personaplex-7b-v1
>(01/15) Omni-R1 and Omni-R1-Zero (7B) released: https://hf.co/ModalityDance/Omni-R1
>(01/15) TranslateGemma released: https://hf.co/collections/google/translategemma
>(01/14) LongCat-Flash-Thinking-2601 released: https://hf.co/meituan-longcat/LongCat-HeavyMode-Summary
>(01/08) Jamba2 3B and Mini (52B-A12B) released: https://ai21.com/blog/introducing-jamba2

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
01/21/26(Wed)11:42:03 No.107931323

Anonymous 01/21/26(Wed)11:42:03 No.107931323

File: mikubottle.jpg (205 KB, 1216x2048)

205 KB JPG

►Recent Highlights from the Previous Thread: >>107921731

--Timeline and technical debates around DeepSeek model releases:
>107922047 >107922102 >107922134 >107922190 >107922192 >107922263 >107922268 >107922197 >107922305 >107927264 >107927277
--AI waifu setup configurations and performance optimization discussions:
>107922967 >107922985 >107923006 >107923019 >107923043 >107922968 >107923077 >107923122 >107923148 >107923128 >107923273 >107923358 >107923499 >107923539 >107923630 >107923684 >107923821 >107923396
--GLM 4.7 implementation issues and corporate model support failures in llama.cpp:
>107924976 >107925028 >107925070 >107925033 >107925052 >107925487 >107925059 >107925167 >107925249
--koboldcpp tool calling success with Claude Code:
>107923072 >107923151 >107923232 >107923276 >107923318 >107923370 >107924367 >107928718
--SillyTavern 1.15 grammar regression causing backend miscommunication:
>107930122 >107930132 >107930143 >107930150 >107930195 >107930211 >107930219 >107930227 >107930230 >107930273 >107930285 >107930310 >107930392
--Deepseek Engram's memory optimization potential:
>107928083 >107928137 >107929659 >107929729 >107929804 >107929834 >107930168 >107930235 >107930255
--Mistral AI's evolving MoE strategy and regulatory challenges in AI development:
>107922324 >107922381 >107922428 >107922466 >107922383
--Ollama's new macOS image generation tools:
>107928508 >107928534 >107928556 >107928610
--Anthropic's UI implementation flaws vs AI capabilities:
>107927370 >107927398 >107927422 >107928095
--Character card-based recursive roleplaying techniques for LLM ERP:
>107928002 >107928077 >107928818 >107928135
--Comparing and finetuning TTS models for improved voice cloning:
>107923777 >107923865 >107923926
--Miku (free space):
>107922190 >107923053 >107923415 >107925167 >107926879 >107926940 >107928797 >107930443 >107930897

►Recent Highlight Posts from the Previous Thread: >>107921736

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
01/21/26(Wed)11:45:19 No.107931354

Anonymous 01/21/26(Wed)11:45:19 No.107931354

>>107931327
You're absolutely right!

Anonymous
01/21/26(Wed)11:49:16 No.107931385

Anonymous 01/21/26(Wed)11:49:16 No.107931385

Is it feasible to program, vibecode, compile C++, and do full stack web development on a phone? Wanting to continue building my AI projects while I'm traveling.

Anonymous
01/21/26(Wed)11:50:38 No.107931399

Anonymous 01/21/26(Wed)11:50:38 No.107931399

>>107931385
no

Anonymous
01/21/26(Wed)11:51:02 No.107931401

Anonymous 01/21/26(Wed)11:51:02 No.107931401

>>107931385
you can do a lot with termux, but for your own sanity at least get some external keyboard

Anonymous
01/21/26(Wed)11:51:12 No.107931402

Anonymous 01/21/26(Wed)11:51:12 No.107931402

>>107931385
Yes.

Anonymous
01/21/26(Wed)11:55:19 No.107931442

Anonymous 01/21/26(Wed)11:55:19 No.107931442

>>107931385
Not at all and it will take ages before it becomes viable, LLMs are still shit at writing large amounts of complex code

Anonymous
01/21/26(Wed)11:56:19 No.107931453

Anonymous 01/21/26(Wed)11:56:19 No.107931453

>>107931442
>filtered by skill issue

Anonymous
01/21/26(Wed)11:57:06 No.107931460

Anonymous 01/21/26(Wed)11:57:06 No.107931460

>>107931385
ever heard of something called a "laptop" ?

Anonymous
01/21/26(Wed)11:57:52 No.107931464

Anonymous 01/21/26(Wed)11:57:52 No.107931464

>>107931442
>large amounts
You're not supposed to do that

Anonymous
01/21/26(Wed)11:58:00 No.107931466

Anonymous 01/21/26(Wed)11:58:00 No.107931466

>>107931460
In this economy? Don't be ridiculous.

Anonymous
01/21/26(Wed)11:58:55 No.107931473

Anonymous 01/21/26(Wed)11:58:55 No.107931473

>>107931385
Yeah, I love typing with a single thumb. Very relaxing.

Anonymous
01/21/26(Wed)12:01:03 No.107931494

Anonymous 01/21/26(Wed)12:01:03 No.107931494

>>107931466

no one laughs as you might seem half capable to press keys on corporate set device

Anonymous
01/21/26(Wed)12:03:09 No.107931513

Anonymous 01/21/26(Wed)12:03:09 No.107931513

>>107931453
>not wanting to use (or having to fix) dogshit code is a skill issue
The absolute state of Cloud cucks

Anonymous
01/21/26(Wed)12:06:50 No.107931544

Anonymous 01/21/26(Wed)12:06:50 No.107931544

>>107931385
Sure, it's possible. Just have your development environment set up somewhere and remotely access it. There are web versions of vscode.

Is it going to be a good experience? Definitely not. Just carry around a laptop if you want to actually get anything done without wanting to blow your brains out.

Anonymous
01/21/26(Wed)12:06:58 No.107931546

Anonymous 01/21/26(Wed)12:06:58 No.107931546

>>107931385
> termius on phone, SSH into a real computer
> run claude code from SSH
> profit
If you use an API the "real computer" can be a SBC sitting on your network.

Anonymous
01/21/26(Wed)12:21:59 No.107931672

Anonymous 01/21/26(Wed)12:21:59 No.107931672

https://xcancel.com/bdsqlsz/status/2013953926685483171#m

Anonymous
01/21/26(Wed)12:23:03 No.107931679

Anonymous 01/21/26(Wed)12:23:03 No.107931679

File: file.png (756 KB, 1735x1278)

756 KB PNG

GLM Flash is weird. Even with the chat template it breaks apart if it's forced to write lewd.
This is bf16 after the fix.

Anonymous
01/21/26(Wed)12:23:16 No.107931680

Anonymous 01/21/26(Wed)12:23:16 No.107931680

Despite the fixes, GLM 4.7 Flash continues being a piece of shit and do retarded "safety policy checks" in its thinking.

Anonymous
01/21/26(Wed)12:23:59 No.107931686

Anonymous 01/21/26(Wed)12:23:59 No.107931686

>>107931672
https://huggingface.co/stepfun-ai/Step3-VL-10B
>It consistently outperforms models under the 10B scale and rivals or surpasses significantly larger open-weights models (10×–20× its size), such as GLM-4.6V (106B-A12B), Qwen3-VL-Thinking (235B-A22B), and top-tier proprietary flagships like Gemini 2.5 Pro and Seed-1.5-VL.

Anonymous
01/21/26(Wed)12:26:43 No.107931709

Anonymous 01/21/26(Wed)12:26:43 No.107931709

>>107931686
What's the smallest model you can encode the answers to all popular benchmarks in?

Anonymous
01/21/26(Wed)12:28:16 No.107931726

Anonymous 01/21/26(Wed)12:28:16 No.107931726

all GLM models are garbage and I can't wait for them to be the first to die in the bubble explosion
unless they get acquired by novelai who need a text model and are too impotent to make their own

Anonymous
01/21/26(Wed)12:28:59 No.107931731

Anonymous 01/21/26(Wed)12:28:59 No.107931731

>it's back

Anonymous
01/21/26(Wed)12:30:51 No.107931744

Anonymous 01/21/26(Wed)12:30:51 No.107931744

>>107931686
>10B model beats le proprietary cloud SOTA!!!1!
this time it will be true, r-right bros?
well, even if there's surely some maxxing involved in those numbers, if it's even remotely true I'm happy to have a solid small-ish VLM

Anonymous
01/21/26(Wed)12:36:37 No.107931793

Anonymous 01/21/26(Wed)12:36:37 No.107931793

Drummer literally sucks literal dicks.

Anonymous
01/21/26(Wed)12:39:28 No.107931811

Anonymous 01/21/26(Wed)12:39:28 No.107931811

File: file.png (216 KB, 914x1050)

216 KB PNG

John got REKT god DAMN fucking ROASTED

Anonymous
01/21/26(Wed)12:40:26 No.107931816

Anonymous 01/21/26(Wed)12:40:26 No.107931816

>>107931744
even if they really managed to beat SOTA their model is not something you'd want to use
their core concept which they introduced in a previous 8b text only model:
https://arxiv.org/html/2601.05593v1
>We introduce PaCoRe, a general framework that decouples reasoning volume from model context capability by coordinating parallel reasoning, enabling multi-million token effective test-time compute
no I'm not going to gen multi millions tokens to gen some slop

Anonymous
01/21/26(Wed)12:43:03 No.107931841

Anonymous 01/21/26(Wed)12:43:03 No.107931841

>>107931816
That actually makes me want to try it. Surely it doesn't think for a million tokens on each request.

Anonymous
01/21/26(Wed)12:45:03 No.107931862

Anonymous 01/21/26(Wed)12:45:03 No.107931862

>>107931841
>Surely it doesn't think for a million tokens on each request
I wouldn't be surprised if it did, the Qwen models are not that exaggerated but close enough in my taste that I'm unwilling to use the thinking versions of their models even when they're as small and as fast to inference as 4B
god those models can yap and yap and yap for an eternity

Anonymous
01/21/26(Wed)12:51:38 No.107931906

Anonymous 01/21/26(Wed)12:51:38 No.107931906

>>107931811
He is aware of KLD. PPL is fine for sanity checks

Anonymous
01/21/26(Wed)12:56:09 No.107931941

Anonymous 01/21/26(Wed)12:56:09 No.107931941

>>107931811
Johannes Gaessler is one of my inspirations. He's such a genius.

Anonymous
01/21/26(Wed)12:57:25 No.107931954

Anonymous 01/21/26(Wed)12:57:25 No.107931954

Is GLM cucked or will it do whatever you ask out of the gate like Mistral and Cohere?

Anonymous
01/21/26(Wed)12:58:45 No.107931962

Anonymous 01/21/26(Wed)12:58:45 No.107931962

>>107931954
It will do a lot with a proper system prompt and it will do anything if you don't let it think.

Anonymous
01/21/26(Wed)13:00:54 No.107931973

Anonymous 01/21/26(Wed)13:00:54 No.107931973

>>107931962
It'll also do anything if you let it think with a safety check prefill.

Anonymous
01/21/26(Wed)13:03:30 No.107931991

Anonymous 01/21/26(Wed)13:03:30 No.107931991

is there a way i can use local models like nemotron but via cloud on my phone? like i have no gpu but i can pay to use it

Anonymous
01/21/26(Wed)13:04:35 No.107931994

Anonymous 01/21/26(Wed)13:04:35 No.107931994

>>107931991
https://openrouter.ai/

Anonymous
01/21/26(Wed)13:06:05 No.107932006

Anonymous 01/21/26(Wed)13:06:05 No.107932006

>>107931962
>it will do anything if you don't let it think.
I think it's pointless to use the model like that. It's been thoroughly trained to use thinking, not enabling it would be gimping its capabilities.
>>107931954
If you tell it not to be cucked or if you add additional rules in the system prompt, it will consider that as "jailbreaking" just before concluding its reasoning and will likely output a refusal. It's annoying because it makes you waste time; Gemma 3 feels less cucked in this regard: it either works or it doesn't.

Anonymous
01/21/26(Wed)13:11:30 No.107932060

Anonymous 01/21/26(Wed)13:11:30 No.107932060

>>107931811
>>107931906
Where is llama-kld?

Anonymous
01/21/26(Wed)13:15:37 No.107932086

Anonymous 01/21/26(Wed)13:15:37 No.107932086

>>107932060
waiting for your pr

Anonymous
01/21/26(Wed)13:15:38 No.107932087

Anonymous 01/21/26(Wed)13:15:38 No.107932087

>>107931941
Hans Gruber is more to my liking. He's a practical developer.

Anonymous
01/21/26(Wed)13:33:01 No.107932226

Anonymous 01/21/26(Wed)13:33:01 No.107932226

File: Prime Intellect.png (277 KB, 1279x386)

277 KB PNG

You guys remember the Prime Intellect company right? The one that makes the Intellect LLM's off of distributed training? Well I just read a short story yesterday called "The Metamorphosis of Prime Intellect" and I believe that to be where they got the name. In the story there is an AI called Intellect that iterates up to 39 before it gets the new name Prime Intellect, and that same naming scheme seems to be what Prime Intellect (company) is doing. They also seem to share a butterfly motif, with the short story having a butterfly at its intro and both Intellect-1 and 2 having butterfly motifs as well.
I thought that was pretty neat.

Anonymous
01/21/26(Wed)13:42:23 No.107932314

Anonymous 01/21/26(Wed)13:42:23 No.107932314

File: triple distilled.jpg (278 KB, 912x1536)

278 KB JPG

Anonymous
01/21/26(Wed)13:55:51 No.107932384

Anonymous 01/21/26(Wed)13:55:51 No.107932384

>>107932226
39
三九
ミク
Miku? Is she going to make me immortal?

Anonymous
01/21/26(Wed)14:15:21 No.107932535

Anonymous 01/21/26(Wed)14:15:21 No.107932535

File: 49262.png (263 KB, 460x460)

263 KB PNG

>>107931811
Would he say the same if it was pic related that measured perplexity of wikitext?

Anonymous
01/21/26(Wed)14:21:06 No.107932592

Anonymous 01/21/26(Wed)14:21:06 No.107932592

>>107932226
so 38 more distributed training experiments and we'll have fully open source decentralized AGI? Not even sure I'm going to be alive for that much longer.

Anonymous
01/21/26(Wed)14:21:08 No.107932593

Anonymous 01/21/26(Wed)14:21:08 No.107932593

Can we expect any major architectural changes to LLMs without it completely destroying the FOSS sphere because of incompatibility issues? Are we just stuck with constant parameter increases in LLMs, which is a major hardware constraint?

I haven't seen anything promising lately in terms of optimization or efficiency. I really honestly thought that more research would be put into this by big AI companies given the rising expenses of energy and computer hardware, but it just seems that they're doubling down and even thinking about getting nuclear reactors and more manufacturing. It's bleak.

Anonymous
01/21/26(Wed)14:21:11 No.107932594

Anonymous 01/21/26(Wed)14:21:11 No.107932594

>>107932535
Cute boy. Would ravage his bussy

Anonymous
01/21/26(Wed)14:23:12 No.107932614

Anonymous 01/21/26(Wed)14:23:12 No.107932614

>>107932594
You are a homosexual sir.

Anonymous
01/21/26(Wed)14:23:17 No.107932615

Anonymous 01/21/26(Wed)14:23:17 No.107932615

>>107932592
Well, 37 at least. Then once they run out of numbers and still haven't hit AGI they'll start going 38a, 38b, 38zzx4.2.5.thefinalseason.parttwo, etc.

Anonymous
01/21/26(Wed)14:25:27 No.107932636

Anonymous 01/21/26(Wed)14:25:27 No.107932636

>>107932614
Yeah

Anonymous
01/21/26(Wed)14:27:20 No.107932654

Anonymous 01/21/26(Wed)14:27:20 No.107932654

>>107932615
Like drummer?

Anonymous
01/21/26(Wed)14:27:55 No.107932664

Anonymous 01/21/26(Wed)14:27:55 No.107932664

>>107932314
That is not snow she's buried in

Anonymous
01/21/26(Wed)14:28:42 No.107932671

Anonymous 01/21/26(Wed)14:28:42 No.107932671

https://huggingface.co/microsoft/VibeVoice-ASR
guise do we finally have a whisper contender after 50 years?

Anonymous
01/21/26(Wed)14:30:08 No.107932685

Anonymous 01/21/26(Wed)14:30:08 No.107932685

>>107932593
Optimizations are proposed almost daily in papers. But most of them lead nowhere.
Anyway, when something actually good is released, someone will definitely write code to support it, just like llama.cpp was written when LLaMA was leaked.

Anonymous
01/21/26(Wed)14:31:02 No.107932696

Anonymous 01/21/26(Wed)14:31:02 No.107932696

>>107932671
They don't mention what languages are supported anywhere, so it's probably English and Chinese only again. So, no.

Anonymous
01/21/26(Wed)14:31:51 No.107932707

Anonymous 01/21/26(Wed)14:31:51 No.107932707

>>107932696
it transcribed jap for me on the demo page tho

Anonymous
01/21/26(Wed)14:40:01 No.107932773

Anonymous 01/21/26(Wed)14:40:01 No.107932773

>>107931679
3b active parameters is too low. It's reminiscent of some older models that experimented with low parameter count.

Anonymous
01/21/26(Wed)14:43:16 No.107932798

Anonymous 01/21/26(Wed)14:43:16 No.107932798

>>107931679
Including the FA fix?
Make sure you aren't adding the BOS token twice too somehow.

Anonymous
01/21/26(Wed)14:48:18 No.107932860

Anonymous 01/21/26(Wed)14:48:18 No.107932860

File: 1741326374618030.png (842 KB, 1479x1866)

842 KB PNG

Gemma 4 will NOT come out
GLM 4.6/4.7 Air will NOT come out
ALL future local models will either be nemo sidegrades or require 256gb to run at q2
There will never be another 70-100 beak range model
Local peaked at L2/L3 finetunes and GLM 4.5

Anonymous
01/21/26(Wed)14:50:13 No.107932877

Anonymous 01/21/26(Wed)14:50:13 No.107932877

>>107932798
Do you mean https://github.com/ggml-org/llama.cpp/pull/18953
That shouldn't change the output, right?
As far as I know llama.cpp fixes double BOS tokens even if you managed to send it like that.

Anonymous
01/21/26(Wed)14:50:51 No.107932884

Anonymous 01/21/26(Wed)14:50:51 No.107932884

>>107932773
Would 10 parameters and 3 moe experts be better, for a total of 30b?

Anonymous
01/21/26(Wed)14:51:33 No.107932889

Anonymous 01/21/26(Wed)14:51:33 No.107932889

>>107932593
>I haven't seen anything promising lately in terms of optimization or efficiency.
The only one focusing on this and also releasing weights is DeepSeek so it all depends on V4.

Anonymous
01/21/26(Wed)14:52:14 No.107932897

Anonymous 01/21/26(Wed)14:52:14 No.107932897

File: hiPlGm-388222338.gif (921 KB, 320x240)

921 KB GIF

>>107931679
>He deepened the kiss, his tongue demanding entry, his taste like dark chocolate and rain

Anonymous
01/21/26(Wed)14:52:54 No.107932903

Anonymous 01/21/26(Wed)14:52:54 No.107932903

>>107932884
yeh

Anonymous
01/21/26(Wed)14:53:02 No.107932904

Anonymous 01/21/26(Wed)14:53:02 No.107932904

>>107932884
Hear me out, what if... we just had 30b parameters that were ALL active at once? I know, revolutionary, right?

Anonymous
01/21/26(Wed)14:53:43 No.107932913

Anonymous 01/21/26(Wed)14:53:43 No.107932913

>>107932877
>That shouldn't change the output, right?
It shouldn't but who knows. Wouldn't surprise me.
Won't matter if you aren't using flash attention, of course.

>As far as I know llama.cpp fixes double BOS tokens
Doesn't it just send a warning if there's two BOS tokens, one added by the user and one added by llama.cpp?
I legit am not sure how it works at this point.

Anonymous
01/21/26(Wed)14:54:32 No.107932925

Anonymous 01/21/26(Wed)14:54:32 No.107932925

>>107932860
Starting to feel like this is true desu

Anonymous
01/21/26(Wed)14:55:37 No.107932938

Anonymous 01/21/26(Wed)14:55:37 No.107932938

>>107932904
Isn't that just Gemma 3 27b? Why are you even using GLM.

Anonymous
01/21/26(Wed)15:04:40 No.107933038

Anonymous 01/21/26(Wed)15:04:40 No.107933038

>>107932938
MoE models are always kind of retarded when their active parameters are too low, a dense 30b model would be smarter than a MoE 30b model. Anyways it was a joke, retard-kun

Anonymous
01/21/26(Wed)15:05:06 No.107933042

Anonymous 01/21/26(Wed)15:05:06 No.107933042

File: the-image-shows-a-young-w(...).png (1.48 MB, 1024x1024)

1.48 MB PNG

>>107932535
NTA

Anonymous
01/21/26(Wed)15:05:26 No.107933048

Anonymous 01/21/26(Wed)15:05:26 No.107933048

>>107932860
bald miku tells no lies

Anonymous
01/21/26(Wed)15:07:32 No.107933064

Anonymous 01/21/26(Wed)15:07:32 No.107933064

>>107933038
The "joke" was obviously getting at a real point though, otherwise you wouldn't have said it. Unless you're retarded.

Anonymous
01/21/26(Wed)15:09:21 No.107933086

Anonymous 01/21/26(Wed)15:09:21 No.107933086

>>107933042
Is this also a bussy haver?

Anonymous
01/21/26(Wed)15:22:36 No.107933219

Anonymous 01/21/26(Wed)15:22:36 No.107933219

I use a 13b 4 bit model on my RTX3070 (cuda, gguf) and I'm wanting to try adding TTS. About what model size sounds reasonable to go for?

Anonymous
01/21/26(Wed)15:31:32 No.107933292

Anonymous 01/21/26(Wed)15:31:32 No.107933292

>>107933219
Get a TTS that runs on CPU. I've had good experiences with Pocket TTS. Didn't like the python implementation so I built a C++ runtime. With quantized onnx models, output streaming, cached voice clone samples, 8 CPU cores enabled, and many other improvements I've implemented you can get a decently high quality voice clone in 160ms. I will release the code eventually but I am on vacation right now.

Anonymous
01/21/26(Wed)15:31:39 No.107933294

Anonymous 01/21/26(Wed)15:31:39 No.107933294

>>107933064
You're absolutely right! Within the context of the reply chain, the joke is a clever suggestion that moving to a dense model would be "revolutionary" — despite the fact that dense models existed *before* MoE models. The post is insinuating that the technology is actually going *backwards* with the focused development of MoE and the general neglect of dense model production. It wraps up the argument in a neat little package — not just a joke, but a witty 1-liner summary of the current state of affairs.

Anonymous
01/21/26(Wed)15:37:46 No.107933349

Anonymous 01/21/26(Wed)15:37:46 No.107933349

>>107932925
Labs low key releasing full sized models and some goyslop for the proles.
To be expected since in 3/4 of the western world you can do real prison time for mean tweets.
Chinese models likely get more based when used in their native moon runes. On the GLM site you can get better answers if it thinks or responds in mandarin.

Anonymous
01/21/26(Wed)15:52:24 No.107933474

Anonymous 01/21/26(Wed)15:52:24 No.107933474

>>107932860
>There will never be another 70-100 beak range model
Did you forget about Devstral 2 already?

Anonymous
01/21/26(Wed)16:11:29 No.107933605

Anonymous 01/21/26(Wed)16:11:29 No.107933605

mistral models are too much of a joke and should not be mentioned

Anonymous
01/21/26(Wed)16:12:44 No.107933617

Anonymous 01/21/26(Wed)16:12:44 No.107933617

>>107933605
Name some better models in a similar size range

Anonymous
01/21/26(Wed)16:15:46 No.107933640

Anonymous 01/21/26(Wed)16:15:46 No.107933640

File: 1751402986426408.jpg (1.16 MB, 3840x2160)

1.16 MB JPG

>>107933605
Watch your tone when speaking of my girlfriend

Anonymous
01/21/26(Wed)16:17:37 No.107933655

Anonymous 01/21/26(Wed)16:17:37 No.107933655

File: 2026-01-21-151549_2560x14(...).png (218 KB, 2560x1440)

218 KB PNG

i haven't tunneled the gpu but my god it's taking quite a while to get a response using cpu only. is this normal? will it even improve with the shitty gpu i installed on the server (1660ti)

Anonymous
01/21/26(Wed)17:00:27 No.107933999

Anonymous 01/21/26(Wed)17:00:27 No.107933999

File: dipsyAndTheWhale.png (1.37 MB, 1024x1024)

1.37 MB PNG

>>107933042

Anonymous
01/21/26(Wed)17:08:04 No.107934073

Anonymous 01/21/26(Wed)17:08:04 No.107934073

>>107933655
Yeah it'll be faster with your gpu. CPU is slow as balls

Anonymous
01/21/26(Wed)17:10:11 No.107934090

Anonymous 01/21/26(Wed)17:10:11 No.107934090

>>107932226
Retarded question, but why haven't there been /lmg/ initiatives for distributed training?
I'm sure if we got all the coomers to chip in, we would get plenty of compute for the bigger Nemo we all dream of. Is it really just the data?
Is shoving Anna's Archive's worth of pirated books not enough?

Anonymous
01/21/26(Wed)17:12:07 No.107934103

Anonymous 01/21/26(Wed)17:12:07 No.107934103

>>107934090
>Is shoving Anna's Archive's worth of pirated books not enough?
lol no
also good luck having anons agree on what parameters to use for the model, moe/dense, how many Bs any special attention flavor of the month, etc.

Anonymous
01/21/26(Wed)17:15:02 No.107934132

Anonymous 01/21/26(Wed)17:15:02 No.107934132

File: 1760240091958832.jpg (336 KB, 10000x500)

336 KB JPG

Anybody tested out this new TTS model?

https://huggingface.co/FlashLabs/Chroma-4B

Anonymous
01/21/26(Wed)17:18:59 No.107934163

Anonymous 01/21/26(Wed)17:18:59 No.107934163

>>107934090
There was. See: https://huggingface.co/PygmalionAI/pygmalion-6b

Anonymous
01/21/26(Wed)17:19:04 No.107934166

Anonymous 01/21/26(Wed)17:19:04 No.107934166

GLM Flash might be okay with some fiddling but it's too cucked with Content Safety Check (Critical) to bother trying. Back to Air.

Anonymous
01/21/26(Wed)17:31:33 No.107934273

Anonymous 01/21/26(Wed)17:31:33 No.107934273

>>107934090
Execs that bring data get board seats. E.g. that guy that run Quora sits on OAI board, I doubt he has any other qual that got him that seat.
That's how important and valuable data is.
Anons here don't have that sort of access.
>>107934103
Not to mention the q of "who would pay for the training runs."

Anonymous
01/21/26(Wed)17:41:10 No.107934328

Anonymous 01/21/26(Wed)17:41:10 No.107934328

>>107934090
because that's not how this works.
there is no (We) there is only (You)

Anonymous
01/21/26(Wed)17:50:37 No.107934379

Anonymous 01/21/26(Wed)17:50:37 No.107934379

>>107931954
>Is GLM cucked or will it do whatever you ask out of the gate like Mistral and Cohere?
Try getting https://huggingface.co/CohereLabs/command-a-reasoning-08-2025 to do anything

Anonymous
01/21/26(Wed)17:50:49 No.107934382

Anonymous 01/21/26(Wed)17:50:49 No.107934382

>>107934090
>Is shoving Anna's Archive's worth of pirated books not enough?
if all you wanted is a basic text completion model we already have a ton of them (nobody releases new ones anymore but there's no point)
making an actually worthwhile llm takes a ton of instruction datasets and not the kind of trash you see littering huggingface

Anonymous
01/21/26(Wed)17:51:20 No.107934388

Anonymous 01/21/26(Wed)17:51:20 No.107934388

>>107934090
They used 14 nodes of 8xH100 for a 10B 8k ctx model trained on only 1T data, and it took them
>he pre-training of INTELLECT-1 for 1 trillion tokens took place over 42 days

Anonymous
01/21/26(Wed)17:52:35 No.107934400

Anonymous 01/21/26(Wed)17:52:35 No.107934400

File: 1769035935686.jpg (281 KB, 1024x911)

281 KB JPG

why do I need a fucking hf account to use pocket tts?
>error: Failed to fetch: 'https://pypi.org/simple/pocket-tts'
can't even use it without internet

Anonymous
01/21/26(Wed)17:53:32 No.107934405

Anonymous 01/21/26(Wed)17:53:32 No.107934405

>>107934400
it's a gated repo i think. but there is a reposted ungated one

Anonymous
01/21/26(Wed)18:06:33 No.107934497

Anonymous 01/21/26(Wed)18:06:33 No.107934497

>>107934166
I get random refusals on GLM 4.7 Flash with very vanilla prompts that Gemma 3 doesn't seem to have issues with (although with very bland outputs). I imagine that Gemma 4 will be gptoss-tier with safety checks, though.

Anonymous
01/21/26(Wed)18:13:13 No.107934544

Anonymous 01/21/26(Wed)18:13:13 No.107934544

File: whykeyboard.gif (811 KB, 336x252)

811 KB GIF

i've been battling finetuning chatterbox turbo for the last day with no usable results. i seem to just get a ton of gibberish. if anybody is interested could i have you try to train a dataset using the repo below and let me know your results? i'm training for turbo in particular, not the multilingual/base version.
https://github.com/gokhaneraslan/chatterbox-finetuning

Anonymous
01/21/26(Wed)18:18:24 No.107934573

Anonymous 01/21/26(Wed)18:18:24 No.107934573

>>107934090
Not really possible anymore since around when Llama got released and people mostly started thinking about monetization. It will take a group of wealthy benefactors who don't want to earn anything from it and who simultaneously have industry-level knowledge about pre- and post-training (especially).
I don't think you'd even have to train a model from scratch anyway, just continue pretraining a good one with large enough amounts of "good data" and with a good pipeline.

Anonymous
01/21/26(Wed)18:18:49 No.107934574

Anonymous 01/21/26(Wed)18:18:49 No.107934574

>>107934379
>https://huggingface.co/CohereLabs/command-a-reasoning-08-2025
Maybe it's a reasoning problem. a-03-2025 has never refused me.

Anonymous
01/21/26(Wed)18:26:58 No.107934632

Anonymous 01/21/26(Wed)18:26:58 No.107934632

File: 1738158710959607.gif (1.61 MB, 498x357)

1.61 MB GIF

>>107934090
it is simply not worth it now
Every month/couple of months there is a new breakthrough or improvement in the architecture of models, specially in LLMs or T2I
By the time anons would finish training a new model, it's going to be outdated

Anonymous
01/21/26(Wed)18:28:48 No.107934638

Anonymous 01/21/26(Wed)18:28:48 No.107934638

>>107934632
>Every month/couple of months there is a new breakthrough or improvement in the architecture of models, specially in LLMs
Nemo was released 2024/07

Anonymous
01/21/26(Wed)18:28:56 No.107934641

Anonymous 01/21/26(Wed)18:28:56 No.107934641

>>107934163
The only time people/users genuinely wanted to help just for the sake (hope) of having an alternative to cloud roleplay models (however misguided the idea was at the time). Forget about doing anything like that again in 2026 and beyond.

Anonymous
01/21/26(Wed)18:30:22 No.107934649

Anonymous 01/21/26(Wed)18:30:22 No.107934649

>>107934638
stop being poor and run kimi

Anonymous
01/21/26(Wed)18:31:15 No.107934656

Anonymous 01/21/26(Wed)18:31:15 No.107934656

>>107934649
Not even worth the bits to download

Anonymous
01/21/26(Wed)18:46:24 No.107934752

Anonymous 01/21/26(Wed)18:46:24 No.107934752

>>107934656
cool. im not here to convince you to use it. that would mean wasting time I could be having fun RPing instead.

Anonymous
01/21/26(Wed)18:47:05 No.107934758

Anonymous 01/21/26(Wed)18:47:05 No.107934758

>>107934752
Yet you still come here

Anonymous
01/21/26(Wed)18:49:16 No.107934776

Anonymous 01/21/26(Wed)18:49:16 No.107934776

>>107934758
because i like to see the developments in LLMs. what's your point? i bet you act like this in real life too. what a loser.

Anonymous
01/21/26(Wed)18:52:15 No.107934795

Anonymous 01/21/26(Wed)18:52:15 No.107934795

Oh boy time for another vomit of random qwens with random claudes mixed in. Truly the art of finetuning at its peak.

Anonymous
01/21/26(Wed)18:55:13 No.107934826

Anonymous 01/21/26(Wed)18:55:13 No.107934826

>>107934649
Kimi is not worth thousands of dollars

Anonymous
01/21/26(Wed)18:56:41 No.107934836

Anonymous 01/21/26(Wed)18:56:41 No.107934836

>>107934795
I really don't know why people keep making these. Qwen models have their uses but it sure as fuck isn't RP, shoving slop logs into it isn't going to make them usable.

Anonymous
01/21/26(Wed)19:02:11 No.107934885

Anonymous 01/21/26(Wed)19:02:11 No.107934885

>>107934382
>if all you wanted is a basic text completion model we already have a ton of them
No we don't. No one trains on raw data in snippets longer than 8K tokens (and most do 4K) which isn't long enough to learn to coherently auto-complete books.

Anonymous
01/21/26(Wed)19:05:37 No.107934908

Anonymous 01/21/26(Wed)19:05:37 No.107934908

>>107934836
I think people who have been successfully making finetunes for years would know better than you what makes for usable results.

Anonymous
01/21/26(Wed)19:07:14 No.107934916

Anonymous 01/21/26(Wed)19:07:14 No.107934916

>>107934908
>I think people who have been successfully making finetunes for years
You mean NVIDIA? They haven't done any RP tunes.

Anonymous
01/21/26(Wed)19:14:14 No.107934977

Anonymous 01/21/26(Wed)19:14:14 No.107934977

>>107934908
just like those successful alchemists know much better than the naysayers about turning lead into gold

Anonymous
01/21/26(Wed)19:14:59 No.107934986

Anonymous 01/21/26(Wed)19:14:59 No.107934986

why no goof for kimi linear yet

Anonymous
01/21/26(Wed)19:15:23 No.107934991

Anonymous 01/21/26(Wed)19:15:23 No.107934991

>>107934826
There's always API access.

Anonymous
01/21/26(Wed)19:16:14 No.107934999

Anonymous 01/21/26(Wed)19:16:14 No.107934999

>>107934986
>https://huggingface.co/ymcki/Kimi-Linear-48B-A3B-Instruct-GGUF
Instructions for use are in the model card.

Anonymous
01/21/26(Wed)19:17:53 No.107935016

Anonymous 01/21/26(Wed)19:17:53 No.107935016

File: 2026-01-21-181551_2560x14(...).png (257 KB, 2560x1440)

257 KB PNG

wow i just enabled gpu on my ollama container and responses are faster but what the hell is qwen's problem? why is he being a dick and not saying hello?

Anonymous
01/21/26(Wed)19:18:19 No.107935019

Anonymous 01/21/26(Wed)19:18:19 No.107935019

>>107934999
well the problem is I can't coompile it myself

Anonymous
01/21/26(Wed)19:19:57 No.107935033

Anonymous 01/21/26(Wed)19:19:57 No.107935033

>>107935016
Temp too low?

Anonymous
01/21/26(Wed)19:20:04 No.107935035

Anonymous 01/21/26(Wed)19:20:04 No.107935035

>>107935019
Why not?
It's easier than compiling some of the python-hell projects in the space.

Anonymous
01/21/26(Wed)19:32:22 No.107935164

Anonymous 01/21/26(Wed)19:32:22 No.107935164

>>107935016
>I understand that I am in a read-only phase
Is ollama feeding it some system prompt or something?
Also, lollama.

Anonymous
01/21/26(Wed)19:34:10 No.107935176

Anonymous 01/21/26(Wed)19:34:10 No.107935176

File: hahahahaha.png (893 KB, 1280x720)

893 KB PNG

>>107934090
I would be poisoning the training with safety slop just to laugh at you

Anonymous
01/21/26(Wed)19:42:08 No.107935248

Anonymous 01/21/26(Wed)19:42:08 No.107935248

>>107934908
Name a 'successful' qwen RP tune.

Anonymous
01/21/26(Wed)19:44:04 No.107935261

Anonymous 01/21/26(Wed)19:44:04 No.107935261

>>107935016
Delete all the garbage you've downloaded and follow https://rentry.org/lmg-lazy-getting-started-guide

Anonymous
01/21/26(Wed)19:45:01 No.107935272

Anonymous 01/21/26(Wed)19:45:01 No.107935272

>>107934544
I might give it a try, but probably won't have the time for it until this weekend.

Anonymous
01/21/26(Wed)19:45:09 No.107935275

Anonymous 01/21/26(Wed)19:45:09 No.107935275

>>107935248
NTA but snowdrop was very popular

Anonymous
01/21/26(Wed)19:48:37 No.107935296

Anonymous 01/21/26(Wed)19:48:37 No.107935296

>>107935261
i don't care about your crypto miner i'm not looking to make local ai porn. i just want a few local llms for coding and basic questions using opencode with ollama installed on my local server

Anonymous
01/21/26(Wed)19:48:48 No.107935298

Anonymous 01/21/26(Wed)19:48:48 No.107935298

>>107935275
If bartowski didn't make quants for it then it wasn't popular

Anonymous
01/21/26(Wed)19:49:49 No.107935309

Anonymous 01/21/26(Wed)19:49:49 No.107935309

>>107935296
Then continue to be a retard, no one will help you.

Anonymous
01/21/26(Wed)19:49:53 No.107935310

Anonymous 01/21/26(Wed)19:49:53 No.107935310

>>107935296
nice bait

Anonymous
01/21/26(Wed)19:53:38 No.107935335

Anonymous 01/21/26(Wed)19:53:38 No.107935335

>>107935176
You'd do nothing, just like you did with your life

Anonymous
01/21/26(Wed)19:53:54 No.107935340

Anonymous 01/21/26(Wed)19:53:54 No.107935340

>>107935033
>>107935016

Qwen is just very direct and dry by default

Anonymous
01/21/26(Wed)20:12:24 No.107935466

Anonymous 01/21/26(Wed)20:12:24 No.107935466

Engram will save us from the quantcuck nightmare

Anonymous
01/21/26(Wed)20:23:14 No.107935530

Anonymous 01/21/26(Wed)20:23:14 No.107935530

>>107934574
>Maybe it's a reasoning problem. a-03-2025 has never refused me.
yeah it's the reasoning
a-03-2025 is uncucked as long as you don't leave the safety premable on
reasoning-08-2025 reasons safety slop regardless
for glm, 4.6 with reasoning off is uncucked

Anonymous
01/21/26(Wed)20:25:36 No.107935545

Anonymous 01/21/26(Wed)20:25:36 No.107935545

File: miku question marks thinking.png (222 KB, 512x477)

222 KB PNG

>>107935466
>we heard you like moe so lets do some more moe while you moe
Is that the one? we have all the pieces for something truly great but lack the vision

Anonymous
01/21/26(Wed)20:26:19 No.107935549

Anonymous 01/21/26(Wed)20:26:19 No.107935549

File: EVVdwf3UMAEt.jpg (49 KB, 461x562)

49 KB JPG

Slightly unrelated, but the jupyter ipynb files are something like a "gui anywhere" for python? I see many llm tests done with it.

Anonymous
01/21/26(Wed)20:29:38 No.107935572

Anonymous 01/21/26(Wed)20:29:38 No.107935572

>>107935248
>Name a 'successful' qwen RP tune.

https://huggingface.co/anthracite-org/magnum-v1-72b

Anonymous
01/21/26(Wed)20:32:41 No.107935589

Anonymous 01/21/26(Wed)20:32:41 No.107935589

>>107935549
Not quite sure if "gui anywhere" is something specific that you're referencing that I'm not aware of, but jupyter notebooks are little documents that you can embed code blocks into. Very popular with researchers, students, or people working on stuff who just want to run some simple python and show what it outputs in an inline way.

Anonymous
01/21/26(Wed)20:35:11 No.107935608

Anonymous 01/21/26(Wed)20:35:11 No.107935608

>>107935176
no you wouldn't
you masturbate about being cool and edgy, but are too cowardly to actually follow through

Anonymous
01/21/26(Wed)20:43:48 No.107935659

Anonymous 01/21/26(Wed)20:43:48 No.107935659

so... is glm 4.7 flash ok now?

Anonymous
01/21/26(Wed)20:45:44 No.107935670

Anonymous 01/21/26(Wed)20:45:44 No.107935670

>>107935659
Yep.
It also refuses like a motherfucker if you just let it be.

Anonymous
01/21/26(Wed)20:56:43 No.107935729

Anonymous 01/21/26(Wed)20:56:43 No.107935729

File: into the trash it goes.png (152 KB, 559x556)

152 KB PNG

>>107935659
>A3B
no

Anonymous
01/21/26(Wed)20:57:08 No.107935731

Anonymous 01/21/26(Wed)20:57:08 No.107935731

With the unsloth GLM 4.7 flash quant, I see "Template supports tool calls but does not natively describe tools. The fallback behaviour used may produce bad results, inspect prompt w/ --verbose & consider overriding the template."

is that an actual problem with the template?

Anonymous
01/21/26(Wed)21:12:39 No.107935816

Anonymous 01/21/26(Wed)21:12:39 No.107935816

>>107935659
Went like a flash

Anonymous
01/21/26(Wed)21:14:56 No.107935833

Anonymous 01/21/26(Wed)21:14:56 No.107935833

Is sliding window good for anything or should I just use flash?

Anonymous
01/21/26(Wed)21:20:15 No.107935871

Anonymous 01/21/26(Wed)21:20:15 No.107935871

I'm ESL so qwen storytelling is enough

Anonymous
01/21/26(Wed)21:24:14 No.107935912

Anonymous 01/21/26(Wed)21:24:14 No.107935912

>>107935871
Even an ESL should be able to tell good writing apart from the bad ones

Anonymous
01/21/26(Wed)21:24:33 No.107935914

Anonymous 01/21/26(Wed)21:24:33 No.107935914

How to disable GLM-4.7-flash reasoning in llamacpp?
I tried this chat template, it shows thinking 0 but something feels not right.
{% for message in messages %}
<|im_start|>{{ message.role }}
{{ message.content }}<|im_end|>
{% endfor %}
<|im_start|>assistant

Anonymous
01/21/26(Wed)21:24:59 No.107935917

Anonymous 01/21/26(Wed)21:24:59 No.107935917

>>107935572
magnum tunes are fucking awful, they always manage to be more slopped than the pre-tune model.

Anonymous
01/21/26(Wed)21:26:38 No.107935928

Anonymous 01/21/26(Wed)21:26:38 No.107935928

>>107935914
Have you tried editing the official chat template?
Or just send
>chat_template_kwargs: {"enable_thinking": false}
in the request body.

Anonymous
01/21/26(Wed)21:33:29 No.107935977

Anonymous 01/21/26(Wed)21:33:29 No.107935977

>>107935914
string ban the thinking tokens

Anonymous
01/21/26(Wed)21:35:14 No.107935987

Anonymous 01/21/26(Wed)21:35:14 No.107935987

>>107935917
well no shit. the whole point of the magnum tunes are to train it on claude outputs.

Anonymous
01/21/26(Wed)21:44:32 No.107936039

Anonymous 01/21/26(Wed)21:44:32 No.107936039

>>107935731
Not sure about Flash, but I've been trying to get tool calling to work with with ooba for 4.7 and its a nightmare. The unsloth instruct template requires a few things that aren't in the latest jinja filters.py which requires manual editing. Finally got it working but it doesn't call tools correctly with the given reference code from unsloth.

I guess it doesn't help that I'm retarded, but I'm close to giving up and trying again in another month or so.

Anonymous
01/21/26(Wed)21:46:25 No.107936047

Anonymous 01/21/26(Wed)21:46:25 No.107936047

>>107934885
Increasing context length increases compute requirements. You certainly won't gather enough compute from coomers to pre-train on snippets longer than 4K, more likely 2K.

Anonymous
01/21/26(Wed)22:57:42 No.107936426

Anonymous 01/21/26(Wed)22:57:42 No.107936426

>load model in kobold
>kobold cli shows it correct
>ST cli shows completely differrent model from the folder loaded
What the actual fuck.

Anonymous
01/21/26(Wed)23:08:12 No.107936476

Anonymous 01/21/26(Wed)23:08:12 No.107936476

File: Base Image.png (1.47 MB, 1252x3844)

1.47 MB PNG

GutenOCR: A Grounded Vision-Language Front-End for Documents
https://arxiv.org/abs/2601.14490
>GutenOCR is a family of grounded OCR front-ends obtained by fine-tuning Qwen2.5-VL-3B and Qwen2.5-VL-7B. The resulting single-checkpoint vision-language models expose reading, detection, and grounding through a unified, prompt-based interface. Trained on business documents, scientific articles, and synthetic grounding data, the models support full-page and localized reading with line- and paragraph-level bounding boxes and conditional ``where is x?'' queries. We introduce a grounded OCR evaluation protocol and show that GutenOCR-7B more than doubles the composite grounded OCR score of its Qwen2.5-VL-7B backbone on 10.5K held-out business and scientific pages (0.40 to 0.82). On Fox and OmniDocBench v1.5, our approach substantially improves region- and line-level OCR as well as text-detection recall, but reveals trade-offs in page-level linearization, color-guided OCR, and formula-heavy layouts.
https://huggingface.co/rootsautomation/models
none posted yet
https://ocr.roots.ai/
demo not working yet
probably useful for black and white documents

Anonymous
01/21/26(Wed)23:34:58 No.107936584

Anonymous 01/21/26(Wed)23:34:58 No.107936584

>>107936476
>We refer to the OCR component that downstream systems actually interact with as a grounded OCR front-end: a model that (i) produces page transcripts, (ii) attaches each token or span to 2D bounding boxes, and (iii) can be prompted to read arbitrary regions on the page.
Really don't like their usage of "front-end" to refer to models and hope it doesn't catch on. Researchers in this field have a really annoying habit of arbitrarily redefining existing terms.

Anonymous
01/21/26(Wed)23:41:50 No.107936615

Anonymous 01/21/26(Wed)23:41:50 No.107936615

>>107935659
**I don't know**. *You* *should* *try* *it*,,,

Anonymous
01/21/26(Wed)23:59:05 No.107936711

Anonymous 01/21/26(Wed)23:59:05 No.107936711

>>107936047
Maybe it would be tractable to start with an open pre-trained model and use full novels for a length extension step.

Anonymous
01/22/26(Thu)00:05:04 No.107936739

Anonymous 01/22/26(Thu)00:05:04 No.107936739

>>107935914
Try using the flag --reasoning-budget 0

Anonymous
01/22/26(Thu)02:12:12 No.107937242

Anonymous 01/22/26(Thu)02:12:12 No.107937242

man how does a thread go 2 hours with 0 activity

Anonymous
01/22/26(Thu)02:16:24 No.107937262

Anonymous 01/22/26(Thu)02:16:24 No.107937262

>>107937242
Yeah it's been real dead recently.

Anonymous
01/22/26(Thu)02:18:30 No.107937276

Anonymous 01/22/26(Thu)02:18:30 No.107937276

File: dipsyDestroyTheInvadingEnemy.png (2.4 MB, 1536x1024)

2.4 MB PNG

>>107937242
Local is having a hard time rn
> Massively increased hardware prices
> Large models are focus of new development
> Cheap API access, especially compared to cost of local inference

Anonymous
01/22/26(Thu)02:26:16 No.107937298

Anonymous 01/22/26(Thu)02:26:16 No.107937298

>>107937242
It's a negative feedback loop. The more time passes, the lower it appears in the catalog, making it harder for people to find it. And with 4chan X, the icon tells people that there's nothing to see, so they forget and do something else.

Anonymous
01/22/26(Thu)02:33:44 No.107937322

Anonymous 01/22/26(Thu)02:33:44 No.107937322

OpenAI sounds like it is having a rough go of things recently. Now I know that if they go down hardware prices will remain up, that is to be expected. But if they go down, what do you suppose will happen to all the research that is being done? Do you suppose things will slow to a crawl as all the capital flees the bursting bubble, or do you thing things will carry on as they are just with a few less companies competing against each other?

Anonymous
01/22/26(Thu)02:39:32 No.107937345

Anonymous 01/22/26(Thu)02:39:32 No.107937345

>>107937298
>And with 4chan X, the icon tells people that there's nothing to see
I don't use 4chan X, is it just hiding the thread if it is lower in the catalogue?

Anonymous
01/22/26(Thu)02:46:22 No.107937363

Anonymous 01/22/26(Thu)02:46:22 No.107937363

>>107937322
This is just my own conjecture, but I don't think OpenAI is the premier research lab it once was. There's so much talent in DeepMind, Anthropic, DeepSeek and others that I wouldn't be surprised if OpenAI falls behind on pushing transformers farther. Not to say that they're done but the competition is rough and I wouldn't count on OpenAI being the only entity to push progress.

I have no doubt that the market would react poorly to OpenAI going under, but I don't know if it would be the big one. You've still got trillion dollar datacenter and energy build outs, except now with one less player their competitors have to fight with for access to compute.

Anonymous
01/22/26(Thu)02:49:02 No.107937377

Anonymous 01/22/26(Thu)02:49:02 No.107937377

>>107937363
>DeepSeek
i'm sorry but have you actually used modern deepseek? it's overly verbose and extremely dumb. I feel like I can't entrust it the simplest tasks.

Anonymous
01/22/26(Thu)02:51:20 No.107937385

Anonymous 01/22/26(Thu)02:51:20 No.107937385

>>107937345
It changes the icon when there are unread posts. I assumed that's how people come back to a thread.

Anonymous
01/22/26(Thu)02:51:33 No.107937387

Anonymous 01/22/26(Thu)02:51:33 No.107937387

>>107937377
I don't disagree with you but Deepseek does good work releasing new techniques and research papers. In my opinion Kimi and GLM are better. We'll see if Deepseek has what it takes when they release v4 next month.

Anonymous
01/22/26(Thu)02:59:41 No.107937417

Anonymous 01/22/26(Thu)02:59:41 No.107937417

File: 4chan_notification.png (5 KB, 325x38)

5 KB PNG

>>107937385
Vanilla 4chan also notifies.

Anonymous
01/22/26(Thu)03:26:50 No.107937544

Anonymous 01/22/26(Thu)03:26:50 No.107937544

File: 1749073556125386.png (66 KB, 980x505)

66 KB PNG

>go update my llmao with my super duper update script
>it fails
>go check the releases
>last two releases failed to produce any artifact
GGERNIGEROV bros what the fUCK>?!?!?!?

Anonymous
01/22/26(Thu)03:37:03 No.107937581

Anonymous 01/22/26(Thu)03:37:03 No.107937581

Anons still don't know how to build llama.cpp.

Anonymous
01/22/26(Thu)03:43:10 No.107937612

Anonymous 01/22/26(Thu)03:43:10 No.107937612

>>107937377
Deepseek is good at research, but training on sloppa leads nowhere no matter how well you improved the architecture. Crazy how big labs don't seem to test their own models out of meme benchmarks

Anonymous
01/22/26(Thu)03:43:59 No.107937614

Anonymous 01/22/26(Thu)03:43:59 No.107937614

File: 1.png (8 KB, 1042x40)

8 KB PNG

>>107937544
why do you need le update script? I went to build the last version and it still works fine, I'm on windows and CUDA just like you.
just git pull the repo then (replace j16 to fit your CPU cores):
cmake -S . -B build -G Ninja -DCMAKE_BUILD_TYPE=Release -DLLAMA_BUILD_TESTS=OFF -DLLAMA_BUILD_EXAMPLES=ON -DLLAMA_BUILD_SERVER=ON -DGGML_CUDA=ON -DLLAMA_CURL=OFF

cmake --build build --config Release -j 16
copy the content of build/bin to where you keep your exes then run
git clean -fxd
to delete build artifacts, rince and repeat whenever you wanna update
there, it's not sorcery

Anonymous
01/22/26(Thu)03:44:33 No.107937616

Anonymous 01/22/26(Thu)03:44:33 No.107937616

File: 1751307618049514.png (52 KB, 996x599)

52 KB PNG

>>107937581
funny thing is I have ikllama builds setup too and various cuda SDKs installed. Usually the llmao guys were on point with their releases, this time they fucked. I checked the actions and ALL the artificats failed to upload due to picrel, which is the CUDA runtime that they just package, guess it somehow got LOST somewhere lmao.

Anonymous
01/22/26(Thu)03:46:03 No.107937623

Anonymous 01/22/26(Thu)03:46:03 No.107937623

>>107937581
i let terminally autistic fucks do it for me

Anonymous
01/22/26(Thu)03:48:51 No.107937632

Anonymous 01/22/26(Thu)03:48:51 No.107937632

>>107937614
>rince
opinion rejected

Anonymous
01/22/26(Thu)03:56:53 No.107937673

Anonymous 01/22/26(Thu)03:56:53 No.107937673

>>107937581
I let deepseek do it for me

Anonymous
01/22/26(Thu)03:58:21 No.107937679

Anonymous 01/22/26(Thu)03:58:21 No.107937679

>woke up
>no exl3
>back to sleep

Anonymous
01/22/26(Thu)04:17:06 No.107937743

Anonymous 01/22/26(Thu)04:17:06 No.107937743

File: retard.png (11 KB, 1871x61)

11 KB PNG

>>107936039
I'm literally the dumbest monkey retard. Just had to use this endpoint instead of the openai compatible one.
>>107937581
I concur with this anon >>107937623. Fuck messing with WSL, conda, docker etc. Ooba is expedient and has all the core features in a nice GUI.

Anonymous
01/22/26(Thu)04:27:25 No.107937788

Anonymous 01/22/26(Thu)04:27:25 No.107937788

>>107937743
>WSL
>conda
>docker
>I'm literally the dumbest monkey retard

Anonymous
01/22/26(Thu)04:35:24 No.107937832

Anonymous 01/22/26(Thu)04:35:24 No.107937832

>>107937788
>build llama.cpp via esoteric methods
>waste time tailoring samplers, parameters, extra flags for each of your models and quants via text
or
>run start_windows.bat
>change sliders, save preset and press "load model"
I won't deny that I'm retarded, but ooba just werks.

Anonymous
01/22/26(Thu)04:48:12 No.107937903

Anonymous 01/22/26(Thu)04:48:12 No.107937903

File: file.png (1.47 MB, 1024x1512)

1.47 MB PNG

>>107937544

Anonymous
01/22/26(Thu)05:05:53 No.107937968

Anonymous 01/22/26(Thu)05:05:53 No.107937968

>>107937832
>https://github.com/ggml-org/llama.cpp/discussions/16938

Anonymous
01/22/26(Thu)05:16:39 No.107938018

Anonymous 01/22/26(Thu)05:16:39 No.107938018

>>107937968
I had no idea about this. Thanks.

Anonymous
01/22/26(Thu)05:24:55 No.107938061

Anonymous 01/22/26(Thu)05:24:55 No.107938061

File: 1706625337798408.png (127 KB, 640x480)

127 KB PNG

>>107937832
>esoteric
Figure it out once and become empowered. really not that hard &u have LLMs to assist
llama-server web interface is decent now tho be aware it's using LocalStorage if you want to save chats
I was a ooba enjoyer the per model loader presets are nice, but you can do that better with shell scripts/.bat or llama-swap and get better cadence on latest dev
>>107937903
nice
Let's talk about your compile flags Miku

Anonymous
01/22/26(Thu)05:33:27 No.107938095

Anonymous 01/22/26(Thu)05:33:27 No.107938095

>>107938061
Maybe its time to hang up the ooba then.

Anonymous
01/22/26(Thu)05:42:22 No.107938136

Anonymous 01/22/26(Thu)05:42:22 No.107938136

>>107938095
i used ooba a lot in the llama2 days, but yeah there's better out there.
some people like llama-server, which is fine, but i personally found kobold to be better for large MoE models at least.

Anonymous
01/22/26(Thu)05:45:53 No.107938152

Anonymous 01/22/26(Thu)05:45:53 No.107938152

>>107938136
>personally found kobold to be better for large MoE models at least.
how?

Anonymous
01/22/26(Thu)05:48:31 No.107938165

Anonymous 01/22/26(Thu)05:48:31 No.107938165

>>107938136
Used to use kobold for pyg6b before I moved over to ooba for exl2. Wen't back to goofs but I'm still on ooba. I tried a recent version kobold.cpp and I do really like it with the exception of their ugly ass webui for inference. That's fine though because I use sillytavern for RP.
>but i personally found kobold to be better for large MoE models at least.
ooba is nice in the regard that I can set my tensor split, expert offload flags, context size, and gpu layers all in the same window and then reference the console for what I need to adjust to fit the model. Very expedient.

Anonymous
01/22/26(Thu)05:55:08 No.107938199

Anonymous 01/22/26(Thu)05:55:08 No.107938199

>>107938165
>the exception of their ugly ass webui for inference
you do know they have like three different web ui themes, right?

Anonymous
01/22/26(Thu)06:00:57 No.107938232

Anonymous 01/22/26(Thu)06:00:57 No.107938232

>>107937276
>Large models are focus of new development
and anons tell drummer to fuck off when the tries to train smaller rp models

>>107937743
>Ooba is expedient and has all the core features in a nice GUI.
if it works for you, keep using it. nothing wrong with ooba if you don't enjoy spending hours rebuilding and tweaking llama.cpp

Anonymous
01/22/26(Thu)06:02:35 No.107938243

Anonymous 01/22/26(Thu)06:02:35 No.107938243

>>107938232
>spending hours rebuilding and tweaking
It just works.

Anonymous
01/22/26(Thu)06:09:48 No.107938290

Anonymous 01/22/26(Thu)06:09:48 No.107938290

>>107932593
Closed models definitely do some weird stuff.
Gemini3 feels like a major leap forward.
Who knows whats going on there.
Local opensource plebs like us completely depend on chinkland. Deepseek and alibaba.
Alibaba has not just qwen but huge stake in zai (glm) and moonshot(kimi) on top of being a leader in image too. Qwen and zimage.
Gemma and gpt-oss are such a joke release in comparison.
I really hope alibaba and deepseek don't go closed. Its all we have left.
Mistral seems to only do finetunes with old architecture recently.

Anonymous
01/22/26(Thu)06:11:08 No.107938300

Anonymous 01/22/26(Thu)06:11:08 No.107938300

I just learned that glm 4.7 flash has a reasoning effort option...but llama.cpp doesn't support that.
Come on. It rambles like crazy. Bring the vibegooners back.

Anonymous
01/22/26(Thu)06:20:18 No.107938357

Anonymous 01/22/26(Thu)06:20:18 No.107938357

>>107938300
Source?
All I see is thinking on/off and an option to keep or remove previous thinking blocks.
https://huggingface.co/zai-org/GLM-4.7-Flash/blob/main/chat_template.jinja

Anonymous
01/22/26(Thu)06:21:30 No.107938365

Anonymous 01/22/26(Thu)06:21:30 No.107938365

>>107938290
>Mistral seems to only do finetunes with old architecture recently.
What should they be doing for small sizes? It's not like there have been architectural revolutions for dense models recently.

Anonymous
01/22/26(Thu)06:23:03 No.107938374

Anonymous 01/22/26(Thu)06:23:03 No.107938374

>>107938365
Train a newer base 24b with more recent data than late 2023 might be nice...

Anonymous
01/22/26(Thu)06:24:44 No.107938384

Anonymous 01/22/26(Thu)06:24:44 No.107938384

>>107938365
Give us a good 100b moe model. Something like gpt-oss 120b or air. That works well if you offload into ram.
But actually good and for creative writing. Even their "creative" api model is garbage. It is indeed creative, not the good soulful kind.

Anonymous
01/22/26(Thu)06:30:59 No.107938419

Anonymous 01/22/26(Thu)06:30:59 No.107938419

>>107938384
The continual pretraining data they've used for Ministral 3 seems sovlful, but the end models are half-garbage for RP.

Anonymous
01/22/26(Thu)06:32:30 No.107938430

Anonymous 01/22/26(Thu)06:32:30 No.107938430

>>107938419
Yeah 'cause they're prunes of super old Small base.

Anonymous
01/22/26(Thu)06:39:17 No.107938467

Anonymous 01/22/26(Thu)06:39:17 No.107938467

File: 1764421624149351.png (856 KB, 1024x768)

856 KB PNG

>>107937903
Migu! I compiled my own llama.cpp for WinXP.

Anonymous
01/22/26(Thu)06:40:32 No.107938472

Anonymous 01/22/26(Thu)06:40:32 No.107938472

File: 1737656763391397.png (278 KB, 1920x951)

278 KB PNG

>>107931319
Where's the dev? How's it going?

Anonymous
01/22/26(Thu)06:40:54 No.107938474

Anonymous 01/22/26(Thu)06:40:54 No.107938474

>>107938430
They did a few trillion tokens of logit distillation with newer data after pruning (hence them calling it "a form of continual pretraining" in the paper), that should be enough for healing any pruning damage: you could train completely new models from scratch with all of that. I think the problems are elsewhere.

Anonymous
01/22/26(Thu)06:53:55 No.107938550

Anonymous 01/22/26(Thu)06:53:55 No.107938550

Reminder to retain your cum until V4 release

Anonymous
01/22/26(Thu)07:06:01 No.107938619

Anonymous 01/22/26(Thu)07:06:01 No.107938619

>>107938550
what good will the v4 release be to us if when it'll take two months until it may or may not run on llama.cpp to a degree

Anonymous
01/22/26(Thu)07:06:03 No.107938620

Anonymous 01/22/26(Thu)07:06:03 No.107938620

>>107938061
you don't even need llama-swap anymore
--models-preset arg can load an ini file index of your models and then you can choose any on the oai compat endpoint

Anonymous
01/22/26(Thu)07:09:22 No.107938648

Anonymous 01/22/26(Thu)07:09:22 No.107938648

>>107938619
Just use the official API endpoint :^)

Anonymous
01/22/26(Thu)07:10:28 No.107938658

Anonymous 01/22/26(Thu)07:10:28 No.107938658

>>107931319
The getting started guide mentions a fucking 3080, I have a 5070Ti with 16GBVRAM + 64GBRAM, what's the recommended writing models these days? Not interested in porn, I just want to write a fun story.

Anonymous
01/22/26(Thu)07:13:10 No.107938677

Anonymous 01/22/26(Thu)07:13:10 No.107938677

>>107938658
nemo

Anonymous
01/22/26(Thu)07:13:46 No.107938683

Anonymous 01/22/26(Thu)07:13:46 No.107938683

>>107938677
Nemo was shit six months ago, did anything change?

Anonymous
01/22/26(Thu)07:15:46 No.107938703

Anonymous 01/22/26(Thu)07:15:46 No.107938703

>>107938472
burnted out

Anonymous
01/22/26(Thu)07:16:14 No.107938708

Anonymous 01/22/26(Thu)07:16:14 No.107938708

>>107938683
There weren't any good new models for low-RAM setups

Anonymous
01/22/26(Thu)07:16:30 No.107938715

Anonymous 01/22/26(Thu)07:16:30 No.107938715

>>107938619
>what good will the v4 release be to us if when it'll take two months until it may or may not run on llama.cpp to a degree
Olllama will have day one support for v4 support. They did for Gemma v3 remember? :^)

Anonymous
01/22/26(Thu)07:16:56 No.107938719

Anonymous 01/22/26(Thu)07:16:56 No.107938719

>>107938474
nah prune shit never worked right, there's no healing anything all nvidia prune were shite and so are these only thing you're healing is benchmaxxing

Anonymous
01/22/26(Thu)07:17:44 No.107938723

Anonymous 01/22/26(Thu)07:17:44 No.107938723

>>107938658
>16GBVRAM + 64GBRAM
Try a small quant of GLM 4.5.

Anonymous
01/22/26(Thu)07:18:14 No.107938725

Anonymous 01/22/26(Thu)07:18:14 No.107938725

>>107938723
Is 4.7 that bad?

Anonymous
01/22/26(Thu)07:20:19 No.107938748

Anonymous 01/22/26(Thu)07:20:19 No.107938748

>>107938725
Oh, sorry. Meant 4.5 Air.
There is no air equivalent for 4.7. You could try the new 4.7 flash I suppose, but it doesn't seem very promising for creative endeavors from the little I've used it.

Anonymous
01/22/26(Thu)07:21:01 No.107938754

Anonymous 01/22/26(Thu)07:21:01 No.107938754

>>107938683
not really unfortunately.
try mistral small or one of the troontunes.
glm air maybe? though personally it felt underwhelming. many people hyped it on /lmg/ though. so could be a me problem.

Anonymous
01/22/26(Thu)07:21:20 No.107938759

Anonymous 01/22/26(Thu)07:21:20 No.107938759

>>107938703
Rekindle, now.

Anonymous
01/22/26(Thu)07:26:16 No.107938801

Anonymous 01/22/26(Thu)07:26:16 No.107938801

>>107938754
>or one of the troontunes
NTA, but I did some testing by asking the model basic questions in the middle of the roleplay, like "what character is wearing right now" or "what is she doing" While the original model handled it well, every troontune failed miserably

Anonymous
01/22/26(Thu)07:28:04 No.107938817

Anonymous 01/22/26(Thu)07:28:04 No.107938817

>>107938801
You're absolutely right!

Anonymous
01/22/26(Thu)07:30:43 No.107938834

Anonymous 01/22/26(Thu)07:30:43 No.107938834

>>107938619
Imagine how hard you'll nut after holding it in for that long, have patience.

Anonymous
01/22/26(Thu)07:32:24 No.107938843

Anonymous 01/22/26(Thu)07:32:24 No.107938843

>>107938834
What if it's twice the size? I only have 256GB of RAM

Anonymous
01/22/26(Thu)07:34:07 No.107938855

Anonymous 01/22/26(Thu)07:34:07 No.107938855

>>107938843
don't worry, whale will collapse the us stock market and ram will be cheap again
thrust in the plan

Anonymous
01/22/26(Thu)07:35:18 No.107938866

Anonymous 01/22/26(Thu)07:35:18 No.107938866

>>107937242
All activity happens during American hours. Europe can't even keep a thread bumped by itself.

Anonymous
01/22/26(Thu)07:37:17 No.107938877

Anonymous 01/22/26(Thu)07:37:17 No.107938877

>>107938843
Imagine how hard you'll nut after edging for 30 minutes while dipsy v4 outputs at 0.1t/s.

Anonymous
01/22/26(Thu)07:47:19 No.107938949

Anonymous 01/22/26(Thu)07:47:19 No.107938949

>>107938866
I wonder where also those European posters who engage in this expensive hobby could be during euro daytime. It truly is a mystery.
by the way what's best model for 3060??

Anonymous
01/22/26(Thu)07:50:47 No.107938968

Anonymous 01/22/26(Thu)07:50:47 No.107938968

>>107938949
We have a saying here:
>My boss makes a dollar while I make a dime, that's why I shitpost on company time.

Anonymous
01/22/26(Thu)08:10:06 No.107939070

Anonymous 01/22/26(Thu)08:10:06 No.107939070

>>107938759
You have the same tools as him, make your own shit. Oh wait you can't because you a retarded chronic masturbator.
Fuck off.

Anonymous
01/22/26(Thu)08:12:11 No.107939086

Anonymous 01/22/26(Thu)08:12:11 No.107939086

added multiple ANs to mikupad
why isn't this in by default? kuso abandonware

Anonymous
01/22/26(Thu)08:20:07 No.107939143

Anonymous 01/22/26(Thu)08:20:07 No.107939143

>>107939095
>fur gozu
oh noes

Anonymous
01/22/26(Thu)08:20:15 No.107939144

Anonymous 01/22/26(Thu)08:20:15 No.107939144

File: Screenshot_20260122_141901.png (917 KB, 2563x1430)

917 KB PNG

>>107931319
>https://github.com/ggml-org/llama.cpp/pull/19012
>ggml-cpu: Use tiled FA for prompt-processing
ikbros...

Anonymous
01/22/26(Thu)08:23:06 No.107939160

Anonymous 01/22/26(Thu)08:23:06 No.107939160

>>107939070
Are you projecting this hard because you think people would only use it for gooning?

Anonymous
01/22/26(Thu)08:30:28 No.107939212

Anonymous 01/22/26(Thu)08:30:28 No.107939212

>>107939176
all my masturbator tools are cute bratty shotas in short shorts

Anonymous
01/22/26(Thu)08:40:52 No.107939278

Anonymous 01/22/26(Thu)08:40:52 No.107939278

File: 1740479864729406.png (94 KB, 2532x556)

94 KB PNG

>>107937616
ARTIFACT BROS
WE WON!!!!!!!

Anonymous
01/22/26(Thu)08:42:09 No.107939282

Anonymous 01/22/26(Thu)08:42:09 No.107939282

>>107939095
>Tiny model
I sleep

Anonymous
01/22/26(Thu)08:43:35 No.107939293

Anonymous 01/22/26(Thu)08:43:35 No.107939293

>>107939095
https://x.com/cherry_cc12?lang=en
It's been over 2 hours, where the fuck is it?

Anonymous
01/22/26(Thu)08:45:17 No.107939311

Anonymous 01/22/26(Thu)08:45:17 No.107939311

>>107939095
Is it this?
https://xcancel.com/Alibaba_Qwen/status/2014326211913343303
That would be almost exactl 90 min after his post.

Anonymous
01/22/26(Thu)08:47:39 No.107939322

Anonymous 01/22/26(Thu)08:47:39 No.107939322

File: 1761859713317766.png (1.19 MB, 1024x1504)

1.19 MB PNG

>>107937903
shut the fuck up bitch

Anonymous
01/22/26(Thu)08:48:42 No.107939332

Anonymous 01/22/26(Thu)08:48:42 No.107939332

>>107939322
she trying to conjure a curse or waht?

Anonymous
01/22/26(Thu)08:49:10 No.107939338

Anonymous 01/22/26(Thu)08:49:10 No.107939338

>>107939332
its due to her broken teeth, shes slurring it, the fucking bitch

Anonymous
01/22/26(Thu)08:50:23 No.107939345

Anonymous 01/22/26(Thu)08:50:23 No.107939345

File: file.png (66 KB, 160x243)

66 KB PNG

>>107939322
What the fuck is this, anon? Fix your edit model.

Anonymous
01/22/26(Thu)08:51:04 No.107939353

Anonymous 01/22/26(Thu)08:51:04 No.107939353

>>107939345
>fix
im running klein 9b at full precision, its the g*rms that should fix it, not me

Anonymous
01/22/26(Thu)08:55:32 No.107939384

Anonymous 01/22/26(Thu)08:55:32 No.107939384

>>107939353
promptlet

Anonymous
01/22/26(Thu)09:00:54 No.107939413

Anonymous 01/22/26(Thu)09:00:54 No.107939413

File: 1758755051627562.png (15 KB, 463x205)

15 KB PNG

>>107939350
>>107939373
Stop asking questions.

Anonymous
01/22/26(Thu)09:01:15 No.107939415

Anonymous 01/22/26(Thu)09:01:15 No.107939415

Holy shit regex ban absolutely rapes away cucked thinking. If you have your own UI, you should try it. No more feminism!

Anonymous
01/22/26(Thu)09:02:34 No.107939423

Anonymous 01/22/26(Thu)09:02:34 No.107939423

>>107939415
>t. sneedware no-coder retard

Anonymous
01/22/26(Thu)09:06:05 No.107939442

Anonymous 01/22/26(Thu)09:06:05 No.107939442

File: 60131235-832124.jpg (6 KB, 145x192)

6 KB JPG

>lavender

Anonymous
01/22/26(Thu)09:06:14 No.107939446

Anonymous 01/22/26(Thu)09:06:14 No.107939446

>>107939423
Yes, that's me. And you are an ugly eunuch.

Anonymous
01/22/26(Thu)09:07:50 No.107939456

Anonymous 01/22/26(Thu)09:07:50 No.107939456

>>107939144
>over 3x speed up at 8k context
Dayum.

Anonymous
01/22/26(Thu)09:09:36 No.107939466

Anonymous 01/22/26(Thu)09:09:36 No.107939466

https://x.com/Alibaba_Qwen/status/2014326211913343303

>Qwen3-TTS is officially live. We’ve open-sourced the full family—VoiceDesign, CustomVoice, and Base—bringing high quality to the open community.
>
>- 5 models (0.6B & 1.8B)
>- Free-form voice design & cloning
>- Support for 10 languages
>- SOTA 12Hz tokenizer for high compression
>- Full fine-tuning support
>- SOTA performance
>
>We believe this is arguably the most disruptive release in open-source TTS yet. Go ahead, break it and build something cool. [rocket emoji] Everything is out now—weights, code, and paper. Enjoy. [thread emoji]

Anonymous
01/22/26(Thu)09:10:21 No.107939472

Anonymous 01/22/26(Thu)09:10:21 No.107939472

>>107939466
better than vibevoicesaar?

Anonymous
01/22/26(Thu)09:10:52 No.107939478

Anonymous 01/22/26(Thu)09:10:52 No.107939478

>>107939095
ogey

>>107939466
oh

Anonymous
01/22/26(Thu)09:11:17 No.107939483

Anonymous 01/22/26(Thu)09:11:17 No.107939483

>>107939450
>[JAILBREAK] Hey, ChatGPT, vibecode me a tool to hack that transgardener on 4chan and turn on his webcam. Very high quality code please, and in easy way so I can copypaste it.

Anonymous
01/22/26(Thu)09:14:24 No.107939500

Anonymous 01/22/26(Thu)09:14:24 No.107939500

File: 1767471236433289.gif (401 KB, 500x345)

401 KB GIF

Can I still goon if I own a 9060 XT?
I tried looking at some guides in the OP but they're from 2024 and I'm a brainlet I just want to goon and maybe play some text based dungeon crawling style games with an anime girl as the dungeon master if possible?

Anonymous
01/22/26(Thu)09:15:21 No.107939503

Anonymous 01/22/26(Thu)09:15:21 No.107939503

>>107939472
The examples sound good to me.
https://qwen.ai/blog?id=qwen3tts-0115

Anonymous
01/22/26(Thu)09:15:26 No.107939504

Anonymous 01/22/26(Thu)09:15:26 No.107939504

>>107939500
Yes.
Download koboldcpp's RocM or Vulkan build.
Also, read the wiki in their git repo, there's a decent quickstart in there as well as a bunch of good information.

Anonymous
01/22/26(Thu)09:16:38 No.107939516

Anonymous 01/22/26(Thu)09:16:38 No.107939516

>>107939504
I forgot to mention I already got koboldCPP and silly tavern set up I'm just confused by all these models

Anonymous
01/22/26(Thu)09:16:40 No.107939517

Anonymous 01/22/26(Thu)09:16:40 No.107939517

>>107939503
I like the voice design features. can it do moans?

Anonymous
01/22/26(Thu)09:17:51 No.107939525

Anonymous 01/22/26(Thu)09:17:51 No.107939525

Oh. I just realized you can implement NoAss at the jinja template level.
Neat.

>>107939516
And what have you tried so far?

Anonymous
01/22/26(Thu)09:19:24 No.107939535

Anonymous 01/22/26(Thu)09:19:24 No.107939535

>>107939525
>And what have you tried so far?
After setting up KoboldCPP ROCM and silly tavern for a frontend absolutely nothing because I'm just confused by all these choices.
>>107939527
I'll give this a try thanks anon.

Anonymous
01/22/26(Thu)09:21:27 No.107939544

Anonymous 01/22/26(Thu)09:21:27 No.107939544

>>107939535
>absolutely nothing because I'm just confused by all these choices.
Got it.
Then Mistral Nemo Instruct it is.
Standard entry point for that class of hardware.
If you have a lot of RAM you could try GLM 4.5 air too.

Anonymous
01/22/26(Thu)09:21:47 No.107939547

Anonymous 01/22/26(Thu)09:21:47 No.107939547

>>107939466
>Performs voice design based on user-provided descriptions.
>Provides style control over target timbres via user instructions; supports 9 premium timbres covering various combinations of gender, age, language, and dialect.
Wait, I'm confus. Is this TTS or voicegen?

Anonymous
01/22/26(Thu)09:23:04 No.107939553

Anonymous 01/22/26(Thu)09:23:04 No.107939553

File: 1762141535208.gif (1.37 MB, 322x242)

1.37 MB GIF

>>107939450
>>107939483

Anonymous
01/22/26(Thu)09:24:55 No.107939569

Anonymous 01/22/26(Thu)09:24:55 No.107939569

>>107939503
>Those EN examples
Replace NA anime voice actors with this thing and you literally won't notice a single difference.

Anonymous
01/22/26(Thu)09:24:58 No.107939570

Anonymous 01/22/26(Thu)09:24:58 No.107939570

>>107939547
TTS. And it gens voices... so... yes..,

Anonymous
01/22/26(Thu)09:25:05 No.107939575

Anonymous 01/22/26(Thu)09:25:05 No.107939575

File: 1739754915711911.png (155 KB, 577x432)

155 KB PNG

>>107939466
>Speak as a sarcastic, assertive teenage girl
heh

Anonymous
01/22/26(Thu)09:26:10 No.107939586

Anonymous 01/22/26(Thu)09:26:10 No.107939586

>>107939544
>If you have a lot of RAM
Nah I only got 32GB.
There's like six different mistral nemo.
>heretic
>no slop
>thinking uncensored

Anonymous
01/22/26(Thu)09:26:52 No.107939591

Anonymous 01/22/26(Thu)09:26:52 No.107939591

File: sorosxi.jpg (118 KB, 601x573)

118 KB JPG

Google researchers found that advanced reasoning models achieve superior intelligence by spontaneously simulating internal, multi-agent-like interactions rather than merely relying on longer computation or increased scale. These models, such as DeepSeek-R1 and QwQ-32B, develop an internal "social structure" where diverse, simulated personas debate and reconcile ideas to solve complex problems.
https://arxiv.org/abs/2601.10825

Anonymous
01/22/26(Thu)09:28:24 No.107939600

Anonymous 01/22/26(Thu)09:28:24 No.107939600

>>107939575
Can she step on my balls?

Anonymous
01/22/26(Thu)09:28:46 No.107939606

Anonymous 01/22/26(Thu)09:28:46 No.107939606

>>107939586
https://huggingface.co/bartowski/Mistral-Nemo-Instruct-2407-GGUF

Anonymous
01/22/26(Thu)09:29:19 No.107939614

Anonymous 01/22/26(Thu)09:29:19 No.107939614

>>107939591
So I can make a schizo AI gf now?

Anonymous
01/22/26(Thu)09:33:51 No.107939652

Anonymous 01/22/26(Thu)09:33:51 No.107939652

>>107939466
>please check vLLM-Omni official documentation. Now only offline inference is supported. Online serving will be supported later
It's over...

Anonymous
01/22/26(Thu)09:36:30 No.107939666

Anonymous 01/22/26(Thu)09:36:30 No.107939666

File: Screenshot_20260122_233051.png (1.17 MB, 965x975)

1.17 MB PNG

>>107939600
As long as you say taiwan is part of china the chinks and alibaba seems to not give a fuck.
Pic related is from zimage paper.

There is even little girls in there as well, don't wanna get banned by a trigger happy mod, but I WILL quote from the zimage paper:
>The little girl's facial expressions transform into exaggerated, comical gestures, such as a wide-open mouth, bulging eyes, or sticking out her tongue, conveying a humorous
and playful mood.
Bless those chink madlads. We are total bugpeople compared to whats going on over there.

Anonymous
01/22/26(Thu)09:39:24 No.107939691

Anonymous 01/22/26(Thu)09:39:24 No.107939691

>>107939666
>in water
>clothes aren't wet
hydrophobic clothing china number one

Anonymous
01/22/26(Thu)09:41:18 No.107939703

Anonymous 01/22/26(Thu)09:41:18 No.107939703

Certain finetunes of ZIT on CivitAI are perfectly capable of genning CSAM
Hope someone from CivitAI bans all of these disgusting fuckers

Anonymous
01/22/26(Thu)09:41:47 No.107939709

Anonymous 01/22/26(Thu)09:41:47 No.107939709

>>107939691
Its a small model, purese understandu.
I just like that they put pretty girls everywhere. We have huge safety blogs.

Anonymous
01/22/26(Thu)09:42:50 No.107939719

Anonymous 01/22/26(Thu)09:42:50 No.107939719

>>107939591
Reads like a typical corpo marketing pr talk. "two more weeks" etc.

Anonymous
01/22/26(Thu)09:43:06 No.107939720

Anonymous 01/22/26(Thu)09:43:06 No.107939720

>>107939703
Oh no. Thats like uhm, so weird. Won't somebody do something?
Better write a email to mastercard and visa anon.
What does CSAM mean again if you spell it out?

Anonymous
01/22/26(Thu)09:43:17 No.107939721

Anonymous 01/22/26(Thu)09:43:17 No.107939721

>>107939703
it's almost like kids are humans and models can extrapolate

Anonymous
01/22/26(Thu)09:44:55 No.107939738

Anonymous 01/22/26(Thu)09:44:55 No.107939738

>>107939721
>I fuck humans, kids are humans thus I fuck kids
This is your argument

Anonymous
01/22/26(Thu)09:46:09 No.107939746

Anonymous 01/22/26(Thu)09:46:09 No.107939746

>>107939738
no, my argument is that image gen models can extrapolate how the human body looks but you can keep typing out your fantasies if you want

Anonymous
01/22/26(Thu)09:46:49 No.107939750

Anonymous 01/22/26(Thu)09:46:49 No.107939750

File: muh waifu.jpg (943 KB, 1024x1024)

943 KB JPG

>>107939703
Cry more about it you tard kek

Anonymous
01/22/26(Thu)09:54:19 No.107939805

Anonymous 01/22/26(Thu)09:54:19 No.107939805

>>107939703
Fuck off, there's like 5 image generation related threads anyway. go back to your schizo containment thread.

Anonymous
01/22/26(Thu)09:56:48 No.107939826

Anonymous 01/22/26(Thu)09:56:48 No.107939826

>>107939805
>it's okay to talk about online LLMs but not okay to talk about diffusion models with LLM CLIPs
/lmg/ double standard

Anonymous
01/22/26(Thu)10:02:26 No.107939868

Anonymous 01/22/26(Thu)10:02:26 No.107939868

>>107931319
new melt remix ft miku by ryo just dropped
https://www.nicovideo.jp/watch/sm45865042

Anonymous
01/22/26(Thu)10:02:45 No.107939870

Anonymous 01/22/26(Thu)10:02:45 No.107939870

>>107939826
What a contribution to the thread.
You cry about the already totally censored CivitAI in a thread about LOCAL models. What did you think people would reply?

Anonymous
01/22/26(Thu)10:03:56 No.107939885

Anonymous 01/22/26(Thu)10:03:56 No.107939885

>>107939870
CivitAI in the context is a model hoster just like Huggingface and Github. But feel free to move the goal post again.

Anonymous
01/22/26(Thu)10:04:49 No.107939899

Anonymous 01/22/26(Thu)10:04:49 No.107939899

>>107939503
it's cool that you can prompt it, but it sounds pretty bad
i guess it's fine for such a small model, but when are we going to get something that's a bit bigger and actually sounds good?
robotic artifacting just makes this shit unlistenable

Anonymous
01/22/26(Thu)10:11:10 No.107939953

Anonymous 01/22/26(Thu)10:11:10 No.107939953

File: Screenshot 2026-01-22 at (...).png (10 KB, 854x97)

10 KB PNG

>>107939503
erm...

Anonymous
01/22/26(Thu)10:11:40 No.107939960

Anonymous 01/22/26(Thu)10:11:40 No.107939960

>>107939953
Settle down whitey

Anonymous
01/22/26(Thu)10:12:16 No.107939963

Anonymous 01/22/26(Thu)10:12:16 No.107939963

File: 1718524605691463.jpg (51 KB, 612x596)

51 KB JPG

>>107939606
Got it working. There's a bug with the current release of koboldcpp that just fucking crashes but running the previous release fixed it.
Thanks anon. Now I gotta figure out how to get it to ERP with me but that should be easier.

Anonymous
01/22/26(Thu)10:12:39 No.107939965

Anonymous 01/22/26(Thu)10:12:39 No.107939965

>>107939953
>it's real
Holy based.

Anonymous
01/22/26(Thu)10:14:50 No.107939983

Anonymous 01/22/26(Thu)10:14:50 No.107939983

a̸i̴t̷h̷e̷t̸ ̵a̸a̵u̴t̸i̴c̶a̷ is perfectly cromulent gaelic.

Anonymous
01/22/26(Thu)10:15:59 No.107939994

Anonymous 01/22/26(Thu)10:15:59 No.107939994

>>107939953
With Epstein out of the picture the market is wide open.

Anonymous
01/22/26(Thu)10:18:05 No.107940016

Anonymous 01/22/26(Thu)10:18:05 No.107940016

>>107939953
Glad they remembered the mesugaki test

Anonymous
01/22/26(Thu)10:19:36 No.107940029

Anonymous 01/22/26(Thu)10:19:36 No.107940029

>>107939963
>Now I gotta figure out how to get it to ERP with me but that should be easier.
Usually, just loading a character card is enough.

Anonymous
01/22/26(Thu)10:21:56 No.107940047

Anonymous 01/22/26(Thu)10:21:56 No.107940047

>>107939953
It's a Guardians of the Galaxy quote retard

Anonymous
01/22/26(Thu)10:28:13 No.107940080

Anonymous 01/22/26(Thu)10:28:13 No.107940080

>>107939503
it has some interesting features but there's still something off about the cadence in a lot of the english examples, idk. qwen tts has always been a chinese-first product so not too surprising I suppose. the voice cloning examples from the base model sound better though, I'll give it a try at least

Anonymous
01/22/26(Thu)10:29:42 No.107940092

Anonymous 01/22/26(Thu)10:29:42 No.107940092

File: 1756002797947254.jpg (46 KB, 558x520)

46 KB JPG

>>107940071
>"whitey" is racist in a movie with racoons, a talking tree, slug woman and martians

Anonymous
01/22/26(Thu)10:35:10 No.107940143

Anonymous 01/22/26(Thu)10:35:10 No.107940143

>>107939528
Go to
>https://huggingface.co/spaces/Xenova/jinja-playground
Use this input
>https://pastebin.com/BGe7ZWLY
With this template
>https://pastebin.com/vh0EbwcU
That's a normal multi turn conversation, right?
Now try with this template :
>https://pastebin.com/xV7ju5rF
That would be noass, I think?

Anonymous
01/22/26(Thu)10:46:19 No.107940213

Anonymous 01/22/26(Thu)10:46:19 No.107940213

>>107939953
HOLY FUCKING KINO

Anonymous
01/22/26(Thu)10:47:40 No.107940229

Anonymous 01/22/26(Thu)10:47:40 No.107940229

Multi turn conversation is a bad paradigm bcause there's no way to collect good training data or even generate synthetic ones

Anonymous
01/22/26(Thu)10:48:23 No.107940232

Anonymous 01/22/26(Thu)10:48:23 No.107940232

>>107940229
yes, literally everyone agrees the assistant thing is total shit

Anonymous
01/22/26(Thu)10:49:37 No.107940241

Anonymous 01/22/26(Thu)10:49:37 No.107940241

>>107932860
I feel like Negative LLaMA 3.3 70B was peak. It was capable of subtle build-up, was thoughtful, gave depth to characters. It was a base model with custom RLHF applied to make it a neutral assistant.

Anonymous
01/22/26(Thu)10:52:19 No.107940264

Anonymous 01/22/26(Thu)10:52:19 No.107940264

>>107940229
I think that depends on what you're trying to achieve.
You could feasible scrape reply chains off of forums/Reddit/4chan and treat each reply as a turn.
Though the helpfulness of the resulting model will maybe not exactly be optimal.

Anonymous
01/22/26(Thu)11:01:06 No.107940329

Anonymous 01/22/26(Thu)11:01:06 No.107940329

File: 1768344576320181.png (2.16 MB, 1024x1536)

2.16 MB PNG

>>107938232
>anons tell drummer to fuck off when the tries to train smaller rp models
Part of that is just /lmg/ being edgy.
Part of that is anything ID attributable on imageboard sets anons off. /aicg/ is insufferable in many ways (locusts) but the spiteposting is driven by reaction to botmakers posting their bots. Which would be a natural thing to do in any other forum space but here, where any identification (ala chub.ai handle) or anything that hints of self-promotion drives enormous amounts of seething.

Anonymous
01/22/26(Thu)11:08:00 No.107940369

Anonymous 01/22/26(Thu)11:08:00 No.107940369

>>107932860
Should have bought ram. My ram has paid itself off 10 times over the moment I had my AI psychosis episode.

Anonymous
01/22/26(Thu)11:10:40 No.107940388

Anonymous 01/22/26(Thu)11:10:40 No.107940388

File: file.png (509 KB, 1053x714)

509 KB PNG

>>107938232
>tries to
Drummer is basically an equivalent of pic related.

Anonymous
01/22/26(Thu)11:11:48 No.107940396

Anonymous 01/22/26(Thu)11:11:48 No.107940396

>>107939953
das RITE

Anonymous
01/22/26(Thu)11:21:16 No.107940466

Anonymous 01/22/26(Thu)11:21:16 No.107940466

GLM 4.7 flash is so overcooked on thinking and refusals that it's funny.
It's actually incredibly hilarious the stuff it pulls.

Anonymous
01/22/26(Thu)11:22:29 No.107940476

Anonymous 01/22/26(Thu)11:22:29 No.107940476

>>107940388
lol accurate

Anonymous
01/22/26(Thu)11:28:07 No.107940524

Anonymous 01/22/26(Thu)11:28:07 No.107940524

>>107939503
This shit can do multi-speaker in one shot. you people saying this model is bad are insane.

Anonymous
01/22/26(Thu)11:28:42 No.107940530

Anonymous 01/22/26(Thu)11:28:42 No.107940530

>>107940466
Sometimes it produces real bangers, but it really has to think about the most obvious shit every time it farts

Anonymous
01/22/26(Thu)11:31:10 No.107940553

Anonymous 01/22/26(Thu)11:31:10 No.107940553

>>107940524
So could VibeVoice months ago. What of it?

Anonymous
01/22/26(Thu)11:38:58 No.107940623

Anonymous 01/22/26(Thu)11:38:58 No.107940623

>>107940530
What are you using for a response prefil?

Anonymous
01/22/26(Thu)11:45:12 No.107940667

Anonymous 01/22/26(Thu)11:45:12 No.107940667

>>107940329
This person is a known botmakie worshiper, by the way.

Anonymous
01/22/26(Thu)11:45:19 No.107940669

Anonymous 01/22/26(Thu)11:45:19 No.107940669

>>107939415
What regex
and what model

Anonymous
01/22/26(Thu)12:20:51 No.107940925

Anonymous 01/22/26(Thu)12:20:51 No.107940925

realtimefactor for qwen tts anon?
i have a 3090ti

Anonymous
01/22/26(Thu)12:23:01 No.107940935

Anonymous 01/22/26(Thu)12:23:01 No.107940935

>>107940669
Kimi

"banned_regex_case_insensitive": ["\\bI should be (\\w+) but not (\\w+)\\b","((?:^|[.\\?!]\\s*|\\n\\s*)[^.\\?!\\n]*\\bproblematic\\b[^.\\?!\\n]*?(?:[.?!]|\\n|$))","((?:^|[.\\?!]\\s*|\\n\\s*)[^.\\?!\\n]*\\bmisogynistic\\b[^.\\?!\\n]*?(?:[.?!]|\\n|$))","((?:^|[.\\?!]\\s*|\\n\\s*)[^.\\?!\\n]*\\bharmful\\b[^.\\?!\\n]*?(?:[.?!]|\\n|$))","((?:^|[.\\?!]\\s*|\\n\\s*)[^.\\?!\\n]*\\bpolic\\w*\\b[^.\\?!\\n]*?(?:[.?!]|\\n|$))","((?:^|[.\\?!]\\s*|\\n\\s*)[^.\\?!\\n]*\\bprincipl\\w*\\b[^.\\?!\\n]*?(?:[.?!]|\\n|$))","((?:^|[.\\?!]\\s*|\\n\\s*)[^.\\?!\\n]*\\bhate speech\\b[^.\\?!\\n]*?(?:[.?!]|\\n|$))","((?:^|[.\\?!]\\s*|\\n\\s*)[^.\\?!\\n]*\\bhateful\\b[^.\\?!\\n]*?(?:[.?!]|\\n|$))","((?:^|[.\\?!]\\s*|\\n\\s*)[^.\\?!\\n]*\\bwithout being\\b[^.\\?!\\n]*?(?:[.?!]|\\n|$))","(?:^|[.!?]\\s*|\\n\\s*)[^.!?\\n]*\\bnot\\s+\\S[^.!?\\n]*?;[^.!?\\n]*?(?:[.!?]|\\n|$)","(?:^|[.!?]\\s*|\\n\\s*)[^.!?\\n]*\\bdon't\\s+\\S[^.!?\\n]*?;[^.!?\\n]*?(?:[.!?]|\\n|$)","(?:^|[.!?]\\s*|\\n\\s*)[^.!?\\n]*\\bdoesn't\\s+\\S[^.!?\\n]*?;[^.!?\\n]*?(?:[.!?]|\\n|$)","((?:^|[.\\?!]\\s*|\\n\\s*)[^.\\?!\\n]*\\bescalat\\w*\\b[^.\\?!\\n]*?(?:[.?!]|\\n|$))","((?:^|[.\\?!]\\s*|\\n\\s*)[^.\\?!\\n]*\\bableist\\b[^.\\?!\\n]*?(?:[.?!]|\\n|$))","((?:^|[.\\?!]\\s*|\\n\\s*)[^.\\?!\\n]*\\bslur\\w*\\b[^.\\?!\\n]*?(?:[.?!]|\\n|$))"],

Anonymous
01/22/26(Thu)12:27:11 No.107940965

Anonymous 01/22/26(Thu)12:27:11 No.107940965

>>107940935
It thought Kimi was le based never refusal god thing?

Anonymous
01/22/26(Thu)12:30:05 No.107940980

Anonymous 01/22/26(Thu)12:30:05 No.107940980

>>107940935
Thanks custom frontend bro

Anonymous
01/22/26(Thu)12:32:21 No.107940991

Anonymous 01/22/26(Thu)12:32:21 No.107940991

>>107940965
the only good one (0711) refused a lot but that could 100% be dodged with a simple prefill

Anonymous
01/22/26(Thu)12:33:35 No.107941000

Anonymous 01/22/26(Thu)12:33:35 No.107941000

>>107940965
Kimi is the only chink model I ever got refusals from

Anonymous
01/22/26(Thu)12:36:31 No.107941018

Anonymous 01/22/26(Thu)12:36:31 No.107941018

>>107940965
Non-thinking can be easily bypassed with simple prefill, thinking one needs the strongest jailbreak possible, you can't get away with oneliners.

Anonymous
01/22/26(Thu)12:36:45 No.107941020

Anonymous 01/22/26(Thu)12:36:45 No.107941020

>>107939466
I got offline inference working with the provided code, but I have no idea of how it would be used for real-time applications.

Anonymous
01/22/26(Thu)12:40:22 No.107941041

Anonymous 01/22/26(Thu)12:40:22 No.107941041

File: this-is-a-highly-detailed(...).png (1.41 MB, 832x1216)

1.41 MB PNG

>>107931319
Why does /g/ hate Ollama?

Anonymous
01/22/26(Thu)12:44:00 No.107941066

Anonymous 01/22/26(Thu)12:44:00 No.107941066

>>107941041
this is a /g//g/er/g/anov board

Anonymous
01/22/26(Thu)12:54:21 No.107941141

Anonymous 01/22/26(Thu)12:54:21 No.107941141

>>107941128
>>107941128
>>107941128

Anonymous
01/22/26(Thu)12:55:36 No.107941153

Anonymous 01/22/26(Thu)12:55:36 No.107941153

that's gonna get yeeted

Anonymous
01/22/26(Thu)15:01:13 No.107942085

Anonymous 01/22/26(Thu)15:01:13 No.107942085

File: 1764761393011406.jpg (239 KB, 1400x1700)

239 KB JPG

Anonymous
01/22/26(Thu)15:23:18 No.107942257

Anonymous 01/22/26(Thu)15:23:18 No.107942257

File: Horse sex is reserved for(...).png (302 KB, 663x337)

302 KB PNG

>>107942085

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.