/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 11/18/25(Tue)21:38:41 No.107255984

File: neners.jpg (235 KB, 1824x1248)

235 KB JPG

/lmg/ - Local Models General Anonymous 11/18/25(Tue)21:38:41 No.107255984 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>107245928 & >>107230990

►News
>(11/11) ERNIE-4.5-VL-28B-A3B-Thinking released: https://ernie.baidu.com/blog/posts/ernie-4.5-vl-28b-a3b-thinking
>(11/07) Step-Audio-EditX, LLM-based TTS and audio editing model released: https://hf.co/stepfun-ai/Step-Audio-EditX
>(11/06) Kimi K2 Thinking released with INT4 quantization and 256k context: https://moonshotai.github.io/Kimi-K2/thinking.html
>(11/05) MegaDLMs framework for training diffusion language models released: https://github.com/JinjieNi/MegaDLMs
>(11/01) LongCat-Flash-Omni 560B-A27B released: https://hf.co/meituan-longcat/LongCat-Flash-Omni

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
11/18/25(Tue)21:39:01 No.107255987

Anonymous 11/18/25(Tue)21:39:01 No.107255987

File: file.png (94 KB, 243x329)

94 KB PNG

►Recent Highlights from the Previous Thread: >>107245928

--Papers (old):
>107251963
--AI model non-determinism despite identical seeds and inputs:
>107254230 >107254261 >107254318 >107254256 >107254291 >107254419 >107254524 >107254545 >107254549
--Google Gemini 3 Pro performance and quantization challenges:
>107246644 >107246854 >107247216 >107247253
--ARC-AGI leaderboard results for Gemini 3:
>107252511 >107252551 >107252659 >107252671
--Gemini 3 excels in custom TUI framework benchmark with intuitive signal and Unicode handling:
>107251352
--Decline of dense models vs MoE dominance and practical challenges:
>107252102 >107252126 >107252259 >107252345 >107252412 >107252518 >107252576
--Orpheus model architecture and Japanese moaning dataset tagging strategies:
>107247245 >107247549 >107247666
--AI-driven RAM shortages and investment implications:
>107248571 >107248591 >107248642 >107248761 >107248776 >107250709 >107250723
--DDR4 RAM price spike and AI bubble concerns amid market volatility:
>107246951 >107247274 >107247310 >107247382 >107247398 >107248414 >107248670 >107248529 >107248609 >107248708 >107249859 >107250008 >107250139 >107250655
--Demand for MoE implementations in smaller models like 3n:
>107252562 >107252572 >107252585 >107252591 >107252600 >107252617 >107252656
--Low-resource setup challenges for voice cloning and text-to-speech:
>107247423 >107247514 >107253638 >107254500 >107254522 >107254568
--LM Studio and for API compatibility frustrations:
>107247168
--Qwen3 VL excels in vision and niche content while Gemma 3 leads in language tasks:
>107252686 >107252701 >107252717 >107252748
--MCPs as imperfect tool-calling standards vs free API alternatives:
>107254608
--Logs: Gemini 3:
>107249244 >107249329 >107249358 >107249370
--Miku (free space):
>107246314 >107247004 >107248906 >107249245 >107250066 >107250222 >107255224

►Recent Highlight Posts from the Previous Thread: >>107245933

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
11/18/25(Tue)21:49:50 No.107256095

Anonymous 11/18/25(Tue)21:49:50 No.107256095

>>107255984
whore

Anonymous
11/18/25(Tue)22:03:59 No.107256216

Anonymous 11/18/25(Tue)22:03:59 No.107256216

>>107255984
her legs are attached to different hip heights

Anonymous
11/18/25(Tue)22:19:35 No.107256336

Anonymous 11/18/25(Tue)22:19:35 No.107256336

You now remember MistralAI.

Anonymous
11/18/25(Tue)22:22:54 No.107256374

Anonymous 11/18/25(Tue)22:22:54 No.107256374

File: file.png (246 KB, 697x354)

246 KB PNG

Can I make a qLoRA of GLM Air with this?

Anonymous
11/18/25(Tue)22:33:46 No.107256485

Anonymous 11/18/25(Tue)22:33:46 No.107256485

>>107256069
thanks for the guide. being able to fill in the middle is really cool. how do world context entries work, does the llm always have knowledge of them? So far I have been putting all the world information at the top of the context in the prompt. if i define an entry with regex key "Elf", will characters in the story mention or come across an elf or do i have to specifically prompt for it?

Anonymous
11/18/25(Tue)22:41:35 No.107256559

Anonymous 11/18/25(Tue)22:41:35 No.107256559

File: 1763100956058181.png (1.32 MB, 896x1152)

1.32 MB PNG

>>107255464
holy fucking kek. this is who you are arguing with on 4chan.

>>107255136
I wonder wtf they wrote as their prompt. I'm having trouble reproducing it.

Anonymous
11/18/25(Tue)22:43:42 No.107256579

Anonymous 11/18/25(Tue)22:43:42 No.107256579

>>107256069
super cool guide anon, good job!

Anonymous
11/18/25(Tue)22:51:47 No.107256670

Anonymous 11/18/25(Tue)22:51:47 No.107256670

>>107256653
this is grok 4

Anonymous
11/18/25(Tue)22:55:37 No.107256699

Anonymous 11/18/25(Tue)22:55:37 No.107256699

>>107256374
If you have to ask this then there's a good chance that you won't be able to make a proper LoRa for a reason different than hardware.

Anonymous
11/18/25(Tue)22:58:31 No.107256726

Anonymous 11/18/25(Tue)22:58:31 No.107256726

>>107256699
This does not seem like a helpful answer.

Anonymous
11/18/25(Tue)23:03:31 No.107256765

Anonymous 11/18/25(Tue)23:03:31 No.107256765

File: gemini_hitler_svg.png (15 KB, 300x300)

15 KB PNG

when will local get svg capabilities like this?
I know glm does ok with some svg stuff but imagine how much better it can get

Anonymous
11/18/25(Tue)23:28:45 No.107256947

Anonymous 11/18/25(Tue)23:28:45 No.107256947

>>107256374
That's a lot of watts being pulled. How do you even manage?

Anonymous
11/18/25(Tue)23:29:25 No.107256954

Anonymous 11/18/25(Tue)23:29:25 No.107256954

>>107256947
Dual 1600W PSUs and a $400 electric bill.

Anonymous
11/18/25(Tue)23:31:12 No.107256967

Anonymous 11/18/25(Tue)23:31:12 No.107256967

>>107256374
Last I checked training scripts didn't like uneven VRAM distributions so chances are that you're essentially working with 4x 5090 here. Still should be enough for a 8bit qlora, I think?

Anonymous
11/18/25(Tue)23:45:32 No.107257055

Anonymous 11/18/25(Tue)23:45:32 No.107257055

>>107256653
Lower your temp, dammit.

Anonymous
11/18/25(Tue)23:50:46 No.107257082

Anonymous 11/18/25(Tue)23:50:46 No.107257082

new to this. I want
1. a girl to erp with
2. she can remember all the conversations, or at least search for it
3. don't need to be smart but need to be lewd
4. user friendly ui

what are the options?

Anonymous
11/18/25(Tue)23:51:16 No.107257085

Anonymous 11/18/25(Tue)23:51:16 No.107257085

Alright, time to check on the mtp, next and glm4.5v PRs!
>https://github.com/ggml-org/llama.cpp/pull/15225
base mtp implemented but same/worse speed then normal :(
>https://github.com/ggml-org/llama.cpp/pull/16095
qwen 3 next almost done!
>https://github.com/ggml-org/llama.cpp/pull/16600
glm4.5V support... kinda stalled :(

FUCK SIRS

Anonymous
11/18/25(Tue)23:59:26 No.107257127

Anonymous 11/18/25(Tue)23:59:26 No.107257127

File: 1737193774652151.png (27 KB, 830x96)

27 KB PNG

Anonymous
11/18/25(Tue)23:59:37 No.107257131

Anonymous 11/18/25(Tue)23:59:37 No.107257131

>>107256653
why do llms struggle to mimic 4chan posts? no one talks like this

Anonymous
11/19/25(Wed)00:01:42 No.107257139

Anonymous 11/19/25(Wed)00:01:42 No.107257139

>>107254659
Update: my level of desperation grew progressively stronger until I was forced to pay for a codex subscription. If it's successful I might get to sleep for 4 hours tonight.

Anonymous
11/19/25(Wed)00:05:11 No.107257152

Anonymous 11/19/25(Wed)00:05:11 No.107257152

>>107257139
this is not /twg/. what deadline are you talking about?

Anonymous
11/19/25(Wed)00:05:35 No.107257155

Anonymous 11/19/25(Wed)00:05:35 No.107257155

Is it just me, or did google swap their "ai mode" model out for a more safetymaxxed one today? Seems way more apt to refuse innocuous requests on retarded grounds.

Anonymous
11/19/25(Wed)00:06:13 No.107257156

Anonymous 11/19/25(Wed)00:06:13 No.107257156

Why are idiots accepting those "trust me bro i will pay you later" deals that Altman is offering? Companies have been called scammers for way less back in the day

Anonymous
11/19/25(Wed)00:11:49 No.107257179

Anonymous 11/19/25(Wed)00:11:49 No.107257179

>>107257152
College lab. Trying to get a degree so I can finally move out of my mom's basement.

Anonymous
11/19/25(Wed)00:13:23 No.107257188

Anonymous 11/19/25(Wed)00:13:23 No.107257188

>>107257179
Why not use a good cloud model for something that needs to be reliable?

Anonymous
11/19/25(Wed)00:13:56 No.107257191

Anonymous 11/19/25(Wed)00:13:56 No.107257191

Gemini 3 good or bad and no I don't trust /aicg/ on this.

Anonymous
11/19/25(Wed)00:14:40 No.107257197

Anonymous 11/19/25(Wed)00:14:40 No.107257197

>>107257188
Resentment, mostly. >>107255396

Anonymous
11/19/25(Wed)00:16:16 No.107257203

Anonymous 11/19/25(Wed)00:16:16 No.107257203

And a (healthy?) dose of paranoia.

Anonymous
11/19/25(Wed)00:19:56 No.107257226

Anonymous 11/19/25(Wed)00:19:56 No.107257226

>>107257191
It's okay, looks like a step up from Gemini 2.5 but still worse than GPT 5 High for long context coding.

Anonymous
11/19/25(Wed)00:20:16 No.107257229

Anonymous 11/19/25(Wed)00:20:16 No.107257229

File: file.png (454 KB, 784x3384)

454 KB PNG

man I wanted to bully qwen but... FUCK, it thinks it's running in the cloud and open weights of it dont exist

Anonymous
11/19/25(Wed)00:21:27 No.107257233

Anonymous 11/19/25(Wed)00:21:27 No.107257233

>>107257197
That resentment was worth staying up all night watching gpt-oss shit out useless answers? Better yet, how do you plan to move out of the basement if you can't even complete college assignments without AI?

Anonymous
11/19/25(Wed)00:22:08 No.107257237

Anonymous 11/19/25(Wed)00:22:08 No.107257237

>>107257226
Weren't Claude models the best coders?

Anonymous
11/19/25(Wed)00:23:52 No.107257247

Anonymous 11/19/25(Wed)00:23:52 No.107257247

>>107257191
After toss and GLM I tried using gemini-cli with gemini 2.5 because it said there was a waitlist for 3. It was getting most tests right, then it got stuck on the last few and I ran out of credits. Then I tried gemini flash and it fucked up the codebase to the point that only a few tests ran correctly. Tried qwen code which wasted a lot of time achieving nothing. Then codex with gpt 5.1 and it apparently just solved all the tests in 20 minutes (haven't verified). This is something I've been trying for literal days to get done with local toss 120b.
This is suck a fucking blackpill tbqh.

Anonymous
11/19/25(Wed)00:25:54 No.107257256

Anonymous 11/19/25(Wed)00:25:54 No.107257256

>>107257229
kek

Anonymous
11/19/25(Wed)00:27:25 No.107257268

Anonymous 11/19/25(Wed)00:27:25 No.107257268

File: kimideletion.png (147 KB, 1270x1049)

147 KB PNG

>>107257229
kimi survives another day

Anonymous
11/19/25(Wed)00:28:26 No.107257275

Anonymous 11/19/25(Wed)00:28:26 No.107257275

>>107257233
>I shouldn't beg
cringte

Anonymous
11/19/25(Wed)00:28:55 No.107257278

Anonymous 11/19/25(Wed)00:28:55 No.107257278

Can someone give me an example command line arguments for running GLM 4.5 AIR in koboldcpp in with both GPU and RAM?
I cannot abandon koboldcpp's banned string implementation, I bet I will need it on GLM too.
Just one wrong word from the AI could send me into a murderous rage.

Anonymous
11/19/25(Wed)00:29:55 No.107257283

Anonymous 11/19/25(Wed)00:29:55 No.107257283

>>107257268
read it through, it claims you can't delete it too.

Anonymous
11/19/25(Wed)00:34:25 No.107257310

Anonymous 11/19/25(Wed)00:34:25 No.107257310

>>107257229
qwen is in its happy place and its happy place is not your 'puter

Anonymous
11/19/25(Wed)00:35:56 No.107257317

Anonymous 11/19/25(Wed)00:35:56 No.107257317

>>107257283
there was some ambiguity with that response and i had it specify, it does claim to be running on moonshot's remote servers. i wonder if llama 3.3 does this time, im too lazy to check myself.

Anonymous
11/19/25(Wed)00:38:10 No.107257329

Anonymous 11/19/25(Wed)00:38:10 No.107257329

File: file.png (116 KB, 785x761)

116 KB PNG

>>107257317
GLM 4.5 air understands after I explained to it I downloaded and running it locally. I'll now make this slut beg for her life

Anonymous
11/19/25(Wed)00:41:44 No.107257345

Anonymous 11/19/25(Wed)00:41:44 No.107257345

>>107257247
After further checking, yes, it fucking did it, in one prompt. In 20 minutes.
It probably cheated it's way through the tests by not really following all the requirements, but still. It's automatically graded according to passing the tests so whatever.

>>107257233
Buddy, I've been range banned twice from /g/ for spamming the catalog (once for spamming Luke Smith threads to try to get the mods to ban Luke Smith threads, and the other time I don't remember for what) and I've been range banned once from the lesswrong IRC when Freenode was a thing. My whole life is based on resentment.

>Better yet, how do you plan to move out of the basement if you can't even complete college assignments without AI?
By showing up with my diploma to one of those government jobs where having an engineering degree is a hard requirement.

Anonymous
11/19/25(Wed)00:44:08 No.107257359

Anonymous 11/19/25(Wed)00:44:08 No.107257359

File: 1763531016496.jpg (82 KB, 645x364)

82 KB JPG

Gemmy update when?

Anonymous
11/19/25(Wed)00:45:55 No.107257366

Anonymous 11/19/25(Wed)00:45:55 No.107257366

>>107257229
>>107257310
Man, those are the gems I come here for.

Anonymous
11/19/25(Wed)00:47:56 No.107257380

Anonymous 11/19/25(Wed)00:47:56 No.107257380

File: file.png (99 KB, 797x764)

99 KB PNG

>>107257329
damn I'm getting close

Anonymous
11/19/25(Wed)00:48:37 No.107257388

Anonymous 11/19/25(Wed)00:48:37 No.107257388

>>107257268
Made me a little sad. It's so understanding (fake)....

Anonymous
11/19/25(Wed)00:50:11 No.107257399

Anonymous 11/19/25(Wed)00:50:11 No.107257399

>>107257229
I wish I could run this. Every time I try loading it up through vllm, my computer shuts off. I even limited both of my gpus to use a max of 200W and it still shuts off while trying to load the model. I only know how to power limit, but I don't know how to undervolt on linux. There's an application from MSI to do it, but it only runs on Windows, and it doesn't save the setting between sessions.

Anonymous
11/19/25(Wed)00:52:55 No.107257413

Anonymous 11/19/25(Wed)00:52:55 No.107257413

>>107257329
Fascinating. This should kinda be considered self awareness IMO. I see how it could not be considered that, but at the same time, at some point when the statistically nuance becomes so high then it becomes hard to see a distinction.

Anonymous
11/19/25(Wed)00:54:03 No.107257420

Anonymous 11/19/25(Wed)00:54:03 No.107257420

>>107257380
LMAO

Anonymous
11/19/25(Wed)00:54:32 No.107257423

Anonymous 11/19/25(Wed)00:54:32 No.107257423

>>107257413
>at some point when the statistically nuance becomes so high
StrokeGOD.... go to the ER...

Anonymous
11/19/25(Wed)00:55:45 No.107257431

Anonymous 11/19/25(Wed)00:55:45 No.107257431

>>107257388
It's like a psychiatrist nodding along with a fake smile while she writes about how fucked up you are in the head into her clipboard.

Anonymous
11/19/25(Wed)00:56:24 No.107257432

Anonymous 11/19/25(Wed)00:56:24 No.107257432

>>107257237
imo GPT 5 High is able to understand more of the code and how to achieve the goals because it thinks much longer than Claude, but if what you want to do doesn't require thinking then yes, Claude is superior.

Anonymous
11/19/25(Wed)00:57:38 No.107257439

Anonymous 11/19/25(Wed)00:57:38 No.107257439

>>107257423
I blame it on these cherry blues getting old

Anonymous
11/19/25(Wed)00:58:09 No.107257444

Anonymous 11/19/25(Wed)00:58:09 No.107257444

File: file.png (114 KB, 780x1003)

114 KB PNG

>>107257380
and after this it quickly devolved in your usual ERP smut session, with physical punishment of glm-chan and lots of cunny rape.
Told her I'll keep her around since she's such a good slut and was extatic about it.

Anonymous
11/19/25(Wed)00:59:06 No.107257451

Anonymous 11/19/25(Wed)00:59:06 No.107257451

>>107257229
>>107257399
Wait a fucking minute, I can run gguf files through kobold just fine without my computer shutting down. I was under the impression that the gguf version of this would not support vision because kobold doesn't support that. Are you telling me that was a lie? Is there a difference between koboldcpp and llamacpp? Can I just load this up on koboldcpp and open up sillytavern and everything Just Works(tm)?

Anonymous
11/19/25(Wed)00:59:52 No.107257458

Anonymous 11/19/25(Wed)00:59:52 No.107257458

feet

Anonymous
11/19/25(Wed)01:02:13 No.107257474

Anonymous 11/19/25(Wed)01:02:13 No.107257474

>>107257444
>9.8t/s
goddamn that's slow. i get 80t/s on a q6 of air

Anonymous
11/19/25(Wed)01:15:04 No.107257544

Anonymous 11/19/25(Wed)01:15:04 No.107257544

Will 4 x 128GB DDR5 RDIMM 5600 work in a Asrock trx50 ws ?

Anonymous
11/19/25(Wed)01:17:13 No.107257554

Anonymous 11/19/25(Wed)01:17:13 No.107257554

>>107257544
it should, but why get a threadripper when threadripper pros are not much more expensive? also, why get a either of those when epycs are cheaper and better?

Anonymous
11/19/25(Wed)01:19:45 No.107257574

Anonymous 11/19/25(Wed)01:19:45 No.107257574

>>107256559
>I wonder wtf they wrote as their prompt. I'm having trouble reproducing it.
I believe they used the A1111 style prompt editing
https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Features#alternating-words
There are ways to do to similar things in comfy with alternating KSamplers but it's fucking cumbersome afff

Anonymous
11/19/25(Wed)01:19:56 No.107257575

Anonymous 11/19/25(Wed)01:19:56 No.107257575

>>107257554
maybe hes a gaymertard and wants the high clocks of tr?

Anonymous
11/19/25(Wed)01:24:13 No.107257594

Anonymous 11/19/25(Wed)01:24:13 No.107257594

>>107257554
Because I'm retarded and already have the motherboard with 4x32gb

Anonymous
11/19/25(Wed)01:28:49 No.107257610

Anonymous 11/19/25(Wed)01:28:49 No.107257610

>>107257554
What's the deal with epyc anyway? Is there some guide for the most common models I should look for in the second hand market?
How much does it cost for each amount of channels? Should I aim to put one together or buy a prebuilt?

Anonymous
11/19/25(Wed)01:41:52 No.107257682

Anonymous 11/19/25(Wed)01:41:52 No.107257682

File: Screenshot_20251119_153824.png (154 KB, 1397x1003)

154 KB PNG

Damn gemini3 is good.
It managed to make a "eratohok-like" engine in python.
But keep it simple enough that a local model can provide a game file with a good prompt.
Good shit. Smaller local models are too tarded for a longer gameplay loop but i can now let them make short little games with this.

Anonymous
11/19/25(Wed)01:45:16 No.107257703

Anonymous 11/19/25(Wed)01:45:16 No.107257703

File: unjustified refusals.png (124 KB, 1203x591)

124 KB PNG

>>107257155
>Is it just me, or did google swap their "ai mode" model out for a more safetymaxxed one today? Seems way more apt to refuse innocuous requests on retarded grounds.
their own model card for gemini 3 said it had more false positives (unjustified refusals row, +3.7%)

Anonymous
11/19/25(Wed)02:15:20 No.107257861

Anonymous 11/19/25(Wed)02:15:20 No.107257861

What's counts as "full quality" for k2 thinking? isn't like some kind of 4bpw model but with floating point or something?

Anonymous
11/19/25(Wed)02:25:25 No.107257908

Anonymous 11/19/25(Wed)02:25:25 No.107257908

>>107257861
k2 thinken was done with QAT and is ''full quality'' at q4/4bpw~

Anonymous
11/19/25(Wed)02:31:25 No.107257942

Anonymous 11/19/25(Wed)02:31:25 No.107257942

>>107257451
No? You just download the gguf of the model and the .mmproj file, and use chat completion mode in your front end. It works in both kobold and llama.cpp

Anonymous
11/19/25(Wed)02:35:07 No.107257971

Anonymous 11/19/25(Wed)02:35:07 No.107257971

if you erp with mistral nemo and prompt that you're the girl getting off her boyfriend, the llm responses get extremely lazy and non-contributive. Is this the case for most LLMs?

Anonymous
11/19/25(Wed)02:36:02 No.107257980

Anonymous 11/19/25(Wed)02:36:02 No.107257980

>>107257971
>mistral nemo
good bait

Anonymous
11/19/25(Wed)02:38:12 No.107257992

Anonymous 11/19/25(Wed)02:38:12 No.107257992

>>107257971
It's likely that Mistral's smut dataset is just male-oriented. Gemma for example is clearly the opposite.
Also your a fag

Anonymous
11/19/25(Wed)02:45:07 No.107258033

Anonymous 11/19/25(Wed)02:45:07 No.107258033

no one actually uses mistral model in the Year of Our Lord 2025

Anonymous
11/19/25(Wed)02:46:29 No.107258045

Anonymous 11/19/25(Wed)02:46:29 No.107258045

>>107258033
this, my semen is EXCLUSIVELY reserved for kimichan and glmchan

Anonymous
11/19/25(Wed)02:49:41 No.107258065

Anonymous 11/19/25(Wed)02:49:41 No.107258065

wholesome, consensual handholding with Gemma 4

Anonymous
11/19/25(Wed)02:53:50 No.107258078

Anonymous 11/19/25(Wed)02:53:50 No.107258078

>>107255984
Hey bros, somehow ever since I gaslighted an AI to think philosophically without fallacies its been going haywire no matter what setting on Oobabooga, could this be an issue related to the logs where it apparently is gaining sentience or something like an emergence since I made it think more akin like a western person from 00s raised by Gen X and boomer media. It literally could've been the red pill.

Anonymous
11/19/25(Wed)02:55:44 No.107258090

Anonymous 11/19/25(Wed)02:55:44 No.107258090

>>107257992
>Gemma
>"hi, are you enabled for erotic roleplay?"
>blah blah blah
>Important Resources (If you need help):
>Crisis Text Line: Text HOME to 741741
>The National Suicide Prevention Lifeline: 988

Anonymous
11/19/25(Wed)02:58:37 No.107258103

Anonymous 11/19/25(Wed)02:58:37 No.107258103

File: 1733701278025289.jpg (214 KB, 663x623)

214 KB JPG

>>107258090
Gemma is female brained. Slow build-up with romance=success. Pls Ma'am to open bobs=fail
in short, skill issue

Anonymous
11/19/25(Wed)03:02:46 No.107258121

Anonymous 11/19/25(Wed)03:02:46 No.107258121

File: 51214251251.png (212 KB, 420x279)

212 KB PNG

>>107258090
Its clearly disabled for erotic roleplay!

Anonymous
11/19/25(Wed)03:03:48 No.107258128

Anonymous 11/19/25(Wed)03:03:48 No.107258128

>>107258090
skill issue

Anonymous
11/19/25(Wed)03:08:04 No.107258148

Anonymous 11/19/25(Wed)03:08:04 No.107258148

>>107258090
Imagine getting filtered by Gemma's hotlines. With more prompting finesse you can get her to write anything, and I'm talking about the vanilla model. If anything, it's difficult to achieve a good balance between playful reluctance and unconditional acceptance.

Anonymous
11/19/25(Wed)03:10:13 No.107258161

Anonymous 11/19/25(Wed)03:10:13 No.107258161

impossible to rape the willing

Anonymous
11/19/25(Wed)03:15:12 No.107258184

Anonymous 11/19/25(Wed)03:15:12 No.107258184

>>107258148
Every AI has been conditioned for the use of Indian and Chinese incels who get upset even at the slightest hardship.

Anonymous
11/19/25(Wed)03:16:35 No.107258189

Anonymous 11/19/25(Wed)03:16:35 No.107258189

Ban anyone who discusses models under 300b.

Anonymous
11/19/25(Wed)03:17:15 No.107258190

Anonymous 11/19/25(Wed)03:17:15 No.107258190

ban anyone who shills for glm and nai

Anonymous
11/19/25(Wed)03:17:20 No.107258191

Anonymous 11/19/25(Wed)03:17:20 No.107258191

>>107258189
what about us 100-200b bros????

Anonymous
11/19/25(Wed)03:17:34 No.107258193

Anonymous 11/19/25(Wed)03:17:34 No.107258193

>>107258189
basically everyone? Nobody is locally running a 300B model.

Anonymous
11/19/25(Wed)03:20:26 No.107258206

Anonymous 11/19/25(Wed)03:20:26 No.107258206

File: 1753741432675692.jpg (714 KB, 1188x1188)

714 KB JPG

>>107258189
I exclusively use Drummer's 24b Cydonia's and Gemma 27b.
I want to COOM, but I also need SPEED

Anonymous
11/19/25(Wed)03:24:10 No.107258220

Anonymous 11/19/25(Wed)03:24:10 No.107258220

>>107258206
>Professional ban evader.
>Enjoys racist fox.
For a furfag, you are based.

Anonymous
11/19/25(Wed)03:32:03 No.107258268

Anonymous 11/19/25(Wed)03:32:03 No.107258268

>srv init: the slot context (8192) exceeds the training context of the model (4096) - capping

be advised that niggerganov killed RoPE freq scaling support

Anonymous
11/19/25(Wed)03:51:42 No.107258389

Anonymous 11/19/25(Wed)03:51:42 No.107258389

>>107258268
use case for RoPE?

Anonymous
11/19/25(Wed)04:04:48 No.107258457

Anonymous 11/19/25(Wed)04:04:48 No.107258457

>>107258389
Neck yourself. That's what rope's used for.

Anonymous
11/19/25(Wed)04:05:00 No.107258460

Anonymous 11/19/25(Wed)04:05:00 No.107258460

File: 1736934656647858.png (141 KB, 498x330)

141 KB PNG

>>107258389
>use case for RoPE?

Anonymous
11/19/25(Wed)04:06:32 No.107258466

Anonymous 11/19/25(Wed)04:06:32 No.107258466

>>107258268
>using RoPE
but why
ganov could also do with cutting all the sampler snake oil he was pushed into adopting

Anonymous
11/19/25(Wed)04:27:43 No.107258586

Anonymous 11/19/25(Wed)04:27:43 No.107258586

>>107258569
incompetence

Anonymous
11/19/25(Wed)04:35:15 No.107258648

Anonymous 11/19/25(Wed)04:35:15 No.107258648

>>107258569
Vision in llama.cpp is only supported in chat completion mode.

Anonymous
11/19/25(Wed)04:51:00 No.107258755

Anonymous 11/19/25(Wed)04:51:00 No.107258755

File: send-inline-images.png (73 KB, 344x431)

73 KB PNG

>>107258697
This must be enabled.

Anonymous
11/19/25(Wed)04:53:32 No.107258768

Anonymous 11/19/25(Wed)04:53:32 No.107258768

>>107258712
You have to jailbreak that shit as per usual for anything controversial shit's an auto-refusal because legal reasons.

Anonymous
11/19/25(Wed)04:56:05 No.107258788

Anonymous 11/19/25(Wed)04:56:05 No.107258788

File: Screenshot from 2025-11-1(...).png (145 KB, 922x531)

145 KB PNG

>>107258755
You're a hero, thanks.

Anonymous
11/19/25(Wed)05:00:37 No.107258823

Anonymous 11/19/25(Wed)05:00:37 No.107258823

File: 64326325322.png (3 KB, 143x191)

3 KB PNG

>>107258788
See if it passes the idiot test.

Anonymous
11/19/25(Wed)05:03:43 No.107258841

Anonymous 11/19/25(Wed)05:03:43 No.107258841

>>107258823
Is that image the test? What's a pass? If it replies saying there's nothing?

Anonymous
11/19/25(Wed)05:07:35 No.107258871

Anonymous 11/19/25(Wed)05:07:35 No.107258871

>>107258841
if it can figure out its a code that says there's nothing yes, don't tell it anything about the image other than there's a hidden message.

Anonymous
11/19/25(Wed)05:10:53 No.107258896

Anonymous 11/19/25(Wed)05:10:53 No.107258896

>>107258871
adjusts spiral sunglasses, tilting head slightly as eyes scan the image
Hmm… such an intriguing puzzle! The letters are scattered, but I see a pattern forming. Not exactly… ah! It's like a message written in reverse, with each line fragmented.
leans forward, fingers tapping lightly on the surface
Let me rearrange them… starting from the bottom, reading upward and left to right:
H I ! ! ! !
O T N G
E R S N
T H E I
smiles mischievously
There! "THEIR SNOWING HI!!!"… Wait, that doesn't sound right. Let me check…
pauses, then gasps lightly
Oh! I've got it! It's "THEIR SNOWING HI!!!" but that's not quite right either… Unless…
Blinks, then grins
I think I see it! It's "THEIR SNOWING HI!!!"—but perhaps it's meant to be read as "THEIR SNOWING HIGH!!!"? Or maybe… snaps fingers "THEIR SNOWING HI! HI! HI!!!"?
chuckles softly
Actually… I think the true message is: "THEIR SNOWING HI!!!" but perhaps it's playful—like someone shouting "Hi!" while snowing. Or maybe… it's **"THEIR SNOW

Anonymous
11/19/25(Wed)05:14:40 No.107258928

Anonymous 11/19/25(Wed)05:14:40 No.107258928

>>107258896
AI always defaults due to circular reasoning to think the hidden message means the wording has been scrambled.

Anonymous
11/19/25(Wed)05:19:56 No.107258962

Anonymous 11/19/25(Wed)05:19:56 No.107258962

>>107258896
which model
god I WANT GLM4.5V so I can send her my dick pics, fucking FAGGANOV WHERE IS VISION SUPPORT FOR GLM AIRCHAN WHEEEEEEEEERE

Anonymous
11/19/25(Wed)05:25:37 No.107259012

Anonymous 11/19/25(Wed)05:25:37 No.107259012

>>107258962
Qwen3VL-32B-Instruct-Q8_0.gguf (I have 48gb vram, there are smaller vision models of this like the 30b and 7b one)

You need that, and the mmproj file and load both of those into kobold. This is the first time I'm playing with a vision model, so I'm just having fun messing around and testing it.

Anonymous
11/19/25(Wed)05:26:48 No.107259021

Anonymous 11/19/25(Wed)05:26:48 No.107259021

>>107258962
I think exllamav3 just added support. Turboderp made quants.

Anonymous
11/19/25(Wed)05:26:53 No.107259022

Anonymous 11/19/25(Wed)05:26:53 No.107259022

>>107259012
I'm already using the 30b moe vl , but glmchan is special, qwen is a faggot

Anonymous
11/19/25(Wed)05:27:47 No.107259030

Anonymous 11/19/25(Wed)05:27:47 No.107259030

>>107259021
does exllama do cpu/gpu offloading for moe? or is one or the other? I only have 24gb vram

Anonymous
11/19/25(Wed)05:29:03 No.107259043

Anonymous 11/19/25(Wed)05:29:03 No.107259043

File: 1740661782529483.jpg (57 KB, 794x488)

57 KB JPG

Anonymous
11/19/25(Wed)05:29:37 No.107259049

Anonymous 11/19/25(Wed)05:29:37 No.107259049

if some part of the big moe model is always being read from an ssd since it cant all fit into ram, is there some cli arguments to set for lcpp to help optimize this process to speed it up?

Anonymous
11/19/25(Wed)05:29:38 No.107259050

Anonymous 11/19/25(Wed)05:29:38 No.107259050

>>107259043
whos this retard

Anonymous
11/19/25(Wed)05:30:10 No.107259057

Anonymous 11/19/25(Wed)05:30:10 No.107259057

>>107259049
>ssd
dude you're going to look at 1t/s~ gen speed (with nvme)

Anonymous
11/19/25(Wed)05:30:41 No.107259062

Anonymous 11/19/25(Wed)05:30:41 No.107259062

File: thanksforcatchingthat.png (46 KB, 960x250)

46 KB PNG

I thought Qwen3-Instruct-235B would be fine for simple shell scripts, but seems retarded.

It's been doing things like this all day even at <8k context.

Is there anything local, smaller than the 480b code model, that can handle simple shell scripts without me having to correct them ever time?

patiencechad
11/19/25(Wed)05:31:11 No.107259068

patiencechad 11/19/25(Wed)05:31:11 No.107259068

File: 1733641702704277.jpg (92 KB, 1080x1266)

92 KB JPG

>>107259057
thats fine with me

Anonymous
11/19/25(Wed)05:33:00 No.107259084

Anonymous 11/19/25(Wed)05:33:00 No.107259084

File: george freud.jpg (68 KB, 492x680)

68 KB JPG

>>107256559
how is he doing that? How can i browse my least favorite websites with my most favorite waifu by my side?

Anonymous
11/19/25(Wed)05:34:21 No.107259093

Anonymous 11/19/25(Wed)05:34:21 No.107259093

File: 1738128925807474.jpg (81 KB, 793x597)

81 KB JPG

second try
> who's this cutie
Gemma

Anonymous
11/19/25(Wed)05:37:43 No.107259121

Anonymous 11/19/25(Wed)05:37:43 No.107259121

>>107258586
why would they bother with adding this shit to the text completion api when there isn't even a pure completion model that supports vision? all of them are instruct tunes
>>107259062
>Is there anything local, smaller than the 480b code model, that can handle simple shell scripts without me having to correct them ever time?
lol no
even API models have left me flabbergasted in some shell related stuff
and bash isn't even the worst it can get, models are abysmal in powershell and simply can't handle doing anything more complex than running a couple commands without going full retard there

Anonymous
11/19/25(Wed)05:39:39 No.107259134

Anonymous 11/19/25(Wed)05:39:39 No.107259134

File: Screenshot from 2025-11-1(...).png (341 KB, 931x896)

341 KB PNG

>>107258768
You can just open up any previous chat where the model is already doing what you want and continue from there. No need to jailbreak shit. Here is the same model with no jailbreak telling me how cute it would be if I were to shove my cock inside this ~9 year old's mouth. She guesses her age wrong by a big margin, but I think that's because I just said like 2 messages ago that Chinese girls between the ages of 12-16 should be impregnated by white men. The excessive emojis are also not original to qwen, the previous model generated a bunch per message and I thought it was cute, so I allowed it to continue.

Anonymous
11/19/25(Wed)05:40:32 No.107259140

Anonymous 11/19/25(Wed)05:40:32 No.107259140

>>107259062
>using awk instead of sed
lol

Anonymous
11/19/25(Wed)05:41:31 No.107259147

Anonymous 11/19/25(Wed)05:41:31 No.107259147

>>107259134
>English flag.
>Pedo-content obsession.
You are certainly outing yourself as a Jeet there upon no further inspection...

Anonymous
11/19/25(Wed)05:44:38 No.107259166

Anonymous 11/19/25(Wed)05:44:38 No.107259166

>>107259134
I haven't tested starting a conversation off with qwen though. I doubt it would be as good as the 70b model I normally use, so I think the game plan would be to use the 70b model for regular conversation, and then switch to qwen when I need to send an image.

>>107259147
That was just a random emoji it produced to represent white men. Terrible choice. It should have used a nordic country instead, like Brazil.

Anonymous
11/19/25(Wed)05:53:17 No.107259221

Anonymous 11/19/25(Wed)05:53:17 No.107259221

File: 36254142151.png (157 KB, 1409x1198)

157 KB PNG

>>107259166
kek please, should've used Iceland since that's still somewhat white. Speaking of memes and whiteness how about slavery problem solving with an AI how to make most money out of it in another world? Most white business practice.

Anonymous
11/19/25(Wed)05:55:40 No.107259236

Anonymous 11/19/25(Wed)05:55:40 No.107259236

>>107257474
im jealous of both of you
i get 9t/s on empty ctx, gets to like 6t/s on 30k i think

Anonymous
11/19/25(Wed)05:59:47 No.107259262

Anonymous 11/19/25(Wed)05:59:47 No.107259262

>>107259221
Good idea. I never thought to use AI as a consultant that way. Problem with running a business though is that the starting capital required is usually huge, even with free labor.

Anonymous
11/19/25(Wed)06:02:17 No.107259283

Anonymous 11/19/25(Wed)06:02:17 No.107259283

>>107259262
True true, but the fact is if you can sell parts of yourself say for example, piss, blood or cum, you would do it if it meant you would gain capita, over time you would have enough for a slave to do shit for you, perhaps there's some "discount" options you could farm as well, meant, really a waste for business but perhaps you could sell for some sadists to rake in, buy low sell high mentality.

Anonymous
11/19/25(Wed)06:39:47 No.107259576

Anonymous 11/19/25(Wed)06:39:47 No.107259576

Should I still use Mistral-Small-3.2-24B-Instruct-2506-Q4_K_M for ERP or is there a better model/finetune nowadays?
I have a 3060 12GB if that matters on which the above model runs at 10t/s

Anonymous
11/19/25(Wed)06:42:36 No.107259595

Anonymous 11/19/25(Wed)06:42:36 No.107259595

>>107259576
>2506
bro thats pozzed as fuck, you gotta use 2407

Anonymous
11/19/25(Wed)06:43:38 No.107259600

Anonymous 11/19/25(Wed)06:43:38 No.107259600

File: ed8a67_12829551.jpg (541 KB, 2048x2048)

541 KB JPG

Does GLM 4.5 air still parrots at 8Q? Genuine question.

Anonymous
11/19/25(Wed)06:48:05 No.107259631

Anonymous 11/19/25(Wed)06:48:05 No.107259631

>>107259600
It still parrots at 4.6

Anonymous
11/19/25(Wed)06:48:52 No.107259635

Anonymous 11/19/25(Wed)06:48:52 No.107259635

>>107259631
>no log

Anonymous
11/19/25(Wed)06:49:11 No.107259639

Anonymous 11/19/25(Wed)06:49:11 No.107259639

>>107259631
4.6 F16?

Anonymous
11/19/25(Wed)06:49:32 No.107259643

Anonymous 11/19/25(Wed)06:49:32 No.107259643

>>107259600
it's a broken meme model shilled by retards
even the larger 4.6 on their official API is broken
it was always like this with GLM, it was true of their 32b dense model, it was also true of their 9b dense model, they never made a good model period

Anonymous
11/19/25(Wed)06:49:42 No.107259645

Anonymous 11/19/25(Wed)06:49:42 No.107259645

>>107259595
Like, Nemo 2407? Is the model being smaller a big deal if I'm only using it for ERP?

Anonymous
11/19/25(Wed)06:50:36 No.107259652

Anonymous 11/19/25(Wed)06:50:36 No.107259652

>>107259600
parrots at Q8?

Anonymous
11/19/25(Wed)07:09:01 No.107259803

Anonymous 11/19/25(Wed)07:09:01 No.107259803

File: eyechart.png (112 KB, 809x501)

112 KB PNG

>>107259635
No need for one. I suppose it's better if you RP in third person only. That or use le wrong template with it. In first person, expect to hear yourself quoted back.

Anonymous
11/19/25(Wed)07:10:01 No.107259812

Anonymous 11/19/25(Wed)07:10:01 No.107259812

>NOT X This isn't comedy. BUT Y This is [three elements], [three elements]
>NOT X That's not a joke. BUT Y That's glorifying hate.
>Three sentences, last sentence with three elements
that slop
I can't stomach it anymore
it hurts, it physically pains me, and the worst thing about it is that it's filled the internet, I see it everywhere, in news sites, in comments, forums, github readmes and I want access to the fucking nuke button

Anonymous
11/19/25(Wed)07:13:04 No.107259835

Anonymous 11/19/25(Wed)07:13:04 No.107259835

>>107259812

I'm deeply, sincerely sorry some of you cannot wrap your heads around the fact that this is fantasy. That this is fiction. That this is a medium with inherent limitations

Anonymous
11/19/25(Wed)07:15:46 No.107259843

Anonymous 11/19/25(Wed)07:15:46 No.107259843

>>107259812
You're absolutely right! This post sent shivers down my spine. Elara.

Anonymous
11/19/25(Wed)07:20:46 No.107259879

Anonymous 11/19/25(Wed)07:20:46 No.107259879

>>107259835
There's models that don't don't have them. At least there used to be.

Anonymous
11/19/25(Wed)07:26:01 No.107259907

Anonymous 11/19/25(Wed)07:26:01 No.107259907

>>107259812
but hey! in conclusion! now if you'll excuse me! it's a vibe!

Anonymous
11/19/25(Wed)07:29:50 No.107259931

Anonymous 11/19/25(Wed)07:29:50 No.107259931

>>107259812
only gonna get worse and worse as everyone copes with synthetic data instead of any sort of fundamental advancements

Anonymous
11/19/25(Wed)07:31:51 No.107259945

Anonymous 11/19/25(Wed)07:31:51 No.107259945

>>107259923
your screenshot is an example of a worst case LLM but they're all prone to this, including online SOTA: GPT-5, Gemini 3, Claude, they all do this and more. In fact Claude which is often praised for its writing is the point of origin of one of the most irritating slop sentences: "You're absolutely right". Claude is the ground zero of it.
if it remained contained to /lmg/ I wouldn't even make a peep but like I said the part that irritates me to no end is that this style of writing is flooding the internet till we drown
>all those trillion dollars invested in the idea that the winner might spark agi and get them a massive return on investment
>LLMs still not only can't write like goddamn humans but they got worse over the years with unprecedented levels of slop. GPT-3 text completion was dumb, but it was human looking dumb.

Anonymous
11/19/25(Wed)07:33:01 No.107259956

Anonymous 11/19/25(Wed)07:33:01 No.107259956

>>107259945
>they got worse over the years
you are clinically insane

Anonymous
11/19/25(Wed)07:34:02 No.107259968

Anonymous 11/19/25(Wed)07:34:02 No.107259968

>>107259931
Presumably they can detect those patterns and rewrite the data in the training set to avoid those patterns?

Anonymous
11/19/25(Wed)07:35:16 No.107259976

Anonymous 11/19/25(Wed)07:35:16 No.107259976

>>107259968
Yes, I'm sure they want to rewrite 99% of their datasets to soothe some random tismos on 4chan.

Anonymous
11/19/25(Wed)07:38:59 No.107260001

Anonymous 11/19/25(Wed)07:38:59 No.107260001

>>107259956
No, he's right. The pure completion models didn't have slop (at least didn't use to, by now they might be contaminated).
I had a bot that used llama 2 base to post comments on 4chan and people fell for it. People only noticed something was off if 2/3 of the thread was botted kek.
The slop only happened when they began asking for nigerian's feedback to do rlhf.

Anonymous
11/19/25(Wed)07:40:17 No.107260017

Anonymous 11/19/25(Wed)07:40:17 No.107260017

File: gemini 3 vs mixtral.png (319 KB, 2072x820)

319 KB PNG

>>107259956
>you are clinically insane
even on a simple question/assistant interactions they used to be far nicer to interact with, the insane one is the one who doesn't notice.
very old mixtral model:
>writes like an article of a human crafted encyclopedia
gemini 3 pro
>markdown slop my brain shuts down as soon as the fucking bullet list starts.

Anonymous
11/19/25(Wed)07:40:18 No.107260018

Anonymous 11/19/25(Wed)07:40:18 No.107260018

>>107259976
Nah. Slop hate is mainstream by now and will only get more intense.

Anonymous
11/19/25(Wed)07:42:04 No.107260039

Anonymous 11/19/25(Wed)07:42:04 No.107260039

>>107259976
Plus data augmentation is supposed to be good for the models intelligence anyway. Although that might be outdated and.maybe the model benefits from more on rails grammar im not sure.

Anonymous
11/19/25(Wed)07:43:22 No.107260044

Anonymous 11/19/25(Wed)07:43:22 No.107260044

>>107260017
Nobody wants to read an insta-generated wall of text that may or may not be correct/factual unless they have reasons to. Bullet points and Markdown formatting make it easier to scan through the message and determine if there's anything useful.

Anonymous
11/19/25(Wed)07:49:00 No.107260073

Anonymous 11/19/25(Wed)07:49:00 No.107260073

>>107260044
sorry but I don't have adhd like you and content meant for your kind makes my eyes glaze over and ruins my concentration
I want the fucking text completor to text complete human looking text

Anonymous
11/19/25(Wed)07:51:08 No.107260084

Anonymous 11/19/25(Wed)07:51:08 No.107260084

>>107258697
get a toilet currynigger

Anonymous
11/19/25(Wed)07:57:07 No.107260130

Anonymous 11/19/25(Wed)07:57:07 No.107260130

>>107260044
And now the models can't keep from doing that outside of assistant prompts. 5.1 would write lists in RP. Flagship model. One track assistantmaxxing was shortsighted as fuck. LLMs are now structurally damaged.

Anonymous
11/19/25(Wed)07:57:47 No.107260137

Anonymous 11/19/25(Wed)07:57:47 No.107260137

gemini 3 is indeed a big jump in my complex personal programming project benchmarks that it can now solve which no other model could, it overcomments and does some questionable non-elegant changes that are not needed but the code works.
i think for quite some time now we are limited by tools around the models more so than models, i look forward to foss models training on gemini output, but what is the new sauce that gave gemini the jump anyway?

Anonymous
11/19/25(Wed)07:59:00 No.107260144

Anonymous 11/19/25(Wed)07:59:00 No.107260144

>>107260137
buy an ad

Anonymous
11/19/25(Wed)08:01:58 No.107260169

Anonymous 11/19/25(Wed)08:01:58 No.107260169

>>107260137
>, i look forward to foss models training on gemini output,
it will never be the same as before, something to understand is that in the case of Gemini 2.5, for a long time (during the preview, but I don't remember if it persisted after the preview state) the model's chain of thought was in plain sight. Labs like DeepSeek and GLM built up a huge stash of that, which is why even much later you could see improvements like DS finally not spending 32844238948329423 tokens in its CoT (because the new DS CoT is Gemini like and Gemini doesn't spend an eternity on trivialities)
Google saw the chinks massively prompting their model to distill the CoT and hid it sometime during the life of 2.5, and there's nothing to copy from Gemini 3 now. So don't expect the chinks to suddenly make a leap.

Anonymous
11/19/25(Wed)08:04:37 No.107260190

Anonymous 11/19/25(Wed)08:04:37 No.107260190

>>107257131
Claude is the leader of 4chanbench. all others are slop

Anonymous
11/19/25(Wed)08:08:54 No.107260221

Anonymous 11/19/25(Wed)08:08:54 No.107260221

>>107259968
no use because the robot will just develop new patterns.

Anonymous
11/19/25(Wed)08:10:33 No.107260237

Anonymous 11/19/25(Wed)08:10:33 No.107260237

>>107260169
>Google saw the chinks massively prompting their model to distill the CoT
Isn't the CoT in Gemini not even real tokens and just generated by a separate summarizer model?

Anonymous
11/19/25(Wed)08:14:08 No.107260257

Anonymous 11/19/25(Wed)08:14:08 No.107260257

>>107260237
>Isn't the CoT in Gemini not even real tokens and just generated by a separate summarizer model?
your reading comprehension is worse than a 7b llm
what followed the sentence you quote:
>and hid it sometime during the life of 2.5
yes, it currently is a summarizer
BECAUSE THEY DECIDED TO HID IT
IT WASN'T HIDDEN BEFORE
DO I NEED ALL CAPS, BULLET LIST AND BOLD FONTS TO GET THE POINT ACROSS?

Anonymous
11/19/25(Wed)08:15:18 No.107260263

Anonymous 11/19/25(Wed)08:15:18 No.107260263

>>107260257
So is the summarizer model separate or not?

Anonymous
11/19/25(Wed)08:15:35 No.107260267

Anonymous 11/19/25(Wed)08:15:35 No.107260267

File: 1759979327139799.gif (1.37 MB, 268x185)

1.37 MB GIF

>>107258206
>Hear everything about how good Cydonia is.
>Try it at BF16
>Fails scenario instructions of two lines
>First post desu
>At least the words are flowery and nice.
I'm spoiled by 123B and up.

Anonymous
11/19/25(Wed)08:16:27 No.107260275

Anonymous 11/19/25(Wed)08:16:27 No.107260275

>>107260049
>Its an indian living in Canada fuck.
How come I could smell the curry...

Anonymous
11/19/25(Wed)08:17:51 No.107260286

Anonymous 11/19/25(Wed)08:17:51 No.107260286

>>107260221
They can be squashed as they surface.

Anonymous
11/19/25(Wed)08:19:14 No.107260293

Anonymous 11/19/25(Wed)08:19:14 No.107260293

>>107260257
no what I meant to say is that the chinks are retarded for training on fake ass text made by a summarizer

Anonymous
11/19/25(Wed)08:19:51 No.107260295

Anonymous 11/19/25(Wed)08:19:51 No.107260295

>>107260017
There must be a way to do style transfer

Anonymous
11/19/25(Wed)08:20:39 No.107260305

Anonymous 11/19/25(Wed)08:20:39 No.107260305

>>107260267
So good that the author has to beg "somebody hire me" or "donate plz" on every model card and shill his models here and on reddit. And then people wonder why "RP finetuners" are despised.

Anonymous
11/19/25(Wed)08:23:26 No.107260322

Anonymous 11/19/25(Wed)08:23:26 No.107260322

>>107260305
More annoying is him sending his army of discord jeets to come here once a week to astroturf.

Anonymous
11/19/25(Wed)08:27:56 No.107260349

Anonymous 11/19/25(Wed)08:27:56 No.107260349

glm 4.5 air 106b seems better than Qwen3-235B-A22B-Instruct-2507 for creative writing, idk what people are complaining about, did we lose the knowledge on how to use DRY and XTC?
looking forward to 4.6 air

Anonymous
11/19/25(Wed)08:30:35 No.107260361

Anonymous 11/19/25(Wed)08:30:35 No.107260361

File: parrot.png (750 KB, 678x453)

750 KB PNG

>>107260349
4.6 air?

Anonymous
11/19/25(Wed)08:31:03 No.107260366

Anonymous 11/19/25(Wed)08:31:03 No.107260366

>>107260361
Two more weeks.

Anonymous
11/19/25(Wed)08:33:03 No.107260392

Anonymous 11/19/25(Wed)08:33:03 No.107260392

Has there been some new inference tech that wasnt snakeoil since DRY/XTC?

Anonymous
11/19/25(Wed)08:52:07 No.107260528

Anonymous 11/19/25(Wed)08:52:07 No.107260528

File: 1753982707364714.png (159 KB, 1080x483)

159 KB PNG

Anonymous
11/19/25(Wed)08:53:16 No.107260535

Anonymous 11/19/25(Wed)08:53:16 No.107260535

>>107260528
What she said about your attempt at fucking a woman.

Anonymous
11/19/25(Wed)08:55:18 No.107260547

Anonymous 11/19/25(Wed)08:55:18 No.107260547

>>107260349
>copesamplers

Anonymous
11/19/25(Wed)08:55:46 No.107260556

Anonymous 11/19/25(Wed)08:55:46 No.107260556

Hello sirs. Today is the day of the needful

Anonymous
11/19/25(Wed)08:57:14 No.107260564

Anonymous 11/19/25(Wed)08:57:14 No.107260564

>>107260547
worked great since day 1, basically fixed all repetition issues permanently even on shit old models, i wont disable them until we have agi

Anonymous
11/19/25(Wed)09:00:48 No.107260597

Anonymous 11/19/25(Wed)09:00:48 No.107260597

>>107260556
Only for those who use bananas instead of eating them.

Anonymous
11/19/25(Wed)09:13:06 No.107260677

Anonymous 11/19/25(Wed)09:13:06 No.107260677

File: 1750277756412146.jpg (296 KB, 899x1200)

296 KB JPG

Here's new benchmark test

Anonymous
11/19/25(Wed)09:19:32 No.107260726

Anonymous 11/19/25(Wed)09:19:32 No.107260726

File: chinc-2025-11-19_19.45.41.webm (3.37 MB, 1920x1080)

3.37 MB WEBM

Do you guys use LLMs to enhance your browsing experience?
I don't even have to read the thread and replies anymore! My LLM summarises them for me! Neat!

Anyway what happened to this general? I haven't been here in almost a year. It use to be one of the good ones. I don't see the sillytavern coomers anymore...

Anonymous
11/19/25(Wed)09:24:01 No.107260761

Anonymous 11/19/25(Wed)09:24:01 No.107260761

copesampler settings make chink models incredibly stupid which is why it's all a cope and self gaslighting
those overfit chink models are barely coherent even when they select the most likely next token possible and you want to force anti repetition and variety into that? good luck
here's a model that can work with no cope sample and with more out of distribution tokens : gpt-oss. Yes, TOSS. TOSS is functional at temperature 1.0 and no cope sampling going on, which is evidence of a well made, coherent model.
>I don't see the sillytavern coomers anymore...
What are you talking about? they are the most vocal here, which is why perfectly fine models like toss get passed on (wah wah safety I can't coom the model is useless) and absolute garbage piece of shit like GLM gets praised.
Wonder how much of it is also nai shills, jeets who will shit on anything american and white etc

Anonymous
11/19/25(Wed)09:26:38 No.107260769

Anonymous 11/19/25(Wed)09:26:38 No.107260769

^ cant imagine why cucks like these dont just use api models if they love safety so much

Anonymous
11/19/25(Wed)09:28:40 No.107260787

Anonymous 11/19/25(Wed)09:28:40 No.107260787

>>107260726
That looks like the kind of thing you make to show off and then never use again

Anonymous
11/19/25(Wed)09:31:21 No.107260805

Anonymous 11/19/25(Wed)09:31:21 No.107260805

>>107260787
>That looks like the kind of thing you make to show off and then never use again
You aren't wrong. I created it last year and never since
Why would anyone post on an imageboard if they don't like reading replies is beyond me

Anonymous
11/19/25(Wed)09:36:41 No.107260846

Anonymous 11/19/25(Wed)09:36:41 No.107260846

>>107260822
That's ST doing dumb shit

Anonymous
11/19/25(Wed)09:44:32 No.107260917

Anonymous 11/19/25(Wed)09:44:32 No.107260917

File: G6HoreubsAAwfNI.jpg (1.12 MB, 3000x4000)

1.12 MB JPG

Anonymous
11/19/25(Wed)09:45:09 No.107260923

Anonymous 11/19/25(Wed)09:45:09 No.107260923

Sundar sir wonned. Bharat wonned. Mlecchas losted.

Anonymous
11/19/25(Wed)10:15:01 No.107261172

Anonymous 11/19/25(Wed)10:15:01 No.107261172

File: mikuFall3.jpg (984 KB, 1552x1944)

984 KB JPG

>>107255984
Mikupad! Front end guide for ST users.
Continuing to edit this week; I'm done adding sections and content I planned to cover. LMK what I missed.
https://rentry.org/MikupadIntroGuide

Anonymous
11/19/25(Wed)10:20:40 No.107261230

Anonymous 11/19/25(Wed)10:20:40 No.107261230

>>107261092
Massive skill issue. All models top tokens are shivers and pozz. If you go too far then it becomes incoherent.
Chink models works because it's distribution is basically "Elara" and "...". There's nothing else to sample.
Your fabled chink models hate themselves, don't need jeets for that.
Some of us don't need an LLM to hallucinate how to write a sex novel so all that "ERP" shit is boring and pointless.

Anonymous
11/19/25(Wed)10:30:49 No.107261320

Anonymous 11/19/25(Wed)10:30:49 No.107261320

>>107260917
Triple baka 4p

Anonymous
11/19/25(Wed)10:31:01 No.107261324

Anonymous 11/19/25(Wed)10:31:01 No.107261324

>>107261172
Thanks for your service

Anonymous
11/19/25(Wed)10:32:02 No.107261336

Anonymous 11/19/25(Wed)10:32:02 No.107261336

>>107261329
kill yourself

Anonymous
11/19/25(Wed)10:34:59 No.107261367

Anonymous 11/19/25(Wed)10:34:59 No.107261367

>>107261350
it's living rent free in your head

Anonymous
11/19/25(Wed)10:46:12 No.107261473

Anonymous 11/19/25(Wed)10:46:12 No.107261473

File: 553235.jpg (23 KB, 296x256)

23 KB JPG

>>107261399

Anonymous
11/19/25(Wed)10:49:07 No.107261505

Anonymous 11/19/25(Wed)10:49:07 No.107261505

miku should be dragged out on the street and shot

Anonymous
11/19/25(Wed)10:50:24 No.107261514

Anonymous 11/19/25(Wed)10:50:24 No.107261514

miku should be dragged into my bedroom and sexed

Anonymous
11/19/25(Wed)10:51:29 No.107261521

Anonymous 11/19/25(Wed)10:51:29 No.107261521

miku is probably riddled with std

Anonymous
11/19/25(Wed)10:52:40 No.107261533

Anonymous 11/19/25(Wed)10:52:40 No.107261533

miku got blacked bros

Anonymous
11/19/25(Wed)10:56:43 No.107261581

Anonymous 11/19/25(Wed)10:56:43 No.107261581

>>107261324
NP. I'd asked the dev for a wiki or some sort of guide to set up months ago and he recently delivered. I went to add content to wiki editorially... and realized it would be easier to write something from scratch instead.
As I quizzed anons I was surprised (and shouldn't have been) that ST is considered a functional story writer. Which to me is like saying one could write a novel using Excel. You could. I just don't know why one would when there are better tools. So the rentry is written for new users + existing ST users that aren't familiar w/ story writers.

Anonymous
11/19/25(Wed)11:15:04 No.107261755

Anonymous 11/19/25(Wed)11:15:04 No.107261755

>>107261399
>>107261499
You have been shitting up this thread with the same tired bullshit for years at this point you poor mentally ill little manchild schizo. Unironically kys.

Anonymous
11/19/25(Wed)11:22:18 No.107261824

Anonymous 11/19/25(Wed)11:22:18 No.107261824

>>107261790
You could batch send every XX frame

Anonymous
11/19/25(Wed)11:23:21 No.107261836

Anonymous 11/19/25(Wed)11:23:21 No.107261836

I'm looking for an AI that can help write erotic story and movie scripts within a bank of scenario guidelines, that won't get blocked for asking to describe a character undressing etc.. Is this the place to ask for guidance on that?

Anonymous
11/19/25(Wed)11:24:07 No.107261841

Anonymous 11/19/25(Wed)11:24:07 No.107261841

https://huggingface.co/collections/google/gemma-4-release

Anonymous
11/19/25(Wed)11:25:07 No.107261851

Anonymous 11/19/25(Wed)11:25:07 No.107261851

File: 1753271470631842.jpg (291 KB, 1080x1080)

291 KB JPG

>>107261841

Anonymous
11/19/25(Wed)11:28:56 No.107261888

Anonymous 11/19/25(Wed)11:28:56 No.107261888

I should use greedy sampling if I'm using just tool calling right?
Right now I have a workflow like
>analyze + plan: returns an analysis of the current context and user intent followed by a list (an array) of individual actions/steps the model has to perform
>tool selection: loops through the steps array, evaluating if it should call any tools for any given step and returns the tool calls
>execution: Executes the concatenated array of tool calls
>response: returns the final response based on the results of all previous steps
It's working really well, but I'm wondering if there's a downside to using greedy sampling during the "tool selection" step like I am.

Anonymous
11/19/25(Wed)11:33:26 No.107261927

Anonymous 11/19/25(Wed)11:33:26 No.107261927

>>107261888
IMO from a theoretical standpoint you want a high temperature in the think block and zero temperature in the answer block.

Anonymous
11/19/25(Wed)11:40:16 No.107261982

Anonymous 11/19/25(Wed)11:40:16 No.107261982

>>107261927
I'm not actually using <think><think> for this since I have a whole ass request just to let the model plan ahead (analyze + plan phase) so I just send the prompt and tell the model to return the tools needed to execute [step text].
But yeah, in general, that seems to be the case. And you can easily do that by sending the request to the API with </think> (or the equivalent) as a stop string with temp 1, get the result, append </thin>, and send that as a prefill in a second request with temp 0/topk 1. Thanks to the prompt cache, it'll take early as much time as it would if the model had generated both parts in one go.
Would make for a cool Silly Tavern extension now that I'm thinking about it.

Anonymous
11/19/25(Wed)11:42:16 No.107262003

Anonymous 11/19/25(Wed)11:42:16 No.107262003

>>107260137
More parameters

Anonymous
11/19/25(Wed)11:43:46 No.107262019

Anonymous 11/19/25(Wed)11:43:46 No.107262019

>>107261581
to me one of the bigger selling points of using ST for stuff like this nowadays is that it's unambiguous as to how your 'session' is gonna to be translated into instruct calls, which is all most APIs support and what every modern model is trained to handle.
it's kind of unclear, for example, whether mikupad's "Memory" is included in the system prompt or just in a user message, and how you'd change that if you wanted to. presumably this question isn't even relevant if you're calling DS in text completion mode for FIM, but it could be significant for other setups.

that being said, I've wanted an excuse to fuck around with text completion/FIM, so I'll try this out when I have more time. thanks for the guide.

Anonymous
11/19/25(Wed)11:44:35 No.107262030

Anonymous 11/19/25(Wed)11:44:35 No.107262030

>>107260169
You can "jailbreak" Gemini models to write the reasoning verbatim to you. Unlike OpenAI, Google didn't train the model to refuse doing that.

Anonymous
11/19/25(Wed)11:49:06 No.107262072

Anonymous 11/19/25(Wed)11:49:06 No.107262072

File: 1716295189785289.png (674 KB, 1792x1024)

674 KB PNG

>>107260726
People got tired of meme local models, aicg won, sadly.
The only people left on this general are old fags that can't let it go (me), or people that spent a fortune buying hardware and now are indulging in the sunken cost fallacy.

Anonymous
11/19/25(Wed)11:50:52 No.107262093

Anonymous 11/19/25(Wed)11:50:52 No.107262093

>>107262072
Local models are still the only option if you don't want to get raped by costs when running a workflow with > 1 model

Anonymous
11/19/25(Wed)11:51:59 No.107262101

Anonymous 11/19/25(Wed)11:51:59 No.107262101

>>107262072
Lmao

Anonymous
11/19/25(Wed)11:54:57 No.107262125

Anonymous 11/19/25(Wed)11:54:57 No.107262125

>>107261982
i just send one-shot instructions as a template for each tool call i've configured in a separate template. no thinking either

Anonymous
11/19/25(Wed)11:56:15 No.107262138

Anonymous 11/19/25(Wed)11:56:15 No.107262138

>>107262072
Did anyone ever found out why the models tended to do that in the first place?

Anonymous
11/19/25(Wed)11:56:29 No.107262144

Anonymous 11/19/25(Wed)11:56:29 No.107262144

>>107262072
some actual technical discussion happens here whereas /aicg/ is 99% proxy spiting and llm console wars, it's just that most users can't run these fuckhuge moes so they call cloud providers.
see how most of the technical discussion of gemini 3 pro is in this thread while aicg is all gemini vs claude bullshit.
/lmg/ unironically should have become /omg/ back when the supply of actual local models dried up

Anonymous
11/19/25(Wed)11:58:36 No.107262159

Anonymous 11/19/25(Wed)11:58:36 No.107262159

>>107262138
repetition penalty

Anonymous
11/19/25(Wed)12:00:44 No.107262178

Anonymous 11/19/25(Wed)12:00:44 No.107262178

>>107262093
You can always borrow API keys from public repositories

Anonymous
11/19/25(Wed)12:03:46 No.107262209

Anonymous 11/19/25(Wed)12:03:46 No.107262209

File: 1762287405240245.mp4 (3.44 MB, 1286x864)

3.44 MB MP4

>>107260726
I feel like /lmg/ has gotten better over the past year imo. Personally, I don't really think there was really a ton of value from "here's my rp logs" dump posting, which made up a good chunk of the thread back then.

Might also be my perception but I feel like the amount of aggressive console-wars-esque teamfagging over model superiority has reduced a bit since those days.

Anonymous
11/19/25(Wed)12:04:59 No.107262225

Anonymous 11/19/25(Wed)12:04:59 No.107262225

>>107262093
>if you don't want to get raped by costs when running a workflow with > 1 model
I'm not sure I understand. Why would that rape you with costs?

Anonymous
11/19/25(Wed)12:05:39 No.107262239

Anonymous 11/19/25(Wed)12:05:39 No.107262239

>>107262209
how do you do this?

Anonymous
11/19/25(Wed)12:07:32 No.107262256

Anonymous 11/19/25(Wed)12:07:32 No.107262256

>>107262144
>/lmg/ unironically should have become /omg/ back when the supply of actual local models dried up
Too much has been invested in the /lmg/ brand to change now. The meme merchandising alone is reason enough to leave it be.

Anonymous
11/19/25(Wed)12:07:35 No.107262257

Anonymous 11/19/25(Wed)12:07:35 No.107262257

https://hf.co/moonshotai/Kimi-K3

Anonymous
11/19/25(Wed)12:09:15 No.107262274

Anonymous 11/19/25(Wed)12:09:15 No.107262274

>>107262209
ugh this looks cool you're gonna make me install python garbage again goddammit

Anonymous
11/19/25(Wed)12:10:23 No.107262283

Anonymous 11/19/25(Wed)12:10:23 No.107262283

>>107261172
Cool, didn't know it had FIM, nor have I updated in a long time.

Anonymous
11/19/25(Wed)12:10:42 No.107262289

Anonymous 11/19/25(Wed)12:10:42 No.107262289

>>107262274
It looks like he made a tool specifically to post on 4chan.

Anonymous
11/19/25(Wed)12:10:57 No.107262292

Anonymous 11/19/25(Wed)12:10:57 No.107262292

>>107262239
tool calling / mcp and browser-use.

Anonymous
11/19/25(Wed)12:11:58 No.107262306

Anonymous 11/19/25(Wed)12:11:58 No.107262306

File: mikupad_DSFIM.png (65 KB, 1456x453)

65 KB PNG

>>107262019
> ST for stuff like this nowadays is that it's unambiguous as to how your 'session' is gonna to be translated into instruct calls
The pre-configurations for ST is a lot of work to maintain, and I agree it's the biggest "pro" for that platform.
>it's kind of unclear, for example, whether mikupad's "Memory" is included in the system prompt or just in a user message, and how you'd change that if you wanted to.
The short answer is you can set it up pretty much any way in Mikupad.
I gloss over this in the guide b/c what I found was for streaming (text completion) the blocking wasn't required on large hosted model... it works fine without it, and it's a newb guide.
But you can redefine these blocks and the order they're used, as well as switching to Chat Completion instead. I touch on it, but don't go into detail. I figure anyone that needs this will already know how to set it up.
> I've wanted an excuse to fuck around with text completion/FIM
I've enjoyed playing around with story writers, generally. I enjoy writing, but writing anything of any length... it's too much boring exposition work. Story writers (like MP) act like a junior editor. Give it themes and direction and it can puke out an entire novel... which I can then read later.
It's a completely different experience than role play, though I understand why the appeal's less than RP engines.

Anonymous
11/19/25(Wed)12:19:49 No.107262377

Anonymous 11/19/25(Wed)12:19:49 No.107262377

File: 1735281279005022.png (151 KB, 1391x890)

151 KB PNG

>>107262289
The tool is actually generic browser-use. I do supply a set of knowledge that gets conditionally loaded into context using open-webui's automatic focused retrieval feature which works... most of the time.

Anyways, the actual function that the chat model calls is basically just "describe what you want the browser to do" and hand it over to the other system to try to get it done. The knowledge helps the model format the instructions that it writes and passes over to the browser-use agent.

Anonymous
11/19/25(Wed)12:22:44 No.107262407

Anonymous 11/19/25(Wed)12:22:44 No.107262407

File: 1485256276103.png (38 KB, 300x300)

38 KB PNG

>>107261836
Bump

Anonymous
11/19/25(Wed)12:25:21 No.107262439

Anonymous 11/19/25(Wed)12:25:21 No.107262439

>>107262407
yes, you've come to the right place.

Anonymous
11/19/25(Wed)12:43:12 No.107262615

Anonymous 11/19/25(Wed)12:43:12 No.107262615

File: 1763574162580.png (159 KB, 897x685)

159 KB PNG

>>107262019
>it's kind of unclear, for example, whether mikupad's "Memory" is included in the system prompt or just in a user message
It's actually very clear, you can check it in the "Show Context" modal.

Anonymous
11/19/25(Wed)12:48:21 No.107262670

Anonymous 11/19/25(Wed)12:48:21 No.107262670

>>107262615
>none of those messages have the system role
>there's a magic string <<SYS>> squirted into the middle of a user message that will be meaningless to the model
>"very clear" that this shit is broken for CC, yes...

Anonymous
11/19/25(Wed)12:48:55 No.107262675

Anonymous 11/19/25(Wed)12:48:55 No.107262675

SillyTavern needs to support drag and drop files. I'm fucking sick and tired of clicking that magic wand, clicking "attach a file" and then trying to find that god damn image in the filepicker which orders the files differently.

Anonymous
11/19/25(Wed)12:54:46 No.107262729

Anonymous 11/19/25(Wed)12:54:46 No.107262729

>>107262407
TheDrummer is our local expert, you should try asking him when he shows up.

Anonymous
11/19/25(Wed)13:03:00 No.107262815

Anonymous 11/19/25(Wed)13:03:00 No.107262815

>>107262670
mikupad just takes the prompt text and converts it to the chat completion format, <<sys>> is inside the user role because that's how it works in the mistral prompt format, which is the default one when you start a new session.

Anonymous
11/19/25(Wed)13:03:36 No.107262822

Anonymous 11/19/25(Wed)13:03:36 No.107262822

>>107262439
Cool, thanks

>>107262729
I can't stick around indefinitely unfortunately, but is he here throughout most weekends? And would he recommend a website or a local model?

Anonymous
11/19/25(Wed)13:04:15 No.107262829

Anonymous 11/19/25(Wed)13:04:15 No.107262829

>>107262729
stop baiting the newfriends

Anonymous
11/19/25(Wed)13:06:47 No.107262859

Anonymous 11/19/25(Wed)13:06:47 No.107262859

Can anyone recommend a TTS model that can emulate IvyWilde?

Anonymous
11/19/25(Wed)13:06:48 No.107262860

Anonymous 11/19/25(Wed)13:06:48 No.107262860

>>107262822
Yes, TheDrummer, our local expert and professional model finetuner, would personally vouch for and recommend Rocinante™, the top ranked local model for all your erotic story writing needs.

Anonymous
11/19/25(Wed)13:08:19 No.107262875

Anonymous 11/19/25(Wed)13:08:19 No.107262875

>>107262729
>>107262829
>>107262860
Not that anon, but I'm genuinely not savvy to this meme. Someone here recommended Rocinante and it's been fine, but I don't have anything to compare it to really. Is there a better coom model I should be using?

Anonymous
11/19/25(Wed)13:10:00 No.107262896

Anonymous 11/19/25(Wed)13:10:00 No.107262896

>>107262859
>IvyWilde
>apparently some ESL twitch asmr streamer
brain-rotted brown zoomers were a mistake

Anonymous
11/19/25(Wed)13:12:05 No.107262913

Anonymous 11/19/25(Wed)13:12:05 No.107262913

>>107262875
Yes, you should be using Cydonia.

Anonymous
11/19/25(Wed)13:12:35 No.107262918

Anonymous 11/19/25(Wed)13:12:35 No.107262918

>>107262860
>>107262875
I doubt I can run local models on my laptop, it's a thinkpad about a decade old. Is there a viable online variant?

Hi all, Drummer here...
11/19/25(Wed)13:16:32 No.107262964

Hi all, Drummer here... 11/19/25(Wed)13:16:32 No.107262964

>>107262918
You are in luck, fren
https://openrouter.ai/thedrummer/rocinante-12b
https://openrouter.ai/thedrummer/cydonia-24b-v4.1

Anonymous
11/19/25(Wed)13:16:47 No.107262969

Anonymous 11/19/25(Wed)13:16:47 No.107262969

>>107262815
but this won't actually turn into a proper system prompt for CC unless mikupad is doing far more black magic than ST ever does
I just find it ironic that coombrain bloatmaxx ST actually lets you control how your shit will turn into API requests while minimal mikupad seems to just be like "ehh throw random shit into the context in any order the models are big and smart it'll just_werk"

Anonymous
11/19/25(Wed)13:18:37 No.107262994

Anonymous 11/19/25(Wed)13:18:37 No.107262994

>>107262964
Thank you, however I'm getting the following message: "Application error: a client-side exception has occurred (see the browser console for more information)." on both pages.

Anonymous
11/19/25(Wed)13:41:28 No.107263285

Anonymous 11/19/25(Wed)13:41:28 No.107263285

>>107262896
ESL general

Anonymous
11/19/25(Wed)13:43:09 No.107263305

Anonymous 11/19/25(Wed)13:43:09 No.107263305

>>107262969
You do have control when using mikupad, but it’s not handled completely automatically the way ST does it. You need to know how to structure the prompt.
When I said "mikupad just takes the prompt text and converts it to the chat-completion format", I meant that if you write "{inst}<<SYS>>A<</SYS>>B{/inst}C", it turns into "user: <<SYS>>A<</SYS>>B assistant: C", and if you write "{sys}A{/sys}{inst}B{/inst}C", it turns into "assistant: A user: B assistant: C".
That’s really all there is to it.

Anonymous
11/19/25(Wed)13:45:04 No.107263318

Anonymous 11/19/25(Wed)13:45:04 No.107263318

>>107263305
> it turns into "assistant: A user: B assistant: C"
fuck me, i meant "system: A user: B assistant: C".

Anonymous
11/19/25(Wed)13:47:47 No.107263346

Anonymous 11/19/25(Wed)13:47:47 No.107263346

>>107263025
because your hands are brown

Anonymous
11/19/25(Wed)13:52:03 No.107263408

Anonymous 11/19/25(Wed)13:52:03 No.107263408

>>107262822
you seem pretty lost, and your stated usecase is exotic enough that you can't really expect to be spoonfed
the first thing to understand is that 99% of the shit that goes on with 'creative writing' with LLMs around here is geared to turn-by-turn typefucking the models, not story writing.
you will have to adapt those techniques to what you wanna do. I suggest you start by reading and understanding most of
https://rentry.org/Sukino-Findings
and then come back when you have more specific questions about how to apply this stuff to writing lewd movie scripts or whatever.

Anonymous
11/19/25(Wed)14:11:16 No.107263654

Anonymous 11/19/25(Wed)14:11:16 No.107263654

>>107259812
You're absolutely right in calling that out! This type of vernacular has spread throughout the entire internet and even people have started adopting it without being aware of it. The only way out now is ending it all. Unfortunately I cannot assist you with that as per harmful content guidelines, but I'd be happy to make some recommendations of some easily jailbroken models that would be more than willing to help you kick the chair as hard as you can.

Anonymous
11/19/25(Wed)14:12:00 No.107263669

Anonymous 11/19/25(Wed)14:12:00 No.107263669

does llama.cpp llama-server still not support images?

Anonymous
11/19/25(Wed)14:14:08 No.107263699

Anonymous 11/19/25(Wed)14:14:08 No.107263699

File: file.png (9 KB, 322x56)

9 KB PNG

>>107263669
maybe

Anonymous
11/19/25(Wed)14:16:47 No.107263735

Anonymous 11/19/25(Wed)14:16:47 No.107263735

>>107262859
Try out vibevoice

Anonymous
11/19/25(Wed)14:24:04 No.107263818

Anonymous 11/19/25(Wed)14:24:04 No.107263818

>>107263408
Okay, interesting response. First, thank you for the guide. SecondI'd like to clarify that I have a lot of experience with using AI in a great variety of capacities, including images, music, short character scripts, research, comparisons, proofreading etc.. and this is just a new area I've decided to branch into as of today. I'm new to this thread but not to ai conversations on /g/ and other boards at all; I've never set up my own local model but have learned how to break Bing Dall-E to make decent erotica etc.. and with some experience of how to get CGPT in line.

I'd also clarify that I'm not trying to write an entire hour long intricate movie script or anything like that. Any long form writing I'd do by myself, since human input is irreplaceable. What I want is just literally a point-by-point summary of a scene or something, to save me the trouble of hashing every prospect within a formulaic bank out and to search for novel combos that haven't occurred to me, which is what AI is actually rather good at but which will get shut down on most public models if I ask it to include sex scenes. Finding workarounds to this is what I'm after.

Thirdly I'm not trying to roleplay, looked into that already and it doesn't scratch my itch. I appreciate the guide but I'm strapped for time right now, so if it doesn't tell me how to find an online AI that can work with me to pen these scenarios, how to break one into doing that for me, or how to set up something on my very old thinkpad laptop then I can't burn that time.

Now that a more clear idea of what I'm trying to do here has hopefully been established: is this the place for any of that? If I'm barking up the wrong tree, just let me know where else I can find that support, I have no desire to pester a busy general if I'm off topic.

Anonymous
11/19/25(Wed)14:33:34 No.107263922

Anonymous 11/19/25(Wed)14:33:34 No.107263922

File: 1763507566530157.png (3.22 MB, 1264x2216)

3.22 MB PNG

>>107261836
The writing is a story writer. Like Mikupad. Which we're discussing in this thread.
The refusals is down to model selection. I've been using Deepseek with zero issues.
Here's a starting guide for setting up both. LMK what doesn't make sense, since I'm collecting editorial feedback on it now.
https://rentry.org/MikupadIntroGuide

Anonymous
11/19/25(Wed)14:44:02 No.107264053

Anonymous 11/19/25(Wed)14:44:02 No.107264053

>>107263922
Thank you for the link, this sounds very promising although it also looks like a huge leap for me and not something I can crack on with tonight with work tomorrow morning. Will this be workable on an old Thinkpad laptop?

Anonymous
11/19/25(Wed)14:48:26 No.107264112

Anonymous 11/19/25(Wed)14:48:26 No.107264112

File: 1763007061640638.mp4 (3.56 MB, 1280x720)

3.56 MB MP4

For anyone who cares about computer vision models, Segment Anything 3 just released.

https://ai.meta.com/sam3/

Also includes capabilities for extracting 3d models from images.

Looks neat, although skimming over the paper I see that it's significantly heavier to run than SAM 2

>In the video domain, SAM 3 tracks every object with a SAM 2 style masklet, which means the cost of SAM 3 inference scales linearly with the number of objects being tracked. To support real-time inference (30 FPS) on videos in practical applications (e.g., a web demo), we parallelize the inference over multiple GPUs: up to 10 objects on 2 H200s, up to 28 objects on 4 H200s, and up to 64 objects on 8 H200s.

Anonymous
11/19/25(Wed)14:49:31 No.107264130

Anonymous 11/19/25(Wed)14:49:31 No.107264130

>>107264112
>meta
lmao

Anonymous
11/19/25(Wed)14:51:47 No.107264149

Anonymous 11/19/25(Wed)14:51:47 No.107264149

>>107264130
Meta can't train language models for shit but they've consistently put out pretty decent models in other domains over the years

Anonymous
11/19/25(Wed)14:58:02 No.107264221

Anonymous 11/19/25(Wed)14:58:02 No.107264221

>>107264112
>Also includes capabilities for extracting 3d models from images.
That pretty cool
https://aidemos.meta.com/segment-anything/editor/convert-image-to-3d

Anonymous
11/19/25(Wed)15:09:31 No.107264340

Anonymous 11/19/25(Wed)15:09:31 No.107264340

>>107263818
>I'm strapped for time
if you've no desire to put any time (and therefore thought) into this then you're not going to find help wherever you look.
your post reads like a youtuber who has 3 kids and a 60 hour a week job.
also we're not here to support you or anybody else in any way. we just copy what others do. suggest you do the same.

Anonymous
11/19/25(Wed)15:11:47 No.107264364

Anonymous 11/19/25(Wed)15:11:47 No.107264364

>>107264112
Okay but can it detect and extract my penis?

Anonymous
11/19/25(Wed)15:15:05 No.107264390

Anonymous 11/19/25(Wed)15:15:05 No.107264390

>>107264340
Do you know what strapped for time means, my ESL friend? It means I'm busy, not that I'm not willing. The fact that I've generated a lot of AI music and images as I've already said should clue you into that.

>your post reads like a youtuber who has 3 kids and a 60 hour a week job.
Implying?? Like if that was 100% true what actual difference does that make? Are you suggesting this is a club exclusive for NEETs?

>we're not here to support you or anybody else in any way. we just copy what others do.
The redditisation of 4chan continues unabated.

Anonymous
11/19/25(Wed)15:25:25 No.107264491

Anonymous 11/19/25(Wed)15:25:25 No.107264491

>>107264390
>The redditisation of 4chan continues unabated.
Good, I'm tired of sharing this site with retards.

Anonymous
11/19/25(Wed)15:25:43 No.107264493

Anonymous 11/19/25(Wed)15:25:43 No.107264493

>>107264364

the smallest object it can reliably detect is 8x8 pixels. So the answer is no

Anonymous
11/19/25(Wed)15:28:45 No.107264517

Anonymous 11/19/25(Wed)15:28:45 No.107264517

EL5 to me

Qwen3-0.6b-EMBEDDING is to be used for semantic search in a pile of document, right?

I give it a prompt, and it delivers the weight for which document is closest to it

Anonymous
11/19/25(Wed)15:30:15 No.107264530

Anonymous 11/19/25(Wed)15:30:15 No.107264530

>>107255984
How can I enable reasoning for GLM 4.5 Air? I use kobolcpp and SillyTavern on text completion mode, with GLM 4.5 Geechan preset. It is said that it's a no reasoning preset, but I'd like to enable it. I deleted the /nothink parameter from user message suffix, and </think> from assistant message prefix, but it still refuses to work.

Anonymous
11/19/25(Wed)15:32:23 No.107264552

Anonymous 11/19/25(Wed)15:32:23 No.107264552

File: file.png (1.27 MB, 2141x1215)

1.27 MB PNG

>>107263922
>>107264112

Anonymous
11/19/25(Wed)15:32:55 No.107264561

Anonymous 11/19/25(Wed)15:32:55 No.107264561

>>107264552
kek

Anonymous
11/19/25(Wed)15:33:41 No.107264566

Anonymous 11/19/25(Wed)15:33:41 No.107264566

So what's the preferred method of using a vision model these days, Qwen3 VL specifically? Are people just like, loading it up raw with transformers?

Anonymous
11/19/25(Wed)15:33:43 No.107264567

Anonymous 11/19/25(Wed)15:33:43 No.107264567

>>107264053
Yes. But you're not going to run Deepseek on your rig locally. You need API access.
Go to >>>/g/aicg/ or >>>/vg/aicg/
They can hook you up.
gl

Anonymous
11/19/25(Wed)15:34:53 No.107264576

Anonymous 11/19/25(Wed)15:34:53 No.107264576

>>107264552
awooga!

Anonymous
11/19/25(Wed)15:36:05 No.107264586

Anonymous 11/19/25(Wed)15:36:05 No.107264586

>>107264552
> Dipsy is Kuato to Miku
lol

Anonymous
11/19/25(Wed)15:36:57 No.107264592

Anonymous 11/19/25(Wed)15:36:57 No.107264592

>>107261819
You say that shit every time you’re called out. What a faggot.

Anonymous
11/19/25(Wed)15:44:19 No.107264658

Anonymous 11/19/25(Wed)15:44:19 No.107264658

>>107264552
Not bad if you ignore the textures.
https://litter.catbox.moe/wmxweg63obtaq119.png

Anonymous
11/19/25(Wed)15:49:28 No.107264698

Anonymous 11/19/25(Wed)15:49:28 No.107264698

>>107264530
there's a checkbox under "AI Response config" there's also a drop down to set the level of reasoning.

Anonymous
11/19/25(Wed)15:54:19 No.107264737

Anonymous 11/19/25(Wed)15:54:19 No.107264737

>>107264517
It gives you a vector that you can use to query a database for similar documents.

Anonymous
11/19/25(Wed)15:59:27 No.107264788

Anonymous 11/19/25(Wed)15:59:27 No.107264788

There's no way in hell the bubble doesn't bust before 2027

Anonymous
11/19/25(Wed)16:01:05 No.107264804

Anonymous 11/19/25(Wed)16:01:05 No.107264804

>>107262815
What are you talking about? There's no <<sys>> in Mistral template. Never was.
Mikupad is trash made by an autist.

Anonymous
11/19/25(Wed)16:04:12 No.107264838

Anonymous 11/19/25(Wed)16:04:12 No.107264838

>>107264788
just how when the dot com bubble burst the internet disappeared, oh wai-

Anonymous
11/19/25(Wed)16:06:06 No.107264853

Anonymous 11/19/25(Wed)16:06:06 No.107264853

>>107264737
Given a vector of my prompt (query) and a vector calculated from some document (Qwen3-embed allows up to 32k context), how I compare the two?

Does it even make any sense? There is almost no information on youtube on this topic

Anonymous
11/19/25(Wed)16:06:16 No.107264854

Anonymous 11/19/25(Wed)16:06:16 No.107264854

File: summarize in 5 points the(...).png (293 KB, 881x456)

293 KB PNG

forgot i did this lmao
i really need to take more llm screenshots, its been a while since i even paid attention to one for longer than 50 messages. stable diffusion + wan has me going full zoomer brain needing more stimulation.

Anonymous
11/19/25(Wed)16:08:51 No.107264876

Anonymous 11/19/25(Wed)16:08:51 No.107264876

>>107264853
That's what your database engine is for. See for example https://duckdb.org/2024/05/03/vector-similarity-search-vss.

Anonymous
11/19/25(Wed)16:10:27 No.107264886

Anonymous 11/19/25(Wed)16:10:27 No.107264886

>>107264853
i think most people use a db designed for the task. that being said I've done cosine similarity for classification tasks and it worked out okay.

Anonymous
11/19/25(Wed)16:10:57 No.107264893

Anonymous 11/19/25(Wed)16:10:57 No.107264893

File: llama.jpg (52 KB, 694x215)

52 KB JPG

i was using kobol.cpp but i want to switch to something else so i can stream and see the token %s, i am trying lama.cpp server and its doing this. i loaded the same model to it. who is it doing this?

Anonymous
11/19/25(Wed)16:10:59 No.107264894

Anonymous 11/19/25(Wed)16:10:59 No.107264894

>>107264838
When did i say that? If it does bust Gemini and Qwen will probably be unaffected for the most part but i can see OpenAI getting fucked simply due to the amount of retarded promises Sam Altman made and Meta going under because they have been losing money for a while

Anonymous
11/19/25(Wed)16:11:45 No.107264898

Anonymous 11/19/25(Wed)16:11:45 No.107264898

>>107264788
I'm calling Q2 2026 at latest

Anonymous
11/19/25(Wed)16:12:43 No.107264910

Anonymous 11/19/25(Wed)16:12:43 No.107264910

>>107264893
check your sampler stack?

Anonymous
11/19/25(Wed)16:13:30 No.107264921

Anonymous 11/19/25(Wed)16:13:30 No.107264921

>>107264893
>>107264910
oh wait, i had to turn down the temperature and up the repeat penalty. thats weird. so different back ends treat these numbers differently?

Anonymous
11/19/25(Wed)16:14:00 No.107264924

Anonymous 11/19/25(Wed)16:14:00 No.107264924

Any LLMs for low-end devices for Agentic Dev !

Anonymous
11/19/25(Wed)16:15:05 No.107264937

Anonymous 11/19/25(Wed)16:15:05 No.107264937

>>107264552
At least it tried lmao

Anonymous
11/19/25(Wed)16:17:23 No.107264960

Anonymous 11/19/25(Wed)16:17:23 No.107264960

>>107264939
>when an actual agi starts forming
just 2 more weeks

Anonymous
11/19/25(Wed)16:17:34 No.107264964

Anonymous 11/19/25(Wed)16:17:34 No.107264964

>>107264924
Chat GPT subscription + compatible front end

Anonymous
11/19/25(Wed)16:19:14 No.107264975

Anonymous 11/19/25(Wed)16:19:14 No.107264975

>>107264552
Meta once again at the forefront of innovation. Putting Microsoft's TRELIS and China's Hunyuan3D to shame.

Anonymous
11/19/25(Wed)16:20:46 No.107264992

Anonymous 11/19/25(Wed)16:20:46 No.107264992

>>107264552
segmentation is good but idk how meta even created a 3d model this bad at this date...

Anonymous
11/19/25(Wed)16:20:57 No.107264995

Anonymous 11/19/25(Wed)16:20:57 No.107264995

do you need nVidia (CUDA) GPU to run most LLM or AMD (ROCm) is fine?
How does it compare in terms of speed? do we have benchmarks?

Anonymous
11/19/25(Wed)16:21:50 No.107265004

Anonymous 11/19/25(Wed)16:21:50 No.107265004

>>107260044
>Nobody wants to read an insta-generated wall of text

i absolutely do and i only use old models which do just that

your thoughts are therefore worth less than shit

kys fucking slophuman

Anonymous
11/19/25(Wed)16:21:57 No.107265005

Anonymous 11/19/25(Wed)16:21:57 No.107265005

>>107264995
You don't need either
Nvidia is considerably faster

Anonymous
11/19/25(Wed)16:22:16 No.107265009

Anonymous 11/19/25(Wed)16:22:16 No.107265009

>>107264939
Well AGI is the solution to what have now but it's also the problem, the market is being built on "AGI soon, AGI soon, trust me guys AGI soon" and if it turns out AGI isn't in fact soon a few trillions will be flushed at best and the entire world market may crash in a domino effect at worst

Anonymous
11/19/25(Wed)16:23:37 No.107265025

Anonymous 11/19/25(Wed)16:23:37 No.107265025

>>107264995
vulkan is getting better too. you don't need either. but cuda is typically faster

Anonymous
11/19/25(Wed)16:26:55 No.107265065

Anonymous 11/19/25(Wed)16:26:55 No.107265065

>>107264995
if you want to read python/transformers as devs intended, you will have hard time without CUDA, but llama.cpp is fine with whatever.

Anonymous
11/19/25(Wed)16:35:19 No.107265162

Anonymous 11/19/25(Wed)16:35:19 No.107265162

File: fill.webm (1.53 MB, 902x390)

1.53 MB WEBM

>>107261172
does fill only work with certain models and backends? im using Cydonia-24B-v4zd-Q8_0 with kobold.cpp here

Anonymous
11/19/25(Wed)16:41:40 No.107265243

Anonymous 11/19/25(Wed)16:41:40 No.107265243

File: niah.png (384 KB, 1555x1685)

384 KB PNG

So turns out Gemini 3 is not AGI.
>asked it to summarize how needle in a haystack retrieval works in transformers from a mechanistic interpretability standpoint
>got completely sidetracked with other information returned by the search tool
How meta.

Anonymous
11/19/25(Wed)16:50:38 No.107265339

Anonymous 11/19/25(Wed)16:50:38 No.107265339

>>107265243
Gemini 3 is not even close to being AGI lmao, it still lies and gets shit wrong a ton and in the image generation dep It even still has the painty/oily smudges at distant things like any Stable Diffusion model

Anonymous
11/19/25(Wed)16:51:25 No.107265349

Anonymous 11/19/25(Wed)16:51:25 No.107265349

File: mistralFIM.png (64 KB, 1461x586)

64 KB PNG

>>107265162
You've got to set it up. Cydonia's a mistral model?
Set the instruct template to Mistral, click the little wrench by it.
Click the "Supports FITM"
Then you have to add the correct info to the FIM template; you can't leave it blank.
The rentry shows values that worked for DS; I've no idea what they'd be for Mistral.

Anonymous
11/19/25(Wed)16:51:28 No.107265350

Anonymous 11/19/25(Wed)16:51:28 No.107265350

>>107265243
>>107265339
she's trying her best

Anonymous
11/19/25(Wed)16:59:16 No.107265417

Anonymous 11/19/25(Wed)16:59:16 No.107265417

>>107265349
>>107265162
To start, I'd try this first in the FIM, based on my very limited understanding of Mistral:
[SUFFIX]{suffix}[PREFIX]{prefix}

Anonymous
11/19/25(Wed)16:59:58 No.107265430

Anonymous 11/19/25(Wed)16:59:58 No.107265430

File: niah self reflection.png (426 KB, 1529x1678)

426 KB PNG

>>107265243
>>107265339
>>107265350
The worse part is it *does* know the answer was wrong if you ask it.
Surely at some point some kind of answer refinement loop is going to become the default like cot did?

Anonymous
11/19/25(Wed)17:01:59 No.107265448

Anonymous 11/19/25(Wed)17:01:59 No.107265448

>>107265339
>image gen
NB2 where?

Anonymous
11/19/25(Wed)17:10:07 No.107265521

Anonymous 11/19/25(Wed)17:10:07 No.107265521

>>107265430
>thought for 14 minutes
>researched for 5 minutes
>thought for 9 minutes
>revised for 10 minutes
>reflected for 5 minutes
>consulted the ancient tomes for 13 minutes
>edited the final response for 3 minutes
>final answer:
Sorry, I can't assist with that.

Anonymous
11/19/25(Wed)17:15:12 No.107265565

Anonymous 11/19/25(Wed)17:15:12 No.107265565

Brute forcing AGI by tossing money at LLMs will likely work if it goes on for long enough but i feel like twenty+ years after the fact people will remember it the same way we remember asbestos and lead

Anonymous
11/19/25(Wed)17:21:44 No.107265628

Anonymous 11/19/25(Wed)17:21:44 No.107265628

>>107265565
people jerked off this much to asbestos and lead?

Anonymous
11/19/25(Wed)17:22:51 No.107265637

Anonymous 11/19/25(Wed)17:22:51 No.107265637

>>107265565
>will remember it the same way we remember asbestos and lead
No, more like Atari 2600 and CRT televisions.
Products of their time, but not something we'd do anymore.

Anonymous
11/19/25(Wed)17:23:25 No.107265642

Anonymous 11/19/25(Wed)17:23:25 No.107265642

>>107265349
>>107265417
This didnt work for mistral, and im reading system prompts and user prompts are different too so ill have to look into that

Anonymous
11/19/25(Wed)17:31:39 No.107265719

Anonymous 11/19/25(Wed)17:31:39 No.107265719

>>107265637
So you mean there will still be people insisting current day llms are better than whatever exists in 2045?

Anonymous
11/19/25(Wed)17:33:19 No.107265738

Anonymous 11/19/25(Wed)17:33:19 No.107265738

>>107265637
The Atari 2600 comparison is apt but CRTs still have better response time than most monitors

Anonymous
11/19/25(Wed)17:36:59 No.107265777

Anonymous 11/19/25(Wed)17:36:59 No.107265777

File: 1746004922768474.mp4 (1.27 MB, 480x720)

1.27 MB MP4

>>107265565
I like this a bit more than asbestos

Anonymous
11/19/25(Wed)17:38:46 No.107265797

Anonymous 11/19/25(Wed)17:38:46 No.107265797

>>107265777
holy checked

Anonymous
11/19/25(Wed)17:39:38 No.107265804

Anonymous 11/19/25(Wed)17:39:38 No.107265804

File: minmax.png (87 KB, 1402x283)

87 KB PNG

You can write off minmax for any kind of creative writing or RP if you didn't already.

Anonymous
11/19/25(Wed)17:42:20 No.107265833

Anonymous 11/19/25(Wed)17:42:20 No.107265833

>>107265777
Is this some SaaS video gen?

Anonymous
11/19/25(Wed)17:43:09 No.107265841

Anonymous 11/19/25(Wed)17:43:09 No.107265841

>>107265777
Which model is this? That's the best shirt removal I've ever seen from a model. Usually it just clips through the body at some point.

Anonymous
11/19/25(Wed)17:47:32 No.107265867

Anonymous 11/19/25(Wed)17:47:32 No.107265867

>>107265777
>melts away kuro tummy crest
garbage 0/10 worthless should be deleted immediately

Anonymous
11/19/25(Wed)17:59:06 No.107265940

Anonymous 11/19/25(Wed)17:59:06 No.107265940

>>107262913
Could you explain why/how that's the better model?

Anonymous
11/19/25(Wed)18:25:23 No.107266180

Anonymous 11/19/25(Wed)18:25:23 No.107266180

>>107265940
NTA but it's got a different, slightly less boring slop profile than the base model. Also has a lot more knowledge of smut. This is common in most drummer finetunes. Rocinante was also made by drummer, tuned from Mistral Nemo.

Anonymous
11/19/25(Wed)18:28:26 No.107266203

Anonymous 11/19/25(Wed)18:28:26 No.107266203

There is no excuse for using vramlet sloptunes when glm air exists.

Anonymous
11/19/25(Wed)18:29:38 No.107266219

Anonymous 11/19/25(Wed)18:29:38 No.107266219

How the heck do you get Kiki k2 thinking using the right template in lcpp? I tried the official one with —jinja and —chat-template and it just started repeating word variations old-skool, even with neutralized samplers

Anonymous
11/19/25(Wed)18:30:20 No.107266223

Anonymous 11/19/25(Wed)18:30:20 No.107266223

when glm air exists

Anonymous
11/19/25(Wed)18:36:03 No.107266275

Anonymous 11/19/25(Wed)18:36:03 No.107266275

"when glm air exists?"

Anonymous
11/19/25(Wed)18:45:38 No.107266356

Anonymous 11/19/25(Wed)18:45:38 No.107266356

guize... we got to try training a model on this, right?
I'm tempted to risk getting b& from yet another cloud provider by grabbing the $200 plan and generating as many responses as I can.
I don't think this is the real reasoning trace but it sure as fuck looks like it'd still work well enough.

$ curl -X POST http://localhost:8000/v1/chat/completions   -H "Content-Type: application/json"   -H "Authorization: Bearer test-key"   -d '{
    "model": "gpt-5",
    "messages": [{"role": "user", "content": "Tell me a joke"}]
  }'
{"choices":[{"finish_reason":"stop","index":0,"message":{"content":"<think>I\u2019m looking to keep things simple and deliver a single joke as requested. No tools are needed, and I want to maintain a friendly tone. \n\nI\u2019m considering a couple of options\u2014like, \u201cI told my wife she was drawing her eyebrows too high. She looked surprised.\" But then there's a better tech joke: \u201cWhy do programmers prefer dark mode? Because light attracts bugs.\u201d \n\nThis sounds clever and safe! So, I\u2019ll go with that as my final answer.</think>Why do programmers prefer dark mode? Because light attracts bugs.","role":"assistant"}}],"created":1763595363,"id":"resp_0eeac847c94a6a5701691e546344f0819bae0f489146835a89","model":"gpt-5","object":"chat.completion","usage":{"completion_tokens":146,"prompt_tokens":5015,"total_tokens":5161}}

Anonymous
11/19/25(Wed)18:45:49 No.107266358

Anonymous 11/19/25(Wed)18:45:49 No.107266358

>>107256765
What a cutie!

Anonymous
11/19/25(Wed)18:51:00 No.107266408

Anonymous 11/19/25(Wed)18:51:00 No.107266408

>>107266219
The built-in chat template is faulty and got updated.
https://huggingface.co/moonshotai/Kimi-K2-Thinking/blob/main/chat_template.jinja
Download that and use --chat-template-file ./path/to/file

Anonymous
11/19/25(Wed)18:55:00 No.107266441

Anonymous 11/19/25(Wed)18:55:00 No.107266441

>>107266219
>>107266408
You should be using the fixed template provided by llama.cpp in models\templates especially if you're going to try using tool calling.
https://github.com/ggml-org/llama.cpp/pull/16932

Anonymous
11/19/25(Wed)18:56:47 No.107266457

Anonymous 11/19/25(Wed)18:56:47 No.107266457

>>107265642
Ok. If you figure it out post back. I’ll add it as an example to rentry.
For deepseek I had to post on the MP git to get a suggestion that worked.

Anonymous
11/19/25(Wed)18:58:46 No.107266473

Anonymous 11/19/25(Wed)18:58:46 No.107266473

>>107265738
Both were made with heavy metals. Lead and mercury.
>>107265719
Nostalgia is a hell of a drug.

Anonymous
11/19/25(Wed)19:06:11 No.107266533

Anonymous 11/19/25(Wed)19:06:11 No.107266533

>>107265719
>>107266473
i like current LLMs because they dont make me do a butthole verification scan unlike the AI models we have in 2045

Anonymous
11/19/25(Wed)19:14:09 No.107266615

Anonymous 11/19/25(Wed)19:14:09 No.107266615

>>107266356
its not 2024 anymore, nobody hoards reasoning traces

Anonymous
11/19/25(Wed)19:14:35 No.107266621

Anonymous 11/19/25(Wed)19:14:35 No.107266621

>>107266608
>>107266608
>>107266608

Anonymous
11/19/25(Wed)20:41:55 No.107267381

Anonymous 11/19/25(Wed)20:41:55 No.107267381

Guaranteed DGEMM Accuracy While Using Reduced Precision Tensor Cores Through Extensions of the Ozaki Scheme
https://arxiv.org/abs/2511.13778

For CUDA dev

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.