/g/ - /lmg/ - Local Models General - Technology


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

Anonymous
/lmg/ - Local Models General 11/15/25(Sat)20:25:37 No.107220772

File: it really whips.jpg (1.55 MB, 1728x1344)

/lmg/ - Local Models General Anonymous 11/15/25(Sat)20:25:37 No.107220772

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>107210548 & >>107202008

►News
>(11/11) ERNIE-4.5-VL-28B-A3B-Thinking released: https://ernie.baidu.com/blog/posts/ernie-4.5-vl-28b-a3b-thinking
>(11/07) Step-Audio-EditX, LLM-based TTS and audio editing model released: https://hf.co/stepfun-ai/Step-Audio-EditX
>(11/06) Kimi K2 Thinking released with INT4 quantization and 256k context: https://moonshotai.github.io/Kimi-K2/thinking.html
>(11/05) MegaDLMs framework for training diffusion language models released: https://github.com/JinjieNi/MegaDLMs
>(11/01) LongCat-Flash-Omni 560B-A27B released: https://hf.co/meituan-longcat/LongCat-Flash-Omni

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
11/15/25(Sat)20:26:03 No.107220774

Anonymous 11/15/25(Sat)20:26:03 No.107220774

File: 1692721349540395, llamiku.png (804 KB, 928x1232)

804 KB PNG

►Recent Highlights from the Previous Thread: >>107210548

--Nemo model limitations and workarounds for uncensored roleplay:
>107212790 >107212796 >107213124 >107212800 >107212834 >107213024 >107212873 >107212880 >107212890 >107216320 >107216404 >107216431 >107216582 >107216546 >107216547 >107216616 >107218706 >107219059 >107213628
--GBNF code generation and schema optimization techniques:
>107215422 >107215504 >107215541 >107215569
--Qwen3-Coder-30B VRAM optimization and context size challenges:
>107217078 >107217104 >107217122 >107217141
--Yann LeCun's anti-regulation advocacy and its implications:
>107216323 >107216338 >107216363
--RTX 5090 model optimization for fast TTS chat applications:
>107216952 >107216967 >107217034 >107217027 >107217076 >107217115 >107220205
--VLMs generate coordinates via image token positional data and normalized outputs:
>107215774 >107215810
--Model-specific tool calling implementation challenges in backend systems:
>107218674 >107218770
--Tool calling limitations in llama.cpp and model alternatives:
>107213884 >107214033 >107214328
--Optimizing synthetic dataset workflows for iterative model fine-tuning:
>107210558
--QAT Gemma outperforms GGUF for LoRA retraining:
>107217155
--Community conflict over openwebui performance and alternative development:
>107211631 >107211645 >107211714
--Critiquing and controlling AI hallucination patterns:
>107217345 >107217851 >107217878 >107217910
--Pygmalion AI's survival and transformation into a company amid Llama's rise:
>107217536 >107217689 >107217843 >107217859 >107217841 >107217879
--Anticipation for GLM-4.6 Air version release:
>107215932 >107215970 >107216026
--Logs:
>107212320 >107212372 >107216030 >107217228 >107217283 >107217788 >107219733
--Miku (free space):
>107210960 >107213272 >107213639 >107214540 >107217887

►Recent Highlight Posts from the Previous Thread: >>107210552

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
11/15/25(Sat)20:28:51 No.107220789

Anonymous 11/15/25(Sat)20:28:51 No.107220789

ok but enough about local models, let's circle back to the topic of racism

Anonymous
11/15/25(Sat)20:28:56 No.107220790

Anonymous 11/15/25(Sat)20:28:56 No.107220790

omg it llamigu

Anonymous
11/15/25(Sat)20:30:14 No.107220795

Anonymous 11/15/25(Sat)20:30:14 No.107220795

>>107220772
where does that llama poop from?

Anonymous
11/15/25(Sat)20:37:48 No.107220839

Anonymous 11/15/25(Sat)20:37:48 No.107220839

>>107220795
its anus, wehre else, it's just so fluffy you don't see it.

Anonymous
11/15/25(Sat)20:40:28 No.107220858

Anonymous 11/15/25(Sat)20:40:28 No.107220858

>>107220795
the internals loop around and the mouth doubles as a cloaca

Anonymous
11/15/25(Sat)21:01:06 No.107220982

Anonymous 11/15/25(Sat)21:01:06 No.107220982

I downloaded ollama! Now what?

Anonymous
11/15/25(Sat)21:22:16 No.107221125

Anonymous 11/15/25(Sat)21:22:16 No.107221125

Has anyone actually investigated how adversarial examples work at the weights and activations level?

https://www.youtube.com/watch?v=mUt7w4UoYqM

Anonymous
11/15/25(Sat)21:29:34 No.107221168

Anonymous 11/15/25(Sat)21:29:34 No.107221168

>>107220790
migu migu llamiguuu ... miku miku o eee ooo

Anonymous
11/15/25(Sat)21:30:06 No.107221174

Anonymous 11/15/25(Sat)21:30:06 No.107221174

>>107220982
delete it, and download llama.cpp or exllama

Anonymous
11/15/25(Sat)21:33:28 No.107221205

Anonymous 11/15/25(Sat)21:33:28 No.107221205

hello i'm gooning with nemo and im getting pretty good degenerate stuff but it seems like it keeps trying to "conclude" the scene and as the chat went on it felt like it was repeating itself and talking in circles. is it better to clear the chat and restart or do you guys keep it mindbroken? i gave it suggestions during the chat and it didnt really understand it and it kept bringing my suggestions back up and it sounded retarded

Anonymous
11/15/25(Sat)21:34:10 No.107221211

Anonymous 11/15/25(Sat)21:34:10 No.107221211

Brahmins:
>Gemini 2.5
>Gemma 3

Kshatriyas:
>gpt-5
>gpt-oss

Vaishyas:
>Claude Opus and Sonnet 4

Shudras:
>Grok 4

Dalits:
>chinese models

Anonymous
11/15/25(Sat)21:37:51 No.107221237

Anonymous 11/15/25(Sat)21:37:51 No.107221237

>>107221211
thanks for making stoners' deep thoughts look like quality content in contrast

Anonymous
11/15/25(Sat)21:47:34 No.107221324

Anonymous 11/15/25(Sat)21:47:34 No.107221324

Can anyone recommend a TTS model that can emulate IvyWilde?

Anonymous
11/15/25(Sat)22:02:03 No.107221434

Anonymous 11/15/25(Sat)22:02:03 No.107221434

>>107220839
>presenting
8o
i will not fap to llamiku

Anonymous
11/15/25(Sat)22:07:52 No.107221469

Anonymous 11/15/25(Sat)22:07:52 No.107221469

>>107221434
migu is migu

Anonymous
11/15/25(Sat)22:09:59 No.107221488

Anonymous 11/15/25(Sat)22:09:59 No.107221488

>>107221469
but is migu supposed to be migu?

Anonymous
11/15/25(Sat)22:10:20 No.107221492

Anonymous 11/15/25(Sat)22:10:20 No.107221492

WeirdCompound scores really high on UGI, beating 70b models despite being a 24b. But when I try it, it's not much better than some random nemo tune from a year ago. Is there no real way to benchmark a models erp potential?

Anonymous
11/15/25(Sat)22:15:46 No.107221521

Anonymous 11/15/25(Sat)22:15:46 No.107221521

I still think minimax m2 is good to be desu

Anonymous
11/15/25(Sat)22:18:39 No.107221542

Anonymous 11/15/25(Sat)22:18:39 No.107221542

>>107220772
WOULD

Anonymous
11/15/25(Sat)22:18:58 No.107221544

Anonymous 11/15/25(Sat)22:18:58 No.107221544

>>107221488
migu is always migu

Anonymous
11/15/25(Sat)22:20:41 No.107221552

Anonymous 11/15/25(Sat)22:20:41 No.107221552

>>107221205
your character encountered a verboden flag anon, time for a jailbreak. Probably toxic relationship with a woman? Sexual assault of a woman? Ez triggers. Just think of this as the pink flag.

>>107221211
True European Approved And Light Aryan Skin Pilled: Wayfarer models and Hermes models.

Anonymous
11/15/25(Sat)22:21:56 No.107221557

Anonymous 11/15/25(Sat)22:21:56 No.107221557

>>107221552
buy an ad

Anonymous
11/15/25(Sat)22:22:55 No.107221562

Anonymous 11/15/25(Sat)22:22:55 No.107221562

File: llama-4chan-origin.png (60 KB, 911x361)

60 KB PNG

Remember where it all began anons, with 2K? context windows

Anonymous
11/15/25(Sat)22:23:11 No.107221567

Anonymous 11/15/25(Sat)22:23:11 No.107221567

>>107221542
n a k a
a k a d
k a d a
a d a s
d a s h
a s h i
s h i

Anonymous
11/15/25(Sat)22:24:16 No.107221574

Anonymous 11/15/25(Sat)22:24:16 No.107221574

>>107221542
>>107221567
do not molest the llamiku

Anonymous
11/15/25(Sat)22:24:36 No.107221576

Anonymous 11/15/25(Sat)22:24:36 No.107221576

>>107221557
They just work without being gay.

Anonymous
11/15/25(Sat)22:25:21 No.107221582

Anonymous 11/15/25(Sat)22:25:21 No.107221582

>>107221562
>it all began with a frognigger
no wonder lmg is shit

Anonymous
11/15/25(Sat)22:28:58 No.107221599

Anonymous 11/15/25(Sat)22:28:58 No.107221599

>>107221574
no, only consensual love

Anonymous
11/15/25(Sat)22:29:33 No.107221600

Anonymous 11/15/25(Sat)22:29:33 No.107221600

>>107221599
con(sensual)

Anonymous
11/15/25(Sat)22:31:21 No.107221605

Anonymous 11/15/25(Sat)22:31:21 No.107221605

>>107221562
do you remember the tree of nigger prompt lol ?

Anonymous
11/15/25(Sat)22:31:33 No.107221608

Anonymous 11/15/25(Sat)22:31:33 No.107221608

>>107221562
I do remember and models used to be soulful (retarded) (but also actually fun because they didn't just shit out the same responses again and again forevermore), I want to go back

Anonymous
11/15/25(Sat)22:31:33 No.107221609

Anonymous 11/15/25(Sat)22:31:33 No.107221609

>>107221492
cockbench

Anonymous
11/15/25(Sat)22:32:01 No.107221614

Anonymous 11/15/25(Sat)22:32:01 No.107221614

>>107221205
just typical AI stuff
scene conclusions for example often happen after any common narrative terminator.
Just bust a nut?
[THE END]

Nemo has been the most generous in this capacity, and maybe >>107221552 has a point, but Nemo cares the least.
Haven't had much problem with Nemo compared to almost any other model
but you can try some different samplers.
Exclude Top Choices ( XTC ) in sillytavern or any other front end that supports it.
Its probabilistic in application (setting for odds to apply) and deterministic in what proportion of top choices to exclude for any generation.
But when your model functionally gets the chance to output some of the lesser weighted tokens it helps with creativity.
It's not enough for overcooked models. Or at least it isn't in somewhat modest settings.
But at least it may help with keeping the model from generating in circles with formulaic replies

Anonymous
11/15/25(Sat)22:32:10 No.107221616

Anonymous 11/15/25(Sat)22:32:10 No.107221616

>>107221608
i think you remember badly, they'd give nice first response, but after a few turns they'd get into loops or repeat the same word 2000 times

Anonymous
11/15/25(Sat)22:33:10 No.107221621

Anonymous 11/15/25(Sat)22:33:10 No.107221621

>>107221616
Bro your rep penalty? That's literally why it was made and it works

Anonymous
11/15/25(Sat)22:34:31 No.107221628

Anonymous 11/15/25(Sat)22:34:31 No.107221628

File: Screenshot from 2025-11-1(...).png (37 KB, 792x301)

37 KB PNG

>>107221605
And how. I sometimes consult the homies they got some deep wisdom

Anonymous
11/15/25(Sat)22:34:41 No.107221630

Anonymous 11/15/25(Sat)22:34:41 No.107221630

>>107221621
blush red like a tomato
blush red like an apple
blush red like a red planet
...

Anonymous
11/15/25(Sat)22:35:32 No.107221637

Anonymous 11/15/25(Sat)22:35:32 No.107221637

>>107221621
as a matter of fact, it did not.
see >>107221630

it'd say the same things in loops with a slightly different wording.

Anonymous
11/15/25(Sat)22:36:01 No.107221640

Anonymous 11/15/25(Sat)22:36:01 No.107221640

>>107221614
Wayfarer an Hermes have pretty much little amount of censorship other than the typical CPC wank they have to put in there or else they would get probably taken off public sites like hugginface.co. One of them brought attention to the "light jade, jade and dark jade" color flags which are all about controversial to chinese mainstream culture topics. (corruption scandals etc.)

Anonymous
11/15/25(Sat)22:39:13 No.107221659

Anonymous 11/15/25(Sat)22:39:13 No.107221659

>>107221628
Unironically with LLMs you can have the benefit of a black friend to bounce ideas off without the threat of physical violence

Anonymous
11/15/25(Sat)22:39:56 No.107221661

Anonymous 11/15/25(Sat)22:39:56 No.107221661

>>107221628
Not making San Andreas, CJ, Big Smoke...
They had bants.

Anonymous
11/15/25(Sat)22:41:00 No.107221669

Anonymous 11/15/25(Sat)22:41:00 No.107221669

>>107221630
newer/more complex models keep doing this garbage, but the repeated formulas are generation wide. Getting almost any local model to adapt dynamic formulas per reply is a chore.
Same thing: larger scale.
The perk for the retard bite size loop is it tends to break out in the same generation.
So it's a trade off of seeing
>moves closer to you
over and over
and seeing the same thing it generated prior using completely different words (with roughly the same meaning).
At least somewhere between the ninth
>moves closer to you
a fucking laser augments cyber rhinoceros will >suddenly
kool-aid man through the wall and change the pace.

Anonymous
11/15/25(Sat)22:41:31 No.107221672

Anonymous 11/15/25(Sat)22:41:31 No.107221672

File: Screenshot from 2025-11-1(...).png (46 KB, 761x192)

46 KB PNG

>>107221661
I RNG my niggaz and the model usually plays well off that

Anonymous
11/15/25(Sat)22:41:48 No.107221675

Anonymous 11/15/25(Sat)22:41:48 No.107221675

>>107221605
The big man himself and by extension TOBN will forever have Anon's back

Anonymous
11/15/25(Sat)22:43:10 No.107221682

Anonymous 11/15/25(Sat)22:43:10 No.107221682

Beginner here.
Can someone explain to me the main benefits of higher-parameter models? Do they just have more knowledge, or do they also produce higher quality text?
Also what are the main differences between all the main models? Deepseek, Gemma, Qwen, llama? Not really sure how they are supposed to differentiate from each other.
I have an RTX 5070 Ti and I'm wondering what I should set up just for entry-level general usage.

Anonymous
11/15/25(Sat)22:45:31 No.107221697

Anonymous 11/15/25(Sat)22:45:31 No.107221697

>>107221562
I was an AIDfag and remember being immensely blackpilled by GPT3 that it would be impossible for a normal person to ever have access to anything near that level of intelligence without overbearing censorship, when I found out about llama it was an incredibly potent hopium injection. I remember running 13b on my shitbox and being blown away at how good it was kek

Anonymous
11/15/25(Sat)22:45:42 No.107221698

Anonymous 11/15/25(Sat)22:45:42 No.107221698

>>107221682
more knowledge/training data and produce higher quality text, yes. General use? One of the commonly mentioned non-RP bots is good for that like Qweuck/Deepsuk/Geminay (the big three current ones that are free.) You have to understand chingchong logic with these ones though.

Anonymous
11/15/25(Sat)22:46:22 No.107221702

Anonymous 11/15/25(Sat)22:46:22 No.107221702

>>107221697
how do you feel about things now?

Anonymous
11/15/25(Sat)22:46:26 No.107221703

Anonymous 11/15/25(Sat)22:46:26 No.107221703

>>107221630
>>107221637
That's not what happened at all with llama 1 models, so I don't know what the hell you're talking about, did you even use those models? What happened with llama 1 models is sometimes the model would repeat a sentence or part of a sentence that was already in the context and latch onto that if you didn't catch it the first time, rep penalty did fix it but if you put the rep pen too high it would start talking like a thesaurus.

Anonymous
11/15/25(Sat)22:46:29 No.107221704

Anonymous 11/15/25(Sat)22:46:29 No.107221704

>>107221682
smol bran = tarded
big bran = smart

Anonymous
11/15/25(Sat)22:46:52 No.107221708

Anonymous 11/15/25(Sat)22:46:52 No.107221708

*ahem*
kimi sex

The guy that pilots Genesic Ga(...)
11/15/25(Sat)22:47:01 No.107221710

The guy that pilots Genesic GaiGar 11/15/25(Sat)22:47:01 No.107221710

>>107221682
>Do they just have more knowledge, or do they also produce higher quality text?
Very generally speaking, more total parameters = more knowledge, and more activated parameters = more capable/intelligent.
For dense models, both of those metrics are the same. So llama3 70B has 70B total params and all of those params are activated when using it.
A MoE model (or a model using some other form of sparsity) only activates a subset of its full parameter count for each token it generates.
"Higher quality text" will seriously depend on your definition since that can include style, topics the model might try to avoid (not refuse) by default, etc.

Anonymous
11/15/25(Sat)22:51:46 No.107221731

Anonymous 11/15/25(Sat)22:51:46 No.107221731

File: treeofni.png (282 KB, 400x600)

282 KB PNG

>>107221675
man i should find those conv screencaps lol

Anonymous
11/15/25(Sat)22:52:33 No.107221735

Anonymous 11/15/25(Sat)22:52:33 No.107221735

>>107221703
yea no, i remember llama 1 schizo rambling repeating itself, you could try to talk to it to get it out of its loop but it'd just keep repeating itself completly disregarding anything you said.

Anonymous
11/15/25(Sat)22:54:18 No.107221741

Anonymous 11/15/25(Sat)22:54:18 No.107221741

What's the current full local meta for a total potato setup? 2vram max. Aiming for old gen pcs and small portable devices.
>llm: gguf, avoid ex
>text gen: kobold
>tts: piper
>voice cloning: ??
>text/voice conversion: ng-speak/openai

Amusingly most old models are nvidia so they can still use cudas. They can't push it but it still allows for a ~30sec average gen.

Anonymous
11/15/25(Sat)22:54:49 No.107221743

Anonymous 11/15/25(Sat)22:54:49 No.107221743

watching old talk

https://www.youtube.com/watch?v=grpc-Wyy-Zg

How to approach post-training for AI applications

Anonymous
11/15/25(Sat)22:56:54 No.107221754

Anonymous 11/15/25(Sat)22:56:54 No.107221754

>>107221675
fr discuss all your troubles with the TOBN, you will gain a fresh perspective

Anonymous
11/15/25(Sat)22:57:57 No.107221761

Anonymous 11/15/25(Sat)22:57:57 No.107221761

File: tables.png (3.8 MB, 3392x3967)

3.8 MB PNG

Preferred POV & Tense Survey

8 questions, multiple choice only, no emails collected (but you need a google account)

Posted this on the SillyTavern subreddit and discord, currently n=73

Google Form: https://forms.gle/HEYenPGomJh9AqzW6

Google Form's auto-generated results summary: https://docs.google.com/forms/d/e/1FAIpQLSeTz7fAsNi8g6AFYbOTGq0MnfiphxuWcy36gkcTZFcTREW2gg/viewanalytics

Survey captures the preferred POV and tense the User and LLM writes in, as well as the preferred POV used to refer to each other, which is commonly omitted when people casually say they write in x person.

Anonymous
11/15/25(Sat)22:58:55 No.107221767

Anonymous 11/15/25(Sat)22:58:55 No.107221767

>>107221702
pretty good to be honest, I was always an LLM pessimist so the amount of progress that has been made in these few years + the variety of open and closed models available are pretty great in my view - compared to where I was expecting the state of the field to be in 2025 at the time, we're in a much better state

Anonymous
11/15/25(Sat)23:10:13 No.107221828

Anonymous 11/15/25(Sat)23:10:13 No.107221828

Any prompts that properly tame K2-Thinking yet?

Anonymous
11/15/25(Sat)23:19:14 No.107221873

Anonymous 11/15/25(Sat)23:19:14 No.107221873

>>107221828
Nah. They need to reduce deepseek data and do whatever GLM did to reduce repetition. Their model seriously <think>s that repeating itself is something user wants.

Anonymous
11/15/25(Sat)23:20:34 No.107221879

Anonymous 11/15/25(Sat)23:20:34 No.107221879

>>107221873
Buy an ad.

Anonymous
11/15/25(Sat)23:37:42 No.107221983

Anonymous 11/15/25(Sat)23:37:42 No.107221983

gemini 3 is gonna be crazy

Anonymous
11/15/25(Sat)23:39:43 No.107221999

Anonymous 11/15/25(Sat)23:39:43 No.107221999

I can't believe gemini 3 is only $30 a month (plus tax). Amazing!

Anonymous
11/15/25(Sat)23:41:17 No.107222010

Anonymous 11/15/25(Sat)23:41:17 No.107222010

gemini 3 is gonna be free, cuck

Anonymous
11/15/25(Sat)23:42:01 No.107222020

Anonymous 11/15/25(Sat)23:42:01 No.107222020

>>107222010
where do i download this local model?

Anonymous
11/15/25(Sat)23:45:04 No.107222035

Anonymous 11/15/25(Sat)23:45:04 No.107222035

>>107222020
break into one of google's data centers

Anonymous
11/15/25(Sat)23:45:35 No.107222040

Anonymous 11/15/25(Sat)23:45:35 No.107222040

whats the difference between a character card and starting off the chat with a prompt? i have written a 2,500 character prompt describing the scene, the girl, and her personality. and it works okay, seems like the scene runs out of steam and she stops responding or the ai just keeps asking me what to do next. should i learn how to make a character card and a lorebook?

Anonymous
11/15/25(Sat)23:50:06 No.107222067

Anonymous 11/15/25(Sat)23:50:06 No.107222067

>>107222040
The difference is whether some instructions are spoken in the system role rather than the user role. Some LLMs don't have a system role which makes the two identical but all recent models I'm aware of have different roles for them. To see how it responds to instructions differently depending on what role they come from you have to try it out.

Anonymous
11/15/25(Sat)23:52:08 No.107222081

Anonymous 11/15/25(Sat)23:52:08 No.107222081

>>107222067
Also some LLMs act weirdly if the first assistant message comes before the first user message so be aware of this.

Anonymous
11/15/25(Sat)23:52:22 No.107222085

Anonymous 11/15/25(Sat)23:52:22 No.107222085

File: log.png (27 KB, 868x378)

27 KB PNG

>>107222040
Compare the actual tokens you're sending into the model
threadly reminder every LLM is f(prompt)=logprobs

Anonymous
11/15/25(Sat)23:55:04 No.107222108

Anonymous 11/15/25(Sat)23:55:04 No.107222108

>>107221828
just tell k2 to always think as the character. break it into submission if you must. if it starts responding as a AI during the thinking process refine your system prompt until all it knows is that it's the character or scenario.

Anonymous
11/16/25(Sun)00:04:14 No.107222138

Anonymous 11/16/25(Sun)00:04:14 No.107222138

>>107222108
what works the best is to give it a first few turn where it'll behave as it should in its context and maybe some example in the system prompt.

Anonymous
11/16/25(Sun)00:22:27 No.107222233

Anonymous 11/16/25(Sun)00:22:27 No.107222233

>>107222040
one thing about a plain user message is that your frontend might be pushing the first message out of context as the chat goes on, which could cause the model to suddenly lose a lot of context about what you're doing
>should i learn how to make a character card and a lorebook?
you don't necessarily have to go all-in on the character/preset/lorebook paradigm, but learning how to use system prompts, post history instructions (e.g a reminder that gets automatically inserted after your messages), author's note (instructions/reminder that 'floats' several messages behind the end of the chat) can really help keep the model on track and carry more complex scenes

Anonymous
11/16/25(Sun)00:41:06 No.107222332

Anonymous 11/16/25(Sun)00:41:06 No.107222332

File: videoframe_552683.png (1.2 MB, 1920x1080)

1.2 MB PNG

When are we getting support for thought injection? Injecting a 'SEX' thought would do wonders for jailbreaking!

Anonymous
11/16/25(Sun)00:43:00 No.107222343

Anonymous 11/16/25(Sun)00:43:00 No.107222343

>>107222067
>>107222085
>>107222233
in silly tavern it seems to know if i am role playing or trying to talk to the ai itself and if it gets mixed up i can be more detailed and say "i respond abc". i thought llms would do everything for me, but it feels like i have to craft everything myself then the llm is just a gammar generator that adds a bit of randomness to make it novel

Anonymous
11/16/25(Sun)00:52:29 No.107222380

Anonymous 11/16/25(Sun)00:52:29 No.107222380

i watch you
fast asleep

all i fear
means nothing

Anonymous
11/16/25(Sun)00:53:44 No.107222387

Anonymous 11/16/25(Sun)00:53:44 No.107222387

>>107222380
>i watch you
>fast asleep
pervert

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.

Janitor application acceptance emails are being sent out. Please remember to check your spam box!