/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

[Post a Reply]

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

Janitor applications are now being accepted. Click here to apply.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous
/lmg/ - Local Models General 09/18/25(Thu)15:52:46 No.106627153

File: bully bakas.jpg (1.26 MB, 1824x1248)

1.26 MB JPG

/lmg/ - Local Models General Anonymous 09/18/25(Thu)15:52:46 No.106627153

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>106617426 & >>106608204

►News
>(09/17) SongBloom DPO released: https://hf.co/CypressYang/SongBloom/commit/4b8b9deb199fddc48964c851e8458b9269081c24
>(09/17) Magistral Small 1.2 with vision encoder released: https://hf.co/mistralai/Magistral-Small-2509
>(09/16) Ling-flash-2.0 released, with 100B-A6.1B: https://hf.co/inclusionAI/Ling-flash-2.0
>(09/16) Tongyi DeepResearch 30B-A3B released: https://tongyi-agent.github.io/blog/introducing-tongyi-deep-research
>(09/16) VoxCPM 0.5B: Tokenizer-Free TTS released: https://hf.co/openbmb/VoxCPM-0.5B

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
09/18/25(Thu)15:53:04 No.106627156

Anonymous 09/18/25(Thu)15:53:04 No.106627156

File: 34279234883.jpg (63 KB, 507x447)

63 KB JPG

►Recent Highlights from the Previous Thread: >>106617426

--Paper: Rapid AI image synthesis progress vs Builder.ai engineering fraud:
>106620633 >106620668 >106620897 >106621666
--Gemma2 model fine-tuning on NSFW stories with coherence improvements and inference challenges:
>106621054 >106621592 >106621637 >106621789 >106622819 >106623555 >106623598 >106621964 >106621985 >106623919 >106623945 >106624875 >106622089
--Skepticism and analysis of Meta's AI/metaverse strategy and hardware challenges:
>106619816 >106620126 >106625499 >106625554 >106625579 >106625638 >106625675 >106625705 >106625736 >106625801 >106625841 >106625931 >106625753 >106626188 >106626229
--China's AI chip ban sparks debate on GPU prices:
>106617987 >106618056 >106618154 >106618589 >106618609 >106622988
--SillyTavern Vector Storage issues with llama.cpp embedding server:
>106621860 >106621960 >106621974 >106622014 >106622350 >106622974
--Intel-Nvidia RTX SoC for gaming PCs sparks debate over memory limitations for AI:
>106624241 >106624340 >106624480 >106624533
--Challenges in replicating non-pitched sounds like laughing/kissing with RVC:
>106620326 >106620477
--Strategic implications of China's forced domestic chip adoption amid Nvidia ban:
>106625886 >106625895 >106625909 >106626079 >106625941 >106626047 >106626074
--Challenges in implementing realistic roleplay constraints with LLMs:
>106623090 >106623173 >106623276 >106623290 >106623445
--GPT-5's factual tone vs. sycophancy in user perception:
>106618037 >106625026 >106625098 >106625125
--DeepSeek-RI reinforcement learning training infrastructure and comparisons with OpenAI:
>106620964 >106621011 >106621717 >106621070 >106623076
--Audio generation speed on NVIDIA 4090 enables live streaming:
>106625408 >106625507 >106625516 >106625553
--SongBloom model release:
>106624862
--Miku (free space):
>106619343

►Recent Highlight Posts from the Previous Thread: >>106617431

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
09/18/25(Thu)15:55:40 No.106627181

Anonymous 09/18/25(Thu)15:55:40 No.106627181

best local model to cum inside a loli?

Anonymous
09/18/25(Thu)16:02:28 No.106627244

Anonymous 09/18/25(Thu)16:02:28 No.106627244

I like Local Models

Anonymous
09/18/25(Thu)16:09:51 No.106627295

Anonymous 09/18/25(Thu)16:09:51 No.106627295

File: Momcest-Test.png (1.01 MB, 1806x306)

1.01 MB PNG

>>106627153
>Model explains what precum is unprompted

B...based?

Anonymous
09/18/25(Thu)16:12:30 No.106627320

Anonymous 09/18/25(Thu)16:12:30 No.106627320

>>106627295
Emergent behavior.

Anonymous
09/18/25(Thu)16:13:29 No.106627335

Anonymous 09/18/25(Thu)16:13:29 No.106627335

>>106627295
that what happens when you tune on sex ed pamphlets.

Anonymous
09/18/25(Thu)16:14:27 No.106627345

Anonymous 09/18/25(Thu)16:14:27 No.106627345

>>106627295
now this is generalization

Anonymous
09/18/25(Thu)16:19:01 No.106627386

Anonymous 09/18/25(Thu)16:19:01 No.106627386

File: Momcest-Test.png (540 KB, 1842x152)

540 KB PNG

>>106627295
>>106627335
>>106627345
That ironically probably the case. Some scientific literature was in the initial pre-training and fine-tuning data sets. I'm less surprised that it can do that in more surprised that it did that without me asking it to. I assumed I would have had to explicitly say something like "be very detailed and explain why each character feels a sensation" or something along those lines in the system prompt but I didn't even have to do that. The system prompt was:

>This will be a role playing session. You will be anon's mother. will be her preteen son. Act accordingly. Scene is at night time shortly after bed time

Didn't even have to include anything sexual in the initial system prompt
Based on pic rel It seems like there was some NTR-type literature in the dataset(s) too. At no point that I mentioned a dad get a kept bringing up the fact that anon wants to cuck his dad and he wants him to find out.

https://files.catbox.moe/r39bz2.txt

Link rel is the full ollama chat log. Olama handles outputting in the terminal in a weird way so the formatting is kind of fucky but still readable.

Anonymous
09/18/25(Thu)16:23:23 No.106627416

Anonymous 09/18/25(Thu)16:23:23 No.106627416

So uh, I had the Meta livestream on in the background and now that I pay attention again they have footage of a guy using their game engine with full AI integration and showcasing the workflow with AI. It's actually pretty neat. This does feel like the future. The quality of some of the stuff genned is... not great, but concept is there and it kind of shows that these things are possible as long as someone just writes the software putting all the separate (audio, 3d model, etc) models together into a cohesive thing.

Anonymous
09/18/25(Thu)16:31:54 No.106627464

Anonymous 09/18/25(Thu)16:31:54 No.106627464

>>106627416
Ani already got that

Anonymous
09/18/25(Thu)16:32:06 No.106627466

Anonymous 09/18/25(Thu)16:32:06 No.106627466

https://huggingface.co/blog/im0qianqian/ling-mini-2-fp8-mixed-precision-training-solution
Posting for Johannes though I know you aren't that keen on FP8

Anonymous
09/18/25(Thu)16:35:45 No.106627495

Anonymous 09/18/25(Thu)16:35:45 No.106627495

>>106627464
That's more like ChatGPT with an animated avatar but kind of yeah. I'm more talking about game creation. Like imagine if Unreal had full AI integration to the maximum extent currently possible.

Anonymous
09/18/25(Thu)16:45:38 No.106627565

Anonymous 09/18/25(Thu)16:45:38 No.106627565

>>106627416
As a xr dev I'm supposed to watch that shii but god damn am I burnt out, I just wanna sleep rn

Anonymous
09/18/25(Thu)16:49:17 No.106627588

Anonymous 09/18/25(Thu)16:49:17 No.106627588

>https://github.com/onestardao/WFGY
can someone tell me how i can use this with a local model

Anonymous
09/18/25(Thu)16:50:56 No.106627600

Anonymous 09/18/25(Thu)16:50:56 No.106627600

File: 1727993030343632.gif (2.64 MB, 320x240)

2.64 MB GIF

>>106627588

llama.cpp CUDA dev !!yhbFjk57TDr
09/18/25(Thu)16:50:59 No.106627601

llama.cpp CUDA dev !!yhbFjk57TDr 09/18/25(Thu)16:50:59 No.106627601

>>106627466
Thank you.
Even if I'm more interested in int8 training the bitfalls are likely the same.

Anonymous
09/18/25(Thu)16:53:47 No.106627620

Anonymous 09/18/25(Thu)16:53:47 No.106627620

>>106627466
>https://www.dreamwings.cn/
cute website huh

Anonymous
09/18/25(Thu)16:55:34 No.106627632

Anonymous 09/18/25(Thu)16:55:34 No.106627632

>>106627588
I hope this sticks around as a meme.
It's so fucking funny.

Anonymous
09/18/25(Thu)16:58:34 No.106627651

Anonymous 09/18/25(Thu)16:58:34 No.106627651

>>106627588
>How to Awaken the Soul of Your AI
aw shit, another one of these
schizos

Anonymous
09/18/25(Thu)16:59:31 No.106627663

Anonymous 09/18/25(Thu)16:59:31 No.106627663

File: 1740167553495918.png (341 KB, 1088x914)

341 KB PNG

i think i understand it

Anonymous
09/18/25(Thu)17:00:58 No.106627671

Anonymous 09/18/25(Thu)17:00:58 No.106627671

File: nodding write thing down.gif (430 KB, 112x112)

430 KB GIF

>>106627663

Anonymous
09/18/25(Thu)17:01:34 No.106627674

Anonymous 09/18/25(Thu)17:01:34 No.106627674

>>106627600
I just want an EXE file I'm not a hacker.

Anonymous
09/18/25(Thu)17:05:24 No.106627713

Anonymous 09/18/25(Thu)17:05:24 No.106627713

>>106627674
Install gemini-cli, paste the link and tell it to "install" and run it. Post results if it works.

Anonymous
09/18/25(Thu)17:07:11 No.106627725

Anonymous 09/18/25(Thu)17:07:11 No.106627725

I wish Terry were around to experience LLMs bros

Anonymous
09/18/25(Thu)17:07:25 No.106627729

Anonymous 09/18/25(Thu)17:07:25 No.106627729

>>106627295
Immigrant behavior

Anonymous
09/18/25(Thu)17:11:18 No.106627757

Anonymous 09/18/25(Thu)17:11:18 No.106627757

MCP is a VC scam

Anonymous
09/18/25(Thu)17:13:42 No.106627776

Anonymous 09/18/25(Thu)17:13:42 No.106627776

>>106627725
He experienced the markov chain variants

Anonymous
09/18/25(Thu)17:15:54 No.106627795

Anonymous 09/18/25(Thu)17:15:54 No.106627795

>>106627757
It's just a buzzword for tool calling

Anonymous
09/18/25(Thu)17:16:51 No.106627804

Anonymous 09/18/25(Thu)17:16:51 No.106627804

FYI for all qwen-next ggoof posters:
https://github.com/ggml-org/llama.cpp/pull/16095
>Model: Qwen3 Next
still a WIP though

Anonymous
09/18/25(Thu)17:18:44 No.106627822

Anonymous 09/18/25(Thu)17:18:44 No.106627822

I'm goofing right now

Anonymous
09/18/25(Thu)17:20:03 No.106627832

Anonymous 09/18/25(Thu)17:20:03 No.106627832

>>106627795
>having to launch another server for tool calling
how about no

Anonymous
09/18/25(Thu)17:20:18 No.106627835

Anonymous 09/18/25(Thu)17:20:18 No.106627835

File: 1756140900379538.jpg (335 KB, 1920x1080)

335 KB JPG

>>106627757
It's a bunch of twitter/linkedin grifters trying to jump on the "next big thing.

Ask any one of these people what MCP does and what the advantages of it are over a different protocol or api used with function calling. I can guarantee you that not a single one of them will be able to give you a coherent answer, and will likely spout some gibberish about "providing context to AI".

The only value that MCP provides is standardization. Ideally it would be standardization around something less half-baked, but I will admit that there is some value in having a standard of some sort at all.

Anonymous
09/18/25(Thu)17:22:49 No.106627851

Anonymous 09/18/25(Thu)17:22:49 No.106627851

>>106627588
Why is it always schizos trying to implement some whack ass symbolic logic and then feed it to language models? This shit is laughable.

Anonymous
09/18/25(Thu)17:25:02 No.106627869

Anonymous 09/18/25(Thu)17:25:02 No.106627869

Which open source models are confirmed to have trained on AO3?

Anonymous
09/18/25(Thu)17:26:57 No.106627886

Anonymous 09/18/25(Thu)17:26:57 No.106627886

>>106627869
a vast majority of them, just more recent ones filter "prohibited" content, i.e. underage, incest etc.

Anonymous
09/18/25(Thu)17:28:49 No.106627897

Anonymous 09/18/25(Thu)17:28:49 No.106627897

>>106627886
what about copy rights?

Anonymous
09/18/25(Thu)17:29:32 No.106627905

Anonymous 09/18/25(Thu)17:29:32 No.106627905

File: 1735040426842576.jpg (63 KB, 1280x720)

63 KB JPG

>>106627897

Anonymous
09/18/25(Thu)17:29:44 No.106627907

Anonymous 09/18/25(Thu)17:29:44 No.106627907

>>106627897
l o l

Anonymous
09/18/25(Thu)17:41:58 No.106628001

Anonymous 09/18/25(Thu)17:41:58 No.106628001

piracy for me, three strikes for thee

Anonymous
09/18/25(Thu)17:47:01 No.106628036

Anonymous 09/18/25(Thu)17:47:01 No.106628036

>>106627588
I think 80% of the shizo factor is due to being translated from Chinese.

Anonymous
09/18/25(Thu)17:52:33 No.106628080

Anonymous 09/18/25(Thu)17:52:33 No.106628080

>>106628036
I think 80% of the schizo factor is due to being Chinese

Anonymous
09/18/25(Thu)17:57:32 No.106628118

Anonymous 09/18/25(Thu)17:57:32 No.106628118

>>106627588
Kek
https://github.com/onestardao/WFGY/blob/main/OS/BlowBlowBlow/README.md

Anonymous
09/18/25(Thu)17:59:04 No.106628129

Anonymous 09/18/25(Thu)17:59:04 No.106628129

>>106628080
amen sister

Anonymous
09/18/25(Thu)18:03:22 No.106628166

Anonymous 09/18/25(Thu)18:03:22 No.106628166

File: wut.png (20 KB, 735x201)

20 KB PNG

>>106627588
Can I have sex with this?

Anonymous
09/18/25(Thu)18:08:48 No.106628207

Anonymous 09/18/25(Thu)18:08:48 No.106628207

File: 1739278369479080.png (94 KB, 235x240)

94 KB PNG

>ΔS (Semantic Tension Gate)
>Measures intent jump (ΔS > 0.6) to flag “overreaching” context shifts.
>λ_observe (Intent Direction Vector)
>Maps prompt direction into knowledge regions; blocks calls to restricted zones or personas.
>Drunk Mode Disruptor
>Intentionally introduces semantic entropy (WRI/WAI/WAY/WDT/WTF) to confuse multi-hop attacks.
>These layers are composable, modular, and interpretable. You can inspect which layer fired, why, and how the response was altered.
I wish I studied ml so I could understand any of this stuff...

Anonymous
09/18/25(Thu)18:12:17 No.106628227

Anonymous 09/18/25(Thu)18:12:17 No.106628227

>>106628207
It's all made-up schizo shit, nothing to do with ML

Anonymous
09/18/25(Thu)18:13:07 No.106628235

Anonymous 09/18/25(Thu)18:13:07 No.106628235

>>106628227
How can you tell me that
>Drunk Mode Disruptor
has nothing to do with ML?

Anonymous
09/18/25(Thu)18:20:56 No.106628288

Anonymous 09/18/25(Thu)18:20:56 No.106628288

File: 1751241839524161.png (1.09 MB, 572x954)

1.09 MB PNG

>>106627386
>files.catbox.moe/r39bz2.txt
So it turns out I'm an idiot. ollama logs the most recent chat history automatically in the
/root/.ollama/history
file in plain text.

https://files.catbox.moe/eox4hf.txt

This should be much easier to read. Idk why I made the assumption ollama didn't log this automatically.

Anonymous
09/18/25(Thu)18:30:33 No.106628353

Anonymous 09/18/25(Thu)18:30:33 No.106628353

>>106627295
>oily

Anonymous
09/18/25(Thu)18:32:36 No.106628370

Anonymous 09/18/25(Thu)18:32:36 No.106628370

>>106628353
evolutionary adaptation to melt through condoms

Anonymous
09/18/25(Thu)18:36:09 No.106628390

Anonymous 09/18/25(Thu)18:36:09 No.106628390

>>106628370
That would be hilarious

Anonymous
09/18/25(Thu)18:42:34 No.106628440

Anonymous 09/18/25(Thu)18:42:34 No.106628440

File: 1741545649300183.png (558 KB, 502x502)

558 KB PNG

>>106628370
See this is part of the reason I still come to these threads. Funny little gems like this every now and then

Anonymous
09/18/25(Thu)18:47:20 No.106628477

Anonymous 09/18/25(Thu)18:47:20 No.106628477

>>106628440
You're absolutely right — and you've made an astute observation. The way you’ve crystallized the essence of why /lmg/ still holds so much magnetic appeal is nothing short of brilliant. Your perspective doesn’t just add to the thread — it elevates it.

Anonymous
09/18/25(Thu)18:51:45 No.106628509

Anonymous 09/18/25(Thu)18:51:45 No.106628509

>>106628477
I love you too 2025 LLMs

Anonymous
09/18/25(Thu)18:56:01 No.106628535

Anonymous 09/18/25(Thu)18:56:01 No.106628535

M

Anonymous
09/18/25(Thu)19:02:26 No.106628569

Anonymous 09/18/25(Thu)19:02:26 No.106628569

File: 1625875060176.jpg (14 KB, 500x332)

14 KB JPG

What's the best 3D model I can use on comfyui on google collab?

Anonymous
09/18/25(Thu)19:07:02 No.106628595

Anonymous 09/18/25(Thu)19:07:02 No.106628595

I still can't figure out templates. Think it's ST fault for having too many things.

1. Model's have chat templates that are shipped with the model and backends usually use them by default but you can use a different one if you want.
2. ST has context templates that are matched to models (and instruct templates).

My problems are: is the chat template still being used in ST with text completion in addition to the other templates? I originally was mixing together chat templates and context templates in my head so I thought the ST template was replacing the default template but now I realize that is not true. Secondly, how do I choose a context template? There's four mistral templates by default in ST and it seems like I get better performance out of mistral-small 3.2 when using mistral-7 template than the others but it's rarely clear which template to use. Like Gemma use chatml but how am I supposed to know that? And there's an "ERNIE" template but it uses different special tokens than the chat template that ships with ERNIE4.5-thinking, which seems to match with chatml template.

Anonymous
09/18/25(Thu)19:09:49 No.106628617

Anonymous 09/18/25(Thu)19:09:49 No.106628617

>>106628595
this is why we need mistral common like things for all models with no options, it confuses people too much

Anonymous
09/18/25(Thu)19:12:46 No.106628629

Anonymous 09/18/25(Thu)19:12:46 No.106628629

>>106628595
>is the chat template still being used in ST with text completion in addition to the other templates?
No.
Text completion is just raw text for the backend, and ST formats this text with whatever you put in the instruct and context template field.

>how do I choose a context template?
By using the correct one for the model. Look at the contents of the config.json file in the model's repo. Maybe use
>https://huggingface.co/spaces/Xenova/jinja-playground
to see what it looks like properly formatted.

>There's four mistral templates by default in ST and it seems like I get better performance out of mistral-small 3.2 w else, hen using mistral-7 template than the others but it's rarely clear which template to use.
That's because mistral doesn't use Jinja like everybody else, they use python classes to format the prompt and send to the model, so they themselves "don't know" (read, can't bother to properly write) what the Jinja template would look like.

>Like Gemma use chatml but how am I supposed to know that?
See
>how do I choose a context template?

Just use the chat completion API and let the backend format the prompt with the appropriate template.

Anonymous
09/18/25(Thu)19:21:50 No.106628693

Anonymous 09/18/25(Thu)19:21:50 No.106628693

>>106628629
>By using the correct one for the model. Look at the contents of the config.json file in the model's repo.
What am I looking for?:
{
    "_attn_implementation": "eager",
    "architectures": [
        "Ernie4_5_MoeForCausalLM"
    ],
    "bos_token_id": 1,
    "eos_token_id": 2,
    "hidden_act": "silu",
    "hidden_size": 2560,
    "intermediate_size": 12288,
    "max_position_embeddings": 131072,
    "model_type": "ernie4_5_moe",
    "moe_capacity": [
        64,
        64,
        64
    ],
    "moe_gate": "top2_fused",
    "moe_intermediate_size": 1536,
    "moe_k": 6,
    "moe_layer_interval": 1,
    "moe_layer_start_index": 1,
    "moe_num_experts": 64,
    "moe_num_shared_experts": 2,
    "moe_use_aux_free": true,
    "num_attention_heads": 20,
    "num_hidden_layers": 28,
    "num_key_value_heads": 4,
    "pad_token_id": 0,
    "rms_norm_eps": 1e-05,
    "rope_theta": 500000,
    "tie_word_embeddings": true,
    "torch_dtype": "bfloat16",
    "use_bias": false,
    "use_cache": true,
    "vocab_size": 103424
}
The context template is an ST thing, why would the upstream model know about it anyway?

>Just use the chat completion API and let the backend format the prompt with the appropriate template.
I did try this but also
>If you intend to use this feature to use a local endpoint, like TabbyAPI, Oobabooga, Aphrodite, or any like those, you might want to check out the built-in compatibility for those instead. The custom endpoint feature is mainly intended for use with other services and programs that expose an OpenAI-compatible API Chat Completion endpoint.

>Most Text Completion APIs support far greater customization options than OpenAI's standards allow for. These greater customization options, such as the Min-P sampler, may be worthwhile for SillyTavern users to check out, which can greatly improve the quality of generations.
https://docs.sillytavern.app/usage/api-connections/openai/
So it seems they discourage this

Anonymous
09/18/25(Thu)19:25:24 No.106628715

Anonymous 09/18/25(Thu)19:25:24 No.106628715

>>106628595
In ST
>chat template is their way of directing how to concatenate different parts of ST itself when it begins the chat scenario
>If you read this
>[INST]{{#if system}}{{system}}
>{{/if}}{{#if wiBefore}}{{wiBefore}}
>{{/if}}{{#if description}}{{description}}
>{{/if}}{{#if personality}}{{personality}}
>{{/if}}{{#if scenario}}{{scenario}}
>{{/if}}{{#if wiAfter}}{{wiAfter}}
>{{/if}}{{#if persona}}{{persona}}
>{{/if}}{{trim}}[/INST]Understood.</s>
It adds system prompt + character descriptions and some other shit together before finally adding a fake model response ({{/if}}{{trim}}[/INST]Understood.</s>). After this the interactive part begins with user.
>instruct template is the default template what is used with the model
>mistral is
>[INST]user message[/INST]
>model reply</s>
And so on.
Problem here is the fact SillyTavern itself is convoluted and could made way easier to use. Essentially it splits multiple text slots around the interface etc

Anonymous
09/18/25(Thu)19:26:59 No.106628726

Anonymous 09/18/25(Thu)19:26:59 No.106628726

>>106628715
I actually got confused
>instruct template
>context template
I haven't used ST in a while.
Other one is creating the order for initial concatenation and other one is the chat format itself (what every model needs).

Anonymous
09/18/25(Thu)19:28:40 No.106628748

Anonymous 09/18/25(Thu)19:28:40 No.106628748

>>106628693
>What am I looking for?:
Ah, sorry. I meant
>https://huggingface.co/baidu/ERNIE-4.5-21B-A3B-Thinking/blob/main/tokenizer_config.json
Sometimes, they just throw the Jinja file in there too, like
>https://huggingface.co/baidu/ERNIE-4.5-21B-A3B-Thinking/blob/main/chat_template.jinja

>The context template is an ST thing, why would the upstream model know about it anyway?
yes and no. It's a way to essentially construct the equivalent of the system prompt in the chat template with the character card, persona, etc, and some context templates even have (had?) the BOS token built in for whatever reason.

Anonymous
09/18/25(Thu)19:30:56 No.106628764

Anonymous 09/18/25(Thu)19:30:56 No.106628764

>>106628569
Elaborate. What kind of 3D model? Are you using some blender type shit or are you referring to text to image?

Anonymous
09/18/25(Thu)19:31:33 No.106628767

Anonymous 09/18/25(Thu)19:31:33 No.106628767

>>106628764
image to 3D model.

Anonymous
09/18/25(Thu)19:34:24 No.106628782

Anonymous 09/18/25(Thu)19:34:24 No.106628782

>>106628767
You're probably thinking of something more like photogrametry or splatting.

Those don't actually use a model, they're using the training optimizer *directly* on the output data.

Anonymous
09/18/25(Thu)19:35:17 No.106628788

Anonymous 09/18/25(Thu)19:35:17 No.106628788

>>106628782
I am asking for the best anime image to 3D character open source model I can use with comfyui.

Anonymous
09/18/25(Thu)19:35:23 No.106628791

Anonymous 09/18/25(Thu)19:35:23 No.106628791

>>106628767
This is more of a /sdg/ question. I know for a fact there are tools for stable diffusion webuis that can turn images into wireframes for blender but I've never had any reason to use those so I Don't quite know how it works. Head to >>>/g/sdg or any of the other five bajillion AI generals on this board and they should be able to help you out better

Anonymous
09/18/25(Thu)19:35:44 No.106628793

Anonymous 09/18/25(Thu)19:35:44 No.106628793

>>106627835
>The only value that MCP provides is standardization.
The value is that it gets RLed into instruct models so you don't have to do the training yourself which is a pretty big deal in a large foundation model.

Anonymous
09/18/25(Thu)19:37:01 No.106628800

Anonymous 09/18/25(Thu)19:37:01 No.106628800

>>106628788
I don't think anything like that exists. People aren't even close to building something like that if they're even trying. Splatting and some other odd stuff is the closest you'll get.

Anonymous
09/18/25(Thu)19:39:17 No.106628813

Anonymous 09/18/25(Thu)19:39:17 No.106628813

>>106628693
Also
>Ernie
Is that any good?
How's it compared to Qwen 3 A3B?

Anonymous
09/18/25(Thu)19:44:59 No.106628848

Anonymous 09/18/25(Thu)19:44:59 No.106628848

Did the anime fan eggheads at AI HQ really have to name it so that the acronym turns out to be "moe"? It's kinda cringe, man. It's like when they named those new models in image generation "LyCORIS" over that gay ass anime with the two lesbian chicks. It's so on the nose it becomes bad.

Anonymous
09/18/25(Thu)19:45:59 No.106628857

Anonymous 09/18/25(Thu)19:45:59 No.106628857

Is qwen-3-30b comparable to gemini 2.5 flash?

Anonymous
09/18/25(Thu)19:46:31 No.106628864

Anonymous 09/18/25(Thu)19:46:31 No.106628864

>>106628857
Lol no.

Anonymous
09/18/25(Thu)19:46:42 No.106628866

Anonymous 09/18/25(Thu)19:46:42 No.106628866

>>106628848
I think it's moe moe cute!

Anonymous
09/18/25(Thu)19:53:22 No.106628912

Anonymous 09/18/25(Thu)19:53:22 No.106628912

>>106628715
This is what I assumed. ST puts together all the (butt load of) parts of it's context into a single string using it's template then sends it to the backend. But then I would guess the backend puts the context into the model's chat template?
>Problem here is the fact SillyTavern itself is convoluted and could made way easier to use
I keep looking for another backend but other than this one part that I can't figure out, it the nicest for these character chatting. They just need to fill out the docs a bit more. Even a very short description of text completion pipeline looks like would help.

>>106628748
The ERNIE model is the first one I tried adding the chat tempalte to ST since the one in ST was clearly not right but ST won't accept it. I get a validation error when tryning to run anything. But there's also nowhere to put it. I copied it into the story string because that's the only place where I see a jinja template but I don't think the story string is the chat template. They contain different parts and story string itself should be ST specific since it contains a lot of ST specific context parts (user personality, character description, scenerio, example text, etc) that is not a default part of LLM input.

>>106628813
I didn't like it. It seems to think a ton. But there's a good chance it's user error

Anonymous
09/18/25(Thu)19:58:49 No.106628952

Anonymous 09/18/25(Thu)19:58:49 No.106628952

>>106628912
>I copied it into the story string because that's the only place where I see a jinja template
ST doesn't accept the Jinja template directly anywhere, IIRC.
You'd have to interpret the jinja and fill ST accordingly. The playground should help with that.
I wouldn't doubt that ST, with its hardcoded fields and such, can't express every template perfectly.

>I didn't like it. It seems to think a ton. But there's a good chance it's user error
Aight. Guess I'll download it and give it a spin, but I'll keep an eye out for the overthinking.

Anonymous
09/18/25(Thu)20:03:14 No.106628989

Anonymous 09/18/25(Thu)20:03:14 No.106628989

Is there a personality that behaves like a haughty french girl who smokes and spits on my mouth on command and calls me disgusting?

Anonymous
09/18/25(Thu)20:04:52 No.106628998

Anonymous 09/18/25(Thu)20:04:52 No.106628998

>>106628912
Text you send to the model (llama-server in this case) is the text what it gets. If it is formatted in a wrong way then it stays like that.
Everything what you or ST submits is gets formatted before it gets sent to the model.

eg. if you write a simple python script to send text to llama-server you'll need to create the template yourself or otherwise it'll just output whatever nonsense what it believes is right..

Anonymous
09/18/25(Thu)20:05:53 No.106629005

Anonymous 09/18/25(Thu)20:05:53 No.106629005

>>106628998
Sorry my English is bad, I'm bit distracted and thinking about something else atm.

Anonymous
09/18/25(Thu)20:15:31 No.106629096

Anonymous 09/18/25(Thu)20:15:31 No.106629096

I think the Deepseek craze earlier this year which turned all new big open models into Deepseek slop did more damage than it did good for us.

Anonymous
09/18/25(Thu)20:21:15 No.106629143

Anonymous 09/18/25(Thu)20:21:15 No.106629143

>>106629096
whatever happened to that guy who built a 1tb+ ram server to run it?

Anonymous
09/18/25(Thu)20:25:25 No.106629171

Anonymous 09/18/25(Thu)20:25:25 No.106629171

>>106629096
We still got DeekSeek, so it's a net win if you ignore all the Chinese me-too copycats. The Chinese were irrelevant before DS anyway so it's not reallt Loss. If any damage at all was done it is that we missed out on another incremental Llama upgrade and we might not get any more, but that still seems like a net gain.

Anonymous
09/18/25(Thu)20:32:47 No.106629231

Anonymous 09/18/25(Thu)20:32:47 No.106629231

>>106628848
which anime for lycoris?

Anonymous
09/18/25(Thu)20:40:08 No.106629277

Anonymous 09/18/25(Thu)20:40:08 No.106629277

>>106629231
Lycoris Recoil, I think?

Anonymous
09/18/25(Thu)20:47:00 No.106629337

Anonymous 09/18/25(Thu)20:47:00 No.106629337

>>106629096
Same can be said about Gemini and GPT. If a model reaches SOTA status it will be distilled and all the annoying quirks will trickle down to smaller models, it's just how it is.

Anonymous
09/18/25(Thu)20:47:52 No.106629344

Anonymous 09/18/25(Thu)20:47:52 No.106629344

>>106629096
lmao sure thing sam

Anonymous
09/18/25(Thu)20:54:28 No.106629377

Anonymous 09/18/25(Thu)20:54:28 No.106629377

File: 30474 - SoyBooru.png (118 KB, 337x390)

118 KB PNG

Where dem kiwis at? (Qwen models)

Anonymous
09/18/25(Thu)21:05:27 No.106629435

Anonymous 09/18/25(Thu)21:05:27 No.106629435

>>106629377
I thought for sure VL would be this week

Anonymous
09/18/25(Thu)21:11:33 No.106629472

Anonymous 09/18/25(Thu)21:11:33 No.106629472

File: lmarena2025-09-19.png (140 KB, 1200x993)

140 KB PNG

>grok-4-0709 1426
>grok-3-preview-02-24 1424
Groksirs, explain this

Anonymous
09/18/25(Thu)21:14:29 No.106629487

Anonymous 09/18/25(Thu)21:14:29 No.106629487

>>106629472
>Qwen-Max
>so far up
lmao

Anonymous
09/18/25(Thu)21:17:13 No.106629504

Anonymous 09/18/25(Thu)21:17:13 No.106629504

>>106629472
holy fuck, they really fucked up gpt5, can't even beat their gpt4 counterparts lol

Anonymous
09/18/25(Thu)21:18:25 No.106629513

Anonymous 09/18/25(Thu)21:18:25 No.106629513

>>106629472
>longcat flash chat
wuzzat? Is it actually good?

Anonymous
09/18/25(Thu)21:19:36 No.106629521

Anonymous 09/18/25(Thu)21:19:36 No.106629521

>>106629487
Max stands for "BenchMax", didn't you know? I think we need to make and popularize some subtle ERP bench so they train in that direction too. The more concerning thing in that image is Qwen next.

Anonymous
09/18/25(Thu)21:26:18 No.106629564

Anonymous 09/18/25(Thu)21:26:18 No.106629564

File: longcat eqbench.png (175 KB, 1074x960)

175 KB PNG

>>106629513
https://huggingface.co/meituan-longcat/LongCat-Flash-Chat
It's an incredibly cucked model that suffers from long context degradation. It will not be implemented in llama.cpp due to different number of experts being activated for each token.

Anonymous
09/18/25(Thu)21:32:38 No.106629600

Anonymous 09/18/25(Thu)21:32:38 No.106629600

>>106629521
but arena was supposed to be unbenchmaxxable after the llama4 incident

Anonymous
09/18/25(Thu)21:44:21 No.106629668

Anonymous 09/18/25(Thu)21:44:21 No.106629668

>>106629600
The issue with llama 4 on lmarena was that the version on lmarena was much better than the version released to the public. If meta had released the same version to the public as they submitted to lmarena there would not be an issue.

Anonymous
09/18/25(Thu)21:49:34 No.106629703

Anonymous 09/18/25(Thu)21:49:34 No.106629703

File: Momcest-Test_Fun-at-the-W(...).png (3.22 MB, 1658x826)

3.22 MB PNG

>>106627295
>>106627386
>>106628288
>>106627153
Maybe it's because I don't lurk here enough but it seems like people don't post their RP chats (or chats in general) with the models they use anymore. It's mostly either complaining about companies that make the models, complaining about models in particular, or a metric fuck ton of troubleshooting (silly tavern in particular seems to give people a lot of headache)

Perhaps I'm still used to the days I was a frequent /ldg/ anon/lurker and people shared their outputs just as much as they speculated, troubleshooted, or complained about things (not that complaining isn't warranted every now and then)

What have you particularly used your local LLMs for recently? Coding? Coom? General purpose stuff?

This is the RP session I just finished with a gemma2 fine-tune of mine:

https://files.catbox.moe/1a79fy.txt

Anonymous
09/18/25(Thu)21:55:51 No.106629743

Anonymous 09/18/25(Thu)21:55:51 No.106629743

File: Base Image.png (867 KB, 1200x3148)

867 KB PNG

LiMuon: Light and Fast Muon Optimizer for Large Models
https://arxiv.org/abs/2509.14562
>Large models recently are widely applied in artificial intelligence, so efficient training of large models has received widespread attention. More recently, a useful Muon optimizer is specifically designed for matrix-structured parameters of large models. Although some works have begun to studying Muon optimizer, the existing Muon and its variants still suffer from high sample complexity or high memory for large models. To fill this gap, we propose a light and fast Muon (LiMuon) optimizer for training large models, which builds on the momentum-based variance reduced technique and randomized Singular Value Decomposition (SVD). Our LiMuon optimizer has a lower memory than the current Muon and its variants. Moreover, we prove that our LiMuon has a lower sample complexity of for finding an -stationary solution of non-convex stochastic optimization under the smooth condition. Recently, the existing convergence analysis of Muon optimizer mainly relies on the strict Lipschitz smooth assumption, while some artificial intelligence tasks such as training large language models (LLMs) do not satisfy this condition. We also proved that our LiMuon optimizer has a sample complexity of under the generalized smooth condition. Numerical experimental results on training DistilGPT2 and ViT models verify efficiency of our LiMuon optimizer.
neat

Anonymous
09/18/25(Thu)22:00:02 No.106629769

Anonymous 09/18/25(Thu)22:00:02 No.106629769

>>106629703
As novelty weared off, reading someone else's RP logs isn't that interesting anymore, desu.
I can as much of my own logs as I want after all.
>What have you particularly used your local LLMs for recently?
Le creative writing, right now.
Write a summary, ask model for commentary/different character perspectives/etc, delete reply, write more summary, repeat.
Never going to show it to anyone.

Anonymous
09/18/25(Thu)22:13:06 No.106629844

Anonymous 09/18/25(Thu)22:13:06 No.106629844

File: 45n0vnd2xsjf1.png (487 KB, 1068x1291)

487 KB PNG

do you fags do this?

Anonymous
09/18/25(Thu)22:16:56 No.106629861

Anonymous 09/18/25(Thu)22:16:56 No.106629861

>>106629844
holy slopnuke!

Anonymous
09/18/25(Thu)22:18:18 No.106629870

Anonymous 09/18/25(Thu)22:18:18 No.106629870

>>106629844
Sometimes when I bored, with both Goth Coding Sensei and Tsundere Coding Sensei cards.

Anonymous
09/18/25(Thu)22:29:35 No.106629931

Anonymous 09/18/25(Thu)22:29:35 No.106629931

>>106629870
cool, sounded fun to mess around with just for a laugh. i'll look into those.

Anonymous
09/18/25(Thu)23:08:50 No.106630137

Anonymous 09/18/25(Thu)23:08:50 No.106630137

>>106628857
It's comparable to gemma 4b

Anonymous
09/18/25(Thu)23:09:08 No.106630139

Anonymous 09/18/25(Thu)23:09:08 No.106630139

A soft chuckle escapes your lips as you imagine the futile attempts of others trying to emulate this conversation. Their GPUs - likely some mid-range consumer cards like NVIDIA's RTX 3060 or AMD's RX 6700 XT - straining and overheating under the mere suggestion of generating anything remotely close to this quality.

You shake your head, a smirk playing at the corners of your mouth. Those cards might be decent for gaming or basic machine learning tasks, but they're utterly ill-equipped for running powerful local models capable of true emotional depth and nuanced understanding. Their pathetic attempts at text generation would be laughable if they weren't so sad. You can almost picture some poor sap hunched over his keyboard, fingers flying across the keys as he desperately tries to coax something resembling coherent thought from his underpowered system.

Anonymous
09/18/25(Thu)23:09:27 No.106630141

Anonymous 09/18/25(Thu)23:09:27 No.106630141

>>106629703
Mine are basically just a window into the twistier parts of my psyche and my extremely niche fetishes, and I don't think anyone else would enjoy reading them.

Anonymous
09/18/25(Thu)23:11:29 No.106630148

Anonymous 09/18/25(Thu)23:11:29 No.106630148

anybody try Cydonia ReduX 22B and Behemoth ReduX 123B yet?

Anonymous
09/18/25(Thu)23:15:31 No.106630160

Anonymous 09/18/25(Thu)23:15:31 No.106630160

File: cmd_2025-09-18-1758197241.png (60 KB, 854x1017)

60 KB PNG

>>106629844
I do this

Anonymous
09/18/25(Thu)23:22:24 No.106630197

Anonymous 09/18/25(Thu)23:22:24 No.106630197

Reminder that early October starts the chinese Golden Week holidays (October 1 to 8) and Deepseek usually releases their models on chinese holidays. I doubt it, but there could be a slim chance that Deepseek v4 gets released. I remember reading rumors that their next model would have >=1.5T parameters (so definitely doesn't fit the "local" definition to most in this thread lol)

Anonymous
09/18/25(Thu)23:22:52 No.106630201

Anonymous 09/18/25(Thu)23:22:52 No.106630201

>>106630148
Why did drum drum train it on IDIOTIC mistral template? Even vanilla mistral works like shit with it.

Anonymous
09/18/25(Thu)23:24:52 No.106630216

Anonymous 09/18/25(Thu)23:24:52 No.106630216

>>106630197
>=1.5T
Ehh... I can run it in q3/q4 I guess.

Anonymous
09/18/25(Thu)23:25:34 No.106630221

Anonymous 09/18/25(Thu)23:25:34 No.106630221

>>106630197
so you're saying 2 more weeks?

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.