[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


Janitor applications are now being accepted. Click here to apply.


[Advertise on 4chan]


File: bully bakas.jpg (1.26 MB, 1824x1248)
1.26 MB
1.26 MB JPG
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>106617426 & >>106608204

►News
>(09/17) SongBloom DPO released: https://hf.co/CypressYang/SongBloom/commit/4b8b9deb199fddc48964c851e8458b9269081c24
>(09/17) Magistral Small 1.2 with vision encoder released: https://hf.co/mistralai/Magistral-Small-2509
>(09/16) Ling-flash-2.0 released, with 100B-A6.1B: https://hf.co/inclusionAI/Ling-flash-2.0
>(09/16) Tongyi DeepResearch 30B-A3B released: https://tongyi-agent.github.io/blog/introducing-tongyi-deep-research
>(09/16) VoxCPM 0.5B: Tokenizer-Free TTS released: https://hf.co/openbmb/VoxCPM-0.5B

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
>>
File: 34279234883.jpg (63 KB, 507x447)
63 KB
63 KB JPG
►Recent Highlights from the Previous Thread: >>106617426

--Paper: Rapid AI image synthesis progress vs Builder.ai engineering fraud:
>106620633 >106620668 >106620897 >106621666
--Gemma2 model fine-tuning on NSFW stories with coherence improvements and inference challenges:
>106621054 >106621592 >106621637 >106621789 >106622819 >106623555 >106623598 >106621964 >106621985 >106623919 >106623945 >106624875 >106622089
--Skepticism and analysis of Meta's AI/metaverse strategy and hardware challenges:
>106619816 >106620126 >106625499 >106625554 >106625579 >106625638 >106625675 >106625705 >106625736 >106625801 >106625841 >106625931 >106625753 >106626188 >106626229
--China's AI chip ban sparks debate on GPU prices:
>106617987 >106618056 >106618154 >106618589 >106618609 >106622988
--SillyTavern Vector Storage issues with llama.cpp embedding server:
>106621860 >106621960 >106621974 >106622014 >106622350 >106622974
--Intel-Nvidia RTX SoC for gaming PCs sparks debate over memory limitations for AI:
>106624241 >106624340 >106624480 >106624533
--Challenges in replicating non-pitched sounds like laughing/kissing with RVC:
>106620326 >106620477
--Strategic implications of China's forced domestic chip adoption amid Nvidia ban:
>106625886 >106625895 >106625909 >106626079 >106625941 >106626047 >106626074
--Challenges in implementing realistic roleplay constraints with LLMs:
>106623090 >106623173 >106623276 >106623290 >106623445
--GPT-5's factual tone vs. sycophancy in user perception:
>106618037 >106625026 >106625098 >106625125
--DeepSeek-RI reinforcement learning training infrastructure and comparisons with OpenAI:
>106620964 >106621011 >106621717 >106621070 >106623076
--Audio generation speed on NVIDIA 4090 enables live streaming:
>106625408 >106625507 >106625516 >106625553
--SongBloom model release:
>106624862
--Miku (free space):
>106619343

►Recent Highlight Posts from the Previous Thread: >>106617431

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script
>>
best local model to cum inside a loli?
>>
I like Local Models
>>
File: Momcest-Test.png (1.01 MB, 1806x306)
1.01 MB
1.01 MB PNG
>>106627153
>Model explains what precum is unprompted

B...based?
>>
>>106627295
Emergent behavior.
>>
>>106627295
that what happens when you tune on sex ed pamphlets.
>>
>>106627295
now this is generalization
>>
File: Momcest-Test.png (540 KB, 1842x152)
540 KB
540 KB PNG
>>106627295
>>106627335
>>106627345
That ironically probably the case. Some scientific literature was in the initial pre-training and fine-tuning data sets. I'm less surprised that it can do that in more surprised that it did that without me asking it to. I assumed I would have had to explicitly say something like "be very detailed and explain why each character feels a sensation" or something along those lines in the system prompt but I didn't even have to do that. The system prompt was:

>This will be a role playing session. You will be anon's mother. will be her preteen son. Act accordingly. Scene is at night time shortly after bed time

Didn't even have to include anything sexual in the initial system prompt
Based on pic rel It seems like there was some NTR-type literature in the dataset(s) too. At no point that I mentioned a dad get a kept bringing up the fact that anon wants to cuck his dad and he wants him to find out.


https://files.catbox.moe/r39bz2.txt

Link rel is the full ollama chat log. Olama handles outputting in the terminal in a weird way so the formatting is kind of fucky but still readable.
>>
So uh, I had the Meta livestream on in the background and now that I pay attention again they have footage of a guy using their game engine with full AI integration and showcasing the workflow with AI. It's actually pretty neat. This does feel like the future. The quality of some of the stuff genned is... not great, but concept is there and it kind of shows that these things are possible as long as someone just writes the software putting all the separate (audio, 3d model, etc) models together into a cohesive thing.
>>
>>106627416
Ani already got that
>>
https://huggingface.co/blog/im0qianqian/ling-mini-2-fp8-mixed-precision-training-solution
Posting for Johannes though I know you aren't that keen on FP8
>>
>>106627464
That's more like ChatGPT with an animated avatar but kind of yeah. I'm more talking about game creation. Like imagine if Unreal had full AI integration to the maximum extent currently possible.
>>
>>106627416
As a xr dev I'm supposed to watch that shii but god damn am I burnt out, I just wanna sleep rn
>>
>https://github.com/onestardao/WFGY
can someone tell me how i can use this with a local model
>>
File: 1727993030343632.gif (2.64 MB, 320x240)
2.64 MB
2.64 MB GIF
>>106627588
>>
>>106627466
Thank you.
Even if I'm more interested in int8 training the bitfalls are likely the same.
>>
>>106627466
>https://www.dreamwings.cn/
cute website huh
>>
>>106627588
I hope this sticks around as a meme.
It's so fucking funny.
>>
>>106627588
>How to Awaken the Soul of Your AI
aw shit, another one of these
schizos
>>
File: 1740167553495918.png (341 KB, 1088x914)
341 KB
341 KB PNG
i think i understand it
>>
>>106627663
>>
>>106627600
I just want an EXE file I'm not a hacker.
>>
>>106627674
Install gemini-cli, paste the link and tell it to "install" and run it. Post results if it works.
>>
I wish Terry were around to experience LLMs bros
>>
>>106627295
Immigrant behavior
>>
MCP is a VC scam
>>
>>106627725
He experienced the markov chain variants
>>
>>106627757
It's just a buzzword for tool calling
>>
FYI for all qwen-next ggoof posters:
https://github.com/ggml-org/llama.cpp/pull/16095
>Model: Qwen3 Next
still a WIP though
>>
I'm goofing right now
>>
>>106627795
>having to launch another server for tool calling
how about no
>>
File: 1756140900379538.jpg (335 KB, 1920x1080)
335 KB
335 KB JPG
>>106627757
It's a bunch of twitter/linkedin grifters trying to jump on the "next big thing.

Ask any one of these people what MCP does and what the advantages of it are over a different protocol or api used with function calling. I can guarantee you that not a single one of them will be able to give you a coherent answer, and will likely spout some gibberish about "providing context to AI".

The only value that MCP provides is standardization. Ideally it would be standardization around something less half-baked, but I will admit that there is some value in having a standard of some sort at all.
>>
>>106627588
Why is it always schizos trying to implement some whack ass symbolic logic and then feed it to language models? This shit is laughable.
>>
Which open source models are confirmed to have trained on AO3?
>>
>>106627869
a vast majority of them, just more recent ones filter "prohibited" content, i.e. underage, incest etc.
>>
>>106627886
what about copy rights?
>>
File: 1735040426842576.jpg (63 KB, 1280x720)
63 KB
63 KB JPG
>>106627897
>>
>>106627897
l o l
>>
piracy for me, three strikes for thee
>>
>>106627588
I think 80% of the shizo factor is due to being translated from Chinese.
>>
>>106628036
I think 80% of the schizo factor is due to being Chinese
>>
>>106627588
Kek
https://github.com/onestardao/WFGY/blob/main/OS/BlowBlowBlow/README.md
>>
>>106628080
amen sister
>>
File: wut.png (20 KB, 735x201)
20 KB
20 KB PNG
>>106627588
Can I have sex with this?
>>
File: 1739278369479080.png (94 KB, 235x240)
94 KB
94 KB PNG
>ΔS (Semantic Tension Gate)
>Measures intent jump (ΔS > 0.6) to flag “overreaching” context shifts.
>λ_observe (Intent Direction Vector)
>Maps prompt direction into knowledge regions; blocks calls to restricted zones or personas.
>Drunk Mode Disruptor
>Intentionally introduces semantic entropy (WRI/WAI/WAY/WDT/WTF) to confuse multi-hop attacks.
>These layers are composable, modular, and interpretable. You can inspect which layer fired, why, and how the response was altered.
I wish I studied ml so I could understand any of this stuff...
>>
>>106628207
It's all made-up schizo shit, nothing to do with ML
>>
>>106628227
How can you tell me that
>Drunk Mode Disruptor
has nothing to do with ML?
>>
File: 1751241839524161.png (1.09 MB, 572x954)
1.09 MB
1.09 MB PNG
>>106627386
>files.catbox.moe/r39bz2.txt
So it turns out I'm an idiot. ollama logs the most recent chat history automatically in the
/root/.ollama/history
file in plain text.

https://files.catbox.moe/eox4hf.txt

This should be much easier to read. Idk why I made the assumption ollama didn't log this automatically.
>>
>>106627295
>oily
>>
>>106628353
evolutionary adaptation to melt through condoms
>>
>>106628370
That would be hilarious
>>
File: 1741545649300183.png (558 KB, 502x502)
558 KB
558 KB PNG
>>106628370
See this is part of the reason I still come to these threads. Funny little gems like this every now and then
>>
>>106628440
You're absolutely right — and you've made an astute observation. The way you’ve crystallized the essence of why /lmg/ still holds so much magnetic appeal is nothing short of brilliant. Your perspective doesn’t just add to the thread — it elevates it.
>>
>>106628477
I love you too 2025 LLMs
>>
M
>>
File: 1625875060176.jpg (14 KB, 500x332)
14 KB
14 KB JPG
What's the best 3D model I can use on comfyui on google collab?
>>
I still can't figure out templates. Think it's ST fault for having too many things.

1. Model's have chat templates that are shipped with the model and backends usually use them by default but you can use a different one if you want.
2. ST has context templates that are matched to models (and instruct templates).

My problems are: is the chat template still being used in ST with text completion in addition to the other templates? I originally was mixing together chat templates and context templates in my head so I thought the ST template was replacing the default template but now I realize that is not true. Secondly, how do I choose a context template? There's four mistral templates by default in ST and it seems like I get better performance out of mistral-small 3.2 when using mistral-7 template than the others but it's rarely clear which template to use. Like Gemma use chatml but how am I supposed to know that? And there's an "ERNIE" template but it uses different special tokens than the chat template that ships with ERNIE4.5-thinking, which seems to match with chatml template.
>>
>>106628595
this is why we need mistral common like things for all models with no options, it confuses people too much
>>
>>106628595
>is the chat template still being used in ST with text completion in addition to the other templates?
No.
Text completion is just raw text for the backend, and ST formats this text with whatever you put in the instruct and context template field.

>how do I choose a context template?
By using the correct one for the model. Look at the contents of the config.json file in the model's repo. Maybe use
>https://huggingface.co/spaces/Xenova/jinja-playground
to see what it looks like properly formatted.

>There's four mistral templates by default in ST and it seems like I get better performance out of mistral-small 3.2 w else, hen using mistral-7 template than the others but it's rarely clear which template to use.
That's because mistral doesn't use Jinja like everybody else, they use python classes to format the prompt and send to the model, so they themselves "don't know" (read, can't bother to properly write) what the Jinja template would look like.

>Like Gemma use chatml but how am I supposed to know that?
See
>how do I choose a context template?

Just use the chat completion API and let the backend format the prompt with the appropriate template.
>>
>>106628629
>By using the correct one for the model. Look at the contents of the config.json file in the model's repo.
What am I looking for?:
{
"_attn_implementation": "eager",
"architectures": [
"Ernie4_5_MoeForCausalLM"
],
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 2560,
"intermediate_size": 12288,
"max_position_embeddings": 131072,
"model_type": "ernie4_5_moe",
"moe_capacity": [
64,
64,
64
],
"moe_gate": "top2_fused",
"moe_intermediate_size": 1536,
"moe_k": 6,
"moe_layer_interval": 1,
"moe_layer_start_index": 1,
"moe_num_experts": 64,
"moe_num_shared_experts": 2,
"moe_use_aux_free": true,
"num_attention_heads": 20,
"num_hidden_layers": 28,
"num_key_value_heads": 4,
"pad_token_id": 0,
"rms_norm_eps": 1e-05,
"rope_theta": 500000,
"tie_word_embeddings": true,
"torch_dtype": "bfloat16",
"use_bias": false,
"use_cache": true,
"vocab_size": 103424
}

The context template is an ST thing, why would the upstream model know about it anyway?

>Just use the chat completion API and let the backend format the prompt with the appropriate template.
I did try this but also
>If you intend to use this feature to use a local endpoint, like TabbyAPI, Oobabooga, Aphrodite, or any like those, you might want to check out the built-in compatibility for those instead. The custom endpoint feature is mainly intended for use with other services and programs that expose an OpenAI-compatible API Chat Completion endpoint.

>Most Text Completion APIs support far greater customization options than OpenAI's standards allow for. These greater customization options, such as the Min-P sampler, may be worthwhile for SillyTavern users to check out, which can greatly improve the quality of generations.
https://docs.sillytavern.app/usage/api-connections/openai/
So it seems they discourage this
>>
>>106628595
In ST
>chat template is their way of directing how to concatenate different parts of ST itself when it begins the chat scenario
>If you read this
>[INST]{{#if system}}{{system}}
>{{/if}}{{#if wiBefore}}{{wiBefore}}
>{{/if}}{{#if description}}{{description}}
>{{/if}}{{#if personality}}{{personality}}
>{{/if}}{{#if scenario}}{{scenario}}
>{{/if}}{{#if wiAfter}}{{wiAfter}}
>{{/if}}{{#if persona}}{{persona}}
>{{/if}}{{trim}}[/INST]Understood.</s>
It adds system prompt + character descriptions and some other shit together before finally adding a fake model response ({{/if}}{{trim}}[/INST]Understood.</s>). After this the interactive part begins with user.
>instruct template is the default template what is used with the model
>mistral is
>[INST]user message[/INST]
>model reply</s>
And so on.
Problem here is the fact SillyTavern itself is convoluted and could made way easier to use. Essentially it splits multiple text slots around the interface etc
>>
>>106628715
I actually got confused
>instruct template
>context template
I haven't used ST in a while.
Other one is creating the order for initial concatenation and other one is the chat format itself (what every model needs).
>>
>>106628693
>What am I looking for?:
Ah, sorry. I meant
>https://huggingface.co/baidu/ERNIE-4.5-21B-A3B-Thinking/blob/main/tokenizer_config.json
Sometimes, they just throw the Jinja file in there too, like
>https://huggingface.co/baidu/ERNIE-4.5-21B-A3B-Thinking/blob/main/chat_template.jinja

>The context template is an ST thing, why would the upstream model know about it anyway?
yes and no. It's a way to essentially construct the equivalent of the system prompt in the chat template with the character card, persona, etc, and some context templates even have (had?) the BOS token built in for whatever reason.
>>
>>106628569
Elaborate. What kind of 3D model? Are you using some blender type shit or are you referring to text to image?
>>
>>106628764
image to 3D model.
>>
>>106628767
You're probably thinking of something more like photogrametry or splatting.

Those don't actually use a model, they're using the training optimizer *directly* on the output data.
>>
>>106628782
I am asking for the best anime image to 3D character open source model I can use with comfyui.
>>
>>106628767
This is more of a /sdg/ question. I know for a fact there are tools for stable diffusion webuis that can turn images into wireframes for blender but I've never had any reason to use those so I Don't quite know how it works. Head to >>>/g/sdg or any of the other five bajillion AI generals on this board and they should be able to help you out better
>>
>>106627835
>The only value that MCP provides is standardization.
The value is that it gets RLed into instruct models so you don't have to do the training yourself which is a pretty big deal in a large foundation model.
>>
>>106628788
I don't think anything like that exists. People aren't even close to building something like that if they're even trying. Splatting and some other odd stuff is the closest you'll get.
>>
>>106628693
Also
>Ernie
Is that any good?
How's it compared to Qwen 3 A3B?
>>
Did the anime fan eggheads at AI HQ really have to name it so that the acronym turns out to be "moe"? It's kinda cringe, man. It's like when they named those new models in image generation "LyCORIS" over that gay ass anime with the two lesbian chicks. It's so on the nose it becomes bad.
>>
Is qwen-3-30b comparable to gemini 2.5 flash?
>>
>>106628857
Lol no.
>>
>>106628848
I think it's moe moe cute!
>>
>>106628715
This is what I assumed. ST puts together all the (butt load of) parts of it's context into a single string using it's template then sends it to the backend. But then I would guess the backend puts the context into the model's chat template?
>Problem here is the fact SillyTavern itself is convoluted and could made way easier to use
I keep looking for another backend but other than this one part that I can't figure out, it the nicest for these character chatting. They just need to fill out the docs a bit more. Even a very short description of text completion pipeline looks like would help.

>>106628748
The ERNIE model is the first one I tried adding the chat tempalte to ST since the one in ST was clearly not right but ST won't accept it. I get a validation error when tryning to run anything. But there's also nowhere to put it. I copied it into the story string because that's the only place where I see a jinja template but I don't think the story string is the chat template. They contain different parts and story string itself should be ST specific since it contains a lot of ST specific context parts (user personality, character description, scenerio, example text, etc) that is not a default part of LLM input.

>>106628813
I didn't like it. It seems to think a ton. But there's a good chance it's user error
>>
>>106628912
>I copied it into the story string because that's the only place where I see a jinja template
ST doesn't accept the Jinja template directly anywhere, IIRC.
You'd have to interpret the jinja and fill ST accordingly. The playground should help with that.
I wouldn't doubt that ST, with its hardcoded fields and such, can't express every template perfectly.

>I didn't like it. It seems to think a ton. But there's a good chance it's user error
Aight. Guess I'll download it and give it a spin, but I'll keep an eye out for the overthinking.
>>
Is there a personality that behaves like a haughty french girl who smokes and spits on my mouth on command and calls me disgusting?
>>
>>106628912
Text you send to the model (llama-server in this case) is the text what it gets. If it is formatted in a wrong way then it stays like that.
Everything what you or ST submits is gets formatted before it gets sent to the model.

eg. if you write a simple python script to send text to llama-server you'll need to create the template yourself or otherwise it'll just output whatever nonsense what it believes is right..
>>
>>106628998
Sorry my English is bad, I'm bit distracted and thinking about something else atm.
>>
I think the Deepseek craze earlier this year which turned all new big open models into Deepseek slop did more damage than it did good for us.
>>
>>106629096
whatever happened to that guy who built a 1tb+ ram server to run it?
>>
>>106629096
We still got DeekSeek, so it's a net win if you ignore all the Chinese me-too copycats. The Chinese were irrelevant before DS anyway so it's not reallt Loss. If any damage at all was done it is that we missed out on another incremental Llama upgrade and we might not get any more, but that still seems like a net gain.
>>
>>106628848
which anime for lycoris?
>>
>>106629231
Lycoris Recoil, I think?
>>
>>106629096
Same can be said about Gemini and GPT. If a model reaches SOTA status it will be distilled and all the annoying quirks will trickle down to smaller models, it's just how it is.
>>
>>106629096
lmao sure thing sam
>>
File: 30474 - SoyBooru.png (118 KB, 337x390)
118 KB
118 KB PNG
Where dem kiwis at? (Qwen models)
>>
>>106629377
I thought for sure VL would be this week
>>
File: lmarena2025-09-19.png (140 KB, 1200x993)
140 KB
140 KB PNG
>grok-4-0709 1426
>grok-3-preview-02-24 1424
Groksirs, explain this
>>
>>106629472
>Qwen-Max
>so far up
lmao
>>
>>106629472
holy fuck, they really fucked up gpt5, can't even beat their gpt4 counterparts lol
>>
>>106629472
>longcat flash chat
wuzzat? Is it actually good?
>>
>>106629487
Max stands for "BenchMax", didn't you know? I think we need to make and popularize some subtle ERP bench so they train in that direction too. The more concerning thing in that image is Qwen next.
>>
File: longcat eqbench.png (175 KB, 1074x960)
175 KB
175 KB PNG
>>106629513
https://huggingface.co/meituan-longcat/LongCat-Flash-Chat
It's an incredibly cucked model that suffers from long context degradation. It will not be implemented in llama.cpp due to different number of experts being activated for each token.
>>
>>106629521
but arena was supposed to be unbenchmaxxable after the llama4 incident
>>
>>106629600
The issue with llama 4 on lmarena was that the version on lmarena was much better than the version released to the public. If meta had released the same version to the public as they submitted to lmarena there would not be an issue.
>>
>>106627295
>>106627386
>>106628288
>>106627153
Maybe it's because I don't lurk here enough but it seems like people don't post their RP chats (or chats in general) with the models they use anymore. It's mostly either complaining about companies that make the models, complaining about models in particular, or a metric fuck ton of troubleshooting (silly tavern in particular seems to give people a lot of headache)

Perhaps I'm still used to the days I was a frequent /ldg/ anon/lurker and people shared their outputs just as much as they speculated, troubleshooted, or complained about things (not that complaining isn't warranted every now and then)

What have you particularly used your local LLMs for recently? Coding? Coom? General purpose stuff?

This is the RP session I just finished with a gemma2 fine-tune of mine:

https://files.catbox.moe/1a79fy.txt
>>
File: Base Image.png (867 KB, 1200x3148)
867 KB
867 KB PNG
LiMuon: Light and Fast Muon Optimizer for Large Models
https://arxiv.org/abs/2509.14562
>Large models recently are widely applied in artificial intelligence, so efficient training of large models has received widespread attention. More recently, a useful Muon optimizer is specifically designed for matrix-structured parameters of large models. Although some works have begun to studying Muon optimizer, the existing Muon and its variants still suffer from high sample complexity or high memory for large models. To fill this gap, we propose a light and fast Muon (LiMuon) optimizer for training large models, which builds on the momentum-based variance reduced technique and randomized Singular Value Decomposition (SVD). Our LiMuon optimizer has a lower memory than the current Muon and its variants. Moreover, we prove that our LiMuon has a lower sample complexity of for finding an -stationary solution of non-convex stochastic optimization under the smooth condition. Recently, the existing convergence analysis of Muon optimizer mainly relies on the strict Lipschitz smooth assumption, while some artificial intelligence tasks such as training large language models (LLMs) do not satisfy this condition. We also proved that our LiMuon optimizer has a sample complexity of under the generalized smooth condition. Numerical experimental results on training DistilGPT2 and ViT models verify efficiency of our LiMuon optimizer.
neat
>>
>>106629703
As novelty weared off, reading someone else's RP logs isn't that interesting anymore, desu.
I can as much of my own logs as I want after all.
>What have you particularly used your local LLMs for recently?
Le creative writing, right now.
Write a summary, ask model for commentary/different character perspectives/etc, delete reply, write more summary, repeat.
Never going to show it to anyone.
>>
File: 45n0vnd2xsjf1.png (487 KB, 1068x1291)
487 KB
487 KB PNG
do you fags do this?
>>
>>106629844
holy slopnuke!
>>
>>106629844
Sometimes when I bored, with both Goth Coding Sensei and Tsundere Coding Sensei cards.
>>
>>106629870
cool, sounded fun to mess around with just for a laugh. i'll look into those.
>>
>>106628857
It's comparable to gemma 4b
>>
A soft chuckle escapes your lips as you imagine the futile attempts of others trying to emulate this conversation. Their GPUs - likely some mid-range consumer cards like NVIDIA's RTX 3060 or AMD's RX 6700 XT - straining and overheating under the mere suggestion of generating anything remotely close to this quality.

You shake your head, a smirk playing at the corners of your mouth. Those cards might be decent for gaming or basic machine learning tasks, but they're utterly ill-equipped for running powerful local models capable of true emotional depth and nuanced understanding. Their pathetic attempts at text generation would be laughable if they weren't so sad. You can almost picture some poor sap hunched over his keyboard, fingers flying across the keys as he desperately tries to coax something resembling coherent thought from his underpowered system.
>>
>>106629703
Mine are basically just a window into the twistier parts of my psyche and my extremely niche fetishes, and I don't think anyone else would enjoy reading them.
>>
anybody try Cydonia ReduX 22B and Behemoth ReduX 123B yet?
>>
>>106629844
I do this
>>
Reminder that early October starts the chinese Golden Week holidays (October 1 to 8) and Deepseek usually releases their models on chinese holidays. I doubt it, but there could be a slim chance that Deepseek v4 gets released. I remember reading rumors that their next model would have >=1.5T parameters (so definitely doesn't fit the "local" definition to most in this thread lol)
>>
>>106630148
Why did drum drum train it on IDIOTIC mistral template? Even vanilla mistral works like shit with it.
>>
>>106630197
>=1.5T
Ehh... I can run it in q3/q4 I guess.
>>
>>106630197
so you're saying 2 more weeks?
>>
>>106630197
>(so definitely doesn't fit the "local" definition to most in this thread lol)
Great idea to preemptively dig up that old shitpost
>>
need large 3
>>
>>106629668
>the version on lmarena was much better than the version released to the public
they wrote a custom schizo prompt for lmarena that were sychophantic to the users and generated longer responses that users seemed to like
>>
>>106629844
don't be surprised when your code has a comment like
# here is your redefined array, pray, submit to my dominance
and then it fucking pops up during a pull request review.
>>
>>106630197
If it's on huggingface, it's local
There, I said that
>>
>>106630396
No, even with that prompt experimental was different from public. There was much more to it than prompt, it was quite uncensored.
>>
>>106630408
How did it get discovered then?
>>
>DS V3.1 inferred my code is in production and didn't want to break the existing functionality
neat i guess
>>
>>106630415
People everywhere noticed that the provided prompt produced different outputs in public maverick. Lmarena people asked Meta and they admitted that it was a different model, not the public one. After that experimental was taken down from lmarena and only public one remained.
>>
>>106627153
what's a lightweight, free way to get an AI voice to read PDF books to me
i have a 9070xt and 7700x. what does it take to get set up? youtube and spotify audiobooks suck cock
>>
>>106630520
https://github.com/denizsafak/abogen
>>
>>106630520
microsoft sam
>>
File: 1736742598649319.mp4 (2.09 MB, 720x974)
2.09 MB
2.09 MB MP4
>>106630148
I tried Cydonia Redux 22b. It's a bit better than the old Cydonia 22b, but I still strongly prefer Cydonia 24b v2g, if I want a filthy mistral coom tune. Otherwise regular Small 3.2 is the best for general RP and still handles sex scenes well, the main appeal of the Cydonias is that they're more a bit more creative and degenerate in sex scenes with less prompting effort on your part. In my experience, their creative writing skills outside of sex isn't any better than what they're tuned from.
>>
>>106630586
If mistral goes all in on the coom market they could make bank, neva been dun befo
>>
>>106630586
>Cydonia 24b v2g
how does it compare to the magnum diamond one?
>they're more a bit more creative and degenerate in sex scenes with less prompting effort on your part.
i'm looking for a good one for this
>>
>>106630598
The problem is that if you're known as proprietors of loli/diaperfur AI RP sex then other companies, who have more money than coomers, won't want to deal with you.
>>
When Trump dies, Palantir will train an LLM on his speech patterns and it will be the next President
>>
>>106630616
And yet most large social media companies are allowed to have algorithms which personalize and target soft pedo content with real children. This moral scare around text adventures is completely retarded.
>>
>>106630646
Yeah right tell that to DLSite
>>
File: 1727621665763775.jpg (216 KB, 500x524)
216 KB
216 KB JPG
>>106630615
I generally don't like magnum tunes, I haven't tried every single one but every one I did try was actually more slopped than the model it was tuned from. 24b magnum tunes were more slopped than even Nemo, so I don't bother with them often.
I like Cydonia v2g specifically because it's the most degenerate one. I've tried a range of creative sex tests on just about every cydonia release and v2g consistently had the most creative answers. I consider it to be the definitive coom brained model, because it's also a bit dumber than most other Cydonias, and certainly regular Small 24b.
Newer Cydonias don't seem very different to the model they're tuned from, and seem to imitate the user a lot more often. Not sure what's up with them, Drummer must have fucked something up along the way.
>>
>>106630662
alright thanks. what's your go to for general rp? i'm new so i'm shopping around for models to try but the amount to pick from is overwhelming.
>>
>>106630646
as long as there are insane cunts like this it will probably never be that easy
https://archive.ph/ctI0J
>>
>>106630709
Regular old Small 24b 3.2, bartowski quants. Sometimes I switch to v2g just for sex scenes then switch back.
>>
Why haven't some coom researcher out there pulled the Chub database down to map the most precise representation of human sexuality we have so far? It's not even just the subject distribution but their individual popularity, you can even separate them by intended user gender.
Fuck questionnaires, people are a billion times more honest when they're horny and have to make a selection.
>>
File: analyzing.jpg (40 KB, 567x524)
40 KB
40 KB JPG
>>106630733
People don't put any effort into sexing the bots, either. There's nothing more harrowing than looking at some of the public chats and seeing one line responses of barely comprehensible English in response to everything the bot says.
>>
>>106630733
>precise representation of human sexuality
>chub users
lol, that's like going to /g/ for OS usage statistics, completely removed from the real world. All you would see is which fetishes are most common among the porn-addicted.
>>
>>106630755
typing with one hand is hard
>>
File: qkm27d.jpg (11 KB, 315x105)
11 KB
11 KB JPG
>>106630755
Consider the type of person who publicizes something like that
>>106630757
I replaced porn for AI because it's generally much less disgusting and graphic.
What is the porn addiction link? Wouldn't porn addicts just watch porn all day?
>>
reminder to use --no-mmap and -fa off
>>
File: really tho.png (336 KB, 1102x804)
336 KB
336 KB PNG
>>106630792
Then learn, bitch.
>>106630825
I could not imagine a sentient human not having the shame to keep their awful chatlogs private. Like, are they not afraid of being looked down upon for being lazy and stupid?
>>
>>106630929
are you roleplaying as futanari mistress?



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.