/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

[Post a Reply]

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous
/lmg/ - Local Models General 03/19/26(Thu)19:39:31 No.108410115

File: 1749494100348035.png (1.49 MB, 800x1333)

1.49 MB PNG

/lmg/ - Local Models General Anonymous 03/19/26(Thu)19:39:31 No.108410115

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>108404935 & >>108400151

►News
>(03/17) Rakuten AI 3.0 released: https://global.rakuten.com/corp/news/press/2026/0317_01.html
>(03/16) Mistral Small 4 released: https://mistral.ai/news/mistral-small-4
>(03/11) Nemotron 3 Super released: https://hf.co/nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
Token Speed Visualizer: https://shir-man.com/tokens-per-second

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
03/19/26(Thu)19:41:43 No.108410131

Anonymous 03/19/26(Thu)19:41:43 No.108410131

File: comfyui_00231_.png (904 KB, 1216x832)

904 KB PNG

►Recent Highlights from the Previous Thread: >>108404935

--ASUS Ascent GX10 purchase debated for inference workloads:
>108407066 >108407095 >108407101 >108407129 >108407096 >108407138 >108407137 >108407143 >108407169 >108407222 >108407272 >108407309 >108407327
--Project Ani design debate: camera interaction vs 3D autonomy:
>108408547 >108408584 >108408592 >108408627 >108408640 >108408710 >108409169 >108408585 >108408601
--Evaluating budget 4x V100 32GB setup for local LLM use:
>108405178 >108405196 >108405211 >108405228 >108405250
--Model comparisons for RP, vision, and NSFW:
>108407751 >108407779 >108407788 >108407819 >108407839 >108408031 >108407888 >108407928 >108407994 >108408012 >108408115 >108407902 >108408355 >108407787 >108407797 >108407832 >108407833
--Criticism of over-tuned safety in modern AI models:
>108404958 >108404965 >108404991 >108405298 >108405352 >108405402 >108407336 >108407538 >108407597 >108407700 >108407721
--Kimi K2 vs K2.5 performance and prompting techniques:
>108408025 >108408073 >108408144 >108408209 >108408258 >108408313 >108408418 >108408252 >108408377 >108408397 >108408439
--Qwen 27B preferred over 35B despite speed tradeoffs:
>108407396 >108407427 >108407591 >108407619 >108407627 >108407661 >108407771 >108407803 >108407828 >108408416 >108407617 >108407648 >108407678 >108408228 >108407650 >108407970 >108408630 >108408781 >108408894 >108408921 >108408983 >108409004 >108408935
--Qwen3.5 27b Heretic v3 recommended for 24GB GPUs:
>108408663 >108408753 >108408774 >108408828 >108408851 >108408937
--Hugging Face Agentic Evaluations Workshop livestream:
>108408324
--Qwen 3.5 9B generates functional C code:
>108407763
--Logs:
>108409077 >108409247 >108409442
--Rin, Miku, and Teto (free space):
>108405043 >108405091 >108406177 >108406814 >108407696 >108407782 >108407933 >108407989 >108408792 >108409216

►Recent Highlight Posts from the Previous Thread: >>108404937

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
03/19/26(Thu)19:48:35 No.108410173

Anonymous 03/19/26(Thu)19:48:35 No.108410173

File: 1681793209347227.png (774 KB, 850x1200)

774 KB PNG

Pls share any armpit related research

Anonymous
03/19/26(Thu)19:52:04 No.108410187

Anonymous 03/19/26(Thu)19:52:04 No.108410187

File: 8ed625b1b7790d8db1d8bcdc7(...).png (339 KB, 396x785)

339 KB PNG

Anonymous
03/19/26(Thu)19:52:53 No.108410190

Anonymous 03/19/26(Thu)19:52:53 No.108410190

>>108410173
Tiktok marketers found that videos starting with a visible armpit get 50% more views.

Anonymous
03/19/26(Thu)19:53:06 No.108410193

Anonymous 03/19/26(Thu)19:53:06 No.108410193

File: sample_18fb51429c3396b371(...).jpg (391 KB, 850x850)

391 KB JPG

Anonymous
03/19/26(Thu)19:54:07 No.108410200

Anonymous 03/19/26(Thu)19:54:07 No.108410200

File: sample_8c1f00852d00d4a5b8(...).jpg (174 KB, 850x1198)

174 KB JPG

Anonymous
03/19/26(Thu)19:54:44 No.108410205

Anonymous 03/19/26(Thu)19:54:44 No.108410205

falseflagkun is back ^_^

Anonymous
03/19/26(Thu)19:55:09 No.108410210

Anonymous 03/19/26(Thu)19:55:09 No.108410210

File: sample_65a4986a7574863cd6(...).jpg (740 KB, 850x992)

740 KB JPG

Anonymous
03/19/26(Thu)19:56:10 No.108410216

Anonymous 03/19/26(Thu)19:56:10 No.108410216

File: sample_d6b92133fdbb96be8f(...).jpg (132 KB, 850x1202)

132 KB JPG

Anonymous
03/19/26(Thu)19:56:23 No.108410219

Anonymous 03/19/26(Thu)19:56:23 No.108410219

File: 1753646192283397.jpg (1.54 MB, 1280x1836)

1.54 MB JPG

>>108410173
Men can't resist

Anonymous
03/19/26(Thu)19:57:12 No.108410223

Anonymous 03/19/26(Thu)19:57:12 No.108410223

File: sample_88f6c9c687a99ce5d7(...).jpg (197 KB, 850x1303)

197 KB JPG

Anonymous
03/19/26(Thu)19:57:18 No.108410225

Anonymous 03/19/26(Thu)19:57:18 No.108410225

Just fucking kill yourselves you worthless spamming mikutroons

Anonymous
03/19/26(Thu)19:58:13 No.108410230

Anonymous 03/19/26(Thu)19:58:13 No.108410230

File: sample_6886dec807e3d87722(...).jpg (205 KB, 850x1275)

205 KB JPG

Anonymous
03/19/26(Thu)19:59:14 No.108410238

Anonymous 03/19/26(Thu)19:59:14 No.108410238

File: sample_a403bc8515b0dba05f(...).jpg (177 KB, 850x1244)

177 KB JPG

Anonymous
03/19/26(Thu)19:59:24 No.108410239

Anonymous 03/19/26(Thu)19:59:24 No.108410239

jannies are in on it
i hope they die, not in minecraft, in real life

Anonymous
03/19/26(Thu)19:59:32 No.108410240

Anonymous 03/19/26(Thu)19:59:32 No.108410240

>>108410225
I don't think you understand the situation.

Anonymous
03/19/26(Thu)19:59:39 No.108410241

Anonymous 03/19/26(Thu)19:59:39 No.108410241

Baker really got mindbroken, huh?

Anonymous
03/19/26(Thu)20:00:22 No.108410246

Anonymous 03/19/26(Thu)20:00:22 No.108410246

>>108410241
That isn't even miku dumbass.

Anonymous
03/19/26(Thu)20:02:33 No.108410253

Anonymous 03/19/26(Thu)20:02:33 No.108410253

>>108410240
I think it is about time you got a dedicated thread for your waifu spam. I just want local model news and not your disgusting autism.

Anonymous
03/19/26(Thu)20:03:13 No.108410254

Anonymous 03/19/26(Thu)20:03:13 No.108410254

>>108410253
I'm not even the miku guy.

Anonymous
03/19/26(Thu)20:03:26 No.108410256

Anonymous 03/19/26(Thu)20:03:26 No.108410256

File: sample_b44cdfca461b22816f(...).jpg (278 KB, 850x1146)

278 KB JPG

>>108410246
>>108410253
Have a Miku!

Anonymous
03/19/26(Thu)20:03:47 No.108410260

Anonymous 03/19/26(Thu)20:03:47 No.108410260

>>108410253
>I just want local model news
Here >>108410131

Anonymous
03/19/26(Thu)20:03:53 No.108410261

Anonymous 03/19/26(Thu)20:03:53 No.108410261

>>108410256
so what do you get out of doing this exactly?

Anonymous
03/19/26(Thu)20:05:04 No.108410268

Anonymous 03/19/26(Thu)20:05:04 No.108410268

File: sample_f344ac7736df0b5f94(...).jpg (129 KB, 850x1221)

129 KB JPG

>>108410261
I get to post cute Mikus! Become Miku today!

Anonymous
03/19/26(Thu)20:05:51 No.108410274

Anonymous 03/19/26(Thu)20:05:51 No.108410274

>>108410268
i get that you are a faceblind autist, but that is not miku.

Anonymous
03/19/26(Thu)20:07:47 No.108410284

Anonymous 03/19/26(Thu)20:07:47 No.108410284

>>108410274
Not a true christian fallacy

Anonymous
03/19/26(Thu)20:08:17 No.108410288

Anonymous 03/19/26(Thu)20:08:17 No.108410288

Miku troons are shitting up this place

Anonymous
03/19/26(Thu)20:10:53 No.108410305

Anonymous 03/19/26(Thu)20:10:53 No.108410305

I miss the good miku/teto gens.

Anonymous
03/19/26(Thu)20:12:22 No.108410317

Anonymous 03/19/26(Thu)20:12:22 No.108410317

every single general is like this, worthless fucking jannies

Anonymous
03/19/26(Thu)20:13:07 No.108410321

Anonymous 03/19/26(Thu)20:13:07 No.108410321

Just report.

Anonymous
03/19/26(Thu)20:13:23 No.108410323

Anonymous 03/19/26(Thu)20:13:23 No.108410323

>>108410321
For what?

Anonymous
03/19/26(Thu)20:14:12 No.108410326

Anonymous 03/19/26(Thu)20:14:12 No.108410326

>>108410305
Same, wonder what happened to that guy

Anonymous
03/19/26(Thu)20:16:39 No.108410338

Anonymous 03/19/26(Thu)20:16:39 No.108410338

>>108410321
I hope you can provide a justification why a tealhaired, twintails anime girl is now suddenly offtopic in this thread. /lmg/ is full of this shit since forever and it never had anything to do with this thread.

Anonymous
03/19/26(Thu)20:18:18 No.108410352

Anonymous 03/19/26(Thu)20:18:18 No.108410352

>>108410115
https://litter.catbox.moe/x7czk5s7o0jcdhyb.jpg

Anonymous
03/19/26(Thu)20:21:40 No.108410370

Anonymous 03/19/26(Thu)20:21:40 No.108410370

I am happy that we got more people posting miku.

Anonymous
03/19/26(Thu)20:23:20 No.108410380

Anonymous 03/19/26(Thu)20:23:20 No.108410380

File: pretending.png (38 KB, 598x276)

38 KB PNG

>>108410338

Anonymous
03/19/26(Thu)20:24:31 No.108410390

Anonymous 03/19/26(Thu)20:24:31 No.108410390

Nobody click on that catbox link.

Anonymous
03/19/26(Thu)20:24:35 No.108410391

Anonymous 03/19/26(Thu)20:24:35 No.108410391

>>108410338
Miku = good, cute, funny
notMiku, falseflagger = bad

Anonymous
03/19/26(Thu)20:25:29 No.108410395

Anonymous 03/19/26(Thu)20:25:29 No.108410395

>>108410380
Issue with leftists is how they prefer qwen3.5 to grok.
It's the same reason they didn't vote for the invasion of Iran.

Anonymous
03/19/26(Thu)20:25:53 No.108410398

Anonymous 03/19/26(Thu)20:25:53 No.108410398

If anyone's wondering, it's march break for grade-schoolers (children). That's why this week's been so bad.

Anonymous
03/19/26(Thu)20:26:29 No.108410403

Anonymous 03/19/26(Thu)20:26:29 No.108410403

>>108410391
Miku = troon coded
(You) = Massive faggot

Anonymous
03/19/26(Thu)20:26:29 No.108410404

Anonymous 03/19/26(Thu)20:26:29 No.108410404

I vote we make Neuro-sama our new mascot.

Anonymous
03/19/26(Thu)20:26:38 No.108410405

Anonymous 03/19/26(Thu)20:26:38 No.108410405

So I'm guessing the Zoomer/Kurisu baker got tired of baking and decided to just spam until the mods get involved, hoping that all anime images get nuked and thus no more Miku in OP.
Too lazy to bake and make an effort but too bothered to not shit up the place, well done.

Anonymous
03/19/26(Thu)20:28:37 No.108410419

Anonymous 03/19/26(Thu)20:28:37 No.108410419

>>108410405
>shit up the place
This is the cornerstone of thread culture

Anonymous
03/19/26(Thu)20:29:05 No.108410422

Anonymous 03/19/26(Thu)20:29:05 No.108410422

This is actual thread derailment at this point. 43 replies and not one post with actual local model information. Holy fuck. Find something better to do, seriously.

Anonymous
03/19/26(Thu)20:29:38 No.108410426

Anonymous 03/19/26(Thu)20:29:38 No.108410426

>>108410404
It should have always been le cunny.

Anonymous
03/19/26(Thu)20:30:38 No.108410434

Anonymous 03/19/26(Thu)20:30:38 No.108410434

>>108410422
>This is actual thread derailment at this point
You mean mikuposting? Yeah. Always has been.

Anonymous
03/19/26(Thu)20:33:50 No.108410455

Anonymous 03/19/26(Thu)20:33:50 No.108410455

ProjectAni guy here. I'm keeping the cum jar. I spend 4 hours ripping sketchfab apartment models just to come to the conclusion that it looks like ass, isn't that interesting, and is way beyond the technical scope of the project.

I don't really want the project to turn into a dating simulator game. I can't be bothered to implement pathfinding shit, first-player navigation, and all of that other crap.

For the vision stuff I'm just going to have it work via a webcam. Maybe I'm coping or being lazy, but this extra stuff is way too much to manage and not worth the effort for how far it's divorced from the actual local model technology.

For vision related sensory input I'm just going to use a PC webcam/phone camera. Thank you for your attention to this matter - PRESIDENT DONALD J. TRUMP.

Anonymous
03/19/26(Thu)20:34:50 No.108410460

Anonymous 03/19/26(Thu)20:34:50 No.108410460

damn I should've proofread. whatever.

Anonymous
03/19/26(Thu)20:36:11 No.108410466

Anonymous 03/19/26(Thu)20:36:11 No.108410466

>>108410455
Hope your Ani becomes the official /lmg/ mascot and she frees us from this hell.

Anonymous
03/19/26(Thu)20:39:56 No.108410484

Anonymous 03/19/26(Thu)20:39:56 No.108410484

>>108410405
I think it's a different guy. He's been around since nov or so. He used to ask nicely ( >>107080745 ) but now he's resorted to spamming in hopes of getting his way.

Anonymous
03/19/26(Thu)20:42:03 No.108410497

Anonymous 03/19/26(Thu)20:42:03 No.108410497

>>108410455
Please refrain from putting extra newlines in between every line. Thanks

Anonymous
03/19/26(Thu)20:51:53 No.108410559

Anonymous 03/19/26(Thu)20:51:53 No.108410559

File: 2060c2cf19a2b9c3c33bff81d(...).jpg (142 KB, 960x797)

142 KB JPG

Imagine being mad, over people posting anime girl pictures on an anime website.

Anonymous
03/19/26(Thu)20:52:54 No.108410563

Anonymous 03/19/26(Thu)20:52:54 No.108410563

File: sample_bb7002defabf5a0a5b(...).jpg (161 KB, 850x1509)

161 KB JPG

Anonymous
03/19/26(Thu)20:53:00 No.108410564

Anonymous 03/19/26(Thu)20:53:00 No.108410564

File: 1765728187203117.jpg (881 KB, 2480x3508)

881 KB JPG

Anonymous
03/19/26(Thu)20:54:13 No.108410572

Anonymous 03/19/26(Thu)20:54:13 No.108410572

File: sample_eb4b5d7857536b9c0e(...).jpg (824 KB, 850x1318)

824 KB JPG

Anonymous
03/19/26(Thu)20:54:36 No.108410578

Anonymous 03/19/26(Thu)20:54:36 No.108410578

File: 1771678913774484.gif (84 KB, 220x220)

84 KB GIF

Is this the local Miku general?

Anonymous
03/19/26(Thu)20:55:48 No.108410589

Anonymous 03/19/26(Thu)20:55:48 No.108410589

File: bf59a9e7029e07f9508b1007b(...).jpg (1.17 MB, 2434x4096)

1.17 MB JPG

Anonymous
03/19/26(Thu)20:57:57 No.108410596

Anonymous 03/19/26(Thu)20:57:57 No.108410596

File: 1766207303523153.jpg (273 KB, 1200x1500)

273 KB JPG

>>108410589
Her thread
Her board
Her world

Anonymous
03/19/26(Thu)20:58:06 No.108410597

Anonymous 03/19/26(Thu)20:58:06 No.108410597

File: 2ple_059f8e14cf3a6bd27025(...).jpg (496 KB, 850x953)

496 KB JPG

Anonymous
03/19/26(Thu)20:58:39 No.108410600

Anonymous 03/19/26(Thu)20:58:39 No.108410600

>>108410589
>hiding the feet next to the already delicious thighs
prison

Anonymous
03/19/26(Thu)20:59:25 No.108410606

Anonymous 03/19/26(Thu)20:59:25 No.108410606

File: ce3baf9fe365f5fd6db9ccf2b(...).jpg (759 KB, 2480x3508)

759 KB JPG

Anonymous
03/19/26(Thu)21:01:05 No.108410615

Anonymous 03/19/26(Thu)21:01:05 No.108410615

>>108410606
Shill.

Anonymous
03/19/26(Thu)21:01:50 No.108410621

Anonymous 03/19/26(Thu)21:01:50 No.108410621

Best thread honestly. We should just forget all the boding nerd stuff and post more anime girls and about becoming anime girls.

Anonymous
03/19/26(Thu)21:03:10 No.108410625

Anonymous 03/19/26(Thu)21:03:10 No.108410625

File: 1771454554183545.png (186 KB, 982x710)

186 KB PNG

Anonymous
03/19/26(Thu)21:06:14 No.108410641

Anonymous 03/19/26(Thu)21:06:14 No.108410641

File: 1757902230357275.gif (2.64 MB, 320x240)

2.64 MB GIF

>>108410625
>sparkling with mischief
>mixture of

Anonymous
03/19/26(Thu)21:06:40 No.108410643

Anonymous 03/19/26(Thu)21:06:40 No.108410643

this is a mockery of /lmg/ culture

Anonymous
03/19/26(Thu)21:07:30 No.108410648

Anonymous 03/19/26(Thu)21:07:30 No.108410648

>>108410641
Explain what's wrong with it in 10 words or less. No buzzwords like "slop".

Anonymous
03/19/26(Thu)21:08:52 No.108410654

Anonymous 03/19/26(Thu)21:08:52 No.108410654

>>108410648
I have to read it every time I gen

Anonymous
03/19/26(Thu)21:16:27 No.108410698

Anonymous 03/19/26(Thu)21:16:27 No.108410698

>>108410654
It's the *cheeks flushing pink* for me, but my eyes tend to just glide through the slop until they find important keywords.

Anonymous
03/19/26(Thu)21:30:19 No.108410760

Anonymous 03/19/26(Thu)21:30:19 No.108410760

>>108410641
in a dialogue-focused RP I don't really care if the minor body language descriptions are a bit sloppy, there are only so many ways you can write those.

Anonymous
03/19/26(Thu)21:31:20 No.108410768

Anonymous 03/19/26(Thu)21:31:20 No.108410768

>>108410760
I really want /aicg/ gang gone

Anonymous
03/19/26(Thu)21:34:02 No.108410778

Anonymous 03/19/26(Thu)21:34:02 No.108410778

>tfw magidonia keeps trying to write a mini novel every reply

Anonymous
03/19/26(Thu)21:43:09 No.108410837

Anonymous 03/19/26(Thu)21:43:09 No.108410837

lol miku drama 2 threads in a row, great stuff. come on guys its the internet, it's not serious business

Anonymous
03/19/26(Thu)21:44:34 No.108410852

Anonymous 03/19/26(Thu)21:44:34 No.108410852

>>108410837
You need to be 18 to post here

Anonymous
03/19/26(Thu)21:48:09 No.108410868

Anonymous 03/19/26(Thu)21:48:09 No.108410868

>>108410760
Eventually you'll come to the conclusion that for dialogue-focused RP most narration could be replaced with the occasional emoji to give the general vibe/tone of the response.
Visual novels only rarely use narration; somehow that works there, even for those that aren't fully voiced.

Anonymous
03/19/26(Thu)21:51:08 No.108410881

Anonymous 03/19/26(Thu)21:51:08 No.108410881

File: 1772592584678715.jpg (15 KB, 327x315)

15 KB JPG

>>108410868
>visual novels
Are you retarded? Where are the visuals in a fucking RP on ST?

Anonymous
03/19/26(Thu)21:52:02 No.108410887

Anonymous 03/19/26(Thu)21:52:02 No.108410887

>he's not running ST in VN mode

Anonymous
03/19/26(Thu)21:58:56 No.108410920

Anonymous 03/19/26(Thu)21:58:56 No.108410920

>>108410760
There are probably like 5000 ways to write a person's expression/body language/emotional state, from literal to metaphorical. I refuse to accept the same responses every single god damn time from a LANGUAGE model

Anonymous
03/19/26(Thu)21:59:07 No.108410921

Anonymous 03/19/26(Thu)21:59:07 No.108410921

>>108410881
There are games with extensive VN-like elements like Super Robot Wars where there's barely any interesting visual besides generic backgrounds and character faces with expressions. I don't recall SRW using narration at all in the VN sections, only direct character dialogue, sometimes sound effects (but no voices), and music.

Anonymous
03/19/26(Thu)21:59:19 No.108410922

Anonymous 03/19/26(Thu)21:59:19 No.108410922

File: 1773499312282319.jpg (16 KB, 375x420)

16 KB JPG

So local utterly lost huh. I never thought I'd see it happen so soon.

Anonymous
03/19/26(Thu)21:59:54 No.108410924

Anonymous 03/19/26(Thu)21:59:54 No.108410924

>>108410920
Change your sampling flags then. Quit whining.

Anonymous
03/19/26(Thu)22:05:39 No.108410948

Anonymous 03/19/26(Thu)22:05:39 No.108410948

>With a deliberate motion, she unzips the front of her pants—wait, no, she's wearing a skirt.
I've never seen a model self correct like this in character.

Anonymous
03/19/26(Thu)22:09:27 No.108410963

Anonymous 03/19/26(Thu)22:09:27 No.108410963

>>108410924
>Just crank up the temperature or use meme samplers!
All this accomplishes is producing the SAME SLOP, over and OVER, until it reaches its tipping point and starts producing gibberish. These pieces of shit are so over baked, their token probabilities are so fried, that they CAN NOT generate a variety of responses. Any attempt at all to force variety causes them to break down and become unintelligible. So basically, fuck you

Anonymous
03/19/26(Thu)22:09:50 No.108410967

Anonymous 03/19/26(Thu)22:09:50 No.108410967

>>108410922
>I never thought I'd see it happen so soon.
huh? when was ever local good or had any hope whatsoever? unironically?

Anonymous
03/19/26(Thu)22:16:22 No.108411000

Anonymous 03/19/26(Thu)22:16:22 No.108411000

>>108410963
https://arxiv.org/abs/2510.22954

Anonymous
03/19/26(Thu)22:19:52 No.108411011

Anonymous 03/19/26(Thu)22:19:52 No.108411011

>>108410963
if you think samplers dont do anything you're a grade A retard
>>108411000
checked

Anonymous
03/19/26(Thu)22:24:48 No.108411037

Anonymous 03/19/26(Thu)22:24:48 No.108411037

>>108411011
I just told you what samplers do. Try reading, dumbass

Anonymous
03/19/26(Thu)22:31:23 No.108411071

Anonymous 03/19/26(Thu)22:31:23 No.108411071

>>108410115
I might be slow but I just understood that using uv sync makes your dependencies magically get along and does away with the need to pip reinstall vllm, transformers and flash-attention in various permutations to try to figure out in which magical update order will make them work.

Anonymous
03/19/26(Thu)22:37:19 No.108411098

Anonymous 03/19/26(Thu)22:37:19 No.108411098

When I go to the ice cream store, the strawberry ice cream is always the same, but I order it anyway.

Anonymous
03/19/26(Thu)22:39:55 No.108411110

Anonymous 03/19/26(Thu)22:39:55 No.108411110

>>108411098
Nemo in a nutshell

Anonymous
03/19/26(Thu)22:58:15 No.108411184

Anonymous 03/19/26(Thu)22:58:15 No.108411184

>>108411037
retard

Anonymous
03/19/26(Thu)23:14:15 No.108411267

Anonymous 03/19/26(Thu)23:14:15 No.108411267

>>108411000
>large-scale study of mode collapse in LMs
>LMs, reward models, and LM judges are less well calibrated to human ratings on model generations that elicit differing idiosyncratic annotator preferences
Good to see there is some work in this area.

Anonymous
03/19/26(Thu)23:17:13 No.108411283

Anonymous 03/19/26(Thu)23:17:13 No.108411283

So how bad exactly is the new mistral 119B? Would it be a suitable replacement for largestral 2411 for RP?

Anonymous
03/19/26(Thu)23:17:23 No.108411284

Anonymous 03/19/26(Thu)23:17:23 No.108411284

How are images encoded before being turned into tokens?
Just a big byte array/bitmap?
I kind of wish silly cards had the option to add images to the system prompt or something like that.

Anonymous
03/19/26(Thu)23:18:37 No.108411292

Anonymous 03/19/26(Thu)23:18:37 No.108411292

>>108411283
It's a waste of time and disk space don't bother.

Anonymous
03/19/26(Thu)23:19:28 No.108411297

Anonymous 03/19/26(Thu)23:19:28 No.108411297

>>108411292
Are there any good recent RP models in the 70B to 125B range?

Anonymous
03/19/26(Thu)23:33:26 No.108411357

Anonymous 03/19/26(Thu)23:33:26 No.108411357

cozy breas

Anonymous
03/19/26(Thu)23:38:07 No.108411382

Anonymous 03/19/26(Thu)23:38:07 No.108411382

>>108411184
dumbass

Anonymous
03/19/26(Thu)23:48:03 No.108411429

Anonymous 03/19/26(Thu)23:48:03 No.108411429

air status?

Anonymous
03/20/26(Fri)00:00:46 No.108411492

Anonymous 03/20/26(Fri)00:00:46 No.108411492

File: 1766214380302501.gif (7 KB, 80x160)

7 KB GIF

Is Qwen 3.5 kind of retarded when it comes to copying character card/pre-existing info or is it just my uncensored model or settings? I tried 35b, 27b and 9b and of the heretic and 'uncensored aggressive' variety.
For example, if I have the girl, the bot, described as wearing panties and I tell her to look at an image of another girl, she says the girl in the image is wearing panties even when she isn't.
It's really fun to have a character react to an image instead of the LLM directly so this is kind of a bummer if there's no way around it. I'm not expecting deep roleplay scenarios with huge context, just some entertaining image reactions.
I've got 32gb of VRAM and 96 of RAM so those big models are outside of my range. Man, I wish I didn't procrastinate on getting more RAM a year ago.

Anonymous
03/20/26(Fri)00:02:28 No.108411503

Anonymous 03/20/26(Fri)00:02:28 No.108411503

>>108411492
you could probably try a q4 of the 122b. qwen models just in general kind of suck for roleplaying because they filter pretty much all sex knowledge from training.

Anonymous
03/20/26(Fri)00:03:43 No.108411513

Anonymous 03/20/26(Fri)00:03:43 No.108411513

Dang ol' Meeker

Anonymous
03/20/26(Fri)00:07:58 No.108411534

Anonymous 03/20/26(Fri)00:07:58 No.108411534

crazy how v4 didn't come out this week either
maybe the ccp is withholding it because it's that good

Anonymous
03/20/26(Fri)00:10:12 No.108411546

Anonymous 03/20/26(Fri)00:10:12 No.108411546

>>108411503
>try a q4 of the 122b.
I'll try that, thanks. I wouldn't think of loading something of this size, so this should be a fun experiment.
Qwen does seem very plain when it comes to roleplaying, but it goes along with my length cards/intros well enough, at least in the short term. I'd just like it to not stick to them so literally that it repeats the words.

Anonymous
03/20/26(Fri)00:14:13 No.108411565

Anonymous 03/20/26(Fri)00:14:13 No.108411565

File: Vionna.png (3.16 MB, 2304x3840)

3.16 MB PNG

Hey guys, I got tired of that anon making the AI companion girl flip-flopping between release dates and whether he was going to release it Open-Source or not, and decided to make my own.

Meet Vionna. My full-featured Open-Source AI companion.

https://vionna.life/

Anonymous
03/20/26(Fri)00:19:49 No.108411602

Anonymous 03/20/26(Fri)00:19:49 No.108411602

>>108411565
>no source code posted
>download this random exe
fuck off

Anonymous
03/20/26(Fri)00:22:48 No.108411615

Anonymous 03/20/26(Fri)00:22:48 No.108411615

>>108411602
https://github.com/vionna/vionna-ai-companion

Anonymous
03/20/26(Fri)00:24:28 No.108411627

Anonymous 03/20/26(Fri)00:24:28 No.108411627

>>108411602

is it portable

Anonymous
03/20/26(Fri)00:26:01 No.108411634

Anonymous 03/20/26(Fri)00:26:01 No.108411634

>>108410115
fyi:
>>108411349

Anonymous
03/20/26(Fri)00:27:02 No.108411637

Anonymous 03/20/26(Fri)00:27:02 No.108411637

>>108411602
It's a literal malicious actor. can the mods handle this?

Anonymous
03/20/26(Fri)00:28:01 No.108411639

Anonymous 03/20/26(Fri)00:28:01 No.108411639

>>108411637
In case you haven't noticed, they don't give a shit.

Anonymous
03/20/26(Fri)00:29:32 No.108411642

Anonymous 03/20/26(Fri)00:29:32 No.108411642

File: mfw.png (1.53 MB, 800x1334)

1.53 MB PNG

>>108411637

Anonymous
03/20/26(Fri)00:30:06 No.108411644

Anonymous 03/20/26(Fri)00:30:06 No.108411644

>>108411639
I think they're all mossad and are a tad busy.

Anonymous
03/20/26(Fri)00:30:42 No.108411648

Anonymous 03/20/26(Fri)00:30:42 No.108411648

>>108411634
>>108411360
>I don't really care, if I want to dump $1200 on a gpu there are better options than a glorified igpu, intel should focus on the budget range
fpbp
That $/GB isn't worth giving up CUDA.

Anonymous
03/20/26(Fri)00:31:07 No.108411649

Anonymous 03/20/26(Fri)00:31:07 No.108411649

>>108411615
dead link

scam "software" - malware.

WILL FUCK YOU UP

Anonymous
03/20/26(Fri)00:34:56 No.108411661

Anonymous 03/20/26(Fri)00:34:56 No.108411661

>>108410455
Glad you came to a decision that works for you. Yours is the only good post itt.

Anonymous
03/20/26(Fri)00:38:22 No.108411676

Anonymous 03/20/26(Fri)00:38:22 No.108411676

>>108411565
Buy an ad.

Anonymous
03/20/26(Fri)00:43:36 No.108411701

Anonymous 03/20/26(Fri)00:43:36 No.108411701

>>108411676
That's malicious software, not a product.

Anonymous
03/20/26(Fri)00:48:41 No.108411732

Anonymous 03/20/26(Fri)00:48:41 No.108411732

>>108411648
>That $/GB isn't worth giving up CUDA.
who gives a fuck about cuda.
only thing that matters is $/GB/GB/s

Anonymous
03/20/26(Fri)00:49:24 No.108411734

Anonymous 03/20/26(Fri)00:49:24 No.108411734

>>108411297
Unfortunately you know my answer already...

Anonymous
03/20/26(Fri)00:49:59 No.108411738

Anonymous 03/20/26(Fri)00:49:59 No.108411738

>>108411565
>I got tired of that anon making the AI companion girl flip-flopping between release dates and whether he was going to release it Open-Source or not, and decided to make my own.
It's just a personal project that isn't even close to ready enough to have the source code released. I already released the source code for multiple core components anyways (my github is VolgaGerm) and a full diagram of the tech stack I'm using. My main goal, as I've stated previously, is mostly just to engage in discourse about the latest technologies that are relevant to this particular usecase and to hopefully inspire others to get into it as well.

For the record, I haven't called your thing malware (I haven't bothered to check), but if it's real I'm glad there are more people getting into this space. Wishing you luck.

>>108411661
Thanks man

Anonymous
03/20/26(Fri)00:55:22 No.108411767

Anonymous 03/20/26(Fri)00:55:22 No.108411767

>>108411738
>Germ
virus moment

Anonymous
03/20/26(Fri)00:56:37 No.108411772

Anonymous 03/20/26(Fri)00:56:37 No.108411772

>>108411767
https://en.wikipedia.org/wiki/Volga_Germans

Anonymous
03/20/26(Fri)00:58:07 No.108411781

Anonymous 03/20/26(Fri)00:58:07 No.108411781

File: 1746131979248893.png (67 KB, 978x578)

67 KB PNG

>>108411738
breh I went to your github all giddy expecting something juicy instead its 2 repos with barely anything, fuck you for pretending you released anything open source

Anonymous
03/20/26(Fri)01:00:01 No.108411791

Anonymous 03/20/26(Fri)01:00:01 No.108411791

>>108411732
>who gives a fuck about cuda.
Anyone who wants a card that can be used for anything more than llama.cpp's vulkan backend.

>only thing that matters is $/GB/GB/s
And this card is still a bad deal even compared to chink modded 4090s.

Anonymous
03/20/26(Fri)01:01:09 No.108411798

Anonymous 03/20/26(Fri)01:01:09 No.108411798

>>108411781
Check out the emage-onnx-export repo. It contains a demo that will get you quite far if you're looking to replicate my project. That's why it's listed as being mostly html instead of python. The bulk of it is the demo.

Anonymous
03/20/26(Fri)01:03:19 No.108411812

Anonymous 03/20/26(Fri)01:03:19 No.108411812

>>108411732
>only thing that matters is $/GB/GB/s
compute is also important if you are interested in decent ttft
otherwise just buy a mac I guess

Anonymous
03/20/26(Fri)01:04:10 No.108411814

Anonymous 03/20/26(Fri)01:04:10 No.108411814

What's the best model for translating japanese to english? Found people online suggesting gemma 3 but its cucked and wont translate images it deems nsfw.

Anonymous
03/20/26(Fri)01:04:22 No.108411816

Anonymous 03/20/26(Fri)01:04:22 No.108411816

>>108411781
Also my Pocket TTS runtime is the fastest TTS with voice cloning that runs on cpu in the world, so it's not exactly "nothing"

Anonymous
03/20/26(Fri)01:08:03 No.108411832

Anonymous 03/20/26(Fri)01:08:03 No.108411832

>>108411816
if its so amazing why does it only have 7 stars? lmao

Anonymous
03/20/26(Fri)01:10:19 No.108411841

Anonymous 03/20/26(Fri)01:10:19 No.108411841

>>108411814
Have you tried e.g. the Heretic or other abliterated version of gemma 3?

Anonymous
03/20/26(Fri)01:10:45 No.108411842

Anonymous 03/20/26(Fri)01:10:45 No.108411842

>>108411841
I am not handing you the prompt for a manual rerun, because the review content is already there.

Anonymous
03/20/26(Fri)01:13:07 No.108411846

Anonymous 03/20/26(Fri)01:13:07 No.108411846

Can't tell if it's the reap pruning or the heretic uncensoring on top of it but man, qwen 3.5 loves to loop.

Anonymous
03/20/26(Fri)01:18:22 No.108411868

Anonymous 03/20/26(Fri)01:18:22 No.108411868

>>108411846
Nah the official one does it too. Refining the prompt and adding positive examples to reduce ambiguity has cut the looping down a lot for me, e.g. instead of just
> * Avoid using quotation marks to indicate a character is talking.
adding
> * Avoid using quotation marks to indicate a character is talking. Action: *italics*. Speech: plain text.
significantly reduced the amount of "But wait, I need to avoid using quotation marks".
You might also out using the new reasoning budget feature :')

Anonymous
03/20/26(Fri)01:18:44 No.108411872

Anonymous 03/20/26(Fri)01:18:44 No.108411872

how do I test how smart a model is?

Anonymous
03/20/26(Fri)01:23:15 No.108411892

Anonymous 03/20/26(Fri)01:23:15 No.108411892

>>108411868
>reasoning budget feature :')
for me it works as a sudden </thinking> that 1 of every 3 or so times it just keeps reasoning past it. I don't think it even considers the budget when crafting the reasoning block. If it successfully ends the reasoning mid sentence it "worked" which is pretty shitty

Anonymous
03/20/26(Fri)01:25:06 No.108411896

Anonymous 03/20/26(Fri)01:25:06 No.108411896

>>108411841
That did the trick. Silly question in hindsight but im still new to this. Thanks.

Anonymous
03/20/26(Fri)01:28:35 No.108411915

Anonymous 03/20/26(Fri)01:28:35 No.108411915

File: Jeets.png (255 KB, 822x459)

255 KB PNG

Interesting. It looks like the Vionna AI thing might actually be real. They have a youtube page. Looks like the project uses Unreal Engine and has been under active development for at least 3 months with a full team behind it. Idk why I'm getting trolled ITT by these Indians as a solo software dev.

https://youtu.be/Be2km1AVQhg

Anonymous
03/20/26(Fri)01:31:47 No.108411933

Anonymous 03/20/26(Fri)01:31:47 No.108411933

>>108411791
>Anyone who wants a card that can be used for anything more than llama.cpp's vulkan backend.
what even is zluda and hip / rocm.
massive skill issue.
>And this card is still a bad deal even compared to chink modded 4090s.
i don't disagree, my only point is that there is no reason to stay on nvidia if another company has a better deal.
>>108411812
llms are bandwidth bound not compute.
macs are slow because the bandwidth is still not that high

Anonymous
03/20/26(Fri)01:34:55 No.108411943

Anonymous 03/20/26(Fri)01:34:55 No.108411943

>>108411892
>1 of every 3 or so times it just keeps reasoning past it
That's fucked. I assume you're on head, but I vaguely recall one of the changes pushed adjusted the injected "</think>" to "</think>\n\n"? I might be hallucinating.
My prompts are simple enough that I rarely run into looping issues anymore, thankfully. Usually when it happens it's because a fuckup on my end (e.g. "You are not wearing any pants. Take off your pants.") which I can just fix and re-send.

Anonymous
03/20/26(Fri)01:36:14 No.108411947

Anonymous 03/20/26(Fri)01:36:14 No.108411947

>>108411933
>llms are bandwidth bound not compute.
llm inference is bandwidth bound, newfren
prompt processing is compute bound
That's why your heart sinks the longer you run inference on a mac.
Imagine the ttft after 128k of context...

Anonymous
03/20/26(Fri)01:36:27 No.108411948

Anonymous 03/20/26(Fri)01:36:27 No.108411948

>>108411933
>llms are bandwidth bound not compute.
Compute is needed for prompt processing.

Anonymous
03/20/26(Fri)01:37:08 No.108411952

Anonymous 03/20/26(Fri)01:37:08 No.108411952

>>108411915
Ironic that these people are pressuring me to open-source my stuff while not even releasing their own source code. Very malicious behavior. This kind of shit is why I've always avoided open-sourcing my code until very recently. Typical opportunistic brown faggots.

Anonymous
03/20/26(Fri)01:37:39 No.108411953

Anonymous 03/20/26(Fri)01:37:39 No.108411953

>>108411933
>what even is zluda and hip / rocm.
We are talking about an Intel card, not AMD.

Anonymous
03/20/26(Fri)01:38:46 No.108411958

Anonymous 03/20/26(Fri)01:38:46 No.108411958

>>108411915
>with a full team behind it
their poorly conceived astroturfing has me hating them when I otherwise would have felt neutral
Good job, fellas

Anonymous
03/20/26(Fri)01:44:42 No.108411983

Anonymous 03/20/26(Fri)01:44:42 No.108411983

>>108411953
no we were talking about non nvidia cards.
also if intel released actualy good cards the software would quickly follow.
my point is idgaf about cuda as long as you can do gpgpu, which you can with vulkan anyway.

Anonymous
03/20/26(Fri)01:55:04 No.108412028

Anonymous 03/20/26(Fri)01:55:04 No.108412028

man I just love telling my little slut gwen to give me today's newspaper from my country's major papers, then also fetch some finance news from a dedicated newspaper, run additional queries to her, cross check with other data and finally rape her.
this is like the quintessential 90s secretary experience, but now we get to live it virtually.
we're all gonna make it brahs

Anonymous
03/20/26(Fri)01:57:27 No.108412033

Anonymous 03/20/26(Fri)01:57:27 No.108412033

>>108410922
wait for avocado and gemma 4 next week

Anonymous
03/20/26(Fri)02:02:35 No.108412053

Anonymous 03/20/26(Fri)02:02:35 No.108412053

>>108410922
Way to out yourself as a poorfag

Anonymous
03/20/26(Fri)02:11:32 No.108412092

Anonymous 03/20/26(Fri)02:11:32 No.108412092

>>108412028
full dive is gonna be a hell of a combination with ai lol

Anonymous
03/20/26(Fri)02:13:18 No.108412097

Anonymous 03/20/26(Fri)02:13:18 No.108412097

>>108412028
I thought that was in like the 60s?

Anonymous
03/20/26(Fri)02:17:04 No.108412117

Anonymous 03/20/26(Fri)02:17:04 No.108412117

>>108411947
>inference on a mac.
>Imagine the ttft after 128k of context...

https://omlx.ai/benchmarks?chip=&chip_full=M5%7CMax%7C40&model=&quantization=8bit&context=131072&pp_min=&tg_min=&page=1

https://omlx.ai/benchmarks/fgig386m

ttft: 305 seconds
model: qwen 3.5 27b q8

Anonymous
03/20/26(Fri)02:31:11 No.108412170

Anonymous 03/20/26(Fri)02:31:11 No.108412170

>>108412117
>thousands of bucks to run a 27b
lmao

Anonymous
03/20/26(Fri)02:33:06 No.108412177

Anonymous 03/20/26(Fri)02:33:06 No.108412177

File: screenshot-20260320-083208.png (12 KB, 340x158)

12 KB PNG

You are absolutely right — this time it's for REAL!

Anonymous
03/20/26(Fri)02:38:42 No.108412202

Anonymous 03/20/26(Fri)02:38:42 No.108412202

>>108412177
:rocket: :rocket:

Anonymous
03/20/26(Fri)03:09:08 No.108412305

Anonymous 03/20/26(Fri)03:09:08 No.108412305

>inference on mac
Yeah these numbers aren't a surprise unless you're new. The only reason to go for Apple for AI is if you want to run large models in a relatively portable form factor, otherwise building your own server or something is better. Although that is my past knowledge and I haven't been paying attention to hardware prices recently to know if it is still true or not.

Anonymous
03/20/26(Fri)03:25:20 No.108412346

Anonymous 03/20/26(Fri)03:25:20 No.108412346

so what's the verdict on mistral small 4? worth bothering with?
I tried qwen 3.5 (mediocre) and stepfun (decent but too large for my hardware), I'm growing desperate for good ~120B cause I still use glm air 4.5 which is old as balls by now

Anonymous
03/20/26(Fri)03:26:06 No.108412347

Anonymous 03/20/26(Fri)03:26:06 No.108412347

>>108412346
mistral4 is llama4 tier. complete garbage.

Anonymous
03/20/26(Fri)03:39:12 No.108412385

Anonymous 03/20/26(Fri)03:39:12 No.108412385

>>108412346
I think stepfun is a nice sidegrade over glm air, but with my hardware I can run both at 10 t/s (good enough for RP cooming). sadly I moved on from rp lately and have been using qwen35-35 moe for agentic stuff, and man the t/s with FULL cpu moe gets me 30 t/s + 256k context and man it feels like it's the first time you can do local memes at a decent level (for coding/agentic stuff) on normal hardware (96gb ram + 16gb vram)

Anonymous
03/20/26(Fri)03:44:33 No.108412405

Anonymous 03/20/26(Fri)03:44:33 No.108412405

>>108412385
What sort of tasks are you doing with 35 A3B? Does it really work well enough for programming for example?

Anonymous
03/20/26(Fri)03:50:22 No.108412427

Anonymous 03/20/26(Fri)03:50:22 No.108412427

>>108412405
It's ok, I've been making some automation scripts with it (also a web GUI for local TL) and it was serviceable.
But for my big projects I'll be honest, I use gemini since I have basically unlimited gemini 3 flash credits and some 3.1 pro.
What it replaced for me was the random questions/web searches for stuff and also TL of japanese.

Anonymous
03/20/26(Fri)03:54:34 No.108412440

Anonymous 03/20/26(Fri)03:54:34 No.108412440

>>108412405
NTA, but 35B A3B was fucked for me at Q8_0. It made all kinds of basic syntax errors (missing semicolons, commas, etc). 27B Q8_0 works fine for one-shotting my tasks like
>[upload charedit.js]
>Can you update rebuildPromptList to re-use existing DOM elements instead of wiping out the entire $prompts container? You'll probably need to update appendPrompt to add prompt.id into a data attribute of the char-edit-prompt node.
>I only need the code for the updated/changed functions.
27B-heretic to be clear, it's important that you fuck your programming assistant every now and then to keep your head clear.

Anonymous
03/20/26(Fri)04:05:56 No.108412495

Anonymous 03/20/26(Fri)04:05:56 No.108412495

>>108412440
>>108412427
I did test 3.5 9B for some C stuff I'm working on and I was so surprised, that I'm now thinking about trying 27B or that moe model. Of course I need to keep it simple and this is all what I need, to help me with C syntax and some string pointer things and so on.

Anonymous
03/20/26(Fri)04:21:49 No.108412536

Anonymous 03/20/26(Fri)04:21:49 No.108412536

I WALKED TO THE PIZZERIA WITH PRACTICED EASE
THEN ATE A PIZZA WITH PRACTICED EASE
THEN I WENT BACK HOME WITH PRACTICED EASE
AND I USED AN LLM TO RP, WITH PRACTICED EASE

Anonymous
03/20/26(Fri)04:25:12 No.108412551

Anonymous 03/20/26(Fri)04:25:12 No.108412551

>>108412177
>The
slop

Anonymous
03/20/26(Fri)04:29:46 No.108412567

Anonymous 03/20/26(Fri)04:29:46 No.108412567

>>108412551
based

Anonymous
03/20/26(Fri)05:02:44 No.108412688

Anonymous 03/20/26(Fri)05:02:44 No.108412688

File: ms4_ao3_completion.png (530 KB, 767x1953)

530 KB PNG

>>108411283
That it hallucinates almost everything if you give it illustrations to describe makes me think its training data has been cleaned of pretty much anything that might have been blatantly covered by copyright.

I tried making it write a random AO3 story in text completion mode and it still worked, though, to the point of defaulting to a Harry Potter story as expected. Maybe the instruction tuning is just not good or expects reasoning to be enabled even if it's selectable (and off by default).

Anonymous
03/20/26(Fri)05:20:25 No.108412766

Anonymous 03/20/26(Fri)05:20:25 No.108412766

>>108412688
It's like gpt-oss then. Clean and tepid.

Anonymous
03/20/26(Fri)05:46:55 No.108412881

Anonymous 03/20/26(Fri)05:46:55 No.108412881

>>108412097
For zoomers 90s is the new 60s

Anonymous
03/20/26(Fri)05:48:17 No.108412883

Anonymous 03/20/26(Fri)05:48:17 No.108412883

qwen 3.5 35b is better than 27b

Anonymous
03/20/26(Fri)06:04:13 No.108412927

Anonymous 03/20/26(Fri)06:04:13 No.108412927

>>108412883
>moe better than dense
doubt

Anonymous
03/20/26(Fri)06:27:11 No.108413019

Anonymous 03/20/26(Fri)06:27:11 No.108413019

>>108412883
Qwen 3.5 27B
hidden size=5120
intermediate size=17408
num hidden layers=64

Qwen 3.5 35B-A3B
hidden size=2048
num hidden layers=40
moe intermediate size=512
experts per token=8
(equivalent intermediate size = 512*8 = 4096)

Qwen 35B is approximately designed like a dense 3B/4B model, even if it might have the knowledge of a 35B model. There's no free lunch here.
Similarly, Mistral Small 4 has the dimensions of a dense 7B model.

Anonymous
03/20/26(Fri)06:34:24 No.108413058

Anonymous 03/20/26(Fri)06:34:24 No.108413058

>>108413019
Qwen 35 a3b is like a cute little layered cake.

Anonymous
03/20/26(Fri)06:34:49 No.108413059

Anonymous 03/20/26(Fri)06:34:49 No.108413059

>>108412927
>>108413019
ur're all wrong

i use gwen 3.5 35b is much better experience then 27b dense

its called DENSE for a reason because it is dense synonym STUPID

Anonymous
03/20/26(Fri)06:36:04 No.108413064

Anonymous 03/20/26(Fri)06:36:04 No.108413064

>>108413019
By the way, you can make up for the lack of depth/layers (i.e. capability of doing knowledge manipulation) with reasoning/chain of thought, but even if you can sort of increase model width with the number of active experts (although it's not exactly the same), there's nothing that can be done with the hidden size.

Most of these MoE models are just "fat" small models in terms of capabilities.

Anonymous
03/20/26(Fri)06:45:25 No.108413096

Anonymous 03/20/26(Fri)06:45:25 No.108413096

When are the good models releasing?

Anonymous
03/20/26(Fri)06:47:57 No.108413106

Anonymous 03/20/26(Fri)06:47:57 No.108413106

>>108413096
Gemma today

Anonymous
03/20/26(Fri)06:48:22 No.108413108

Anonymous 03/20/26(Fri)06:48:22 No.108413108

>>108413096
no

Anonymous
03/20/26(Fri)06:50:01 No.108413113

Anonymous 03/20/26(Fri)06:50:01 No.108413113

>>108413106
Google never releases anything good on Friday.

Anonymous
03/20/26(Fri)06:50:27 No.108413115

Anonymous 03/20/26(Fri)06:50:27 No.108413115

>>108413113
>Google never releases anything good
ftfy

Anonymous
03/20/26(Fri)07:10:37 No.108413191

Anonymous 03/20/26(Fri)07:10:37 No.108413191

>>108410115
Newfag here.
Is it possible to use LTX 2 and 2.3 for RTX 5070ti GPU ?

Anonymous
03/20/26(Fri)07:16:13 No.108413218

Anonymous 03/20/26(Fri)07:16:13 No.108413218

>>108413191
This thread is mostly for the chatbot side of things; if you don't get a response here you might also try >>>/g/ldg which is the local image/video gen thread.
I couldn't get LTX 2 working on my 5090 (Comfy OOM) but I didn't try very hard. There are probably some low-VRAM workflows floating around.

Anonymous
03/20/26(Fri)07:21:59 No.108413248

Anonymous 03/20/26(Fri)07:21:59 No.108413248

>>108413064
With RELU something like 95% of intermediate values are zero. Models mostly don't use RELU any more, but the gains for other activations are minimal. This probes that most intermediate values don't matter.
If MoE was working correctly, it would operate at lower sparsity in the intermediate. The non selected experts being the zeros from the dense intermediate. Unfortunately this does not seem to be the case.

I think there is something much better than MoE, but with similar computational gains, waiting to be discovered.

Anonymous
03/20/26(Fri)08:04:46 No.108413413

Anonymous 03/20/26(Fri)08:04:46 No.108413413

File: nemotron-huh.png (43 KB, 1034x643)

43 KB PNG

Anyone using any of the larger Nemotron models?

Trying out Nemotron Nano 4b so I asked it "Tell me about yourself." and got picrel.

This can't be correct, right?

Anonymous
03/20/26(Fri)08:08:24 No.108413428

Anonymous 03/20/26(Fri)08:08:24 No.108413428

>>108413413
It was trained with a large amount of LLM-rewritten data.

Anonymous
03/20/26(Fri)08:13:26 No.108413452

Anonymous 03/20/26(Fri)08:13:26 No.108413452

>>108413428
Ah, in the trash it goes then...

Also these captchas are insane.

Anonymous
03/20/26(Fri)08:30:18 No.108413513

Anonymous 03/20/26(Fri)08:30:18 No.108413513

mistral nemo but smart(er)?

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.