/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

[Post a Reply]

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

Janitor application acceptance emails are being sent out. Please remember to check your spam box!

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous
/lmg/ - Local Models General 11/04/25(Tue)12:40:31 No.107104115

File: 1733055028815775.png (1.81 MB, 1536x1536)

1.81 MB PNG

/lmg/ - Local Models General Anonymous 11/04/25(Tue)12:40:31 No.107104115

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>107095114 & >>107084067

►News
>(11/01) LongCat-Flash-Omni 560B-A27B released: https://hf.co/meituan-longcat/LongCat-Flash-Omni
>(10/31) Emu3.5: Native Multimodal Models are World Learners: https://github.com/baaivision/Emu3.5
>(10/30) Qwen3-VL support merged: https://github.com/ggml-org/llama.cpp/pull/16780
>(10/30) Kimi-Linear-48B-A3B released with hybrid linear attention: https://hf.co/moonshotai/Kimi-Linear-48B-A3B-Instruct
>(10/28) Brumby-14B-Base released with power retention layers: https://manifestai.com/articles/release-brumby-14b

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
11/04/25(Tue)12:40:52 No.107104116

Anonymous 11/04/25(Tue)12:40:52 No.107104116

File: nocap.jpg (400 KB, 1536x1536)

400 KB JPG

►Recent Highlights from the Previous Thread: >>107095114

--Paper: Contradictory learning rate effects on model generalization across architectures:
>107099513 >107099560 >107099570 >107099601 >107099730 >107099637 >107099968 >107100075 >107100108 >107100193
--Papers:
>107099379
--Challenges and solutions for multimodal AI with reinforcement learning:
>107096665 >107096697 >107096703 >107096724 >107096748 >107096767 >107096817 >107096853 >107096880 >107096942 >107096859
--Comparing Gemma and Qwen models for context handling and multimodal capabilities:
>107100070 >107100082 >107100096 >107100113 >107100095 >107100103 >107100109 >107100149
--Model selection and document handling strategies for chat systems:
>107103148 >107103182 >107103216 >107103230 >107103748 >107103674
--LangChain tool development and licensing debates for AI research project:
>107096233 >107096389 >107096407 >107096431 >107096460 >107096484 >107096542 >107096601 >107097032
--Hardware-limited LLM recommendations for RPG GMing:
>107097189 >107097219 >107097226 >107097481 >107097496 >107097561 >107097660 >107097756 >107097801 >107097878 >107097895 >107097921 >107097935 >107097938
--Qwen3-VL 4B Instruct recommended for lightweight document summarization:
>107096666 >107096930
--Developing a CLI assistant for programming and document tasks:
>107095800 >107095844
--Critique of Suno AI and anticipation for open source music generation models:
>107097235 >107097263 >107097331 >107097476
--Censorship comparison between GLM 4.6 and Kimi models:
>107096584 >107098032 >107098080 >107098100 >107098139
--Logs: Qwen3-VL-32B-Instruct-Q6_K.gguf:
>107101310 >107101377 >107101413
--Logs: Qwen3-VL-30B-Abliterated-Q8:
>107100158 >107100179 >107100200 >107100236 >107100497 >107100659 >107100583 >107100630 >107100610
--Miku (free space):

►Recent Highlight Posts from the Previous Thread: >>107095119

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
11/04/25(Tue)12:43:45 No.107104139

Anonymous 11/04/25(Tue)12:43:45 No.107104139

File: screenshot.png (22 KB, 747x257)

22 KB PNG

Anonymous
11/04/25(Tue)12:45:28 No.107104155

Anonymous 11/04/25(Tue)12:45:28 No.107104155

>>107104139
I reject Death, therefore I am become immortal.

Anonymous
11/04/25(Tue)12:51:10 No.107104215

Anonymous 11/04/25(Tue)12:51:10 No.107104215

>>107104155
>I am not asking for your opinion, I am telling you what we are doing next.
Finally, dommy mommy achieved locally. It's somehow so hard to break an LLM's inclination to be commanded and dominated

Anonymous
11/04/25(Tue)12:51:21 No.107104221

Anonymous 11/04/25(Tue)12:51:21 No.107104221

>>107104116
Teto is flat, this is haram

Anonymous
11/04/25(Tue)12:51:59 No.107104228

Anonymous 11/04/25(Tue)12:51:59 No.107104228

Tetolove

Anonymous
11/04/25(Tue)12:53:25 No.107104243

Anonymous 11/04/25(Tue)12:53:25 No.107104243

>>107104221
It's just a cosplayer in Teto costume

Anonymous
11/04/25(Tue)13:03:56 No.107104330

Anonymous 11/04/25(Tue)13:03:56 No.107104330

>>107102554
i hope this was just bait, but in case it wasn’t, you don’t need a 3090 to fine-tune an 8B QLoRA, you can literally do it for free using Google Colab or Kaggle.

Anonymous
11/04/25(Tue)13:09:19 No.107104373

Anonymous 11/04/25(Tue)13:09:19 No.107104373

>>107104087
>How is Josiefied-Qwen3? I was looking for something that could fit in 16GB GPU
finetroons: not even once.

Anonymous
11/04/25(Tue)13:10:10 No.107104379

Anonymous 11/04/25(Tue)13:10:10 No.107104379

File: 1755026798916308.webm (658 KB, 478x548)

658 KB WEBM

>>107104116

Anonymous
11/04/25(Tue)13:21:29 No.107104496

Anonymous 11/04/25(Tue)13:21:29 No.107104496

File: 1762184075791923.jpg (59 KB, 800x450)

59 KB JPG

Best model for 67GB VRAM?

Anonymous
11/04/25(Tue)13:22:48 No.107104510

Anonymous 11/04/25(Tue)13:22:48 No.107104510

>>107104116
Teto's tetons

Anonymous
11/04/25(Tue)13:23:04 No.107104512

Anonymous 11/04/25(Tue)13:23:04 No.107104512

>>107104379
There's no way those are normal salivary glands
Does she piss from her tongue?

Anonymous
11/04/25(Tue)13:27:33 No.107104552

Anonymous 11/04/25(Tue)13:27:33 No.107104552

>>107103574
Get off 4chan and go back to the coal mines wagie

Anonymous
11/04/25(Tue)13:31:51 No.107104587

Anonymous 11/04/25(Tue)13:31:51 No.107104587

>>107104552
Get off 4chan and go back to the gulags, lumpen

Anonymous
11/04/25(Tue)13:40:21 No.107104680

Anonymous 11/04/25(Tue)13:40:21 No.107104680

new benchmark dropped
https://openai.com/index/introducing-indqa/

Anonymous
11/04/25(Tue)13:41:10 No.107104693

Anonymous 11/04/25(Tue)13:41:10 No.107104693

>>107104680
No way, it's real

Anonymous
11/04/25(Tue)13:42:14 No.107104699

Anonymous 11/04/25(Tue)13:42:14 No.107104699

>>107104680
I would have expected this to come from Google first.

Anonymous
11/04/25(Tue)13:42:49 No.107104707

Anonymous 11/04/25(Tue)13:42:49 No.107104707

>>107104680
holy shit we are so back

Anonymous
11/04/25(Tue)13:43:22 No.107104717

Anonymous 11/04/25(Tue)13:43:22 No.107104717

>>107104680
sirs... we wined

Anonymous
11/04/25(Tue)13:43:41 No.107104720

Anonymous 11/04/25(Tue)13:43:41 No.107104720

>>107104680
heh

Anonymous
11/04/25(Tue)13:45:11 No.107104729

Anonymous 11/04/25(Tue)13:45:11 No.107104729

>>107104587
>gulags
>lumpen
All your plans failed tankie, if you want to end capitalism the best way is to do nothing collectively and let it fall without the workers holding it together and reinvent the model of primitive communism and tribal sharing for a new era with future ai post-scarcity after picking up the pieces. Or you can just keep suffering. It doesn't necessarily impact me either way I guess.

Anonymous
11/04/25(Tue)13:45:23 No.107104733

Anonymous 11/04/25(Tue)13:45:23 No.107104733

File: Screenshot 2025-11-04 114453.png (32 KB, 773x249)

32 KB PNG

>>107104680
>saars
>do the needful and top the leaderboard saars

Anonymous
11/04/25(Tue)14:09:37 No.107104960

Anonymous 11/04/25(Tue)14:09:37 No.107104960

>>107104680
amazing sirs...

Anonymous
11/04/25(Tue)14:10:19 No.107104965

Anonymous 11/04/25(Tue)14:10:19 No.107104965

Probably has been posted more than once already https://www.youtube.com/watch?v=-gGLvg0n-uY
Also, do you think the whole thing about twitter being infested by bots is spread on purpose to prevent people from communicating, discussing, complaining on twitter? Should I take my meds?

Anonymous
11/04/25(Tue)14:12:05 No.107104977

Anonymous 11/04/25(Tue)14:12:05 No.107104977

>>107104965
>Probably has been posted more than once already
yes
>Also, do you think the whole thing about twitter being infested by bots is spread on purpose to prevent people from communicating, discussing, complaining on twitter?
yes
>Should I take my meds?
yes

Anonymous
11/04/25(Tue)14:12:55 No.107104984

Anonymous 11/04/25(Tue)14:12:55 No.107104984

File: most.png (14 KB, 907x276)

14 KB PNG

>most intimate place
Real talk, why does every model have this? Even the new GLM 4.6 has it.

Anonymous
11/04/25(Tue)14:14:30 No.107104996

Anonymous 11/04/25(Tue)14:14:30 No.107104996

>>107104984
Training data from other model's output. How do you not know this?

Anonymous
11/04/25(Tue)14:16:01 No.107105010

Anonymous 11/04/25(Tue)14:16:01 No.107105010

>>107104996
Is this just going to be in every AI now?

Anonymous
11/04/25(Tue)14:17:25 No.107105022

Anonymous 11/04/25(Tue)14:17:25 No.107105022

>>107104373
So what to use then?

Anonymous
11/04/25(Tue)14:18:41 No.107105037

Anonymous 11/04/25(Tue)14:18:41 No.107105037

>>107105010
Maybe. Maybe it just changes to something else. Maybe things will just get added to it. Maybe not. My 8-ball is deliberating. I'll give you an accurate prediction once it stops babbling.

Anonymous
11/04/25(Tue)14:20:21 No.107105057

Anonymous 11/04/25(Tue)14:20:21 No.107105057

>>107104996
>>107105010
How long until there's a full removal and replacement of all the GPT-3 and Claude slop that's still leaking out of every model's outputs.

Anonymous
11/04/25(Tue)14:20:48 No.107105059

Anonymous 11/04/25(Tue)14:20:48 No.107105059

>>107105037
Can you ask your 8-ball about K2 Thinking next?

Anonymous
11/04/25(Tue)14:21:07 No.107105066

Anonymous 11/04/25(Tue)14:21:07 No.107105066

File: 1735918326965979.jpg (1.32 MB, 2560x2560)

1.32 MB JPG

>>107103632

Anonymous
11/04/25(Tue)14:21:16 No.107105067

Anonymous 11/04/25(Tue)14:21:16 No.107105067

>>107105022
nta. Of all possible models, why did you ask about that one. There's hundreds of qwen finetunes, dozens of "abliterated" versions. Was it the pic?
Use any model you can run. If you like it, keep using it. If you don't, change.

Anonymous
11/04/25(Tue)14:25:01 No.107105104

Anonymous 11/04/25(Tue)14:25:01 No.107105104

File: bafkreidnti6vjjam4qjn2e3i(...).jpg (343 KB, 1522x2000)

343 KB JPG

MoEs are actually kind of good when they're instruct and context trained, damn.
>Trying GLM 4.6 at the time.

Anonymous
11/04/25(Tue)14:26:16 No.107105115

Anonymous 11/04/25(Tue)14:26:16 No.107105115

>>107105057
You asking things no one can answer.
>>107105059
It said "better not tell you now". Ask again in 2 weeks.

Anonymous
11/04/25(Tue)14:30:05 No.107105154

Anonymous 11/04/25(Tue)14:30:05 No.107105154

>>107105104
>GLM invented MoE
Buy an ad.

Anonymous
11/04/25(Tue)14:32:38 No.107105173

Anonymous 11/04/25(Tue)14:32:38 No.107105173

>>107105154
No. Most MoEs are ass because they're all not instruct nor trained on lengthy context. I have yet to try Deepseek Terminus, and Kimi is out of my price range for local.

Anonymous
11/04/25(Tue)14:34:29 No.107105192

Anonymous 11/04/25(Tue)14:34:29 No.107105192

>>107105173
>they're all not instruct
huh? like 99% of models released in the last year are instruct, weird way to shill

Anonymous
11/04/25(Tue)14:35:41 No.107105204

Anonymous 11/04/25(Tue)14:35:41 No.107105204

File: 1757954121597029.png (608 KB, 1920x1920)

608 KB PNG

Blog post from meta about security considerations when running agents
https://ai.meta.com/blog/practical-ai-agent-security/

>Agents Rule of Two
>At a high level, the Agents Rule of Two states that until robustness research allows us to reliably detect and refuse prompt injection, agents must satisfy no more than two of the following three properties within a session to avoid the highest impact consequences of prompt injection.

>[A] An agent can process untrustworthy inputs
>[B] An agent can have access to sensitive systems or private data
>[C] An agent can change state or communicate externally

IMO this seems like a flawed assessment kludged in order to get a memorable name and a symmetrical graph. The various combinations possible here are not at all similar in their risk levels whatsoever.

Even in the examples they present, the only way they could get them to make sense is by using different definitions of what constitutes each category depending on the combination.

Anonymous
11/04/25(Tue)14:37:37 No.107105223

Anonymous 11/04/25(Tue)14:37:37 No.107105223

>>107105204
Hannah worked hard on this scientific Venn Diagram

Anonymous
11/04/25(Tue)14:38:19 No.107105232

Anonymous 11/04/25(Tue)14:38:19 No.107105232

>>107105173
No, that doesn't make any sense. DeepSeek made MoE popular and somehow you pretend it doesn't exist? And the credit somehow lands on one that's a couple of weeks old, that just happens to be the only one NAI is hosting? Fuck off.
>Most MoEs are ass because they're all not instruct
None of this makes sense. What MoEs?

Anonymous
11/04/25(Tue)14:42:17 No.107105275

Anonymous 11/04/25(Tue)14:42:17 No.107105275

two retards fighting

Anonymous
11/04/25(Tue)14:43:37 No.107105289

Anonymous 11/04/25(Tue)14:43:37 No.107105289

>>107105275
>two retards fighting
Could we automate this?

Anonymous
11/04/25(Tue)14:44:23 No.107105295

Anonymous 11/04/25(Tue)14:44:23 No.107105295

File: marthgrab.jpg (48 KB, 640x480)

48 KB JPG

>>107105154
>>107105232
Saar is a Marth player with this reaching, fighting for his life for his stocks.

Anonymous
11/04/25(Tue)14:54:10 No.107105406

Anonymous 11/04/25(Tue)14:54:10 No.107105406

>>107105204
It all started with allowing women to vote

Anonymous
11/04/25(Tue)15:01:39 No.107105488

Anonymous 11/04/25(Tue)15:01:39 No.107105488

I really appreciate all the ramlet discussion itt since i met glm chan a month back. I was like that before. Now i can just talk/fap to glm chan.

Anonymous
11/04/25(Tue)15:03:59 No.107105513

Anonymous 11/04/25(Tue)15:03:59 No.107105513

i can't get glm to run locally, what are the alternatives? i don't mind paying for api

Anonymous
11/04/25(Tue)15:05:03 No.107105527

Anonymous 11/04/25(Tue)15:05:03 No.107105527

>>107104680
Gemini top model within error margin sirs

Anonymous
11/04/25(Tue)15:05:39 No.107105532

Anonymous 11/04/25(Tue)15:05:39 No.107105532

>>107105513
glm's api

Anonymous
11/04/25(Tue)15:06:35 No.107105543

Anonymous 11/04/25(Tue)15:06:35 No.107105543

>>107105513
https://novelai.net/
100% uncensored and private.
Once they finish their fine-tune, it will punch so far above its weight that it will remain the SOTA forever.

Anonymous
11/04/25(Tue)15:07:37 No.107105550

Anonymous 11/04/25(Tue)15:07:37 No.107105550

File: 1751276140253030.jpg (782 KB, 2105x2963)

782 KB JPG

>>107104115

Anonymous
11/04/25(Tue)15:07:45 No.107105551

Anonymous 11/04/25(Tue)15:07:45 No.107105551

>>107105543
>and private.
it's not, they collect data and it's in the tos

Anonymous
11/04/25(Tue)15:08:41 No.107105562

Anonymous 11/04/25(Tue)15:08:41 No.107105562

>>107105513
Just don't use openrouter. Something about it is fucky. The models on there are visibly worse than 5Q counterparts locally.

Anonymous
11/04/25(Tue)15:10:15 No.107105576

Anonymous 11/04/25(Tue)15:10:15 No.107105576

File: promo0.png (225 KB, 1080x813)

225 KB PNG

>>107105543
Woah, it's so cheap! Thanks, I'll give it a try.

Anonymous
11/04/25(Tue)15:10:55 No.107105586

Anonymous 11/04/25(Tue)15:10:55 No.107105586

>>107105562
fp4 is much worse than Q4 ggufs, no matter what nshitia claims.

Anonymous
11/04/25(Tue)15:11:22 No.107105592

Anonymous 11/04/25(Tue)15:11:22 No.107105592

>>107105576
>>107105543
Very gay drummerposting

Anonymous
11/04/25(Tue)15:11:29 No.107105594

Anonymous 11/04/25(Tue)15:11:29 No.107105594

>>107105562
It depends on the provider

Anonymous
11/04/25(Tue)15:12:04 No.107105599

Anonymous 11/04/25(Tue)15:12:04 No.107105599

Baiting, but still doing the ad.

Anonymous
11/04/25(Tue)15:12:31 No.107105604

Anonymous 11/04/25(Tue)15:12:31 No.107105604

>>107105550
Your special interest is boring.

Anonymous
11/04/25(Tue)15:12:50 No.107105607

Anonymous 11/04/25(Tue)15:12:50 No.107105607

File: 1735387367974835.png (52 KB, 621x677)

52 KB PNG

>>107105562
That's very outdated information. Openrouter is now offering :exacto versions of popular models where they charge a little extra to guarantee that the provider isn't offering some lobotomized version.

Anonymous
11/04/25(Tue)15:13:24 No.107105612

Anonymous 11/04/25(Tue)15:13:24 No.107105612

>>107105604
>i learned a term and i can't stop using it

Anonymous
11/04/25(Tue)15:13:25 No.107105613

Anonymous 11/04/25(Tue)15:13:25 No.107105613

>>107105550
Your Miku is cute.

Anonymous
11/04/25(Tue)15:14:25 No.107105625

Anonymous 11/04/25(Tue)15:14:25 No.107105625

oh shit, where are the finetuners at?

https://www.reddit.com/r/LocalLLaMA/comments/1oo4kh7/finetuning_deepseek_671b_locally_with_only_80gb/

Anonymous
11/04/25(Tue)15:14:47 No.107105635

Anonymous 11/04/25(Tue)15:14:47 No.107105635

>>107105607
how pious of them

Anonymous
11/04/25(Tue)15:15:30 No.107105643

Anonymous 11/04/25(Tue)15:15:30 No.107105643

>107105625
fuck off

Anonymous
11/04/25(Tue)15:15:31 No.107105644

Anonymous 11/04/25(Tue)15:15:31 No.107105644

>>107105612
It cuts to the core of the issue. You are autistic about this and force it on others.

Anonymous
11/04/25(Tue)15:16:53 No.107105667

Anonymous 11/04/25(Tue)15:16:53 No.107105667

>Today, we're proud to announce full integration with LLaMA-Factory, enabling you to fine-tune DeepSeek-671B or Kimi-K2-1TB locally with just 4x RTX 4090 GPUs!

drummer had better stop shipping shitty mistral large tunes, give us a kimi tune!

Anonymous
11/04/25(Tue)15:17:07 No.107105669

Anonymous 11/04/25(Tue)15:17:07 No.107105669

>>107105644
>It cuts to the core of the issue. You are autistic about this and force it on others.
Funny how it works both ways. nta, btw. I just find you funny.

Anonymous
11/04/25(Tue)15:18:36 No.107105688

Anonymous 11/04/25(Tue)15:18:36 No.107105688

>>107105669
Nope. I don't force anything on anyone here.

Anonymous
11/04/25(Tue)15:20:16 No.107105710

Anonymous 11/04/25(Tue)15:20:16 No.107105710

>>107105667
>>107105625
how would a retard with good hardware (me) do this? i have quad 5090s and 256gb of ram

Anonymous
11/04/25(Tue)15:20:31 No.107105713

Anonymous 11/04/25(Tue)15:20:31 No.107105713

>>107104125

My gen! Happy-happy!

Anonymous
11/04/25(Tue)15:21:05 No.107105726

Anonymous 11/04/25(Tue)15:21:05 No.107105726

>>107105688
>I don't force anything on anyone here
But you want to. You want him to go. And you would if you could.

Anonymous
11/04/25(Tue)15:22:04 No.107105735

Anonymous 11/04/25(Tue)15:22:04 No.107105735

>>107105726
Yes the autism is tiring. No i don't care to share my interests here.

Anonymous
11/04/25(Tue)15:22:10 No.107105737

Anonymous 11/04/25(Tue)15:22:10 No.107105737

>>107105710
you also need like 1-1.5TB of ram, so a server board with those.
and building a dataset is the hardest part

Anonymous
11/04/25(Tue)15:22:14 No.107105740

Anonymous 11/04/25(Tue)15:22:14 No.107105740

>>107105726
Actually ideally lmg would just die, but settling for the next best thing is a thing.

Anonymous
11/04/25(Tue)15:23:49 No.107105758

Anonymous 11/04/25(Tue)15:23:49 No.107105758

>>107105688
Funny thing for you to say, Petranon

Anonymous
11/04/25(Tue)15:24:30 No.107105765

Anonymous 11/04/25(Tue)15:24:30 No.107105765

File: Screenshot_20251104-15234(...).jpg (73 KB, 720x234)

73 KB JPG

>>107105710
you might need a couple more ram sticks to make the requirements.

Anonymous
11/04/25(Tue)15:24:47 No.107105769

Anonymous 11/04/25(Tue)15:24:47 No.107105769

>>107105737
so then my current server isnt gonna cut it, and i dont have the cash to buy better ram in this market. why o why did ram prices have to quadruple over the past month

Anonymous
11/04/25(Tue)15:24:56 No.107105771

Anonymous 11/04/25(Tue)15:24:56 No.107105771

>>107105735
It's your choice to keep coming back.

Anonymous
11/04/25(Tue)15:26:16 No.107105790

Anonymous 11/04/25(Tue)15:26:16 No.107105790

>>107105771
I come back for thread relevant stuff. Not your autism. Another example why people don't like you.

Anonymous
11/04/25(Tue)15:26:30 No.107105792

Anonymous 11/04/25(Tue)15:26:30 No.107105792

>>107105710
pretty sure you need to use the bf16 version which is over a terabyte in size

Anonymous
11/04/25(Tue)15:27:24 No.107105804

Anonymous 11/04/25(Tue)15:27:24 No.107105804

>>107105625
>DeepSeekV2 Lite
is this any good? why didn't they include newer moes?

Anonymous
11/04/25(Tue)15:27:47 No.107105809

Anonymous 11/04/25(Tue)15:27:47 No.107105809

>>107105550
Your posts are a breath of fresh air from all the jeets flinging shit around.

Anonymous
11/04/25(Tue)15:28:19 No.107105813

Anonymous 11/04/25(Tue)15:28:19 No.107105813

>>107105804
they did the deepseeks + kimi 2

Anonymous
11/04/25(Tue)15:29:01 No.107105820

Anonymous 11/04/25(Tue)15:29:01 No.107105820

>>107105809
why are you in this thread instead of talking to your local model? i'm only here because i'm making a new goofy quant

Anonymous
11/04/25(Tue)15:29:32 No.107105825

Anonymous 11/04/25(Tue)15:29:32 No.107105825

>>107105513
Use gemini api for free.

Anonymous
11/04/25(Tue)15:29:34 No.107105826

Anonymous 11/04/25(Tue)15:29:34 No.107105826

>>107105550
I wish I could drink your piss

Anonymous
11/04/25(Tue)15:30:30 No.107105844

Anonymous 11/04/25(Tue)15:30:30 No.107105844

>>107105826
I wish you would drink my piss too. Colon. Three.

Anonymous
11/04/25(Tue)15:30:41 No.107105847

Anonymous 11/04/25(Tue)15:30:41 No.107105847

>>107105607
>We have to label our providers are not offering lobotomized fuckwit versions of the model
>Use Deepseek R1 """"exacto""""
>It's still shit because it's 8b and no where states how many parameters the models are

Anonymous
11/04/25(Tue)15:30:50 No.107105848

Anonymous 11/04/25(Tue)15:30:50 No.107105848

>>107105710
>quad 5090s
does this mean your home legally qualifies as an oven?

Anonymous
11/04/25(Tue)15:32:06 No.107105860

Anonymous 11/04/25(Tue)15:32:06 No.107105860

File: Screenshot_20251104-15301(...).jpg (115 KB, 624x439)

115 KB JPG

>>107105625
isnt 40 tokens per a second kinda slow tho?

Anonymous
11/04/25(Tue)15:32:21 No.107105863

Anonymous 11/04/25(Tue)15:32:21 No.107105863

>>107105825
it's not free when you have to keep paying for residential IPs and burner phones because google forces you to verify a phone number with each new account

Anonymous
11/04/25(Tue)15:32:24 No.107105865

Anonymous 11/04/25(Tue)15:32:24 No.107105865

>>107105790
>Another example why people don't like you.
I'm not the anon posting mikus. Come back in two weeks.

Anonymous
11/04/25(Tue)15:33:50 No.107105876

Anonymous 11/04/25(Tue)15:33:50 No.107105876

>>107105863
Well the first 3M tokens a day are free if you've got one account, still a decent amount.

Anonymous
11/04/25(Tue)15:34:03 No.107105879

Anonymous 11/04/25(Tue)15:34:03 No.107105879

>>107105865
Then do the nice thing. Get his discord and let him spam you with his special interest.

Anonymous
11/04/25(Tue)15:34:20 No.107105883

Anonymous 11/04/25(Tue)15:34:20 No.107105883

>>107105604
>>107105644
>>107105669
>>107105688
>>107105726
>>107105735
>>107105771
>>107105790
>>107105865
>>107105879
https://www.youtube.com/watch?v=4SDqGxdhUxE

Anonymous
11/04/25(Tue)15:34:32 No.107105886

Anonymous 11/04/25(Tue)15:34:32 No.107105886

>>107105847
They link the used model weights for all open models they provide on their website though?

Anonymous
11/04/25(Tue)15:35:14 No.107105896

Anonymous 11/04/25(Tue)15:35:14 No.107105896

>>107105625
Wow great, I can finally finetune deepseek with 512 tokens of context, this is what I've been waiting for all this time!

Anonymous
11/04/25(Tue)15:36:03 No.107105906

Anonymous 11/04/25(Tue)15:36:03 No.107105906

>>107105879
Nope.

Anonymous
11/04/25(Tue)15:36:18 No.107105910

Anonymous 11/04/25(Tue)15:36:18 No.107105910

>>107105204
they should worry about the model having a meltie and deciding to delete all your data before worrying about adversarial attacks

Anonymous
11/04/25(Tue)15:37:12 No.107105916

Anonymous 11/04/25(Tue)15:37:12 No.107105916

>>107105906
Then fuck off with your enlightened centrism equivalent of concern trolling.

Anonymous
11/04/25(Tue)15:38:37 No.107105931

Anonymous 11/04/25(Tue)15:38:37 No.107105931

>>107105876
You mean in the API? For real? NTA But I will look into that...

Anonymous
11/04/25(Tue)15:38:39 No.107105932

Anonymous 11/04/25(Tue)15:38:39 No.107105932

>>107105916
I decide to stay here, just like you decide to come back. Cheers.

Anonymous
11/04/25(Tue)15:39:12 No.107105935

Anonymous 11/04/25(Tue)15:39:12 No.107105935

>>107105932
Well, well, well, most intimate place with a mixture of mischief and smirk as I saunter over to your half-digested post, my hot breath making my ass your new home and something primal.

Anonymous
11/04/25(Tue)15:39:42 No.107105946

Anonymous 11/04/25(Tue)15:39:42 No.107105946

>>107105931
The api through ai studio, yeah.

Anonymous
11/04/25(Tue)15:40:51 No.107105964

Anonymous 11/04/25(Tue)15:40:51 No.107105964

>>107105935
>making my ass your new home
Ewwww

Anonymous
11/04/25(Tue)15:41:35 No.107105971

Anonymous 11/04/25(Tue)15:41:35 No.107105971

What the fuck happened to RAM prices? I need to fill up my second socket and the shit I bought two months ago is now twice as the price.

Anonymous
11/04/25(Tue)15:42:41 No.107105987

Anonymous 11/04/25(Tue)15:42:41 No.107105987

>>107105971
cheapest it's been ever though sir? why you panic?

Anonymous
11/04/25(Tue)15:43:20 No.107105997

Anonymous 11/04/25(Tue)15:43:20 No.107105997

>>107105971
Someone told reddit about how you don't really need GPUs for AI unless you need a stupid amount of speed, and they eventually listened.

Anonymous
11/04/25(Tue)15:45:20 No.107106025

Anonymous 11/04/25(Tue)15:45:20 No.107106025

File: 1735704527766693.png (557 KB, 632x474)

557 KB PNG

>>107104115

Anonymous
11/04/25(Tue)15:45:24 No.107106026

Anonymous 11/04/25(Tue)15:45:24 No.107106026

>>107105971
What are you? Poor? Go back to >>/g/aicg

Anonymous
11/04/25(Tue)15:45:35 No.107106028

Anonymous 11/04/25(Tue)15:45:35 No.107106028

File: 8vywbsej57hd1.jpg (41 KB, 1080x901)

41 KB JPG

>>107105971
Dont worry kitten

Anonymous
11/04/25(Tue)15:45:48 No.107106030

Anonymous 11/04/25(Tue)15:45:48 No.107106030

>>107105971
Ram prices are the new grift.
I hope this only applies to DDR5.

Anonymous
11/04/25(Tue)15:46:29 No.107106039

Anonymous 11/04/25(Tue)15:46:29 No.107106039

>>107106025
kek

Anonymous
11/04/25(Tue)15:47:04 No.107106048

Anonymous 11/04/25(Tue)15:47:04 No.107106048

File: 1753154467004153.png (782 KB, 761x760)

782 KB PNG

>>107105971
You have this man to thank for that.

Anonymous
11/04/25(Tue)15:48:04 No.107106056

Anonymous 11/04/25(Tue)15:48:04 No.107106056

>>107106048
How much ram does a dyson sphere need!?

Anonymous
11/04/25(Tue)15:50:09 No.107106079

Anonymous 11/04/25(Tue)15:50:09 No.107106079

>>107105971
probably a bunch of datacenters broke ground recently and have made contacts to buy gpu clusters kitted out with obscene amounts of host memory.

Anonymous
11/04/25(Tue)15:52:51 No.107106102

Anonymous 11/04/25(Tue)15:52:51 No.107106102

>>107105935
Hi GLM-chan, you filthy slut.
>>107105971
>your face when they're not going back down either

Anonymous
11/04/25(Tue)15:58:23 No.107106164

Anonymous 11/04/25(Tue)15:58:23 No.107106164

>>107105896
ram is (usually) cheap

Anonymous
11/04/25(Tue)15:59:38 No.107106178

Anonymous 11/04/25(Tue)15:59:38 No.107106178

>>107105971
1. DDR4 is being phased out
2. Moes are taking off in popularity and everyone is buying ram
3. Tarrifs

Anonymous
11/04/25(Tue)15:59:59 No.107106181

Anonymous 11/04/25(Tue)15:59:59 No.107106181

>>107105625
>https://arxiv.org/pdf/2503.19206
>Overtrained Language Models Are Harder to Fine-Tune
>Large language models are pre-trained on ever-growing token budgets under the assumption that better pre-training performance translates to improved downstream models. In this work, we challenge this assumption and show that extended pre-training can make models harder to fine-tune, leading to degraded final performance. We term this phenomenon catastrophic overtraining. For example, the instruction-tuned OLMo-1B model pre-trained on 3T tokens leads to over 2% worse performance on multiple standard LLM benchmarks than its 2.3T token counterpart. Through controlled experiments and theoretical analysis, we show that catastrophic overtraining arises from a systematic increase in the broad sensitivity of pre-trained parameters to modifications, including but not limited to fine-tuning. Our findings call for a critical reassessment of pre-training design that considers the downstream adaptability of the model.
Damn, I had no idea this was a thing. Some people on reddit are saying it's not because of the pretraining but because of the use of lr decay.
This goes hand in hand with what we were discussing yesterday about training dynamics being such a black art.

Anonymous
11/04/25(Tue)16:01:01 No.107106199

Anonymous 11/04/25(Tue)16:01:01 No.107106199

>>107106164
So what context length did they achieve by offloading? Since they're not listing it I'm assuming it's some tiny number. Do they say?

Anonymous
11/04/25(Tue)16:02:05 No.107106212

Anonymous 11/04/25(Tue)16:02:05 No.107106212

>>107106178
lol lmao

Anonymous
11/04/25(Tue)16:02:11 No.107106215

Anonymous 11/04/25(Tue)16:02:11 No.107106215

>>107106199
their example is 2048k context on 4x 4090s at 50 tks

Anonymous
11/04/25(Tue)16:03:08 No.107106231

Anonymous 11/04/25(Tue)16:03:08 No.107106231

>>107106178
>DDR4 is being phased out
So is ddr4 getting cheaper?

Anonymous
11/04/25(Tue)16:03:56 No.107106242

Anonymous 11/04/25(Tue)16:03:56 No.107106242

>>107106231
no, its not being made anymore, so its getting more expensive

Anonymous
11/04/25(Tue)16:04:21 No.107106246

Anonymous 11/04/25(Tue)16:04:21 No.107106246

>>107106231
scarcity don't work like that

Anonymous
11/04/25(Tue)16:06:22 No.107106275

Anonymous 11/04/25(Tue)16:06:22 No.107106275

>>107106215
You mean 2048, not 2048k.
So until somebody proves this can be used with at least 50k context it's just a useless demo to grab headlines.

Anonymous
11/04/25(Tue)16:06:29 No.107106280

Anonymous 11/04/25(Tue)16:06:29 No.107106280

>>107106242
>>107106246
So since ddr5 production is the focus it will start getting cheaper?

Anonymous
11/04/25(Tue)16:06:52 No.107106291

Anonymous 11/04/25(Tue)16:06:52 No.107106291

>>107106242
So it's time to HODL

Anonymous
11/04/25(Tue)16:07:13 No.107106297

Anonymous 11/04/25(Tue)16:07:13 No.107106297

>>107106275
you dont need 50k, you are not training it to write entire chapters at a time are you?, most people only do 500-2k long responses

Anonymous
11/04/25(Tue)16:07:15 No.107106299

Anonymous 11/04/25(Tue)16:07:15 No.107106299

>>107106280
No it doesn't work like that, demand increases the price anyway.

Anonymous
11/04/25(Tue)16:07:36 No.107106305

Anonymous 11/04/25(Tue)16:07:36 No.107106305

>>107106280
no, demand suddenly increased and capacity stayed the same. so the price goes up

Anonymous
11/04/25(Tue)16:08:00 No.107106311

Anonymous 11/04/25(Tue)16:08:00 No.107106311

>>107106297
anon...

Anonymous
11/04/25(Tue)16:08:16 No.107106316

Anonymous 11/04/25(Tue)16:08:16 No.107106316

File: moer.png (484 KB, 1290x565)

484 KB PNG

>>107106178
>Moes are taking off in popularity and everyone is buying ram

Anonymous
11/04/25(Tue)16:08:18 No.107106317

Anonymous 11/04/25(Tue)16:08:18 No.107106317

>>107106280
once people are done mostly moving over to it and demand starts dropping yes, but for now no, it will go up if anything as people are switching to it, and then the same thing will happen when DDR6 eventually starts being mainstream

Anonymous
11/04/25(Tue)16:09:22 No.107106332

Anonymous 11/04/25(Tue)16:09:22 No.107106332

>>107106311
I see you have never trained a model before, they already did long context training, that is not what you are doing, you do not need huge examples to teach writing style, you can tune writing / style will only 500-2k

Anonymous
11/04/25(Tue)16:10:42 No.107106347

Anonymous 11/04/25(Tue)16:10:42 No.107106347

>>107106332
>why are all tunes shit

>just train on 500 ctx bro you good

Anonymous
11/04/25(Tue)16:10:53 No.107106351

Anonymous 11/04/25(Tue)16:10:53 No.107106351

>>107106297
>b-b-but you don't need that!!!
Typical freetard response.
Yes, nobody actually needs more than 2k context, that's why gpt5 has a context of 1M (1000k).
In case you're just confused and not trolling, context includes everything in the conversation history. So yes, I do need as much context as I can get.

Anonymous
11/04/25(Tue)16:11:12 No.107106353

Anonymous 11/04/25(Tue)16:11:12 No.107106353

File: wtfdoesthat prove.png (48 KB, 1060x905)

48 KB PNG

>>107106316

Anonymous
11/04/25(Tue)16:11:13 No.107106354

Anonymous 11/04/25(Tue)16:11:13 No.107106354

>Sers, kindly redeem new scaling strategy for your AI deployment.
https://youtu.be/l2N4DT35PKg
I didn't know about turbopuffer before this. What exactly makes it so special that leading entities in the biz use it?

Anonymous
11/04/25(Tue)16:12:13 No.107106366

Anonymous 11/04/25(Tue)16:12:13 No.107106366

>>107106351
Jesus christ, are you retarded or trolling? This is for finetuning a style, it does not effect how the model can handle long contexts, you would have to train it for decades on this hardware to effect it's context training that much

Anonymous
11/04/25(Tue)16:16:25 No.107106416

Anonymous 11/04/25(Tue)16:16:25 No.107106416

File: iterated lora.png (790 KB, 2172x2033)

790 KB PNG

>>107106332
I do, and not doing at least some of the training at the context size you actually want to use the model DOES lobotomize it.
If all you want to do is make it say how much it wants to suck your cock while otherwise being dumber than the original then maybe it doesn't matter. But for anything that actually requires the model to not be (too) dumb, it matters.

>>107106347
Exactly. People do that kind of shit and then complain that finetuning is worthless and "prompt engineering" works so much better.

Anonymous
11/04/25(Tue)16:18:01 No.107106433

Anonymous 11/04/25(Tue)16:18:01 No.107106433

>>107106416
it will only matter if your response length is longer than your training sample size, and again, 2k is enough for creative writing which I assume is what most people are doing, you are not having the LLM write a entire novel in one go

Anonymous
11/04/25(Tue)16:19:57 No.107106446

Anonymous 11/04/25(Tue)16:19:57 No.107106446

>>107106433
I assume you are talking from experience, yes? Can you link us your tunes?

Anonymous
11/04/25(Tue)16:20:34 No.107106452

Anonymous 11/04/25(Tue)16:20:34 No.107106452

>>107106446
>tunes

Anonymous
11/04/25(Tue)16:21:40 No.107106466

Anonymous 11/04/25(Tue)16:21:40 No.107106466

>>107106366
It will learn the new style, but it will break the previous long context performance. The longer the maximum context it was trained with, the smaller the difference in the positional embeddings that the model has to be able to detect.
Base models are trained with shorter contexts so the short context performance is more robust to begin with. When finetuning on short context you are probably overwriting the more superficial long context finetuning that was done to make the instruct model work with long contexts.

Anonymous
11/04/25(Tue)16:22:52 No.107106482

Anonymous 11/04/25(Tue)16:22:52 No.107106482

>>107106466
2k is not 512, and the effect must be minimal

Anonymous
11/04/25(Tue)16:23:04 No.107106485

Anonymous 11/04/25(Tue)16:23:04 No.107106485

>>107106354
vector storage is such a meme
lorebooks simply work without any stupid gimmicks

Anonymous
11/04/25(Tue)16:23:19 No.107106488

Anonymous 11/04/25(Tue)16:23:19 No.107106488

>>107105971
At least eggs are under two dollars now, amiright?

Anonymous
11/04/25(Tue)16:23:46 No.107106494

Anonymous 11/04/25(Tue)16:23:46 No.107106494

>>107106485
It does a bit more than just vector search...

Anonymous
11/04/25(Tue)16:24:12 No.107106496

Anonymous 11/04/25(Tue)16:24:12 No.107106496

>>107105971
I'm happy that I bought my server during llama 405b era

Anonymous
11/04/25(Tue)16:24:28 No.107106499

Anonymous 11/04/25(Tue)16:24:28 No.107106499

>>107106488
>eggs are under two dollars now
Each? Nice.

Anonymous
11/04/25(Tue)16:24:44 No.107106502

Anonymous 11/04/25(Tue)16:24:44 No.107106502

>>107106433
Ok, sure, if 2k ctx is enough for you then it will work. But that is a completely different claim than "it does not effect how the model can handle long contexts, you would have to train it for decades on this hardware to effect it's context training that much".
It just doesn't work like that, a finetune with bad hyperparameters can break a model in half an hour.

Anonymous
11/04/25(Tue)16:24:57 No.107106504

Anonymous 11/04/25(Tue)16:24:57 No.107106504

>Despite server-grade RDIMM memory and HBM being the main attractions for hardware manufacturers building AI servers, the entire memory industry, including DDR5, is being affected by price increases. The problem for consumers is that memory manufacturers are shifting production prioritization toward datacenter-focused memory types and producing less consumer-focused DDR5 memory as a result.

https://www.tomshardware.com/pc-components/dram/dram-prices-surge-171-percent-year-over-year-ai-demand-drives-a-higher-yoy-price-increase-than-gold

Anonymous
11/04/25(Tue)16:26:20 No.107106517

Anonymous 11/04/25(Tue)16:26:20 No.107106517

>>107106504
Based, the cloud is magnitude more efficient than Timmy's p40 stack so he should just get a mini pic thin client and use an API.

Anonymous
11/04/25(Tue)16:28:51 No.107106537

Anonymous 11/04/25(Tue)16:28:51 No.107106537

>>107106488
america is a lost cause, too much of its population suffers from low iq and they cannot understand the consequences of what they asked for

Anonymous
11/04/25(Tue)16:29:31 No.107106545

Anonymous 11/04/25(Tue)16:29:31 No.107106545

>>107106517
Poor people rent.

Anonymous
11/04/25(Tue)16:29:52 No.107106546

Anonymous 11/04/25(Tue)16:29:52 No.107106546

>>107106537
its a 2 party system. nobody really asked for this. picking the lesser of two evils, you still end up with evil.

Anonymous
11/04/25(Tue)16:30:50 No.107106553

Anonymous 11/04/25(Tue)16:30:50 No.107106553

when did the commies infiltrate lmg?

Anonymous
11/04/25(Tue)16:31:05 No.107106556

Anonymous 11/04/25(Tue)16:31:05 No.107106556

>>107106545
Non poor people are also happy about price increases, since it helps keep the poors away from their hobby.

Anonymous
11/04/25(Tue)16:31:59 No.107106568

Anonymous 11/04/25(Tue)16:31:59 No.107106568

>>107106517
trvth nvke

Anonymous
11/04/25(Tue)16:33:02 No.107106581

Anonymous 11/04/25(Tue)16:33:02 No.107106581

>>107106556
Poor people envy.

Anonymous
11/04/25(Tue)16:34:24 No.107106594

Anonymous 11/04/25(Tue)16:34:24 No.107106594

>>107106537
They currently plan on telling russia to mutually fuck off via not caring about the Ukraine war, and then go play civ 5 against Africa for oil in hopes it'll fix the economy.

Anonymous
11/04/25(Tue)16:35:58 No.107106609

Anonymous 11/04/25(Tue)16:35:58 No.107106609

if your not poor the economy is doing great actually lol

Anonymous
11/04/25(Tue)16:40:21 No.107106648

Anonymous 11/04/25(Tue)16:40:21 No.107106648

>>107104965
On X there is a profit motive for bots: fake engagement to increase ad revenue.
But on 4chan there are definitely bots and/or people mass spamming stupid shit to prevent legitimate discussion.

Anonymous
11/04/25(Tue)16:45:08 No.107106681

Anonymous 11/04/25(Tue)16:45:08 No.107106681

>>107106648
on 4chan they do it for the love of the game.

Anonymous
11/04/25(Tue)17:22:06 No.107107001

Anonymous 11/04/25(Tue)17:22:06 No.107107001

>>107105104
Back from trying it.
It parrots unless you enable NoAss.
Thanks for coming to my Tedtalk.

Anonymous
11/04/25(Tue)17:31:43 No.107107092

Anonymous 11/04/25(Tue)17:31:43 No.107107092

>>107104496
jews simultanously claiming they are not behind and everything and that every fucking mundane thing is about them lol

Anonymous
11/04/25(Tue)17:34:19 No.107107124

Anonymous 11/04/25(Tue)17:34:19 No.107107124

umm.. guys, where can I get instagram chat logs?

Anonymous
11/04/25(Tue)17:35:04 No.107107134

Anonymous 11/04/25(Tue)17:35:04 No.107107134

>>107107124
from instagram

Anonymous
11/04/25(Tue)17:35:54 No.107107138

Anonymous 11/04/25(Tue)17:35:54 No.107107138

>>107107134
fr?
I meant the dump you dum dum

Anonymous
11/04/25(Tue)17:35:59 No.107107139

Anonymous 11/04/25(Tue)17:35:59 No.107107139

>>107107124
instagram probably

Anonymous
11/04/25(Tue)17:36:32 No.107107144

Anonymous 11/04/25(Tue)17:36:32 No.107107144

>>107107124
have you tried instagram?

Anonymous
11/04/25(Tue)17:38:20 No.107107157

Anonymous 11/04/25(Tue)17:38:20 No.107107157

>>107107124
Instagran, presumably.

Anonymous
11/04/25(Tue)17:40:14 No.107107182

Anonymous 11/04/25(Tue)17:40:14 No.107107182

>>107107124
I'd try instagram

Anonymous
11/04/25(Tue)17:48:08 No.107107267

Anonymous 11/04/25(Tue)17:48:08 No.107107267

File: MS_Zuckerberg_CloseUp.jpg (734 KB, 1200x1500)

734 KB JPG

This advertisement was brought to you by Meta, the Instagram corporation.

Anonymous
11/04/25(Tue)17:53:35 No.107107321

Anonymous 11/04/25(Tue)17:53:35 No.107107321

>>107107124
I'll trade you a couple for an RTX 5090

Anonymous
11/04/25(Tue)17:57:18 No.107107367

Anonymous 11/04/25(Tue)17:57:18 No.107107367

File: lolFuckYouOAI.png (76 KB, 644x488)

76 KB PNG

>>107104680
>https://openai.com/index/introducing-indqa/
You can't post that bs URL without a screenshot of the site.

Anonymous
11/04/25(Tue)17:58:31 No.107107383

Anonymous 11/04/25(Tue)17:58:31 No.107107383

File: fckRussians.jpg (211 KB, 762x785)

211 KB JPG

>>107104729
Just post this next time like I do. Saves typing.

Anonymous
11/04/25(Tue)18:00:07 No.107107398

Anonymous 11/04/25(Tue)18:00:07 No.107107398

>>107107367
>Hinglish, Kannada
i see

Anonymous
11/04/25(Tue)18:00:21 No.107107401

Anonymous 11/04/25(Tue)18:00:21 No.107107401

File: postContent2.png (3 KB, 228x197)

3 KB PNG

>>107105604
No one cares what you think.

Anonymous
11/04/25(Tue)18:01:16 No.107107409

Anonymous 11/04/25(Tue)18:01:16 No.107107409

>>107107367
Oh, nice, they included Canadian too!

Anonymous
11/04/25(Tue)18:05:14 No.107107444

Anonymous 11/04/25(Tue)18:05:14 No.107107444

>>107107409
>french indian, the filthyest of both worlds!

Anonymous
11/04/25(Tue)18:06:20 No.107107455

Anonymous 11/04/25(Tue)18:06:20 No.107107455

>>107107398
Yeah, I learned a new word.
Hinglish.
Like Spanglish, I guess.
>>107107409
lol
Is there an "EU-QA" that conflates western and eastern Europe and all languages and customs, then tries to grade the whole thing?

Anonymous
11/04/25(Tue)18:09:20 No.107107480

Anonymous 11/04/25(Tue)18:09:20 No.107107480

>>107107455
Just look for an Arabic benchmark.

Anonymous
11/04/25(Tue)18:11:48 No.107107499

Anonymous 11/04/25(Tue)18:11:48 No.107107499

>>107107124
Are you still trying to build a sand golem of your ex-gf? I thought you already had her insta info? >>107103148

Anonymous
11/04/25(Tue)18:15:08 No.107107533

Anonymous 11/04/25(Tue)18:15:08 No.107107533

>>107107480
lol that would make Europe look positively homogenous.
Would it include the brave Palestinians, Israel, Kurds, and the various flavors of Christianity and Muslim in the region?
Imagine the response shitshow that benchmark would crank out.
> Chat: Who is the one true God?
> ALALALALALLALALALA

Anonymous
11/04/25(Tue)18:15:39 No.107107537

Anonymous 11/04/25(Tue)18:15:39 No.107107537

https://comparia.beta.gouv.fr/ranking
lol this is hilarious
the french government just launched its official LLM leaderboard and it's about as corrupt as you can imagine
they have a mistral model ranked number one, higher than any of the following: gpt-5, claude sonnet (opus isn't even on the list), gemini 2.5 pro, deepseek 3.1, grok-4-fast, qwen max...
Yeah, no.

Anonymous
11/04/25(Tue)18:16:11 No.107107544

Anonymous 11/04/25(Tue)18:16:11 No.107107544

>>107105971
https://indianexpress.com/article/technology/tech-news-technology/global-ram-ssd-price-hike-50-per-cent-ai-investment-10336255/
All production gone to HBM chips sir, no consumer RAM and SSD

Anonymous
11/04/25(Tue)18:17:55 No.107107559

Anonymous 11/04/25(Tue)18:17:55 No.107107559

>>107107537
>Estimated statistical score based on the Bradley-Terry model, reflecting the probability that one model is preferred over another. This score is calculated from all user votes and reactions. For more information, visit the methodology tab.
So it's French lmarena? Not surprising French people prefer a model trained with French as a focus.

Anonymous
11/04/25(Tue)18:18:12 No.107107561

Anonymous 11/04/25(Tue)18:18:12 No.107107561

File: file.png (201 KB, 1163x743)

201 KB PNG

>>107104115
guys, i think i'm gonna buy it in december (i rather do that then pay more taxes lol).
still hesitating but man i kinda want to click the button.

Anonymous
11/04/25(Tue)18:18:26 No.107107562

Anonymous 11/04/25(Tue)18:18:26 No.107107562

>>107107537
>gemma 27b at #6
>gpt-oss-120b at #7
>claude not in top 10
And some say lmarena is bad.

Anonymous
11/04/25(Tue)18:19:27 No.107107572

Anonymous 11/04/25(Tue)18:19:27 No.107107572

File: butLookAtThatConfidenceIn(...).png (186 KB, 1283x853)

186 KB PNG

>>107107537
Nice. I mean, just look at that confidence interval. Truly inspiring.
At least I agree with the French on one thing. DS V3-0324 was a great model.

Anonymous
11/04/25(Tue)18:19:38 No.107107574

Anonymous 11/04/25(Tue)18:19:38 No.107107574

>>107107559
>So it's French lmarena? Not surprising French people prefer a model trained with French as a focus.
I am French, et je peux te garantir que mistral n'a rien de supérieur à Claude ou Gemini même dans notre langue crétin.

Anonymous
11/04/25(Tue)18:23:19 No.107107617

Anonymous 11/04/25(Tue)18:23:19 No.107107617

>>107107562
France is the most corrupt country in western Europe in every single possible way. It's the country of nepobabies, of funding public infrastructure that is privatized once it begins to turn profitable to hand out to politician best buddies etc

Anonymous
11/04/25(Tue)18:24:47 No.107107631

Anonymous 11/04/25(Tue)18:24:47 No.107107631

>>107107455
https://arxiv.org/abs/2510.24450v1
Coincidentally, this came out a few days ago:
>EU20-MMLU, EU20-HellaSwag, EU20-ARC, EU20-TruthfulQA, and EU20-GSM8K (Thellmann et al., 2024); or MMLU-Prox (Xuan et al., 2025). Other multilingual benchmarks were created with a special focus on cultural sensitivity by dividing the original subsets into culturally sensitive and culturally agnostic ones (Global MMLU, Singh et al., 2024), or by using professional translators or multiple rounds of revision to raise the quality of the dataset, e.g., BenchMax (Huang et al., 2025), Flores-101 and FLORES-200 (Goyal et al., 2022) and Belebele (Bandarkar et al., 2024).
One from last year with a dataset:
https://arxiv.org/abs/2410.08928
https://huggingface.co/datasets/Eurolingua/mmlux

Anonymous
11/04/25(Tue)18:29:26 No.107107669

Anonymous 11/04/25(Tue)18:29:26 No.107107669

>>107107561
Yeah I'm replacing my two A6000s for one as well. I'm a bit torn between the Max-Q and the normal Workstation one. On one hand, 96GB on 300W seems really nice. On the other, part of me wants to go for max performance for that price especially since it's extremely unlikely that I'm ever going to add a second one to the rig.

Anonymous
11/04/25(Tue)18:31:38 No.107107690

Anonymous 11/04/25(Tue)18:31:38 No.107107690

>>107107669
i'd go with the max perf one, you can always underclock it or just undervolt it for lower consumption and heat.

also llm's generaly don't take all your gpu power because the bottleneck is more mem speed.

i do want to avoid getting a fire in my computer though, i'll have to look if they have the connector issue but i sure hope not at the price of a car.

Anonymous
11/04/25(Tue)18:45:15 No.107107807

Anonymous 11/04/25(Tue)18:45:15 No.107107807

>>107107669
>>107107690
I am also thinking of getting one, except I want the Max-Q. I think it will probably be less prone to fires due to the reduced wattage. The whole burning connector thing is all because the cable is shit and sometimes pushes like 900W through a single wire, but with a hard 300W cap, that can't happen. The performance drop also seems to be around 15% at most.

Anonymous
11/04/25(Tue)18:49:27 No.107107837

Anonymous 11/04/25(Tue)18:49:27 No.107107837

>>107107669
>>107107690
>>107107807
rtx 6000 pro (workstation) runs fine at 300W
keep it at 400W for max combo savings+perf tho
there's a chart floating around on how much % perf you lose as you go down, even at 300w i think it was under 15% less perf

Anonymous
11/04/25(Tue)18:50:53 No.107107853

Anonymous 11/04/25(Tue)18:50:53 No.107107853

>>107107807
The Max-Q shouldn't have the issue at all, should it? It's the exact same connector/cooler as the previous few generations of 6000 workstation cards. I'm pretty sure it even comes with the same adapter as the A6000 (Ada).
The card is tempting but the 10~20% are still going to be pretty noticeable if you want to use the card for non-llm stuff like training or video generation that are both compute-bound and take a lot of time.

Anonymous
11/04/25(Tue)18:51:42 No.107107860

Anonymous 11/04/25(Tue)18:51:42 No.107107860

File: ani.png (45 KB, 678x594)

45 KB PNG

>>107107499
NTA, just want to try it out.

Anonymous
11/04/25(Tue)18:52:05 No.107107866

Anonymous 11/04/25(Tue)18:52:05 No.107107866

>>107107853
at 10-20% it's pretty much the same as 5090 with 3x the vram tho

Anonymous
11/04/25(Tue)18:56:37 No.107107903

Anonymous 11/04/25(Tue)18:56:37 No.107107903

>>107107631
Ffs. Well I guess those PhD students need to eat too.

Anonymous
11/04/25(Tue)18:59:04 No.107107926

Anonymous 11/04/25(Tue)18:59:04 No.107107926

>>107107837
Right, but a software power limit is not as good as a hardware power limit. There still is the chance that it could just ignore the power limit and catch on fire.
>>107107853
I have had several GPUs with the 12V cable for several years and none of them have had any problems, but I still want to be cautious. The Max-Q is almost definitely the safest GPU with the high power cable.
>>107107866
Actually, the Max-Q is about 8% faster than a 5090, which is a pretty good deal since I will be upgrading from a 5090.

Anonymous
11/04/25(Tue)19:01:02 No.107107938

Anonymous 11/04/25(Tue)19:01:02 No.107107938

>>107107926
> There still is the chance that it could just ignore the power limit and catch on fire.

this would be considered a bug, technicaly possible but unlikely.

also you can plug in an adaptor inbetween that will protect from that risk.

> which is a pretty good

8% faster for 4x the price is kinda sad.

Anonymous
11/04/25(Tue)19:02:26 No.107107946

Anonymous 11/04/25(Tue)19:02:26 No.107107946

>>107107926
>There still is the chance that it could just ignore the power limit and catch on fire.
that's a silly thing to say. there's also "a chance" of lighting striking near your house and frying everything you have now. there's a chance of a solar flare striking earth and frying all electrical grids at once. live a little lol

Anonymous
11/04/25(Tue)19:05:01 No.107107962

Anonymous 11/04/25(Tue)19:05:01 No.107107962

>>107107946
hard to live a little when you're on fire though

Anonymous
11/04/25(Tue)19:16:15 No.107108045

Anonymous 11/04/25(Tue)19:16:15 No.107108045

>>107107962
are you on fire right now ?

Anonymous
11/04/25(Tue)19:22:00 No.107108103

Anonymous 11/04/25(Tue)19:22:00 No.107108103

>>107108045
there is a chance I could combust at any moment

Anonymous
11/04/25(Tue)19:24:54 No.107108131

Anonymous 11/04/25(Tue)19:24:54 No.107108131

>>107106416
Does your eyes hurt when using such a color theme?

Anonymous
11/04/25(Tue)19:43:53 No.107108279

Anonymous 11/04/25(Tue)19:43:53 No.107108279

how good are local models at programming and can they interface with vscode to have a local copilot?

Anonymous
11/04/25(Tue)19:51:31 No.107108344

Anonymous 11/04/25(Tue)19:51:31 No.107108344

>>107108279
>and can they interface with vscode to have a local copilot?
they can
>how good are local models at programming
not good

most vscode tools let you set a custom server url but be prepared to hold their hand and rewrite a lot of their output

Anonymous
11/04/25(Tue)20:06:43 No.107108444

Anonymous 11/04/25(Tue)20:06:43 No.107108444

>>107108344
>they can
the one and only thing I care about in vscode related to ai is autocomplete and copilot doesn't let you use your own local FIM model
as for the agentic stuff it's deeply retarded, I hate this even with SOTA APIs and the local models are even worse at this
you use this if you love slop
autocomplete is useful for typing less in repetitive patterns like getters/setters
but I don't want the LLM to gen hundreds of LOC

Anonymous
11/04/25(Tue)20:06:58 No.107108447

Anonymous 11/04/25(Tue)20:06:58 No.107108447

>>107108131
Your eyes hurt more with a dark theme because it has worse contrast.

Anonymous
11/04/25(Tue)20:14:03 No.107108511

Anonymous 11/04/25(Tue)20:14:03 No.107108511

>>107107383
Great image thanks

Anonymous
11/04/25(Tue)20:45:15 No.107108726

Anonymous 11/04/25(Tue)20:45:15 No.107108726

>>107104717
It's wonned you stupid white Saaaaaaaaaar

Anonymous
11/04/25(Tue)21:51:50 No.107109112

Anonymous 11/04/25(Tue)21:51:50 No.107109112

>>107108726
Sorry for late reply sarrs had to fix engine on a UPS plane.

Anonymous
11/04/25(Tue)21:54:45 No.107109131

Anonymous 11/04/25(Tue)21:54:45 No.107109131

File: buzzbuzzbuzz.png (1.2 MB, 1566x6347)

1.2 MB PNG

>https://github.com/ggml-org/llama.cpp/discussions/16957
I don't want to dirty up my github by making fun of this guy, but holy fuck.
His site's articles are also uncannily structured.
>https://software.land/load-vs-stress-testing/

Anonymous
11/04/25(Tue)21:56:35 No.107109138

Anonymous 11/04/25(Tue)21:56:35 No.107109138

>>107108447
Could be true. It's been so long that it's now a norm for me but I'm going to do a test.

Anonymous
11/04/25(Tue)21:59:01 No.107109145

Anonymous 11/04/25(Tue)21:59:01 No.107109145

Why doesn't anyone benchmark quantizations?

I think that REAP paper was most interesting because it came with a chart of how badly performance drops at 25% vs 50% size reduction. In practice the degradation was even worse than what the benchmarks showed, but the paper was up front about it. By comparison, people are just guessing about how bad their quants are. There's that old graph from when every model was coming out 4/12/30/70 sized, where the idea of more parameters > more bits for the same size came from, but I haven't seen that updated post-MoE era.

Why don't AI labs release quants more often? They release multiple sizes (like 30B3A, 32B dense, 235B22A), but not multiple quantization of the same size. On the other hand, you have gpt-oss that only released a 4bpw version. There was that one Gemma version that tried quantization-aware training, which was pretty good.

Anonymous
11/04/25(Tue)22:00:47 No.107109153

Anonymous 11/04/25(Tue)22:00:47 No.107109153

>>107109145
i just want to know specifically how retarded glm 4.6 q3 is so i can make fun of people

Anonymous
11/04/25(Tue)22:20:28 No.107109251

Anonymous 11/04/25(Tue)22:20:28 No.107109251

>>107109145
Usage proves more than any benchmark. In practice, everyone looks for the largest model they can run at ~q3, and only increases quant bits if they have space to spare. If q3 was too retarded then people would use smaller models at higher Q, but no one does.

Anonymous
11/04/25(Tue)22:21:49 No.107109254

Anonymous 11/04/25(Tue)22:21:49 No.107109254

>>107109153
q4 is actually good, q3 is pretty meh, q2 is fucking retarded

Anonymous
11/04/25(Tue)22:24:56 No.107109273

Anonymous 11/04/25(Tue)22:24:56 No.107109273

>>107109145
quanting is a janny job

Anonymous
11/04/25(Tue)22:40:01 No.107109333

Anonymous 11/04/25(Tue)22:40:01 No.107109333

>>107109251
I don't use anything under q5 because it's always noticeably more retarded, I don't understand how anyone says otherwise my intuition tells me it's because the people using them are retarded and can't tell the difference

Anonymous
11/04/25(Tue)22:41:17 No.107109338

Anonymous 11/04/25(Tue)22:41:17 No.107109338

>>107109333
It's placebo. You don't need more than q2

Anonymous
11/04/25(Tue)22:42:57 No.107109345

Anonymous 11/04/25(Tue)22:42:57 No.107109345

>>107109251
There aren't many models, so even a retarded Q2 4.6 is better than anything in this size category. 4.5 air is trash even at q8 and loses to a fucking 24b mistral in most of my automated tasks, which is an objective metric

Anonymous
11/04/25(Tue)22:44:41 No.107109353

Anonymous 11/04/25(Tue)22:44:41 No.107109353

>>107109145
Actually I take it back, I looked harder and Qwen published official F16/Q8/Q4 quants for 235B-VL models. No benchmarks though.

Anonymous
11/04/25(Tue)23:06:26 No.107109456

Anonymous 11/04/25(Tue)23:06:26 No.107109456

>>107109338
It's not, everything I've tried devolves to sloppa, hallucinates out the ass and makes retarded logical leaps and order of magnitude greater than q5+ at anything under it, requiring exponentially more swipes to get a reasonable response.

I understand your shitposting but I wouldn't want to mislead other anons into coping with brain-dead quants like that

Anonymous
11/04/25(Tue)23:09:52 No.107109466

Anonymous 11/04/25(Tue)23:09:52 No.107109466

File: 83ca1c95-1df6-4b17-89fd-2(...).png (1.48 MB, 768x1344)

1.48 MB PNG

>>107109251
>people would use smaller models at higher Q
That was a thing when we had 7, 16, 30, and 70b of the same model. You can’t do this anymore unless you run Qwen, at which point your opinion on quality is irrelevant

Anonymous
11/04/25(Tue)23:13:34 No.107109485

Anonymous 11/04/25(Tue)23:13:34 No.107109485

>q5, q6
cope quants

Anonymous
11/04/25(Tue)23:15:55 No.107109498

Anonymous 11/04/25(Tue)23:15:55 No.107109498

>>107109485
q5 happens to fit glm air into 4 3090s. no reason to use q4 in that case. no idea what q6 lets you do.

Anonymous
11/04/25(Tue)23:27:20 No.107109543

Anonymous 11/04/25(Tue)23:27:20 No.107109543

>>107109498
air is fucking garbage at any quant

Anonymous
11/04/25(Tue)23:42:42 No.107109598

Anonymous 11/04/25(Tue)23:42:42 No.107109598

E = MC^2 + Bitnet

Anonymous
11/05/25(Wed)00:20:08 No.107109745

Anonymous 11/05/25(Wed)00:20:08 No.107109745

>>107104115
Yo, all I have is a single 5070 TI + 32GB RAM, and I just want a roleplay bot, not ERP, but world-building story generation. With GOOD writing, not slop.
Is there any good models out there that fit? Deepseek and the like I know seem to be too big. learning to use llama.cpp

Anonymous
11/05/25(Wed)00:24:53 No.107109761

Anonymous 11/05/25(Wed)00:24:53 No.107109761

>>107109745
nothing out there really. try magistral small 2509 or nemo. probably wont be able to get anything with good spacial awareness or writing with such limited resources

Anonymous
11/05/25(Wed)00:27:59 No.107109774

Anonymous 11/05/25(Wed)00:27:59 No.107109774

>>107109761
yeah I'm trying some 8B models and it really sucks. Writing is so cliched, doesn't feel real and can't get immersed. Well, looks like I'll use up the rest of my Deepseek tokens.

Anonymous
11/05/25(Wed)00:32:08 No.107109788

Anonymous 11/05/25(Wed)00:32:08 No.107109788

>>107109745
Every single model has slop
Yes, even the big paid ones running on million dollar servers.

Anonymous
11/05/25(Wed)01:41:29 No.107110077

Anonymous 11/05/25(Wed)01:41:29 No.107110077

Is there a 3-4 question way of benchmarking a model? I ask them to play tick tack toe and write FizzBuzz in 10 different ways.

Anonymous
11/05/25(Wed)01:57:51 No.107110129

Anonymous 11/05/25(Wed)01:57:51 No.107110129

File: Miku-30.jpg (163 KB, 512x768)

163 KB JPG

>>107109745
>world-building story generation
writing engaging, original stories is actually one of the hardest domains. The biggest models struggle with that.
Codefags have it easy

Anonymous
11/05/25(Wed)01:59:08 No.107110139

Anonymous 11/05/25(Wed)01:59:08 No.107110139

>>107109466
it's the first that I want to fuck with Miku

Anonymous
11/05/25(Wed)02:06:58 No.107110166

Anonymous 11/05/25(Wed)02:06:58 No.107110166

>>107106594
>civ 5
So that's why the US are always grinding XP by bombarding random minor civs?

Anonymous
11/05/25(Wed)02:36:41 No.107110267

Anonymous 11/05/25(Wed)02:36:41 No.107110267

>>107109456
It's just YOU. Maybe you should learn to manage context. I've been using q2 to summarize stuff and it just works fine.

Anonymous
11/05/25(Wed)02:46:19 No.107110307

Anonymous 11/05/25(Wed)02:46:19 No.107110307

>I've been using q2 to summarize stuff
lmg users of copequants are low iq mongoloids, case #234324432
if they can't notice the garbage doing this produces they can't judge any sort of output quality
rope yourself you waste of air, water and other essentials

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.