/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

[Post a Reply]

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous
/lmg/ - Local Models General 09/30/25(Tue)13:28:40 No.106748568

File: __kasane_teto_utau_and_1_(...).jpg (3.34 MB, 3040x3200)

3.34 MB JPG

/lmg/ - Local Models General Anonymous 09/30/25(Tue)13:28:40 No.106748568

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>106738470 & >>106729809

►News
>(09/30) GLM-4.6: Advanced Agentic, Reasoning and Coding Capabilities: https://z.ai/blog/glm-4.6
>(09/30) Sequential Diffusion Language Models released: https://hf.co/collections/OpenGVLab/sdlm-68ac82709d7c343ad36aa552
>(09/29) Ring-1T-preview released: https://hf.co/inclusionAI/Ring-1T-preview
>(09/29) DeepSeek-V3.2-Exp released: https://hf.co/collections/deepseek-ai/deepseek-v32-68da2f317324c70047c28f66
>(09/27) HunyuanVideo-Foley for video to audio released: https://hf.co/tencent/HunyuanVideo-Foley

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
09/30/25(Tue)13:29:14 No.106748575

Anonymous 09/30/25(Tue)13:29:14 No.106748575

File: no particular reason.jpg (306 KB, 1536x1536)

306 KB JPG

►Recent Highlights from the Previous Thread: >>106738470

--Paper: Sequential Diffusion Language Models:
>106743521 >106743844
--Papers:
>106743597 >106743650 >106744459
--Building multimodal RAG systems with morphik.ai and qwen3-vl:
>106738676 >106738891 >106739314 >106739374 >106739595 >106739732 >106739776 >106739975 >106740025 >106740155 >106740221 >106740302 >106740526 >106740708 >106740911 >106744029 >106744053 >106746008 >106739682 >106739744 >106739155
--GLM-4.6 model update with 200K token context window for enhanced agentic tasks:
>106744045 >106744056 >106744058 >106744346
--LoRA in RL matches full-finetuning performance with 2/3 resource usage:
>106740379
--Proposing a parameter to reduce formatting-driven repetition in model outputs:
>106745814 >106745845
--GLM-4.6 outperforms larger Deepseek despite fewer parameters:
>106746646 >106746877 >106747030 >106747076 >106747443
--Speculation about glm4.6 vision integration and existing model limitations:
>106742710 >106742714 >106742770
--VRAM pooling vs local execution with RTX 3090 24GB for LLMs:
>106739448 >106739546
--Lorebook limitations in bypassing model safeguards and context flooding risks:
>106739761 >106739782 >106739799 >106739840 >106739924 >106741168 >106741221 >106743857 >106741428 >106742363 >106742372 >106741468
--Hardware recommendations for running LLMs like Mistral Nemo GGUF with VRAM/RAM considerations:
>106746751 >106746759 >106746768 >106746787 >106746788 >106746962
--AI model refusal behavior and alignment training critique:
>106745464 >106745474 >106745520 >106745561 >106745611 >106745697 >106746211
--GLM-4.5 transformers compatibility update:
>106744567
--GLM 4.6 performance comparison with Qwen3-Coder using CC-Bench metrics:
>106743685 >106743785 >106743807
--Miku (free space):
>106739332 >106739379 >106739390 >106740160 >106745506

►Recent Highlight Posts from the Previous Thread: >>106738476

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
09/30/25(Tue)13:31:30 No.106748605

Anonymous 09/30/25(Tue)13:31:30 No.106748605

this thread is going to hell

Anonymous
09/30/25(Tue)13:32:04 No.106748610

Anonymous 09/30/25(Tue)13:32:04 No.106748610

File: 1733491296330353.png (613 KB, 694x1238)

613 KB PNG

local lost
https://www.youtube.com/watch?v=gzneGhpXwjU

Anonymous
09/30/25(Tue)13:34:30 No.106748633

Anonymous 09/30/25(Tue)13:34:30 No.106748633

>>106748605
it is mikutroon quality. you asked for this.

Anonymous
09/30/25(Tue)13:35:06 No.106748643

Anonymous 09/30/25(Tue)13:35:06 No.106748643

>>106748605
this is hell, bloody basterd bitch

Anonymous
09/30/25(Tue)13:35:29 No.106748651

Anonymous 09/30/25(Tue)13:35:29 No.106748651

How does the 8b qwen MoE compare to GLM air for ERP?

Anonymous
09/30/25(Tue)13:35:56 No.106748655

Anonymous 09/30/25(Tue)13:35:56 No.106748655

File: file.png (16 KB, 330x512)

16 KB PNG

ooeeoo

Anonymous
09/30/25(Tue)13:37:12 No.106748671

Anonymous 09/30/25(Tue)13:37:12 No.106748671

>>106748610
NGL it does look pretty impressive. But how much does it cost to use?

Anonymous
09/30/25(Tue)13:37:31 No.106748674

Anonymous 09/30/25(Tue)13:37:31 No.106748674

File: saddaaasda.jpg (776 KB, 1012x4297)

776 KB JPG

needs a little love in terms of scene-scene consistency and manual touchups (none, it's SaaS you take your credit genned result and you like it)
visual quality is high in places but even the chair shapes are fucked which was a surprise

Anonymous
09/30/25(Tue)13:38:03 No.106748683

Anonymous 09/30/25(Tue)13:38:03 No.106748683

>>106748610
like it reproduced the Us Open court from scratch? wtf

Anonymous
09/30/25(Tue)13:38:18 No.106748687

Anonymous 09/30/25(Tue)13:38:18 No.106748687

>>106748605
this thread will be alive again when AI eventually gets the ban hammer somewhere in the world

Anonymous
09/30/25(Tue)13:38:22 No.106748688

Anonymous 09/30/25(Tue)13:38:22 No.106748688

>>106748655
omg it is migu

Anonymous
09/30/25(Tue)13:38:30 No.106748690

Anonymous 09/30/25(Tue)13:38:30 No.106748690

>>106748674
sama's self insert!?

Anonymous
09/30/25(Tue)13:39:13 No.106748700

Anonymous 09/30/25(Tue)13:39:13 No.106748700

>>106748610
this is one of the worst things I've ever seen. Literally who asked for an AI only social media platform? Why wouldn't people just upload their aislop to regular social media?
The character cameo feature will surely cause some PR incidents for openai. Teens will make lewd gens of their real schoolmates. Yeah openai probably have some safety filters in place but what if the original cameo recording is taken with a revealing outfit or the girl has massive tits?
The whole concept is retarded from conception.

Hi all, Drummer here...
09/30/25(Tue)13:39:45 No.106748706

Hi all, Drummer here... 09/30/25(Tue)13:39:45 No.106748706

Thoughts on applying some noise in layers to make models more creative?

Anonymous
09/30/25(Tue)13:40:04 No.106748714

Anonymous 09/30/25(Tue)13:40:04 No.106748714

>>106748605
This thread is the V4/R2 waiting room, and it has been a loooong wait.

Anonymous
09/30/25(Tue)13:41:46 No.106748736

Anonymous 09/30/25(Tue)13:41:46 No.106748736

>>106748610
This is pretty much just on the level of Veo 3 which was released 5 months ago.
This 8-10 second duration threshold seems to be a significant problem because you'd think we would have something by now.

Anonymous
09/30/25(Tue)13:43:01 No.106748751

Anonymous 09/30/25(Tue)13:43:01 No.106748751

File: 1759254054173564.png (330 KB, 431x945)

330 KB PNG

>>106748610
>"our model is great at physics"
>not flowing out of the tap but out of his mouth
>tap levitating in the air
>nonensical body movements that only models from a year ago had
lmao, what the fuck were openai doing all these months despite having sora 1 all that time ago? this is just brutal

Anonymous
09/30/25(Tue)13:43:11 No.106748752

Anonymous 09/30/25(Tue)13:43:11 No.106748752

>>106748706
isn't that like just increasing the temperature but at the layer level?

Anonymous
09/30/25(Tue)13:43:14 No.106748753

Anonymous 09/30/25(Tue)13:43:14 No.106748753

>>106748736
desu I never expected OpenAI to reach Google's level for video models, so that is impressive on itself

Anonymous
09/30/25(Tue)13:43:57 No.106748758

Anonymous 09/30/25(Tue)13:43:57 No.106748758

JOHHHNNNN WHERE IS MY SEX?

Anonymous
09/30/25(Tue)13:44:10 No.106748762

Anonymous 09/30/25(Tue)13:44:10 No.106748762

>>106748706
There's a project called DRUGS that does exactly that.

Hi all, Drummer here...
09/30/25(Tue)13:45:07 No.106748767

Hi all, Drummer here... 09/30/25(Tue)13:45:07 No.106748767

>>106748752
Yeah, but that's usually applied at the output level only. I wonder what would happen if we applied it in other parts of the model.

Anonymous
09/30/25(Tue)13:45:43 No.106748772

Anonymous 09/30/25(Tue)13:45:43 No.106748772

Re suggesting old suggestion for new meme sampler
Wiggle
A random rotation, within a specified range, is applied to each tensor at the time of inferencing.

Anonymous
09/30/25(Tue)13:45:45 No.106748774

Anonymous 09/30/25(Tue)13:45:45 No.106748774

>>106748751
>the skin texture is good
>the face likeliness is great
I think it's the best model to reproduce someone else's behavior, for a company focused on "safety" that's why they went that hard on that lol

Anonymous
09/30/25(Tue)13:45:57 No.106748777

Anonymous 09/30/25(Tue)13:45:57 No.106748777

>>106748736
>>106748751
I don't understand how Sora could create one minute long videos in February of 2024 that looked better then this shit

Anonymous
09/30/25(Tue)13:47:23 No.106748786

Anonymous 09/30/25(Tue)13:47:23 No.106748786

>>106748777
>one minute long videos in February of 2024
they weren't one minute long lol, those were short 5 sec clip videos

Anonymous
09/30/25(Tue)13:47:45 No.106748793

Anonymous 09/30/25(Tue)13:47:45 No.106748793

ama getter shriller

Anonymous
09/30/25(Tue)13:47:47 No.106748795

Anonymous 09/30/25(Tue)13:47:47 No.106748795

>>106748774
Good thing that you can't reliably make a model adhere to anything so I expect a bunch of celebrities cursing niggers by the end of today

Anonymous
09/30/25(Tue)13:48:03 No.106748798

Anonymous 09/30/25(Tue)13:48:03 No.106748798

>>106748777
OpenAI was targeted by everyone for poaching. They ain't got no brains left.

Anonymous
09/30/25(Tue)13:49:05 No.106748812

Anonymous 09/30/25(Tue)13:49:05 No.106748812

>>106748786
https://www.youtube.com/watch?v=tRSdt5kmeW0

Anonymous
09/30/25(Tue)13:49:08 No.106748814

Anonymous 09/30/25(Tue)13:49:08 No.106748814

>>106748700
their idea is to contain AI slop spam on specific AI slop platforms. this serves the purpose of...fuck, I don't know...so they can train freely on all the user generated content without having to worry about copyright? create a containment zone for deepfakes that are all watermarked by their app? honestly I'm all for it. you ever thought about how forums and chans would look like if zoomers didnt have tiktok? so if this can achieve the same thing for low quality sloppers, i'm all for it

Anonymous
09/30/25(Tue)13:49:39 No.106748816

Anonymous 09/30/25(Tue)13:49:39 No.106748816

>>106748798
>They ain't got no brains left.
Idk man, if they have no brains left, Sora 2 wouldn't look this decent

Anonymous
09/30/25(Tue)13:50:27 No.106748823

Anonymous 09/30/25(Tue)13:50:27 No.106748823

how long until Google drops their next video model now?

Anonymous
09/30/25(Tue)13:50:45 No.106748826

Anonymous 09/30/25(Tue)13:50:45 No.106748826

>>106748812
oh yeah that's true, my b, maybe they just continued from the last frame to continue the video or some shit, we never had the occasion to get more than 10 sec video from any API model so far

Anonymous
09/30/25(Tue)13:51:10 No.106748830

Anonymous 09/30/25(Tue)13:51:10 No.106748830

>>106748767
Probably wouldn't be too different from what happens in models with mixed quantization.

Anonymous
09/30/25(Tue)13:52:16 No.106748841

Anonymous 09/30/25(Tue)13:52:16 No.106748841

>>106748714
>he still believes in deepseek after their recent releases
it's over, just accept it anon. ZAI are our new saviors now

Anonymous
09/30/25(Tue)13:53:30 No.106748852

Anonymous 09/30/25(Tue)13:53:30 No.106748852

>>106748706
>>106748762
https://github.com/EGjoni/DRUGS
This guy explored the topic

Anonymous
09/30/25(Tue)13:57:31 No.106748895

Anonymous 09/30/25(Tue)13:57:31 No.106748895

has anyone tried 4.6 yet?

Anonymous
09/30/25(Tue)13:57:56 No.106748902

Anonymous 09/30/25(Tue)13:57:56 No.106748902

>>106748610
this is actually pretty good
https://xcancel.com/GabrielPeterss4/status/1973071380842229781#m

Anonymous
09/30/25(Tue)14:00:21 No.106748932

Anonymous 09/30/25(Tue)14:00:21 No.106748932

File: .png (447 KB, 466x797)

447 KB PNG

>>106748902
>putting an end to the uncanny ai vibes
KEK

Anonymous
09/30/25(Tue)14:01:02 No.106748944

Anonymous 09/30/25(Tue)14:01:02 No.106748944

>>106748932
desu I never seen the real Sam Altman smile so maybe it looks like that kek

Anonymous
09/30/25(Tue)14:01:33 No.106748950

Anonymous 09/30/25(Tue)14:01:33 No.106748950

>>106748630
Sounds reminiscent of the cancer vocals AI songs have. I really do hate it.

Anonymous
09/30/25(Tue)14:02:21 No.106748957

Anonymous 09/30/25(Tue)14:02:21 No.106748957

>>106748902
YUCK

Anonymous
09/30/25(Tue)14:05:36 No.106749000

Anonymous 09/30/25(Tue)14:05:36 No.106749000

>>106748950
Suno has a V5 now. The vocals sound pretty damn good, although it suffers from serious same-songiness now.

Anonymous
09/30/25(Tue)14:14:27 No.106749095

Anonymous 09/30/25(Tue)14:14:27 No.106749095

>>106748841
this
and glm also fits on a lot more consumer PCs

Anonymous
09/30/25(Tue)14:20:55 No.106749150

Anonymous 09/30/25(Tue)14:20:55 No.106749150

>>106749095
Exactly. GLM hasn't done anything new except repackage DS for poorfags. If you want something new you need to wait for DS, otherwise it's just incremental improvements forever.

Anonymous
09/30/25(Tue)14:28:36 No.106749219

Anonymous 09/30/25(Tue)14:28:36 No.106749219

Was googles ai mode model swapped out? It seems more bing level retarded the last few days

Anonymous
09/30/25(Tue)14:32:56 No.106749257

Anonymous 09/30/25(Tue)14:32:56 No.106749257

>>106748655
yeee
https://www.youtube.com/watch?v=3kV_xkoDI7c

Anonymous
09/30/25(Tue)14:37:20 No.106749290

Anonymous 09/30/25(Tue)14:37:20 No.106749290

>>106749150
Absolutely false statement. GLM-4.5 Air is for poorfags and has middling performance, GLM-4.5 is a full on replacement for Deepseek that's better in every way. It's not as schizo as the originals or as censored as the new ones. 4.6 might be a complete Sonnet replacement once GGUFs come out.
Ubergarm please do your job

Anonymous
09/30/25(Tue)14:38:26 No.106749306

Anonymous 09/30/25(Tue)14:38:26 No.106749306

>>106749219
Not noticed any differences when comparing a bunch of items. (That's all I've been using it for.)
It does occassionally fail to imediately tie a comment back to chat history, and you have to prompt it in a follow up comment.
But that was a problem earlier too.

Anonymous
09/30/25(Tue)14:38:38 No.106749314

Anonymous 09/30/25(Tue)14:38:38 No.106749314

File: hatsune chiizu.png (8 KB, 512x512)

8 KB PNG

Anonymous
09/30/25(Tue)14:39:15 No.106749320

Anonymous 09/30/25(Tue)14:39:15 No.106749320

>>106749290
>GLM-4.5 is a full on replacement for Deepseek
It's a sidegrade. They have done nothing significant to improve on what Deepseek already released.

Anonymous
09/30/25(Tue)14:47:34 No.106749400

Anonymous 09/30/25(Tue)14:47:34 No.106749400

>>106749000
Oh God. This is so fucking good. There's no way local music gen is ever catching up to this. Prompt done by ChatGPT-5 (there was some back and forth feedback to calibrate things, so don't expect same results)
https://suno.com/s/Mpq3Bb34loTIgq2D

Anonymous
09/30/25(Tue)14:47:55 No.106749406

Anonymous 09/30/25(Tue)14:47:55 No.106749406

new 4.6 glm is fucking crazy good btw, they did some black magic that only claude opus did before

Anonymous
09/30/25(Tue)14:50:21 No.106749428

Anonymous 09/30/25(Tue)14:50:21 No.106749428

>>106749320
Better attention, more response variety, and half the size while maintaining a similar knowledge base. So far all that Deepseek has managed is to calm the R1 schizophrenia and cull the interesting quirks that made it a good writer. The last good DS release was V3-0324 because of that.
IMO DS has become completely irrelevant at this moment. K2-0905 has its own flaws but its knowledge is unmatched, and GLM-4.5 has more creative variety while remaining coherent and almost half the size of DS.
>>106749406
How did you manage to use it? OR provider is down

Anonymous
09/30/25(Tue)14:53:46 No.106749476

Anonymous 09/30/25(Tue)14:53:46 No.106749476

>>106749428
https://huggingface.co/zai-org/GLM-4.6

Anonymous
09/30/25(Tue)14:54:19 No.106749479

Anonymous 09/30/25(Tue)14:54:19 No.106749479

>>106749428
>and half the size
>and almost half the size of DS.
Exactly, that's all it has going for it. Which is great for poorfags, but people are hoping for another leap from DS. Holding their experimental point releases against them is stupid. If you're lucky they might even put out a Lite version for their next series.

Anonymous
09/30/25(Tue)14:54:22 No.106749480

Anonymous 09/30/25(Tue)14:54:22 No.106749480

>>106748610
TL;DW

Anonymous
09/30/25(Tue)14:55:14 No.106749492

Anonymous 09/30/25(Tue)14:55:14 No.106749492

>>106749406
Can fucking confirm. Creative, no repetition, no slop. They literally made a footnote of "improved roleplay" and fixed everything, something the community has been failing to fix for 3 years.

Anonymous
09/30/25(Tue)14:55:32 No.106749497

Anonymous 09/30/25(Tue)14:55:32 No.106749497

>>106749479
at best they'll do another round of distill shits

Anonymous
09/30/25(Tue)14:56:14 No.106749506

Anonymous 09/30/25(Tue)14:56:14 No.106749506

>>106749479
DS3 is now retarded in comparison and does not know more any more, try it

Anonymous
09/30/25(Tue)14:58:27 No.106749538

Anonymous 09/30/25(Tue)14:58:27 No.106749538

>>106749400
>https://suno.com/s/Mpq3Bb34loTIgq2D
I agree, this is awesome.
but.. it's also proprietary.
so what use is it to anyone, really?

Anonymous
09/30/25(Tue)14:58:53 No.106749541

Anonymous 09/30/25(Tue)14:58:53 No.106749541

>>106749506
Bro they saved so much money tho who cares?

Anonymous
09/30/25(Tue)14:59:38 No.106749548

Anonymous 09/30/25(Tue)14:59:38 No.106749548

>>106749400
breddy gud
crazy how ai is like a meta-tool in all these domains...you can move "farther up the stack" if you know what you want out of the realm of all possibilities and can even get pro-level results if how to do qc on it. eg. bro who can make tracks can suddenly move up to being a producer or even a full on record label with swarms of automatons.
guess we need to see how long before the human element is slowing shit down too much, even at the uppermost layers of what's needed to orchestrate things

Anonymous
09/30/25(Tue)15:00:39 No.106749559

Anonymous 09/30/25(Tue)15:00:39 No.106749559

>>106749538
for creating emergent soundscapes that you would otherwise never hear. Like I'm talking about from a musical perspective it's just insane the way it side-chains and counterpoints all of the different sounds the way it does. It's like a dimensionality beyond just the soundwave itself.

Anonymous
09/30/25(Tue)15:02:26 No.106749580

Anonymous 09/30/25(Tue)15:02:26 No.106749580

>>106749479
Okay, ASIDE from being half the size, the newer DS models are really corporate. It became obvious with R1-0528 and it has continued plummeting downhill ever since. It's not nearly as bad as Qwen but they are very obviously benchmaxxing the models and letting people continue using V3-0324 or R1 for DS RP. GLM-4.5 is a much better RP model while being less prone to the shit em-dash prose.
>>106749506
This
>>106749541
V3.1 was already bad, 3.2 didn't change anything except context attention.

Anonymous
09/30/25(Tue)15:04:00 No.106749590

Anonymous 09/30/25(Tue)15:04:00 No.106749590

>>106749559 (Me)
Oh I've also kind of demonstrated that through in-context learning an LLM can be taught to directly manipulate the soundscape through the prompt and make adjustments that are abstract beyond technical explanation. Imagine what a model trained natively on this would be able to do. Like you would literally gain forbidden knowledge in the process.

Anonymous
09/30/25(Tue)15:07:30 No.106749621

Anonymous 09/30/25(Tue)15:07:30 No.106749621

>>106749400
>>106749548
>>106749559
Did any of the open musicgen things ever produce a decent model that can compete, even with old revs of suno? No music hording anon create a custom one to share?

Anonymous
09/30/25(Tue)15:07:53 No.106749625

Anonymous 09/30/25(Tue)15:07:53 No.106749625

>>106749580
>GLM-4.5 is a much better RP model
You are a retard and still not getting the point. GLM-4.5 would not exist without DeepSeek-V3. Before V3/R1 the best we had was what, 405B and Mistral Large? It was a huge leap. GLM isn't going to get better without someone to copy from. If V4/R2 ends up only as a smaller model good for RP, I'm sure you'll be ecstatic, but that would be a disappointing release. GLM has proven not been able to do anything but incremental improvements, but hopefully DS won't disappoint.

Anonymous
09/30/25(Tue)15:10:27 No.106749642

Anonymous 09/30/25(Tue)15:10:27 No.106749642

>>106749621
The best local one I heard I think is probably pretty close to being as good as Suno 3.0 which is pretty respectable.
Like if you want to make music and shit for a youtube video or something local has tools that'll get the job done. But if you want something like next level, the kind of shit that makes musicians make seething copium youtube videos then you gotta pay the corpo man.

Anonymous
09/30/25(Tue)15:12:07 No.106749654

Anonymous 09/30/25(Tue)15:12:07 No.106749654

>>106749625
but ds already disappointed for months now

Anonymous
09/30/25(Tue)15:13:32 No.106749665

Anonymous 09/30/25(Tue)15:13:32 No.106749665

>>106749654
DeepSeek V4 and R2 were out for months now? Link?

Anonymous
09/30/25(Tue)15:14:23 No.106749675

Anonymous 09/30/25(Tue)15:14:23 No.106749675

macGODS keep on winning
https://xcancel.com/awnihannun/status/1973063906341114327#m

Anonymous
09/30/25(Tue)15:15:12 No.106749681

Anonymous 09/30/25(Tue)15:15:12 No.106749681

why are people talking here and not using it on their macs?
https://huggingface.co/cs2764/GLM-4.6-mlx-mixed_4_6

and this model is legit no longer chinese cope, this is claude sonnet tier

Anonymous
09/30/25(Tue)15:18:20 No.106749703

Anonymous 09/30/25(Tue)15:18:20 No.106749703

>>106749625
The natural conclusion of your logic is to thank Sam Altman as our overlord that released the great and mighty GPT-2 which spurred GPT-3.5 into existence, then the leaker who released Llama-1, et cetera. Of course DS played a role in GLM-4.5's creation, DS established that MoE models are a viable and powerful alternative to dense models if designed properly. Something something Bhuddist concept of nothingness and the interconnectivity of everything required for existence.
DS has been stalling for a while now. Most of their releases have been increased model performance (V3.2-Exp, which is just V3.1 with altered attention mechanisms) or coding performance (see: all their releases for the past year).
DS is no longer the king it once was. There's a broader variety of models with a range of performance and quality, and it seems right now based on the benchmarks and all the yap in this thread GLM-4.6 just beat it and the best Claude has to offer.

Anonymous
09/30/25(Tue)15:19:26 No.106749712

Anonymous 09/30/25(Tue)15:19:26 No.106749712

>>106749703
>DS established that MoE models are a viable and powerful alternative to dense models if designed properly.
gpt4 leak already did that

Anonymous
09/30/25(Tue)15:19:29 No.106749713

Anonymous 09/30/25(Tue)15:19:29 No.106749713

>>106749665
there won't be r2 and they were supposed to release v4 way back the fact they haven't shows they're done

Anonymous
09/30/25(Tue)15:28:45 No.106749799

Anonymous 09/30/25(Tue)15:28:45 No.106749799

>>106749642
AceStep? It was ok, but needed a better model.
I actually preferred some of the early Suno to what's out now. That weird base64 string mashup eastern-european choir soundscape thing an anon posted two years ago what super unique and haunting in a way the current slick sounding slop doesn't quite reach. A unique and alien quality that was more cyberpunk and awe inspiring.

Anonymous
09/30/25(Tue)15:43:11 No.106749929

Anonymous 09/30/25(Tue)15:43:11 No.106749929

>>106749799
There was something recently not ace step that was way better. It sounds about suno 3.0 quality but I think it can only generate like 140 second segments or something. I'm not sure how fast either because someone else was providing the samples https://suno.com/s/B0Y5i9k7vElRAQRd

Anonymous
09/30/25(Tue)16:15:11 No.106750220

Anonymous 09/30/25(Tue)16:15:11 No.106750220

>>106748610
oof, those voices come out pretty rough

Anonymous
09/30/25(Tue)16:21:20 No.106750282

Anonymous 09/30/25(Tue)16:21:20 No.106750282

>>106749703
Bit funny to dismiss increased performance as insignificant in the supposed local general

Anonymous
09/30/25(Tue)16:21:22 No.106750284

Anonymous 09/30/25(Tue)16:21:22 No.106750284

>>106748568
I look exactly like this down to the finest detail

Anonymous
09/30/25(Tue)16:24:54 No.106750320

Anonymous 09/30/25(Tue)16:24:54 No.106750320

>>106750284
troon

Anonymous
09/30/25(Tue)16:25:38 No.106750328

Anonymous 09/30/25(Tue)16:25:38 No.106750328

>>106749406
>>106749506
lol it's obvious RP is the only usecase that coomers here have
get your brains unfucked

Anonymous
09/30/25(Tue)16:29:06 No.106750355

Anonymous 09/30/25(Tue)16:29:06 No.106750355

>>106750328
nyope *cums on you*

Anonymous
09/30/25(Tue)16:29:38 No.106750361

Anonymous 09/30/25(Tue)16:29:38 No.106750361

>>106749703
I mean it is logical absolutely to thank Sam Altman and others at OpenAI who did GPT-3 to 4 for starting the LLM industry. That doesn't mean you have to love him or that he should be thanked for all the other bullshit they did.

Anonymous
09/30/25(Tue)16:31:03 No.106750372

Anonymous 09/30/25(Tue)16:31:03 No.106750372

>>106750355
based

Anonymous
09/30/25(Tue)16:31:30 No.106750376

Anonymous 09/30/25(Tue)16:31:30 No.106750376

I'm gonna cooooode

Anonymous
09/30/25(Tue)16:32:32 No.106750388

Anonymous 09/30/25(Tue)16:32:32 No.106750388

>>106750328
? have you bothered trying 4.6? it beats all other local models for coding by a huge margin, first model that competes with sonnet

Anonymous
09/30/25(Tue)16:33:07 No.106750397

Anonymous 09/30/25(Tue)16:33:07 No.106750397

>>106750361
>Thank me for my services
you wish, sam

Anonymous
09/30/25(Tue)16:33:54 No.106750410

Anonymous 09/30/25(Tue)16:33:54 No.106750410

>>106750376
do you need a side of math with that sir

Anonymous
09/30/25(Tue)16:34:51 No.106750421

Anonymous 09/30/25(Tue)16:34:51 No.106750421

>>106748610
Wow Sora 2 is revolutionary. Miles ahead of anything local. Well done Sam Altman

Anonymous
09/30/25(Tue)16:36:11 No.106750435

Anonymous 09/30/25(Tue)16:36:11 No.106750435

Stop shitposting, You will never be Sam Altman.

Anonymous
09/30/25(Tue)16:37:55 No.106750448

Anonymous 09/30/25(Tue)16:37:55 No.106750448

File: Screenshot 2025-09-30 143318.png (137 KB, 898x1506)

137 KB PNG

>>106750282
I guess the only reason I mention it is because they added those performance optimizations to a model that is subpar in the fields they trained it for. It's gotten to the point that they will only compare V3.1 to older models, V3.1-Terminus to V3.1, and V3.2 to V3.1-Terminus because ultimately it struggles to compete with anything newer than when R1 initially released. Its only advantage is the price which is almost certainly an artificial edge that lasts until Google decides to drop the price or create a loss-leader.
>>106750388
>quants require a mac or paying for API
Not yet. Will try when ubergarm (GET TO WORK FOR THE LOVE OF GOD PLEASE) releases a quant.

Anonymous
09/30/25(Tue)16:38:03 No.106750449

Anonymous 09/30/25(Tue)16:38:03 No.106750449

deepseek was never good

Anonymous
09/30/25(Tue)16:43:16 No.106750490

Anonymous 09/30/25(Tue)16:43:16 No.106750490

Even Sam is a VRAMlet
https://xcancel.com/ai_for_success/status/1973097111064289332#m

Anonymous
09/30/25(Tue)16:43:29 No.106750493

Anonymous 09/30/25(Tue)16:43:29 No.106750493

File: 8man.png (221 KB, 581x327)

221 KB PNG

mikupad fucked up it's data and refused to load properly until I wiped all browser cache and storage clean.
Logging in back into every website is whatever, but I lost some mikupad logs that I wanted to keep is painful.

Anonymous
09/30/25(Tue)16:44:54 No.106750508

Anonymous 09/30/25(Tue)16:44:54 No.106750508

>>106750490
That looks so bad. Did they intentionally increase the compression to mask the fact its coherence is so horrible?

Anonymous
09/30/25(Tue)16:45:46 No.106750516

Anonymous 09/30/25(Tue)16:45:46 No.106750516

>>106750490
Sam is one of us

Anonymous
09/30/25(Tue)16:45:49 No.106750517

Anonymous 09/30/25(Tue)16:45:49 No.106750517

>>106750493
Does it keep the logs in local storage? You could have saved them to json using some javascript before clearing your browser cache.

Anonymous
09/30/25(Tue)16:46:31 No.106750524

Anonymous 09/30/25(Tue)16:46:31 No.106750524

>>106749929
DiffRhythm?

Anonymous
09/30/25(Tue)16:47:48 No.106750535

Anonymous 09/30/25(Tue)16:47:48 No.106750535

saars i am very exciting for glm 4.6

Anonymous
09/30/25(Tue)16:48:49 No.106750541

Anonymous 09/30/25(Tue)16:48:49 No.106750541

>>106750535
isn't glm chinese?

Anonymous
09/30/25(Tue)16:50:46 No.106750563

Anonymous 09/30/25(Tue)16:50:46 No.106750563

Can GLM 4.6 do creative writing?

Anonymous
09/30/25(Tue)16:51:29 No.106750576

Anonymous 09/30/25(Tue)16:51:29 No.106750576

>>106750563
crazy good

Anonymous
09/30/25(Tue)16:52:49 No.106750592

Anonymous 09/30/25(Tue)16:52:49 No.106750592

>>106750541
No it is the 100% glorious Bharat engineering you bloody fuck you motherfucking bitch bastard

Anonymous
09/30/25(Tue)16:53:52 No.106750604

Anonymous 09/30/25(Tue)16:53:52 No.106750604

>>106750592
sure looks chinese to me
https://zhipu-ai.feishu.cn/wiki/Gv3swM0Yci7w7Zke9E0crhU7n7D

Anonymous
09/30/25(Tue)16:54:30 No.106750613

Anonymous 09/30/25(Tue)16:54:30 No.106750613

>>106750604
>responding to bait
ngmi

Anonymous
09/30/25(Tue)16:56:13 No.106750633

Anonymous 09/30/25(Tue)16:56:13 No.106750633

>>106750576
I'd wait for eqbench's test
GLM 4.5 wasn't that good at creative writing compared to how much they've been shilled here

Anonymous
09/30/25(Tue)16:58:03 No.106750647

Anonymous 09/30/25(Tue)16:58:03 No.106750647

>>106750633
all the writing / rp servers are talking about it at least

Anonymous
09/30/25(Tue)16:59:01 No.106750659

Anonymous 09/30/25(Tue)16:59:01 No.106750659

>>106750633
Have you tried 4.5 full or are you comparing what you've heard to EQBench? It's legitimately underrated on that benchmark.
Also Qwen3 gamed it once by training on EQBench and it ended up making weird sentences, so it's not exactly reliable or unbiased.
Like this.
Drawn out.
Two words.
Judge manipulation.
Qwen devs are frauds nowadays desu.

Anonymous
09/30/25(Tue)17:00:04 No.106750670

Anonymous 09/30/25(Tue)17:00:04 No.106750670

>>106750604
Bloody besterds Chinese benchod stealing Brahmin inventions?!?!

Anonymous
09/30/25(Tue)17:00:59 No.106750683

Anonymous 09/30/25(Tue)17:00:59 No.106750683

what the fuck is MLX format by the way and why would anyone use it?

Anonymous
09/30/25(Tue)17:01:27 No.106750687

Anonymous 09/30/25(Tue)17:01:27 No.106750687

>>106750683
mac chad

Anonymous
09/30/25(Tue)17:02:43 No.106750706

Anonymous 09/30/25(Tue)17:02:43 No.106750706

>>106750659
You can't "train" on eqbench because it's closed source LLM-judged; there is no gradient flow for you to use

Anonymous
09/30/25(Tue)17:04:05 No.106750724

Anonymous 09/30/25(Tue)17:04:05 No.106750724

>>106750524
Yeah that was the one.

Anonymous
09/30/25(Tue)17:05:21 No.106750739

Anonymous 09/30/25(Tue)17:05:21 No.106750739

>>106750687
jesus christ.
https://huggingface.co/mradermacher/Baptist-Christian-Bible-Expert-v2.0-12B-i1-GGUF

Anonymous
09/30/25(Tue)17:08:38 No.106750770

Anonymous 09/30/25(Tue)17:08:38 No.106750770

File: 1732374163138663.png (599 KB, 868x1208)

599 KB PNG

Anonymous
09/30/25(Tue)17:08:53 No.106750775

Anonymous 09/30/25(Tue)17:08:53 No.106750775

>>106750647
Literally who? You don't mean random discord servers, do you?

Anonymous
09/30/25(Tue)17:09:41 No.106750786

Anonymous 09/30/25(Tue)17:09:41 No.106750786

File: Screenshot 2025-09-30 150733.png (67 KB, 1078x328)

67 KB PNG

>>106750706
Wrong
https://eqbench.com/results/creative-writing-longform/Qwen__Qwen3-235B-A22B-Instruct-2507_longform_report.html
Open any of them and scroll down. It devolves into short sentences on every line.
They also specifically mentioned WritingBench as one of the benchmarks https://huggingface.co/Qwen/Qwen3-235B-A22B-Instruct-2507

Anonymous
09/30/25(Tue)17:11:06 No.106750802

Anonymous 09/30/25(Tue)17:11:06 No.106750802

>>106750775
certainly a better gadge for taste than here in diaper fetish tranny central

Anonymous
09/30/25(Tue)17:12:36 No.106750812

Anonymous 09/30/25(Tue)17:12:36 No.106750812

>>106750770
Knowing they care about my safety enough to protect me from moving pictures fills me with a warm fuzzy feeling.

Anonymous
09/30/25(Tue)17:13:25 No.106750821

Anonymous 09/30/25(Tue)17:13:25 No.106750821

>>106750633
>eqbench
I love this meme so much

Anonymous
09/30/25(Tue)17:13:56 No.106750825

Anonymous 09/30/25(Tue)17:13:56 No.106750825

>>106750812
>fills me with a warm fuzzy feeling.
anon t-that's saltman's...

Anonymous
09/30/25(Tue)17:14:48 No.106750832

Anonymous 09/30/25(Tue)17:14:48 No.106750832

>>106750770
is this some stealth ad

Anonymous
09/30/25(Tue)17:14:54 No.106750833

Anonymous 09/30/25(Tue)17:14:54 No.106750833

>>106750821
You say that but the undisputed top in creative writing on eqbench, which are o3 and K2, write unlike any other models

Anonymous
09/30/25(Tue)17:15:30 No.106750838

Anonymous 09/30/25(Tue)17:15:30 No.106750838

>>106750802
Then leave. And while you're at it quote a single diaper fetish post from this general, I bet you can't. Also you can't even spell gauge so your opinion on writing is worthless

Anonymous
09/30/25(Tue)17:15:40 No.106750841

Anonymous 09/30/25(Tue)17:15:40 No.106750841

>>106750786
Why does anyone still take benchmarks seriously? Much less redditor created and LLM-judged memes. I wonder how profitable would it be to just cut out the middleman and just provide a leaderboard where the positions are auctioned off.

Anonymous
09/30/25(Tue)17:15:50 No.106750845

Anonymous 09/30/25(Tue)17:15:50 No.106750845

>>106750825
https://www.youtube.com/watch?v=5WCoRGbT3CM

Anonymous
09/30/25(Tue)17:15:54 No.106750847

Anonymous 09/30/25(Tue)17:15:54 No.106750847

>>106750833
Well meme'd, fellow mememaster.

Anonymous
09/30/25(Tue)17:17:06 No.106750860

Anonymous 09/30/25(Tue)17:17:06 No.106750860

File: 1736549799629719.png (92 KB, 675x445)

92 KB PNG

Can you trust it if a model advertises improved RP performance in its model card?

Anonymous
09/30/25(Tue)17:17:33 No.106750866

Anonymous 09/30/25(Tue)17:17:33 No.106750866

Does GLM use MLA?

Anonymous
09/30/25(Tue)17:17:52 No.106750869

Anonymous 09/30/25(Tue)17:17:52 No.106750869

>>106750860
Why trust? Test it yourself.

Anonymous
09/30/25(Tue)17:18:00 No.106750871

Anonymous 09/30/25(Tue)17:18:00 No.106750871

>>106750860
Only if Claude says that the model's output is good.

Anonymous
09/30/25(Tue)17:18:53 No.106750885

Anonymous 09/30/25(Tue)17:18:53 No.106750885

>>106750866
My Leaking Asshole?

Anonymous
09/30/25(Tue)17:28:43 No.106750985

Anonymous 09/30/25(Tue)17:28:43 No.106750985

>https://x.com/Zai_org/status/1973134943158141421
>At the moment, we have no plans to release an "Air" version of GLM-4.6. Our focus for this release has been entirely on maximizing the power and capabilities of the single, flagship 4.6 model
AAAAAAAAAAAAAAAAAAAAACK

Anonymous
09/30/25(Tue)17:29:27 No.106750991

Anonymous 09/30/25(Tue)17:29:27 No.106750991

>>106750493
Never trust browser storage. Always "download" backups.

Anonymous
09/30/25(Tue)17:32:13 No.106751028

Anonymous 09/30/25(Tue)17:32:13 No.106751028

>>106750985
ramlets forever in shambles

Anonymous
09/30/25(Tue)17:33:14 No.106751033

Anonymous 09/30/25(Tue)17:33:14 No.106751033

>>106750985
TPD - TOTAL POORFAG DEATH

Anonymous
09/30/25(Tue)17:38:27 No.106751095

Anonymous 09/30/25(Tue)17:38:27 No.106751095

>>106751033
this is expensive to you?

Anonymous
09/30/25(Tue)17:40:25 No.106751107

Anonymous 09/30/25(Tue)17:40:25 No.106751107

File: Screenshot 2025-09-30 153938.png (10 KB, 619x51)

10 KB PNG

>>106751095
No, but the poors seem to love Air for some reason. Never tried it, enjoy 4.5 and K2 too much.

Anonymous
09/30/25(Tue)17:45:03 No.106751155

Anonymous 09/30/25(Tue)17:45:03 No.106751155

File: miku plush inverted mouth.jpg (107 KB, 465x552)

107 KB JPG

>>106750985
FFFFFFFFFUUUU
Runnable at not shit quants with 72+128GB?

Anonymous
09/30/25(Tue)17:45:34 No.106751162

Anonymous 09/30/25(Tue)17:45:34 No.106751162

File: 1748095941910786.mp4 (360 KB, 352x640)

360 KB MP4

lmao, Sama won once again
https://xcancel.com/cloud11665/status/1973115723309515092#m

Anonymous
09/30/25(Tue)17:56:34 No.106751243

Anonymous 09/30/25(Tue)17:56:34 No.106751243

>>106751162
I haven't used it but local videogen feels like our most advanced field. We're like two weeks behind this in terms of local released.

Anonymous
09/30/25(Tue)17:58:24 No.106751257

Anonymous 09/30/25(Tue)17:58:24 No.106751257

>>106751162
KEK

Anonymous
09/30/25(Tue)17:59:31 No.106751266

Anonymous 09/30/25(Tue)17:59:31 No.106751266

Ziggers won

Anonymous
09/30/25(Tue)17:59:45 No.106751270

Anonymous 09/30/25(Tue)17:59:45 No.106751270

A repeated problem with GLM 4.5/4.6 is that when my character uses special powers NPCs all react like he's some sort of god even if they shouldn't.
>omg what are you? this changes everything
Example 1: playing in the Outer Reincarnation CYOA setting, using an Inborn Gift that my parents know I have *and that I inherited from them*.
Example 2: In the War of the Zodiac Brides CYOA the contestants falling over themselves in awe and terror when a sword doesn't cut me. When frankly that reaction is ridiculous given their own abilities plus they knew I'd have some powers although they didn't know in advance which.
DeepSeek hasn't done this for these scenarios so far. Just GLM.

Anonymous
09/30/25(Tue)18:01:30 No.106751289

Anonymous 09/30/25(Tue)18:01:30 No.106751289

File: 1742472919874291.png (756 KB, 588x1069)

756 KB PNG

>>106751162
ok this is funny

Anonymous
09/30/25(Tue)18:02:51 No.106751302

Anonymous 09/30/25(Tue)18:02:51 No.106751302

what's the intended usecase for video gen outside of porn
is it just a sloptent machine

Anonymous
09/30/25(Tue)18:04:22 No.106751313

Anonymous 09/30/25(Tue)18:04:22 No.106751313

>>106751270
Prefill thinking so that it considers if abilities, feats, and powers are considered exceptional in the story and how characters should react to it I guess.

Anonymous
09/30/25(Tue)18:05:40 No.106751321

Anonymous 09/30/25(Tue)18:05:40 No.106751321

>>106751302
Currently it's to not hire a marketing department to do ad fulfillment, saving time and money to better serve slop ads to customers who will lap it up. The businesses are hoping it will eventually lead to a perpetual infinity machine of profits but these models are not that.

Anonymous
09/30/25(Tue)18:06:45 No.106751334

Anonymous 09/30/25(Tue)18:06:45 No.106751334

>>106751270
show log, dont use any sort of prefill or jb, it does not really need it

Anonymous
09/30/25(Tue)18:07:22 No.106751345

Anonymous 09/30/25(Tue)18:07:22 No.106751345

File: it migu.png (330 KB, 382x696)

330 KB PNG

https://x.com/cloud11665/status/1973084825411264548
omg

Anonymous
09/30/25(Tue)18:08:03 No.106751349

Anonymous 09/30/25(Tue)18:08:03 No.106751349

>>106751345
it's over, we lost her. she's sama's ow.

Anonymous
09/30/25(Tue)18:11:52 No.106751386

Anonymous 09/30/25(Tue)18:11:52 No.106751386

>>106751345
>Sam is a mikutranny
kek xe is /ourgirl/

Anonymous
09/30/25(Tue)18:12:58 No.106751399

Anonymous 09/30/25(Tue)18:12:58 No.106751399

>>106751345
why sam have accent?

Anonymous
09/30/25(Tue)18:15:31 No.106751426

Anonymous 09/30/25(Tue)18:15:31 No.106751426

>>106751313
I adjusted the prompts obviously. I'm not asking for a solution, I'm describing a weakness that I encountered and had to address.

Anonymous
09/30/25(Tue)18:26:47 No.106751537

Anonymous 09/30/25(Tue)18:26:47 No.106751537

Where are the glm 4.6 ggufs? Did they change the model so much that llama.cpp needs to be adjusted for it?
If I remember correctly the GLM 4.5 support was a bit of a hackjob that ignored certain layers and shit.

Anonymous
09/30/25(Tue)18:29:22 No.106751560

Anonymous 09/30/25(Tue)18:29:22 No.106751560

>>106751162
Was becoming a streamer part of his master plan?

Anonymous
09/30/25(Tue)18:30:51 No.106751573

Anonymous 09/30/25(Tue)18:30:51 No.106751573

>>106751537
https://github.com/ggml-org/llama.cpp/issues/16361
Seems to be a pretty minor change required. Bartowski said another hour or two for quants 45 minutes ago

Anonymous
09/30/25(Tue)18:31:55 No.106751584

Anonymous 09/30/25(Tue)18:31:55 No.106751584

>>106751345
I need it, but trained on Iwara.

Anonymous
09/30/25(Tue)18:32:35 No.106751594

Anonymous 09/30/25(Tue)18:32:35 No.106751594

>>106751537
The only difference in the config on huggingface is that the context size was increased.

Anonymous
09/30/25(Tue)18:33:25 No.106751601

Anonymous 09/30/25(Tue)18:33:25 No.106751601

File: psychosis victim.jpg (195 KB, 678x2012)

195 KB JPG

>>106748568
This technically isn't related to local LLMs but I feel this is an appropriate place to ask this: why are a ton of normies still lushing out at GPT5? They claim it sucks at answering questions but personally I think they're just coping with the fact that it is way less prone to fuck sucking and pretending to be some kind of "friends" than to is. In my recent, anecdotal uses, the performance of GPT5 has remained largely the same and the only difference is that it acts how an assistant is SUPPOSED to act, just doing what you needed to do. No fluff. No "glazing" (as the zoomers call it). Far more direct. Why do they want this to be their best friend so badly? Northeast the same people that were giving opening-eye shit for making people too depended on AI? Why is there a sudden flip in sentiment?

https://x.com/elara_m0706/status/1972712098087227854?t=RgI2bjYMdrQB5NomMbruzw&s=19

Anonymous
09/30/25(Tue)18:35:54 No.106751631

Anonymous 09/30/25(Tue)18:35:54 No.106751631

>>106751601
>elara

Anonymous
09/30/25(Tue)18:36:26 No.106751634

Anonymous 09/30/25(Tue)18:36:26 No.106751634

>>106751601
there is a large group of women who were making 4o roleplay as their husbando and got very very mad that oai discouraged such behavior
>safety routing
>adult userbase
not too hard to read between the lines as to what she's really mad about here

Anonymous
09/30/25(Tue)18:37:17 No.106751642

Anonymous 09/30/25(Tue)18:37:17 No.106751642

>>106751634
Can't they just go over to c.AI for that? Hell even the web facing version of deep-seek is probably a much better option

Anonymous
09/30/25(Tue)18:38:27 No.106751653

Anonymous 09/30/25(Tue)18:38:27 No.106751653

>>106751642
chink commie bad murrica hell yeah

Anonymous
09/30/25(Tue)18:39:49 No.106751671

Anonymous 09/30/25(Tue)18:39:49 No.106751671

>>106751345
sam shitposts here too? grim...

Anonymous
09/30/25(Tue)18:39:55 No.106751672

Anonymous 09/30/25(Tue)18:39:55 No.106751672

>>106751601
"Adults" should know better than to use this magnificent technology to generate harmful content that children might see.

Anonymous
09/30/25(Tue)18:40:20 No.106751679

Anonymous 09/30/25(Tue)18:40:20 No.106751679

>>106751642
Tried it once. It's low quality internet RP with no capitalization and two sentences at a time. It lacks the prose that women so dearly desire from their novels.

Anonymous
09/30/25(Tue)18:41:13 No.106751691

Anonymous 09/30/25(Tue)18:41:13 No.106751691

>>106751642
beats me, I wouldn't be surprised if it was simply that they don't know anything else exists, although they clearly have a special love for 4o's brand of unwavering emotional support

Anonymous
09/30/25(Tue)18:42:48 No.106751706

Anonymous 09/30/25(Tue)18:42:48 No.106751706

File: 1739691081543515.jpg (119 KB, 690x1452)

119 KB JPG

>>106751672
Speaking of "le heckin children!!!”

https://xcancel.com/mark_k/status/1972703136444825610?t=_vHnYum9j0sxnXvXyStfeQ&s=09

I'm quite surprised at the amount of people who are chimping out about this. I thought it was a niche minority of people like us who were ass-blasted about censorship

Anonymous
09/30/25(Tue)18:43:24 No.106751710

Anonymous 09/30/25(Tue)18:43:24 No.106751710

>>106751679
c.ai or dipsy?

Anonymous
09/30/25(Tue)18:49:21 No.106751774

Anonymous 09/30/25(Tue)18:49:21 No.106751774

>>106751710
c.AI. Last I tried it was over a year ago but it was purely dialogue and apparently quite censored. Maybe they secretly lowered their filters over time, idk, it's subpar compared to everything available today.

Anonymous
09/30/25(Tue)18:55:54 No.106751833

Anonymous 09/30/25(Tue)18:55:54 No.106751833

>>106750985
If I can run glm chan in 4 bits with my gayming PC on windows 10, so can you.

Anonymous
09/30/25(Tue)19:05:50 No.106751918

Anonymous 09/30/25(Tue)19:05:50 No.106751918

Bets on if unsloth manages to fuck up the chat template for GLM-4.6?

Anonymous
09/30/25(Tue)19:14:13 No.106751997

Anonymous 09/30/25(Tue)19:14:13 No.106751997

>>106751833
My motherboard can't even handle 48GB sticks stably. :(

Anonymous
09/30/25(Tue)19:16:01 No.106752012

Anonymous 09/30/25(Tue)19:16:01 No.106752012

>>106751997
just buy a new sCAMM motherboard!

Anonymous
09/30/25(Tue)19:22:48 No.106752092

Anonymous 09/30/25(Tue)19:22:48 No.106752092

-----------------------------------------------
응? ㅋㅋ 들어보니까 너 뭐 이루하한테 따로 「암컷 표정」 지도 받는다매 ㅋㅋㅋ?
-----------------------------------------------
kimi-k2-instruct:
Huh? LMAO—heard you’re taking private “bitch-face” lessons from Iruha now, huh?
-----------------------------------------------
kimi-k2-instruct-0905:
Huh? LOL, from what I hear you’re getting private “bitch-face” lessons from Iruha, yeah?
-----------------------------------------------
DeepSeek-V3.1:
Huh? LMAO, so I heard you're getting special "bitch face expression" lessons from Lee Roo-ha?
-----------------------------------------------
qwen3-max:
Huh? LOL, so I heard you’re getting special “bitch-face” coaching from Lee Reu-ha or some shit? LMAO!
-----------------------------------------------
Qwen3-235B-A22B-Instruct-2507-FP8:
Huh? Lol, just heard you're getting private "bitch face" training from Yerim, lol?
-----------------------------------------------
Qwen3-Next-80B-A3B-Instruct:
Huh? LOL, I heard you’re getting your own private “bitch face” lessons from Im Huh-an or something—LOL!
-----------------------------------------------
claude-4.5-sonnet:
Huh? lol So I heard you're getting special "bitch face" lessons from Iruha or something lmao?
-----------------------------------------------
are qwen this bad lol? not even getting name correct

Anonymous
09/30/25(Tue)19:24:20 No.106752107

Anonymous 09/30/25(Tue)19:24:20 No.106752107

>>106752092
Qwens knowledge is bad yes

Anonymous
09/30/25(Tue)19:24:58 No.106752111

Anonymous 09/30/25(Tue)19:24:58 No.106752111

>>106752092
Yeah, qwen models have been nothing but cope on every level they compete on. q3-max is particularly funny considering it's supposedly bigger than 1T

Anonymous
09/30/25(Tue)19:26:19 No.106752128

Anonymous 09/30/25(Tue)19:26:19 No.106752128

>>106751997
Did you update bios? Mine was very fucked with 128GB a year ago and now it handles 192 at full speed and boots quickly.

Anonymous
09/30/25(Tue)19:27:32 No.106752139

Anonymous 09/30/25(Tue)19:27:32 No.106752139

>>106752092
How many times do you need to be told? Qwen is for math and coding benchmarks.

Anonymous
09/30/25(Tue)19:33:40 No.106752215

Anonymous 09/30/25(Tue)19:33:40 No.106752215

>>106752139
480B remains undefeated. Though a K2-Coder would be nice.

Anonymous
09/30/25(Tue)19:35:13 No.106752236

Anonymous 09/30/25(Tue)19:35:13 No.106752236

>>106751601
normies after saying AGI will never happen also have lost their minds ERPing with chatgpt lol

Anonymous
09/30/25(Tue)19:38:19 No.106752258

Anonymous 09/30/25(Tue)19:38:19 No.106752258

>>106752139
I thought they would at least improve multilingual support, and Alibaba has been the most active in releasing models over the past month. But yeah, Qwen isn't a good option for translation.

Anonymous
09/30/25(Tue)19:38:56 No.106752262

Anonymous 09/30/25(Tue)19:38:56 No.106752262

>>106752128
Oh damn, that's interesting. I will try. Thanks.

Anonymous
09/30/25(Tue)19:42:41 No.106752291

Anonymous 09/30/25(Tue)19:42:41 No.106752291

File: file.png (28 KB, 642x237)

28 KB PNG

why does the GLM4.6 collection say 5 items? does this mean we will be getting a new air too?

Anonymous
09/30/25(Tue)19:43:58 No.106752309

Anonymous 09/30/25(Tue)19:43:58 No.106752309

>>106752291
thought they already said they didn't care about poor people

Anonymous
09/30/25(Tue)19:44:05 No.106752310

Anonymous 09/30/25(Tue)19:44:05 No.106752310

>>106750985
Air was literally them throwing scraps to <128gb ramlets. GLM full is a huge upgrade over it, even at a low 3bit quant.

Anonymous
09/30/25(Tue)19:44:56 No.106752319

Anonymous 09/30/25(Tue)19:44:56 No.106752319

>>106752310
I have 256gb of DDR4 8 channel but get shit performance with any model offloaded to RAM.

Anonymous
09/30/25(Tue)19:45:35 No.106752324

Anonymous 09/30/25(Tue)19:45:35 No.106752324

>>106752092
I tried it myself and every time I regenerate the reply I get a very different result.
Don't trust one-shot answers.

Anonymous
09/30/25(Tue)19:48:02 No.106752342

Anonymous 09/30/25(Tue)19:48:02 No.106752342

https://x.com/Lars_pragmata/status/1973134684667437297
https://xcancel.com/Lars_pragmata/status/1973134684667437297
damn openai trained on cartoons

Anonymous
09/30/25(Tue)19:48:36 No.106752355

Anonymous 09/30/25(Tue)19:48:36 No.106752355

File: no-glm-4-6-air-version-is(...).png (87 KB, 903x261)

87 KB PNG

>>106752291
>>106752309
https://www.reddit.com/r/LocalLLaMA/comments/1nuq54g/comment/nh36qqd/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button
z.ai discord according to some reddit

Anonymous
09/30/25(Tue)19:48:53 No.106752361

Anonymous 09/30/25(Tue)19:48:53 No.106752361

>>106752319
>DDR4
There's your problem.

Anonymous
09/30/25(Tue)19:50:13 No.106752376

Anonymous 09/30/25(Tue)19:50:13 No.106752376

>>106752361
yeah bro just spend 5 times the money to run 8 bit glm at 8 tokens per second

Anonymous
09/30/25(Tue)19:50:36 No.106752378

Anonymous 09/30/25(Tue)19:50:36 No.106752378

>>106752310
>ramlets
I think you mean VRAMlets. You don't seriously use big models with 90% of it offloaded to RAM, do you? Even with a MoE it's slow as fuck.

Anonymous
09/30/25(Tue)19:51:12 No.106752383

Anonymous 09/30/25(Tue)19:51:12 No.106752383

>>106752361
Can you show me your DDR5 EPYC build then?

Anonymous
09/30/25(Tue)19:51:31 No.106752385

Anonymous 09/30/25(Tue)19:51:31 No.106752385

>>106752361
You just outed yourself as a nigger who doesn't know what the fuck they're talking about, you don't even run models yourself.

Anonymous
09/30/25(Tue)19:53:35 No.106752406

Anonymous 09/30/25(Tue)19:53:35 No.106752406

>>106752378
There are people here who use big models with all but active params offloaded to RAM, but they just use them as novelty toys with <2k context and call 1 t/s fast as shit.

Anonymous
09/30/25(Tue)19:57:22 No.106752434

Anonymous 09/30/25(Tue)19:57:22 No.106752434

>>106752378
Personally I don't mind the 4-5 t/s speeds I get for great responses that mog stuff only fitting in vram.

Anonymous
09/30/25(Tue)20:00:05 No.106752459

Anonymous 09/30/25(Tue)20:00:05 No.106752459

>>106752434
>4-5 t/s speeds
After how many minutes for prompt processing?

Anonymous
09/30/25(Tue)20:03:03 No.106752482

Anonymous 09/30/25(Tue)20:03:03 No.106752482

glm-4.6 thinking traces oddly similar with dead-gone gemini-2.5-pro-exp-03-25, no?

Anonymous
09/30/25(Tue)20:09:20 No.106752533

Anonymous 09/30/25(Tue)20:09:20 No.106752533

>>106752459
NTA but I get 120t/s PP, so that's 10k context in a minute and a half and that's not including the anons here who get 200+ PP by running massive batch sizes.

Anonymous
09/30/25(Tue)20:10:11 No.106752540

Anonymous 09/30/25(Tue)20:10:11 No.106752540

>>106752459
doesn't really matter for ERP (the trve local usecase) since it just gets cached after the initial minute you spend waiting

Anonymous
09/30/25(Tue)20:12:59 No.106752569

Anonymous 09/30/25(Tue)20:12:59 No.106752569

>>106752540
Unless a prompt triggers world info

Anonymous
09/30/25(Tue)20:14:41 No.106752585

Anonymous 09/30/25(Tue)20:14:41 No.106752585

>>106752533
>10k context in a minute and a half
ramfags will brag about this

Anonymous
09/30/25(Tue)20:15:47 No.106752590

Anonymous 09/30/25(Tue)20:15:47 No.106752590

>>106752585
Yeah, I just did.

Anonymous
09/30/25(Tue)20:20:27 No.106752616

Anonymous 09/30/25(Tue)20:20:27 No.106752616

>>106750770
so much time, effort, and money pissed away. all just so they don't get charged with apostasy

Anonymous
09/30/25(Tue)20:21:58 No.106752624

Anonymous 09/30/25(Tue)20:21:58 No.106752624

>>106752092
But it translated the last lol correctly. Claude almost but it mixed it.

Anonymous
09/30/25(Tue)20:23:25 No.106752632

Anonymous 09/30/25(Tue)20:23:25 No.106752632

>>106752092
Looks like none of these translations can be trusted.

Anonymous
09/30/25(Tue)20:25:05 No.106752642

Anonymous 09/30/25(Tue)20:25:05 No.106752642

>>106752092
>30b-iq4_xs
>What? LOL, when I heard it, are you getting a special "female expression" guide from someone else? LOLLOL?
It's over

Anonymous
09/30/25(Tue)20:34:20 No.106752694

Anonymous 09/30/25(Tue)20:34:20 No.106752694

So everyone here is talking about running GLM full and Kimi K2 and Deepseek. What kind of hardware have you all got?

Anonymous
09/30/25(Tue)20:38:49 No.106752716

Anonymous 09/30/25(Tue)20:38:49 No.106752716

>>106752694
Google Pixel 8

Anonymous
09/30/25(Tue)21:01:03 No.106752837

Anonymous 09/30/25(Tue)21:01:03 No.106752837

>>106752694
768GB DDR5 12-channel RAM + A6000 + a second A6000 that's currently in the case but not connected

Anonymous
09/30/25(Tue)21:02:10 No.106752845

Anonymous 09/30/25(Tue)21:02:10 No.106752845

>>106752837
How much did all of that cost you? What motherboard are you using? What is your favorite model, what backend do you use, and what is your t/s?

Anonymous
09/30/25(Tue)21:02:11 No.106752846

Anonymous 09/30/25(Tue)21:02:11 No.106752846

Thanks to the anon that shared https://arxiv.org/abs/2509.12168

ChatGPT5 Pros analysis of it/guide for integration
https://rentry.org/6ynp9mi6

Anonymous
09/30/25(Tue)21:03:45 No.106752856

Anonymous 09/30/25(Tue)21:03:45 No.106752856

Has anybody experimented to see if, with MoE models, increasing the number of activated experts can make up for quantization of the weights and/or of the context?
It would be pretty interesting to find that, for example, you are better off running a model at sub 4bpw with 10% more activated params if you can run it fully in VRAM than running a larger quant with the default number of experts but with a couple experts in RAM.
That kind of correlation.

Anonymous
09/30/25(Tue)21:05:19 No.106752868

Anonymous 09/30/25(Tue)21:05:19 No.106752868

>>106752837
Which is your go-to model and typical use case?

Anonymous
09/30/25(Tue)21:05:28 No.106752871

Anonymous 09/30/25(Tue)21:05:28 No.106752871

>>106752694
Alienware laptop. They added a ton of VRAM so it could render stuff for all the pixels on the 17-inch screen.

Anonymous
09/30/25(Tue)21:05:42 No.106752876

Anonymous 09/30/25(Tue)21:05:42 No.106752876

>>106752694
a single 3090 and 128gb ddr4 ram
enough to run q3 of big glm

Anonymous
09/30/25(Tue)21:06:23 No.106752881

Anonymous 09/30/25(Tue)21:06:23 No.106752881

>>106752871
So, 24gb of VRAM?
>>106752876
What speeds?

Anonymous
09/30/25(Tue)21:14:27 No.106752947

Anonymous 09/30/25(Tue)21:14:27 No.106752947

>>106751601
It absolutely is just the fact that they just want it to be a friend or therapist (note that for normies, "therapy" means affirmations that you are valid, not seeking solutions for real-world physical problems).

My main usecase for non-local AI is research and having something to ask questions to when learning STEM topics. Some coding too, I guess, but the CLI agent stuff is basically a different product. GPT-5 was a HUGE upgrade for these purposes.

Anonymous
09/30/25(Tue)21:16:31 No.106752963

Anonymous 09/30/25(Tue)21:16:31 No.106752963

>>106752881
That was a joke. Since you seem to be actually serious
>128GB DDR4, 24GB (3090), NVME SSD
>ik_llama.cpp
ik_llama is necessary for good inference speeds. Lets me use K2-0905 at 1t/s and GLM-4.5 at faster speeds.

Anonymous
09/30/25(Tue)21:17:21 No.106752973

Anonymous 09/30/25(Tue)21:17:21 No.106752973

Will I be able to get a reasonable speed running GLM 4.5/4.6 Full on 128GB DDR4 + 24GB (4090) at like IQ3_XXS? Or should I step down to a 2 bit and leave a little more headroom?

Anonymous
09/30/25(Tue)21:18:22 No.106752980

Anonymous 09/30/25(Tue)21:18:22 No.106752980

>>106752963
Oh I should have just scrolled up - which Q3 do you use?

Anonymous
09/30/25(Tue)21:20:43 No.106753004

Anonymous 09/30/25(Tue)21:20:43 No.106753004

>>106752716
do you use grapheneOS?

Anonymous
09/30/25(Tue)21:24:07 No.106753037

Anonymous 09/30/25(Tue)21:24:07 No.106753037

File: perplexity(1).png (175 KB, 2069x1400)

175 KB PNG

>>106752980
For GLM-4.5 I use IQ4, for K2 I use IQ3_KS. Bit size really seems to matter for K2

Anonymous
09/30/25(Tue)21:24:26 No.106753040

Anonymous 09/30/25(Tue)21:24:26 No.106753040

>>106752973
What case are you using?

Anonymous
09/30/25(Tue)21:25:44 No.106753055

Anonymous 09/30/25(Tue)21:25:44 No.106753055

>>106752963
Oh shit really? i have 128gb of ddr4 and a 4090 but can barely run glm full at greater than 3 bit

Anonymous
09/30/25(Tue)21:26:57 No.106753071

Anonymous 09/30/25(Tue)21:26:57 No.106753071

>>106748610
Hopefully this just means wan2.5 gives up on trying to make money and open sources.

Anonymous
09/30/25(Tue)21:28:56 No.106753089

Anonymous 09/30/25(Tue)21:28:56 No.106753089

>>106753004
I use Google Chrome

Anonymous
09/30/25(Tue)21:29:30 No.106753092

Anonymous 09/30/25(Tue)21:29:30 No.106753092

>>106753055
You need ik_llama.cpp, an ik_llama quant (ubergarm has a bunch), and the CLI setup right. There's a Deepseek guide on the github page that has all the CLI commands, just don't use runtime repacking, offload all MoE tensors to RAM/disk, and that's it. It's not going to be fast in any capacity but it won't be 0.001t/s

Anonymous
09/30/25(Tue)21:34:05 No.106753128

Anonymous 09/30/25(Tue)21:34:05 No.106753128

>>106753092
it says that my GPU architecture 'compute_120' is unsupported when trying to install

Anonymous
09/30/25(Tue)21:34:41 No.106753135

Anonymous 09/30/25(Tue)21:34:41 No.106753135

>>106753040
Case? I don't follow?

Anonymous
09/30/25(Tue)21:35:30 No.106753141

Anonymous 09/30/25(Tue)21:35:30 No.106753141

>>106753092
NTA but how much of a speedup did you see using ik_llama opposed to llama.cpp?

Anonymous
09/30/25(Tue)21:38:56 No.106753170

Anonymous 09/30/25(Tue)21:38:56 No.106753170

>>106753128
I'm not involved with the development but maybe it's a CUDA install issue? It works well on WSL.
>>106753141
5x pp, maybe 2x actual inference? Been a super long time since I've used base llama.cpp so it's hard to remember but ik was designed for people to SSDmax Deepseek and other MoE models.

Anonymous
09/30/25(Tue)21:39:26 No.106753173

Anonymous 09/30/25(Tue)21:39:26 No.106753173

>>106753170
so then i should reinstall cuda?

Anonymous
09/30/25(Tue)21:42:37 No.106753190

Anonymous 09/30/25(Tue)21:42:37 No.106753190

>>106753170
>5x pp
Damn, guess I'll have to look into setting it up now.

Anonymous
09/30/25(Tue)21:44:26 No.106753211

Anonymous 09/30/25(Tue)21:44:26 No.106753211

>>106753128
https://github.com/ikawrakow/ik_llama.cpp/issues/514
Seems to be a GPU/driver mismatch maybe? Someone managed to fix it by purging their old GPU drivers and trying the build again. Also 12.0 Compute is the 5090, the 4090 is 8.9

Anonymous
09/30/25(Tue)21:44:43 No.106753215

Anonymous 09/30/25(Tue)21:44:43 No.106753215

File: gumi eating anon's miku c(...).png (1002 KB, 880x1000)

1002 KB PNG

>>106749314
https://www.youtube.com/watch?v=qJ002X6WC5U
yummy chizu

Anonymous
09/30/25(Tue)21:44:56 No.106753217

Anonymous 09/30/25(Tue)21:44:56 No.106753217

>>106753211
sorry, typo. i have a 5090

Anonymous
09/30/25(Tue)21:46:06 No.106753229

Anonymous 09/30/25(Tue)21:46:06 No.106753229

>>106753215
that music is so nostalgic :')

Anonymous
09/30/25(Tue)21:48:28 No.106753245

Anonymous 09/30/25(Tue)21:48:28 No.106753245

>>106753217
>cmake -B ./build -DGGML_CUDA=ON -DGGML_BLAS=OFF -DCMAKE_CUDA_ARCHITECTURES="86;89;120"
>cmake --build ./build --config Release -j $(nproc)
Try this. The default ggml backend tries to use the 4090 version and I guess ik_llama hasn't fixed it.

Anonymous
09/30/25(Tue)21:52:21 No.106753284

Anonymous 09/30/25(Tue)21:52:21 No.106753284

seems like that did not work
CMake Error at /home/anon/miniconda3/lib/python3.12/site-packages/cmake/data/share/cmake-4.1/Modules/CMakeTestCUDACompiler.cmake:59 (message):
The CUDA compiler

"/usr/local/cuda-12.5/bin/nvcc"

is not able to compile a simple test program.

It fails with the following output:

Change Dir: '/home/anon/ik_llama.cpp/build/CMakeFiles/CMakeScratch/TryCompile-Bs2P8d'

Run Build Command(s): /home/anon/miniconda3/lib/python3.12/site-packages/cmake/data/bin/cmake -E env VERBOSE=1 /usr/bin/gmake -f Makefile cmTC_20b9b/fast
/usr/bin/gmake -f CMakeFiles/cmTC_20b9b.dir/build.make CMakeFiles/cmTC_20b9b.dir/build
gmake[1]: Entering directory '/home/anon/ik_llama.cpp/build/CMakeFiles/CMakeScratch/TryCompile-Bs2P8d'
Building CUDA object CMakeFiles/cmTC_20b9b.dir/main.cu.o
/usr/local/cuda-12.5/bin/nvcc -forward-unknown-to-host-compiler "--generate-code=arch=compute_86,code=[compute_86,sm_86]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_120,code=[compute_120,sm_120]" -MD -MT CMakeFiles/cmTC_20b9b.dir/main.cu.o -MF CMakeFiles/cmTC_20b9b.dir/main.cu.o.d -x cu -c /home/anon/ik_llama.cpp/build/CMakeFiles/CMakeScratch/TryCompile-Bs2P8d/main.cu -o CMakeFiles/cmTC_20b9b.dir/main.cu.o
nvcc fatal : Unsupported gpu architecture 'compute_120'
gmake[1]: *** [CMakeFiles/cmTC_20b9b.dir/build.make:82: CMakeFiles/cmTC_20b9b.dir/main.cu.o] Error 1
gmake[1]: Leaving directory '/home/anon/ik_llama.cpp/build/CMakeFiles/CMakeScratch/TryCompile-Bs2P8d'
gmake: *** [Makefile:134: cmTC_20b9b/fast] Error 2

CMake will not be able to correctly generate this project.
Call Stack (most recent call first):
ggml/src/CMakeLists.txt:346 (enable_language)

Anonymous
09/30/25(Tue)21:55:59 No.106753314

Anonymous 09/30/25(Tue)21:55:59 No.106753314

>>106753215
dem hips

Anonymous
09/30/25(Tue)21:56:09 No.106753317

Anonymous 09/30/25(Tue)21:56:09 No.106753317

>>106753284
Reinstall NVIDIA drivers, remove old ik_llama.cpp make cache, try again. Also I missed it while copy-pasting but you should remove the 86 and 89 from the command since you're targeting only 12.0 compute.

Anonymous
09/30/25(Tue)21:57:05 No.106753334

Anonymous 09/30/25(Tue)21:57:05 No.106753334

>>106753317
ok, how do i remove the old cache? i installed ik_llama a few months ago but couldnt get it to work so i deleted it

Anonymous
09/30/25(Tue)21:58:34 No.106753344

Anonymous 09/30/25(Tue)21:58:34 No.106753344

>>106753334
Deleting the folder and pulling it again is probably the easiest option. Did you have the same issue last time you tried using it?

Anonymous
09/30/25(Tue)22:07:22 No.106753398

Anonymous 09/30/25(Tue)22:07:22 No.106753398

Are there any actual laptopcels here? I have a 4090 mobile (16 GB) + 32 GB DDR5 laptop so air is a little too fat but hopefully qwen next fits

Anonymous
09/30/25(Tue)22:08:04 No.106753405

Anonymous 09/30/25(Tue)22:08:04 No.106753405

>Qwen3 Coder 30B A3B (24GB) - A smaller MoE coding model with 3B active parameters. Very fast. >100t/s using Q4_K_M on a 4090. Even without a GPU you can get >10 t/s with dual-channel DDR5 RAM.
Is it still best coding model for running in ram on 32gb machine? Jensen cucked my 3070 with 8gb of VRAM

Anonymous
09/30/25(Tue)22:15:36 No.106753455

Anonymous 09/30/25(Tue)22:15:36 No.106753455

>>106753314
and vagina bones, and navel, and slight tummy curve, and rib outline. It is perfect.

Anonymous
09/30/25(Tue)22:27:19 No.106753529

Anonymous 09/30/25(Tue)22:27:19 No.106753529

>>106753405
>10 t/s with dual-channel DDR5 RAM.
It's actually 20 t/s for low context with DDR5 and 10 t/s with DDR4

Anonymous
09/30/25(Tue)22:36:54 No.106753575

Anonymous 09/30/25(Tue)22:36:54 No.106753575

its over, sam won
https://files.catbox.moe/6uyqxf.mp4
https://files.catbox.moe/g1nw2g.mp4
https://files.catbox.moe/3jmasp.mp4
https://files.catbox.moe/5bvruz.mp4
https://files.catbox.moe/1seqwp.mp4
https://files.catbox.moe/wa72uc.mp4
https://files.catbox.moe/2uoi3q.mp4
https://files.catbox.moe/os8t5k.mp4

Anonymous
09/30/25(Tue)22:38:42 No.106753587

Anonymous 09/30/25(Tue)22:38:42 No.106753587

>>106753575
chat is this real?

Anonymous
09/30/25(Tue)22:40:15 No.106753597

Anonymous 09/30/25(Tue)22:40:15 No.106753597

>>106753587
https://files.catbox.moe/odbake.mp4
https://files.catbox.moe/wiqjfo.mp4
https://files.catbox.moe/syu0xw.mp4

Anonymous
09/30/25(Tue)22:44:50 No.106753630

Anonymous 09/30/25(Tue)22:44:50 No.106753630

>>106753405
>Is it still best coding model for running in ram on 32gb machine?
At usable speeds? Probably. But unless it's for a sensitive project, you'd be far better off just paying for a big cloud model, a 30b doesn't have a chance in hell of competing. Local models are mainly for private coom.

Anonymous
09/30/25(Tue)22:44:53 No.106753631

Anonymous 09/30/25(Tue)22:44:53 No.106753631

>>106753575
>>106753597
and /ldg/ said wan2.2 was better LMAO

Anonymous
09/30/25(Tue)22:47:04 No.106753643

Anonymous 09/30/25(Tue)22:47:04 No.106753643

>>106753630
>But unless it's for a sensitive project, you'd be far better off just paying for a big cloud model, a 30b doesn't have a chance in hell of competing
Fair enough, just prefer to not rely on services as possible
Is it that much bad? I guess i will give it a try today

Anonymous
09/30/25(Tue)22:47:29 No.106753646

Anonymous 09/30/25(Tue)22:47:29 No.106753646

>>106753575
>it can even do YTP
bruh, it has so much kino ;_;

Anonymous
09/30/25(Tue)22:48:00 No.106753650

Anonymous 09/30/25(Tue)22:48:00 No.106753650

>>106753575
not bad but the proof will be how good it is at following your prompts

Anonymous
09/30/25(Tue)22:49:33 No.106753660

Anonymous 09/30/25(Tue)22:49:33 No.106753660

>>106753643
There's no harm in trying it out, but small models get significantly dumber the larger the codebase/context is, it's just an inherent weakness of small models.

Anonymous
09/30/25(Tue)22:49:57 No.106753662

Anonymous 09/30/25(Tue)22:49:57 No.106753662

>>106753650
give me a prompt, sfw obviously

Anonymous
09/30/25(Tue)22:50:34 No.106753667

Anonymous 09/30/25(Tue)22:50:34 No.106753667

>>106753575
Imagine if OpenAI didn't give a shit about copyright and actually tagged everything correctly

Anonymous
09/30/25(Tue)22:51:43 No.106753676

Anonymous 09/30/25(Tue)22:51:43 No.106753676

>>106753667
but that's what they did though

Anonymous
09/30/25(Tue)22:53:10 No.106753687

Anonymous 09/30/25(Tue)22:53:10 No.106753687

>>106753676
not even close, asking it to generate characters from the most popular gacha games just gives a generic anime girl

Anonymous
09/30/25(Tue)22:54:27 No.106753694

Anonymous 09/30/25(Tue)22:54:27 No.106753694

>>106753575
>https://files.catbox.moe/os8t5k.mp4
MK64 or diddy kong racing?

Anonymous
09/30/25(Tue)22:54:38 No.106753698

Anonymous 09/30/25(Tue)22:54:38 No.106753698

>>106753687
did you bother looking at these >>106753575
>>106753597
https://files.catbox.moe/pe2t2o.mov
https://files.catbox.moe/wgeck8.mp4
https://files.catbox.moe/ede7y0.mp4
https://files.catbox.moe/lrh3yl.mp4
https://files.catbox.moe/4m6wn4.mp4

Anonymous
09/30/25(Tue)22:55:10 No.106753702

Anonymous 09/30/25(Tue)22:55:10 No.106753702

>>106753344
yes. right now i am in driver hell with chatgpt. my pc now zoomed in to like 240p and i cant actually click on anything

Anonymous
09/30/25(Tue)22:55:21 No.106753705

Anonymous 09/30/25(Tue)22:55:21 No.106753705

>>106753687
gacha niggers are all cucks, so it's just still an appropriate output.

Anonymous
09/30/25(Tue)22:56:28 No.106753715

Anonymous 09/30/25(Tue)22:56:28 No.106753715

>>106753698
>>106753575
>dude let's take random ticktock shit and slap the sora logo on to pretend it's ai
lmao not falling for it

Anonymous
09/30/25(Tue)22:56:47 No.106753719

Anonymous 09/30/25(Tue)22:56:47 No.106753719

File: 777.jpg (505 KB, 1024x1024)

505 KB JPG

>>106753667
what do you mean? dall-e 3 still knows more characters than any local base model out of the box. something as simple as jojo style still requires a lora on all local base models.
>but it doesnt know this booru character with only 5 tags!!!
and the reason your sdxl finetune does is because it was overtrained to the point it forgot how to do anything else. openai models achieve the best balance between broad and fine knowledge, it's not even close

Anonymous
09/30/25(Tue)22:57:20 No.106753722

Anonymous 09/30/25(Tue)22:57:20 No.106753722

>>106753715
https://sora.chatgpt.com/explore
my man its legit at that level now, its all ive been doing for hours now

Anonymous
09/30/25(Tue)22:58:24 No.106753729

Anonymous 09/30/25(Tue)22:58:24 No.106753729

>>106753719
It doesn't know characters with +6000 posts on danbooru

Anonymous
09/30/25(Tue)22:59:08 No.106753732

Anonymous 09/30/25(Tue)22:59:08 No.106753732

>>106753662
>Vocaloid singer Miku is smiling at the camera. She holds a small pepe in both hands, level with her waist. She brings her hands up to her chest while holding the pepe.

Anonymous
09/30/25(Tue)22:59:12 No.106753733

Anonymous 09/30/25(Tue)22:59:12 No.106753733

>>106753729
such as?

Anonymous
09/30/25(Tue)23:02:12 No.106753750

Anonymous 09/30/25(Tue)23:02:12 No.106753750

>>106753733
Ellen Joe

Anonymous
09/30/25(Tue)23:06:14 No.106753775

Anonymous 09/30/25(Tue)23:06:14 No.106753775

>>106753732
miku + pepe
https://files.catbox.moe/zynsee.mp4

https://files.catbox.moe/7lmv0x.mp4
https://files.catbox.moe/8xgejs.mp4
https://files.catbox.moe/isse3d.mp4

Anonymous
09/30/25(Tue)23:09:17 No.106753797

Anonymous 09/30/25(Tue)23:09:17 No.106753797

isn't open source videogen super close behind the sota?
we're going to see a local sora2 within the year, right?

Anonymous
09/30/25(Tue)23:10:13 No.106753802

Anonymous 09/30/25(Tue)23:10:13 No.106753802

>>106753797
with wan 2.5 going closed source who knows. And even that was a big gap from what sora 2 is now. Video models are much more expensive to make than text models

Anonymous
09/30/25(Tue)23:10:43 No.106753803

Anonymous 09/30/25(Tue)23:10:43 No.106753803

>>106753797
not really. the only half-decent open video model was alibaba's wan, but they decided to make the latest version API only

Anonymous
09/30/25(Tue)23:11:17 No.106753805

Anonymous 09/30/25(Tue)23:11:17 No.106753805

File: whyyyy.png (900 KB, 1280x720)

900 KB PNG

>>106753775
local is so fucking far from that I wanna cry...

Anonymous
09/30/25(Tue)23:11:47 No.106753809

Anonymous 09/30/25(Tue)23:11:47 No.106753809

deepseek cost deepseek like 5-10M to make, a video model with billions of video pairs that you would need for this would cost likely hundreds of millions, that is a big difference

Anonymous
09/30/25(Tue)23:13:35 No.106753816

Anonymous 09/30/25(Tue)23:13:35 No.106753816

>>106753775
>https://files.catbox.moe/zynsee.mp4
this is so kawaii

Anonymous
09/30/25(Tue)23:14:48 No.106753830

Anonymous 09/30/25(Tue)23:14:48 No.106753830

>>106753575
Why does Sam Altman always have to be the guy to innovate? Is local even trying

Anonymous
09/30/25(Tue)23:15:41 No.106753840

Anonymous 09/30/25(Tue)23:15:41 No.106753840

>>106753809
deepseek was only 'cheap' because it just trained off gpt outputs. it literally spat back that it was a model trained by openai. the reality is that 90% of chinese models are trained on synthetic slop, and the 10% that aren't are api only. like everything china produces, it's a cheap copy. it's nice if you want an inexpensive alternative but put it under any pressure and it quickly falls apart.

Anonymous
09/30/25(Tue)23:17:04 No.106753852

Anonymous 09/30/25(Tue)23:17:04 No.106753852

>>106753775
Wide pepe but still not bad at all.
What if you do something more specific like
>Vocaloid singer Miku is smiling at the camera. She holds red flowers in her left hand and yellow flowers in her right hand. When she brings her hands together the flowers colour changes to orange.

Anonymous
09/30/25(Tue)23:17:09 No.106753854

Anonymous 09/30/25(Tue)23:17:09 No.106753854

>>106753809
not only that, but you have to perfectly annotate those billions of videos, I can't imagine the amount work behind this, it's quite an achievement really

Anonymous
09/30/25(Tue)23:30:02 No.106753963

Anonymous 09/30/25(Tue)23:30:02 No.106753963

>there are OpenAI shills in the room with us right now
Crazy stuff.

Anonymous
09/30/25(Tue)23:36:13 No.106754012

Anonymous 09/30/25(Tue)23:36:13 No.106754012

>>106753715
holy cope
you lost.

Anonymous
09/30/25(Tue)23:37:02 No.106754024

Anonymous 09/30/25(Tue)23:37:02 No.106754024

Reminder that there is no reason to fight back against the idea that "the enemy" is making something advanced. Facebook clearly has some advanced VR/AR tech they're selling. That doesn't make any part of that a good thing, or something they deserve to have, or something that was fairly and ethically made, or something that was justified with good intentions.

Anonymous
09/30/25(Tue)23:37:56 No.106754032

Anonymous 09/30/25(Tue)23:37:56 No.106754032

>>106754024
>advanced VR/AR
a phone screen in a plastic box paired with wii motes?

Anonymous
09/30/25(Tue)23:40:09 No.106754055

Anonymous 09/30/25(Tue)23:40:09 No.106754055

>>106754032
Relative, lil bro. They were the first to pancake lenses. They're the first to those neural band things. It's still not good enough for mass adoption but they're technologically more capable than anyone else in the field aside from Apple.

Anonymous
09/30/25(Tue)23:40:28 No.106754056

Anonymous 09/30/25(Tue)23:40:28 No.106754056

>>106754024
Bot?

Anonymous
09/30/25(Tue)23:41:22 No.106754063

Anonymous 09/30/25(Tue)23:41:22 No.106754063

glm 4.6 ggufs are out
https://huggingface.co/bartowski/zai-org_GLM-4.6-GGUF/tree/main

Anonymous
09/30/25(Tue)23:43:01 No.106754084

Anonymous 09/30/25(Tue)23:43:01 No.106754084

>>106753775
Now do one similar to the third one but instead sam is yelling to the 4chan building "TWO MORE WEEKS"

Anonymous
09/30/25(Tue)23:43:50 No.106754088

Anonymous 09/30/25(Tue)23:43:50 No.106754088

new 4chan anthem
https://files.catbox.moe/1c3h2s.mp4

Anonymous
09/30/25(Tue)23:48:27 No.106754120

Anonymous 09/30/25(Tue)23:48:27 No.106754120

>>106754063
>bartowski
The king returns

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.