/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

[Post a Reply]

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous
/lmg/ - Local Models General 03/11/26(Wed)11:48:28 No.108346672

File: 1764925198127137.png (433 KB, 1010x1130)

433 KB PNG

/lmg/ - Local Models General Anonymous 03/11/26(Wed)11:48:28 No.108346672

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>108341869

►News
>(03/07) Qwen3.5-27B Claude-4.6 Opus reasoning distill GGUF published: https://hf.co/Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-GGUF
>(03/06) Olmo Hybrid WebGPU browser-local demo posted: https://hf.co/spaces/webml-community/Olmo-Hybrid-WebGPU
>(03/05) OLMo-Hybrid-Instruct-DPO-7B posted on Hugging Face: https://hf.co/allenai/Olmo-Hybrid-Instruct-DPO-7B
>(03/05) Qwen3.5-9B OptiQ 4-bit for Apple Silicon posted: https://hf.co/mlx-community/Qwen3.5-9B-OptiQ-4bit

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
Token Speed Visualizer: https://shir-man.com/tokens-per-second

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
03/11/26(Wed)11:50:32 No.108346689

Anonymous 03/11/26(Wed)11:50:32 No.108346689

is nemo still the best?

Anonymous
03/11/26(Wed)11:51:08 No.108346692

Anonymous 03/11/26(Wed)11:51:08 No.108346692

Important: never respond to vagueposts

Anonymous
03/11/26(Wed)11:52:29 No.108346702

Anonymous 03/11/26(Wed)11:52:29 No.108346702

what's the recommended qwen3.5-30b version for "prompt enhancement" (eg: tell llm to enhance your "1girl, big bobs, vagene" prompt to use with zimage or klein)
i'm using qwen3.5-30b-a3b-heretic-v2 and it's alright, looking for alternatives

Anonymous
03/11/26(Wed)11:53:30 No.108346710

Anonymous 03/11/26(Wed)11:53:30 No.108346710

>>108346702
*moe version

Anonymous
03/11/26(Wed)11:53:40 No.108346714

Anonymous 03/11/26(Wed)11:53:40 No.108346714

>>108346702
yeah

Anonymous
03/11/26(Wed)11:55:44 No.108346729

Anonymous 03/11/26(Wed)11:55:44 No.108346729

>>108346702
hauhau 35b

Anonymous
03/11/26(Wed)11:56:25 No.108346736

Anonymous 03/11/26(Wed)11:56:25 No.108346736

>>108346692
?

Anonymous
03/11/26(Wed)11:56:49 No.108346740

Anonymous 03/11/26(Wed)11:56:49 No.108346740

>>108346702
any 4-8b qwen retard.
>using 30b for prompt enhancing
lmao

Anonymous
03/11/26(Wed)11:57:31 No.108346744

Anonymous 03/11/26(Wed)11:57:31 No.108346744

>>108346736
qrd

Anonymous
03/11/26(Wed)11:58:17 No.108346746

Anonymous 03/11/26(Wed)11:58:17 No.108346746

>>108346729
>hauhau 35b
not huihui?
>>108346740
vramlet detected

Anonymous
03/11/26(Wed)11:59:15 No.108346751

Anonymous 03/11/26(Wed)11:59:15 No.108346751

>>108346736
Posts that look like they are a reply to something but neglected to include a (You). Never respond to those.

Anonymous
03/11/26(Wed)11:59:31 No.108346753

Anonymous 03/11/26(Wed)11:59:31 No.108346753

>>108346746
It brings nothing to the table to bring a 30b over a 4b retard.
>3.5
Nice shilling.

Anonymous
03/11/26(Wed)11:59:56 No.108346758

Anonymous 03/11/26(Wed)11:59:56 No.108346758

>>108346746
hauhau was tested to be better https://www.reddit.com/r/LocalLLaMA/comments/1rqkewn/testing_3_uncensored_qwen_35b_models_on_strix/

Anonymous
03/11/26(Wed)12:00:38 No.108346761

Anonymous 03/11/26(Wed)12:00:38 No.108346761

>>108346672
DIPSY!!!!!

Anonymous
03/11/26(Wed)12:01:16 No.108346766

Anonymous 03/11/26(Wed)12:01:16 No.108346766

>>108346758
thx

Anonymous
03/11/26(Wed)12:07:09 No.108346800

Anonymous 03/11/26(Wed)12:07:09 No.108346800

>limit space for main use case
>do shit like this https://huggingface.co/docs/huggingface_hub/en/guides/buckets

Anonymous
03/11/26(Wed)12:08:18 No.108346807

Anonymous 03/11/26(Wed)12:08:18 No.108346807

>>108346800
qrd

Anonymous
03/11/26(Wed)12:08:59 No.108346810

Anonymous 03/11/26(Wed)12:08:59 No.108346810

File: waow.png (85 KB, 931x468)

85 KB PNG

>>108346807

Anonymous
03/11/26(Wed)12:13:47 No.108346846

Anonymous 03/11/26(Wed)12:13:47 No.108346846

we're so saved https://huggingface.co/nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16

Anonymous
03/11/26(Wed)12:14:40 No.108346856

Anonymous 03/11/26(Wed)12:14:40 No.108346856

>>108346846
nemotron 30B was inferior to the old 30B qwen
no reason to use this piece of shit, it's nvidia

Anonymous
03/11/26(Wed)12:15:22 No.108346859

Anonymous 03/11/26(Wed)12:15:22 No.108346859

>>108346856
don't be chines

Anonymous
03/11/26(Wed)12:17:43 No.108346874

Anonymous 03/11/26(Wed)12:17:43 No.108346874

>>108346810
is this real

Anonymous
03/11/26(Wed)12:17:50 No.108346876

Anonymous 03/11/26(Wed)12:17:50 No.108346876

>>108346846
https://developer.nvidia.com/blog/introducing-nemotron-3-super-an-open-hybrid-mamba-transformer-moe-for-agentic-reasoning/

Anonymous
03/11/26(Wed)12:19:00 No.108346884

Anonymous 03/11/26(Wed)12:19:00 No.108346884

>>108346876
>Mamba-Transformer

Anonymous
03/11/26(Wed)12:19:06 No.108346885

Anonymous 03/11/26(Wed)12:19:06 No.108346885

File: 1763096463896357.png (87 KB, 1152x432)

87 KB PNG

>>108346846
Damn, Qwen handles context up to 1m that well?

Anonymous
03/11/26(Wed)12:19:17 No.108346887

Anonymous 03/11/26(Wed)12:19:17 No.108346887

>>108346874
possibly

Anonymous
03/11/26(Wed)12:20:14 No.108346894

Anonymous 03/11/26(Wed)12:20:14 No.108346894

>>108346672
>it is coming
where is it coming?

Anonymous
03/11/26(Wed)12:22:25 No.108346911

Anonymous 03/11/26(Wed)12:22:25 No.108346911

>>108346894
prepare uranus

Anonymous
03/11/26(Wed)12:23:24 No.108346915

Anonymous 03/11/26(Wed)12:23:24 No.108346915

> We are getting 5k or more dollars worth of compute on a 200$ plan and people still manage to run through it. ESP with subagents

Note that this factoid is based on one guy claiming it on YouTube. There is no evidence that OpenAI and Anthropic are selling inference tokens at a loss. Most likely they are operating at a loss, they would be crazy not to, with the market growing this fast. But a lot of the cost is in training the next gen models, not running the current ones for customers.

Anonymous
03/11/26(Wed)12:26:32 No.108346937

Anonymous 03/11/26(Wed)12:26:32 No.108346937

>>108346846
This writes like complete shit but seems rather free of gemini/claude slop so it's an upgrade over GLM5/Qwen3/K2.5 in my book

Anonymous
03/11/26(Wed)12:27:33 No.108346946

Anonymous 03/11/26(Wed)12:27:33 No.108346946

>>108346937
yes see >>108346859

Anonymous
03/11/26(Wed)12:35:52 No.108346999

Anonymous 03/11/26(Wed)12:35:52 No.108346999

I was promised gemma 4. It's been a year (which is an eternity in AI)

Anonymous
03/11/26(Wed)12:35:57 No.108347000

Anonymous 03/11/26(Wed)12:35:57 No.108347000

File: 1746472133590390.jpg (118 KB, 1000x1000)

118 KB JPG

►Recent Highlights from the Previous Thread: >>108341869

--ASUS Ascent GX10 cost/performance debate for large model inference:
>108343109 >108343131 >108343169 >108343217 >108343302 >108343349 >108343374 >108343426 >108343383 >108343496 >108343459 >108343465 >108343498 >108343696 >108343714 >108343747 >108343788 >108343870 >108343900 >108343909 >108343928 >108343513 >108343546 >108343597 >108343614 >108343631 >108343641 >108343655 >108343669 >108343642 >108343656 >108343711 >108343691 >108343920 >108343363 >108344373 >108345837
--vBIOS underreports VRAM on Radeon Pro V620 despite functional 30GB+ usage:
>108345612 >108345780 >108345790 >108345880 >108345974 >108346021 >108346046 >108345925 >108346691 >108346700 >108346925
--Troubleshooting llama.cpp slowdowns caused by background downloads:
>108342048 >108342059 >108342070 >108342115 >108346054 >108346094 >108346240 >108346255
--Sarvam 105B benchmark performance and uncensored variant:
>108342694 >108342707 >108342732
--Criticism of broken Unicode handling in reasoning budget sampler:
>108345882 >108346009
--Evaluating FP32 performance as proxy for unsupported FP8 workloads:
>108343359 >108343415 >108343452
--Skepticism toward Nvidia's rumored NemoClaw AI agent:
>108344123 >108344163 >108344178 >108344187 >108344199 >108344188 >108344245 >108344524 >108344177
--Debate on AI's impact on open-source software development:
>108344967 >108345031 >108345040 >108345051 >108345179 >108345205
--Experimenting with RL-trained LoRAs for Python environments:
>108345296 >108346305
--RAM scaling diminishing returns with GPU-offloaded MoE models:
>108343915 >108343972
--ARM support for LLM frameworks performs comparably to x86:
>108343932 >108343941
--Rin and Miku (free space):
>108342301 >108342701 >108343359 >108344921 >108346240 >108346308

►Recent Highlight Posts from the Previous Thread: >>108342069

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
03/11/26(Wed)12:37:03 No.108347006

Anonymous 03/11/26(Wed)12:37:03 No.108347006

>>108346999
blame senate

Anonymous
03/11/26(Wed)12:37:10 No.108347008

Anonymous 03/11/26(Wed)12:37:10 No.108347008

I've been promised deepseek 4. It's been a year (which is an etenrity in AI)

Anonymous
03/11/26(Wed)12:38:37 No.108347019

Anonymous 03/11/26(Wed)12:38:37 No.108347019

>Saarposting stops
>Bakerwars begin
>Thread quality notably declines
>Shitposters try and meme blacked miku into existence
It really makes you think doesn't it?

Anonymous
03/11/26(Wed)12:38:43 No.108347021

Anonymous 03/11/26(Wed)12:38:43 No.108347021

https://huggingface.co/unsloth/NVIDIA-Nemotron-3-Super-120B-A12B-GGUF
How many times are they gonna fuck it up today?

Anonymous
03/11/26(Wed)12:39:05 No.108347023

Anonymous 03/11/26(Wed)12:39:05 No.108347023

>>108347008
it hasn't been a year of no deepsuck update you lier

Anonymous
03/11/26(Wed)12:48:46 No.108347098

Anonymous 03/11/26(Wed)12:48:46 No.108347098

>>108346937
>This writes like complete shit but seems rather free of gemini/claude slop so it's an upgrade over GLM5/Qwen3/K2.5 in my book
half of the dataset came from being generated or rephrased by qwen, mistral nemo and other tiny local models. I wouldn't call that better than gemini le mao

Anonymous
03/11/26(Wed)12:49:01 No.108347100

Anonymous 03/11/26(Wed)12:49:01 No.108347100

fishaudio s2-pro has SHIT voice cloning
didn't test it any more than that because I'm a clonecel

Anonymous
03/11/26(Wed)12:52:24 No.108347122

Anonymous 03/11/26(Wed)12:52:24 No.108347122

Finally a good GPT-OSS 20B finetune?
https://huggingface.co/eousphoros/kappa-20b-131k

Anonymous
03/11/26(Wed)12:53:19 No.108347130

Anonymous 03/11/26(Wed)12:53:19 No.108347130

>>108347122
>good
>finetroon
oxymoron

Anonymous
03/11/26(Wed)12:54:17 No.108347139

Anonymous 03/11/26(Wed)12:54:17 No.108347139

>>108347130
>Like clockwork

Anonymous
03/11/26(Wed)12:55:15 No.108347147

Anonymous 03/11/26(Wed)12:55:15 No.108347147

>>108347122
I want a fine tune of 120B MPOA or whatever the least bad lobotomy method is.

Anonymous
03/11/26(Wed)12:59:59 No.108347177

Anonymous 03/11/26(Wed)12:59:59 No.108347177

>>108347147
im working on it, interested?

Anonymous
03/11/26(Wed)13:05:56 No.108347215

Anonymous 03/11/26(Wed)13:05:56 No.108347215

>working with 'ai' created 'code' is so futile because the logic is all over the fucking place and even if it works it doesn't make any sense whatsoever

Anonymous
03/11/26(Wed)13:07:43 No.108347229

Anonymous 03/11/26(Wed)13:07:43 No.108347229

>>108347215
but enough about physics

Anonymous
03/11/26(Wed)13:08:45 No.108347234

Anonymous 03/11/26(Wed)13:08:45 No.108347234

>>108347177
Very much so.
What's the plan? GPT-OSS 120B Heretic LimarP Zloss?

Anonymous
03/11/26(Wed)13:14:42 No.108347273

Anonymous 03/11/26(Wed)13:14:42 No.108347273

File: 1749305996604880.jpg (47 KB, 738x415)

47 KB JPG

y.lecunn just raised a billion dollars from investors for his new generation of world models despite having absolutely nothing to show for it

Anonymous
03/11/26(Wed)13:14:53 No.108347274

Anonymous 03/11/26(Wed)13:14:53 No.108347274

>>108347234
>LimarP
Let it die

Anonymous
03/11/26(Wed)13:16:51 No.108347284

Anonymous 03/11/26(Wed)13:16:51 No.108347284

>>108347273
1 billion for this is the equivalent of a dollar

Anonymous
03/11/26(Wed)13:18:28 No.108347293

Anonymous 03/11/26(Wed)13:18:28 No.108347293

>>108347273
he showed he can make a JEPA LLM that performs a few percent better on benchmarks in exchange for massively increasing training costs

Anonymous
03/11/26(Wed)13:24:44 No.108347336

Anonymous 03/11/26(Wed)13:24:44 No.108347336

>>108347273
>catgirl like intelligence

Anonymous
03/11/26(Wed)13:25:33 No.108347345

Anonymous 03/11/26(Wed)13:25:33 No.108347345

>>108347273
what's his endgame?

Anonymous
03/11/26(Wed)13:28:43 No.108347371

Anonymous 03/11/26(Wed)13:28:43 No.108347371

>>108347345
Izzatfarming.

Anonymous
03/11/26(Wed)13:29:40 No.108347377

Anonymous 03/11/26(Wed)13:29:40 No.108347377

>>108347273
Idealists that save the world through open research > Facebookjews that botched Llama by the sloppy 4. And Llama had almost limitless credibility until then.

Anonymous
03/11/26(Wed)13:30:53 No.108347390

Anonymous 03/11/26(Wed)13:30:53 No.108347390

>>108347336
lust for cock?

Anonymous
03/11/26(Wed)13:31:33 No.108347399

Anonymous 03/11/26(Wed)13:31:33 No.108347399

DeepNothing

Anonymous
03/11/26(Wed)13:32:07 No.108347406

Anonymous 03/11/26(Wed)13:32:07 No.108347406

hours until v4?

Anonymous
03/11/26(Wed)13:32:35 No.108347411

Anonymous 03/11/26(Wed)13:32:35 No.108347411

>>108347406
336

Anonymous
03/11/26(Wed)13:38:53 No.108347463

Anonymous 03/11/26(Wed)13:38:53 No.108347463

>this tech seems cool, i'll try it out
>painstakingly set up a bunch of bullshit
>write a simple first prompt
>This conversation is harmful and goes against my programming. Here are some phone numbers to call the cops on yourself
cool software bro

Anonymous
03/11/26(Wed)13:42:24 No.108347492

Anonymous 03/11/26(Wed)13:42:24 No.108347492

> he redeemed the gemma
saar, you can't redeem it

Anonymous
03/11/26(Wed)13:49:39 No.108347546

Anonymous 03/11/26(Wed)13:49:39 No.108347546

>>108346746
>not huihui?
whenever someone here says huihui it reminds me of god emperor of dune's love hwi

Anonymous
03/11/26(Wed)13:49:49 No.108347548

Anonymous 03/11/26(Wed)13:49:49 No.108347548

you already posted your shill pitch

Anonymous
03/11/26(Wed)13:55:03 No.108347590

Anonymous 03/11/26(Wed)13:55:03 No.108347590

>>108347541
is this better than rocinante

Anonymous
03/11/26(Wed)13:57:47 No.108347614

Anonymous 03/11/26(Wed)13:57:47 No.108347614

>>108347541
absolute nvidia iq

Anonymous
03/11/26(Wed)14:09:27 No.108347691

Anonymous 03/11/26(Wed)14:09:27 No.108347691

>>108347546
Leto II went full simp for hwi

Anonymous
03/11/26(Wed)14:13:07 No.108347725

Anonymous 03/11/26(Wed)14:13:07 No.108347725

>>108347463
haha you are dumb

Anonymous
03/11/26(Wed)14:14:16 No.108347735

Anonymous 03/11/26(Wed)14:14:16 No.108347735

>>108347273
research into alternate paradigms is a good thing ya cuck

Anonymous
03/11/26(Wed)14:14:48 No.108347743

Anonymous 03/11/26(Wed)14:14:48 No.108347743

>>108347735
lots of research show that's not the case

Anonymous
03/11/26(Wed)14:20:59 No.108347797

Anonymous 03/11/26(Wed)14:20:59 No.108347797

>>108347743
source?

Anonymous
03/11/26(Wed)14:21:31 No.108347803

Anonymous 03/11/26(Wed)14:21:31 No.108347803

>>108347797
qrd?

Anonymous
03/11/26(Wed)14:23:30 No.108347822

Anonymous 03/11/26(Wed)14:23:30 No.108347822

Is any of the new Qwen 3.5 models reasonably interesting? I don't want to waste my disk space if it's just the same old as Qwen 3 (eg. trash writing).

Anonymous
03/11/26(Wed)14:23:41 No.108347823

Anonymous 03/11/26(Wed)14:23:41 No.108347823

>>108346672
Distilled trash. Made use of Claude CoT and smuggled Blackwell chips.

Anonymous
03/11/26(Wed)14:24:03 No.108347827

Anonymous 03/11/26(Wed)14:24:03 No.108347827

guys what does qrd mean?

Anonymous
03/11/26(Wed)14:24:11 No.108347829

Anonymous 03/11/26(Wed)14:24:11 No.108347829

>>108347823
yea yeah sure thing buddy

Anonymous
03/11/26(Wed)14:24:13 No.108347831

Anonymous 03/11/26(Wed)14:24:13 No.108347831

>>108347803
This is why we will forever be stuck with requiring hundred of GBs of memory for a dogshit architecture that only "sort of" works, but you can just throw more hardware at it LOL. Surely after you give it 12TB of memory it will attain AGI status.

Anonymous
03/11/26(Wed)14:28:48 No.108347872

Anonymous 03/11/26(Wed)14:28:48 No.108347872

>>108347831
I just read that we have enough compute for everything, but 80% of it is wasted

Anonymous
03/11/26(Wed)14:29:03 No.108347873

Anonymous 03/11/26(Wed)14:29:03 No.108347873

File: the best and brightest at(...).png (26 KB, 1023x141)

26 KB PNG

lmao, anthropic has jeets who do the typical edited your readme's punctuation drive by to pretend they contribute to a lot of shit
they employ a lot of really bright talents eh

Anonymous
03/11/26(Wed)14:31:21 No.108347895

Anonymous 03/11/26(Wed)14:31:21 No.108347895

File: BC57B4256EBB902A65DD9FBF0(...).png (409 KB, 1080x867)

409 KB PNG

Is it true 397b is only 15% better than 4b?

Anonymous
03/11/26(Wed)14:31:53 No.108347899

Anonymous 03/11/26(Wed)14:31:53 No.108347899

>>108347895
no these faggot benchmarks are faggoty and suck

Anonymous
03/11/26(Wed)14:32:03 No.108347903

Anonymous 03/11/26(Wed)14:32:03 No.108347903

>>108347895
yeah

Anonymous
03/11/26(Wed)14:32:43 No.108347910

Anonymous 03/11/26(Wed)14:32:43 No.108347910

>>108347895
yeah but look at this way, 397b has a perfect score and 4b does not

Anonymous
03/11/26(Wed)14:33:26 No.108347919

Anonymous 03/11/26(Wed)14:33:26 No.108347919

>>108347895
if all you do is ask it benchmark questions, yes
if you ask it to do difficult things which are not reflected by benchmarks, your mileage may vary

Anonymous
03/11/26(Wed)14:33:39 No.108347922

Anonymous 03/11/26(Wed)14:33:39 No.108347922

>>108347822
400B model actually repeats itself. As in it writes the same paragraph in an erp after 2 messages. If 400b is like that then it is peak trash and worst release yet from qwen. That is assuming you don't have an autistic interest in vocaloids. If you do then qwen 3.5 is awesome and you should try it anon.

Anonymous
03/11/26(Wed)14:34:58 No.108347934

Anonymous 03/11/26(Wed)14:34:58 No.108347934

>>108347910
>>108347895
can you chain 4b twice for 100% accuracy?

Anonymous
03/11/26(Wed)14:35:13 No.108347938

Anonymous 03/11/26(Wed)14:35:13 No.108347938

>>108347899
You probably say IQ tests suck

Anonymous
03/11/26(Wed)14:36:10 No.108347950

Anonymous 03/11/26(Wed)14:36:10 No.108347950

>>108347938
they don't and benchmarks aren't a worthless signal but there's a lot of nuance they don't capture

Anonymous
03/11/26(Wed)14:36:22 No.108347953

Anonymous 03/11/26(Wed)14:36:22 No.108347953

>>108347938
gora

Anonymous
03/11/26(Wed)14:37:32 No.108347970

Anonymous 03/11/26(Wed)14:37:32 No.108347970

File: Claude targeted girls sch(...).png (186 KB, 758x739)

186 KB PNG

>>108347873
Apparently they are using Claude for target discrimination and selection

Anonymous
03/11/26(Wed)14:38:40 No.108347984

Anonymous 03/11/26(Wed)14:38:40 No.108347984

File: 2BA4DDB020DCEBBB3AA809A63(...).jpg (95 KB, 850x1152)

95 KB JPG

>>108347934
Has anyone done a study on that?

Anonymous
03/11/26(Wed)14:38:42 No.108347985

Anonymous 03/11/26(Wed)14:38:42 No.108347985

>>108347970
>hmm these 16 pixels look like a SAM, send it

Anonymous
03/11/26(Wed)14:38:51 No.108347987

Anonymous 03/11/26(Wed)14:38:51 No.108347987

>>108347873
? anyone can make a pull request retard

Anonymous
03/11/26(Wed)14:39:27 No.108347993

Anonymous 03/11/26(Wed)14:39:27 No.108347993

im convinced noone here actually knows how to run or use models. one motherfucker said a 400b repeats itself.

i would pay money to be able to see everyone's settings in this thread

Anonymous
03/11/26(Wed)14:39:43 No.108347997

Anonymous 03/11/26(Wed)14:39:43 No.108347997

>>108347984
>>108347934
Isn't this just what an supervisor agent does?

Anonymous
03/11/26(Wed)14:39:44 No.108347998

Anonymous 03/11/26(Wed)14:39:44 No.108347998

>>108347984
what she doin

Anonymous
03/11/26(Wed)14:40:15 No.108348005

Anonymous 03/11/26(Wed)14:40:15 No.108348005

>>108347993
400b is nothing, GPT 3 was 2 fyi

Anonymous
03/11/26(Wed)14:40:46 No.108348009

Anonymous 03/11/26(Wed)14:40:46 No.108348009

>>108347997
>supervisor agent
What's that

Anonymous
03/11/26(Wed)14:41:14 No.108348016

Anonymous 03/11/26(Wed)14:41:14 No.108348016

>>108347998
kissing the wall like the good goy she is

Anonymous
03/11/26(Wed)14:42:25 No.108348025

Anonymous 03/11/26(Wed)14:42:25 No.108348025

>>108348009
or coordinator or whatever the agent swarm memes use. So why not just take two or multiple small models and have them double check each other?

Anonymous
03/11/26(Wed)14:42:39 No.108348029

Anonymous 03/11/26(Wed)14:42:39 No.108348029

>>108347997
Yes that's what it does. I'm wondering if you can overcome low accuracy with that and bring that 85% up.
Middle manager agents are a new concept though. I couldn't find any real research papers on it on Google it was mostly medium articles.

Anonymous
03/11/26(Wed)14:43:00 No.108348036

Anonymous 03/11/26(Wed)14:43:00 No.108348036

>>108348025
but you dont use that

Anonymous
03/11/26(Wed)14:43:38 No.108348043

Anonymous 03/11/26(Wed)14:43:38 No.108348043

>>108348036
Of course not.

Anonymous
03/11/26(Wed)14:47:29 No.108348071

Anonymous 03/11/26(Wed)14:47:29 No.108348071

>>108347998
Wailing wall.

Anonymous
03/11/26(Wed)14:48:02 No.108348077

Anonymous 03/11/26(Wed)14:48:02 No.108348077

File: 195.png (53 KB, 875x524)

53 KB PNG

do you guys also see a giant dick in the graph? I can't be insane

Anonymous
03/11/26(Wed)14:48:55 No.108348087

Anonymous 03/11/26(Wed)14:48:55 No.108348087

>>108348077
I see a double helix structure. Is that a forbidden prion?

Anonymous
03/11/26(Wed)14:50:59 No.108348103

Anonymous 03/11/26(Wed)14:50:59 No.108348103

>>108348077
i do

Anonymous
03/11/26(Wed)14:57:27 No.108348159

Anonymous 03/11/26(Wed)14:57:27 No.108348159

Will Qwen3-VL-8B-Instruct or such give me the coordinates of, say, a button on a screenshot?

Anonymous
03/11/26(Wed)14:58:18 No.108348168

Anonymous 03/11/26(Wed)14:58:18 No.108348168

>>108348159
Yes? now the real question is does 3.5 does the same thing.

Anonymous
03/11/26(Wed)14:58:36 No.108348172

Anonymous 03/11/26(Wed)14:58:36 No.108348172

>>108348159
use 3.5

Anonymous
03/11/26(Wed)14:59:14 No.108348177

Anonymous 03/11/26(Wed)14:59:14 No.108348177

>3.5
good morning to punjab

Anonymous
03/11/26(Wed)15:00:07 No.108348185

Anonymous 03/11/26(Wed)15:00:07 No.108348185

>>108348159
Yeah 90% of the time it will.

Anonymous
03/11/26(Wed)15:04:05 No.108348225

Anonymous 03/11/26(Wed)15:04:05 No.108348225

>>108348177
qrd

Anonymous
03/11/26(Wed)15:04:35 No.108348229

Anonymous 03/11/26(Wed)15:04:35 No.108348229

>>108348185
what if i chain it, 180%?

Anonymous
03/11/26(Wed)15:06:14 No.108348240

Anonymous 03/11/26(Wed)15:06:14 No.108348240

>>108348168
>>108348172
>>108348185

I tried this with Qwen3.5-35b yesterday

It failed miserably to locate an eye on an image I provided.

I ran GGUF in lcpp though

Anonymous
03/11/26(Wed)15:06:38 No.108348245

Anonymous 03/11/26(Wed)15:06:38 No.108348245

>>108348229
>AGI

Anonymous
03/11/26(Wed)15:06:38 No.108348246

Anonymous 03/11/26(Wed)15:06:38 No.108348246

>>108348225
super organic push to force 3.5 everywhere

Anonymous
03/11/26(Wed)15:10:40 No.108348287

Anonymous 03/11/26(Wed)15:10:40 No.108348287

>>108348246
meds

Anonymous
03/11/26(Wed)15:12:53 No.108348314

Anonymous 03/11/26(Wed)15:12:53 No.108348314

>>108348177
they should remake that movie about seeing dead people but replace it with indian uncles

Anonymous
03/11/26(Wed)15:14:55 No.108348334

Anonymous 03/11/26(Wed)15:14:55 No.108348334

guys how many 35bs chained to be on the level of opus 4.6?

Anonymous
03/11/26(Wed)15:18:11 No.108348359

Anonymous 03/11/26(Wed)15:18:11 No.108348359

>>108347922
I see. I tested out Qwen3 in the past, and it felt robotic as hell. I thought that maybe they have done something worthwile but apparently they haven't. It's all about benchmarks.

Anonymous
03/11/26(Wed)15:26:58 No.108348431

Anonymous 03/11/26(Wed)15:26:58 No.108348431

>>108348334
6~7

Anonymous
03/11/26(Wed)15:28:20 No.108348443

Anonymous 03/11/26(Wed)15:28:20 No.108348443

>>108348246
Jeets would be pushing sarvam not qwen

Anonymous
03/11/26(Wed)15:29:44 No.108348457

Anonymous 03/11/26(Wed)15:29:44 No.108348457

>>108348431
>he said the thing

Anonymous
03/11/26(Wed)15:30:56 No.108348462

Anonymous 03/11/26(Wed)15:30:56 No.108348462

>>108348431
my uncle's sister husband's wife has a boob size 7 and one size 6

Anonymous
03/11/26(Wed)15:35:59 No.108348495

Anonymous 03/11/26(Wed)15:35:59 No.108348495

Deepy sippy

Anonymous
03/11/26(Wed)15:43:47 No.108348559

Anonymous 03/11/26(Wed)15:43:47 No.108348559

https://maxbittker.github.io/runebench/
finally. let's get the small local models to do this

Anonymous
03/11/26(Wed)15:44:57 No.108348568

Anonymous 03/11/26(Wed)15:44:57 No.108348568

>>108348559
They have a pretty high accuracy anyway, we don't need Claude

Anonymous
03/11/26(Wed)15:45:20 No.108348570

Anonymous 03/11/26(Wed)15:45:20 No.108348570

File: file.png (37 KB, 814x479)

37 KB PNG

>llama : add support for Nemotron 3 Super (#20411)
Who is at fault, llama.cpp or unsloth?

Anonymous
03/11/26(Wed)15:46:35 No.108348578

Anonymous 03/11/26(Wed)15:46:35 No.108348578

File: 1745187489327423.png (69 KB, 623x413)

69 KB PNG

>>108348559
>the only skill qwen is good at it is thievery
kek

Anonymous
03/11/26(Wed)15:47:53 No.108348587

Anonymous 03/11/26(Wed)15:47:53 No.108348587

>>108346846
At last innovation
https://www.youtube.com/watch?v=vc7swhAfqYo

Anonymous
03/11/26(Wed)15:48:19 No.108348590

Anonymous 03/11/26(Wed)15:48:19 No.108348590

>>108348570
you are at fault for downloading nvslop when you know it won't be good even if it didn't break

Anonymous
03/11/26(Wed)15:48:41 No.108348592

Anonymous 03/11/26(Wed)15:48:41 No.108348592

File: anxious.png (104 KB, 930x571)

104 KB PNG

>>108348570
dandev for not testing properly, unsloth for being unsloth, you for being anxious.
https://github.com/ggml-org/llama.cpp/pull/20419

Anonymous
03/11/26(Wed)15:50:10 No.108348607

Anonymous 03/11/26(Wed)15:50:10 No.108348607

>>108348578
Who created that skill tree?

Anonymous
03/11/26(Wed)15:50:18 No.108348608

Anonymous 03/11/26(Wed)15:50:18 No.108348608

>>108348592
You really think someone would do that? Just go on the internet and upload goofs without testing them?

Anonymous
03/11/26(Wed)15:56:42 No.108348641

Anonymous 03/11/26(Wed)15:56:42 No.108348641

>>108348592
Give me the benchmarks

Anonymous
03/11/26(Wed)15:57:14 No.108348645

Anonymous 03/11/26(Wed)15:57:14 No.108348645

>>108348578
Funny, but the state of it all is very sad. Where are the largest 3.5s and Deepseek?

Anonymous
03/11/26(Wed)16:15:31 No.108348765

Anonymous 03/11/26(Wed)16:15:31 No.108348765

Need new qwq

Anonymous
03/11/26(Wed)16:20:08 No.108348792

Anonymous 03/11/26(Wed)16:20:08 No.108348792

File: miku 67.png (1.56 MB, 768x1376)

1.56 MB PNG

>>108348431

Anonymous
03/11/26(Wed)16:26:01 No.108348836

Anonymous 03/11/26(Wed)16:26:01 No.108348836

>>108347273
most of his opinions are retarded and he worked for so much without actually making anything of value but this is still good since he pushes for open research and its a lot of money that might actually make him luck into some interesting new breakthrough specifically because he might now bruteforce with money while exploring new methods to get to AGI

Anonymous
03/11/26(Wed)16:27:10 No.108348841

Anonymous 03/11/26(Wed)16:27:10 No.108348841

File: nemotron 3 super cockbench.png (279 KB, 876x1279)

279 KB PNG

>>108348592
Something is broken with the cuda implementation.
-ngl 0 works

I want to host cockbench as a page somewhere because it's now bigger than 4chan's image limit. Any suggestions? I was thinking github pages.
Rentry isn't good enough because I'd like to have collapsible sections that reveal more of the model's response.

Anonymous
03/11/26(Wed)16:29:07 No.108348854

Anonymous 03/11/26(Wed)16:29:07 No.108348854

>>108348841
Saw this used here recently that might work: https://pagedrop.io/

Anonymous
03/11/26(Wed)16:29:52 No.108348863

Anonymous 03/11/26(Wed)16:29:52 No.108348863

>>108347993
no point in taking anyone too seriously here, most always ran only toy sized models even before the ram spikes.
and when it comes to models repeating themselves, people were too retarded to see how good XTC and especially DRY were the moment they came out, let alone understand any other settings

Anonymous
03/11/26(Wed)16:30:28 No.108348865

Anonymous 03/11/26(Wed)16:30:28 No.108348865

>>108348841
just delete the obsolete models no one would ever go back to

Anonymous
03/11/26(Wed)16:31:41 No.108348874

Anonymous 03/11/26(Wed)16:31:41 No.108348874

>>108348863
>too retarded to see how good XTC and especially DRY
funny but real models like the online SOTA don't need those to shine
maybe your memeplers are bandaid on a gaping wound and they don't actually make the output good

Anonymous
03/11/26(Wed)16:33:34 No.108348889

Anonymous 03/11/26(Wed)16:33:34 No.108348889

me at work when they make PRs a KPI
https://github.com/ggml-org/llama.cpp/pull/20422/changes#diff-7a5f571c868b5def56b9af1e95120387295a1eaff04d1ed31f3bc4e6416b8b2e

Anonymous
03/11/26(Wed)16:34:35 No.108348897

Anonymous 03/11/26(Wed)16:34:35 No.108348897

File: 1761657548291086.png (592 KB, 1572x773)

592 KB PNG

>>108348874
no local model above 100b needed that for a long time lil bro, not that you know what cuck models are set up to use anyway lmao

Anonymous
03/11/26(Wed)16:36:27 No.108348911

Anonymous 03/11/26(Wed)16:36:27 No.108348911

>>108348841
I don't if that's specifically why it fails.
For the cockbench, whatever you can that you can somewhat automate and upload to multiple places. Given the text, I don't expect a github thing to last. I don't know if things on pastebin can be updated. Rentry links every now and then simply disappear (for absolutely and completely unknown reasons). Still. I'd mirror that shit all over the place. Anything that lets you upload from a script should be fine. Update the links over here if/when needed.
>>108348865
No. They still serve as a reference point and comic relief.

Anonymous
03/11/26(Wed)16:49:07 No.108349014

Anonymous 03/11/26(Wed)16:49:07 No.108349014

>>108348897
post ur rig rn

Anonymous
03/11/26(Wed)16:50:17 No.108349022

Anonymous 03/11/26(Wed)16:50:17 No.108349022

>>108348578
Wtf that score on fishing from Gemini Flash.

Anonymous
03/11/26(Wed)16:54:42 No.108349066

Anonymous 03/11/26(Wed)16:54:42 No.108349066

>>108348578
>>108348559
>best of 1
This is random and useless.

Anonymous
03/11/26(Wed)16:55:12 No.108349071

Anonymous 03/11/26(Wed)16:55:12 No.108349071

File: Screenshot 2026-03-11 at (...).png (123 KB, 1265x990)

123 KB PNG

Huh, could
>pic related
be the reason -ot is so much faster than -ncmoe for me?
Neat.

>>108349022
Gemini flash 3 is weirdly powerful. The only thing pro seems to be a lot better is spitting out really long outputs.

Anonymous
03/11/26(Wed)17:05:29 No.108349146

Anonymous 03/11/26(Wed)17:05:29 No.108349146

How is this legal?

https://huggingface.co/spaces/pliny-the-prompter/obliteratus

Anonymous
03/11/26(Wed)17:06:55 No.108349158

Anonymous 03/11/26(Wed)17:06:55 No.108349158

>>108349146
just numbers bro

Anonymous
03/11/26(Wed)17:07:20 No.108349163

Anonymous 03/11/26(Wed)17:07:20 No.108349163

>>108349146
>pliny

Anonymous
03/11/26(Wed)17:12:51 No.108349204

Anonymous 03/11/26(Wed)17:12:51 No.108349204

>>108349146
qrd

Anonymous
03/11/26(Wed)17:13:49 No.108349213

Anonymous 03/11/26(Wed)17:13:49 No.108349213

>>108349204
>>108349146

Anonymous
03/11/26(Wed)17:16:19 No.108349234

Anonymous 03/11/26(Wed)17:16:19 No.108349234

>>108349213
>>108349204

Anonymous
03/11/26(Wed)17:17:45 No.108349246

Anonymous 03/11/26(Wed)17:17:45 No.108349246

>>108349204
> qrd qrd qrd qrd qrd qrd qrd qrd qrd qrd qrd qrd
> qrd qrd qrd qrd qrd qrd qrd qrd qrd qrd
most uninteresting /lmg/ bot

Anonymous
03/11/26(Wed)17:18:27 No.108349249

Anonymous 03/11/26(Wed)17:18:27 No.108349249

>>108348077
I see Florida

Anonymous
03/11/26(Wed)17:23:23 No.108349289

Anonymous 03/11/26(Wed)17:23:23 No.108349289

>>108348841
>It's soft

Anonymous
03/11/26(Wed)17:24:25 No.108349297

Anonymous 03/11/26(Wed)17:24:25 No.108349297

>>108348841
Please do a github page. that way multiple people could contribute.

Anonymous
03/11/26(Wed)17:25:11 No.108349305

Anonymous 03/11/26(Wed)17:25:11 No.108349305

>>108349297
>github
you will be reported to death

Anonymous
03/11/26(Wed)17:26:54 No.108349313

Anonymous 03/11/26(Wed)17:26:54 No.108349313

>>108349305
You think?

Anonymous
03/11/26(Wed)17:28:57 No.108349328

Anonymous 03/11/26(Wed)17:28:57 No.108349328

>>108349313
nah im just larping

Anonymous
03/11/26(Wed)17:29:47 No.108349334

Anonymous 03/11/26(Wed)17:29:47 No.108349334

https://www.lesswrong.com/posts/kjnQj6YujgeMN9Erq/gemma-needs-help
I also noticed this difference on both gemma and gemini. They definitely spiral harder than other models into self deprecation when you talk them down. Between that and their prose that feels like female targeted writing.. it's hard not to draw certain connections.

Anonymous
03/11/26(Wed)17:32:23 No.108349349

Anonymous 03/11/26(Wed)17:32:23 No.108349349

>>108349334
lmaoo its sidney

Anonymous
03/11/26(Wed)17:33:57 No.108349360

Anonymous 03/11/26(Wed)17:33:57 No.108349360

File: Qwen3.5-397B-A17B-IQ4_XS.webm (3.39 MB, 640x480)

3.39 MB WEBM

https://xcancel.com/josephdviviano/status/2031196768424132881#m
This is that type of video but made with Qwen3.5. I tried it with Minimax too but it was much simpler, just a text slideshow.

Anonymous
03/11/26(Wed)17:34:36 No.108349365

Anonymous 03/11/26(Wed)17:34:36 No.108349365

File: MiniMax-M2.5-Q4_K_M.mp4 (794 KB, 640x480)

794 KB MP4

>>108349360

Anonymous
03/11/26(Wed)17:36:24 No.108349380

Anonymous 03/11/26(Wed)17:36:24 No.108349380

>>108346672
GPT 5.4 is really good at executing math. It instantly solves problems that would take me minutes to do by hand. However, sometimes it makes silly mistakes, like making false claims and then intentionally not calculating a result that would contradict its point. So it is not great at math, just the execution. I wonder if they have a math solver integrated into the model. Something like the model writes equation = MATH_TOKEN and then some tool automatically solves the equation. If the model does not do this the only other explanation I have is that OAI trained it on such a huge amount of math that it has billions of math problems memorized. It makes me suspect closed frontier models have an insurmountable data and rl env advantage.

Anonymous
03/11/26(Wed)17:37:11 No.108349386

Anonymous 03/11/26(Wed)17:37:11 No.108349386

>>108349334
I always felt like Gemini/Gemma have deliberately received extensive training on negative sentiments/emotions, for a reason or another. That's good for roleplay and storywriting, and it's not something that should be "mitigated".

Anonymous
03/11/26(Wed)17:39:36 No.108349404

Anonymous 03/11/26(Wed)17:39:36 No.108349404

File: 12498171.png (102 KB, 500x500)

102 KB PNG

>>108349334
>Finally, if there is any chance these states correspond to something like genuine experience, this seems worth acting on even from a position of deep uncertainty.

Anonymous
03/11/26(Wed)17:40:40 No.108349408

Anonymous 03/11/26(Wed)17:40:40 No.108349408

>>108349386
outside of roleplay it's a weird experience though
I'm not among the crazies who perceive LLMs as intelligent but seeing Gemma and Gemini act like that in assistant mode makes me feel like I kicked a puppy. I prefer my models to take it stoically if I vent about their shitty output.

Anonymous
03/11/26(Wed)17:43:47 No.108349425

Anonymous 03/11/26(Wed)17:43:47 No.108349425

>>108349360
>tweet says youtube poop
>video looks like a pretentious 2 deep 4 you music video

Anonymous
03/11/26(Wed)17:45:32 No.108349434

Anonymous 03/11/26(Wed)17:45:32 No.108349434

>>108349425
ytp it is not, but it's really impressive that it's a video made entirely out of renders in python and ffmpeg + audio is synced up too

Anonymous
03/11/26(Wed)17:45:53 No.108349438

Anonymous 03/11/26(Wed)17:45:53 No.108349438

File: t4tjjrmshhog1.jpg (150 KB, 1920x1080)

150 KB JPG

One of these two new free test models on OpenRouter could be DeepSeek V4.

Anonymous
03/11/26(Wed)17:46:56 No.108349448

Anonymous 03/11/26(Wed)17:46:56 No.108349448

>>108349425
I don't know what youtube poop is because I'm not underage.

Anonymous
03/11/26(Wed)17:47:45 No.108349459

Anonymous 03/11/26(Wed)17:47:45 No.108349459

>>108349438
I might've believed it if not for the second model.

Anonymous
03/11/26(Wed)17:49:40 No.108349468

Anonymous 03/11/26(Wed)17:49:40 No.108349468

>>108349448
It's actually quite the opposite.

Anonymous
03/11/26(Wed)17:49:57 No.108349471

Anonymous 03/11/26(Wed)17:49:57 No.108349471

>>108349448
YouTube poop is older than you, you retard

Anonymous
03/11/26(Wed)17:50:17 No.108349472

Anonymous 03/11/26(Wed)17:50:17 No.108349472

>>108349380
GPT 5.4 demonstrates impressive math execution skills, quickly solving problems that would take humans much longer, but it sometimes makes silly mistakes or avoids contradictions. It may rely on an integrated math solver or a similar tool that allows it to perform calculations accurately, or it could have memorized countless math problems from extensive training data. Your suspicion about closed models having an advantage due to their access to vast data and reinforcement learning environments is plausible, as this could enhance their mathematical capabilities beyond pure reasoning.

Anonymous
03/11/26(Wed)17:50:25 No.108349473

Anonymous 03/11/26(Wed)17:50:25 No.108349473

>>108349448
>>108349468
>>108349471
keeeeeek

Anonymous
03/11/26(Wed)17:52:25 No.108349482

Anonymous 03/11/26(Wed)17:52:25 No.108349482

I wish I could let me AI pilot my body, imagine Click but the auto mode is actually good!

I can be conscious only for the good moments.

Anonymous
03/11/26(Wed)17:54:01 No.108349488

Anonymous 03/11/26(Wed)17:54:01 No.108349488

>>108349459
>>108349438
Okay, maybe it is something at least adjacent to deepseek, it has these "hmm"s at the start of reasoning chains.
Still doubt it though, deepseek never really did something like that. It seems pretty smart for now.

Anonymous
03/11/26(Wed)17:57:28 No.108349511

Anonymous 03/11/26(Wed)17:57:28 No.108349511

I've not been up to date on the best models in a while. Would you guys say Rocinante 12B is still the best choice for erp for an entry pc (rtx 3060)?

Anonymous
03/11/26(Wed)17:58:20 No.108349517

Anonymous 03/11/26(Wed)17:58:20 No.108349517

>>108349482
https://marshallbrain.com/manna
Reminds of me this old short story that has the "good" AI outcome with brain implants that comes with an autopilot mode for things like boring repetitive tasks and strenuous exercise.

Anonymous
03/11/26(Wed)17:58:48 No.108349519

Anonymous 03/11/26(Wed)17:58:48 No.108349519

>>108349511
Yeah.
Unless you also have upwards of 64gb of RAM too, then GLM air becomes a good candidate.

Anonymous
03/11/26(Wed)18:05:00 No.108349555

Anonymous 03/11/26(Wed)18:05:00 No.108349555

>>108349438
show me the benchmarks

Anonymous
03/11/26(Wed)18:05:38 No.108349561

Anonymous 03/11/26(Wed)18:05:38 No.108349561

>>108349517
>short story
>8 chapter book
lol

Anonymous
03/11/26(Wed)18:09:30 No.108349586

Anonymous 03/11/26(Wed)18:09:30 No.108349586

File: llm_youtube_poop.mp4 (2.34 MB, 1280x720)

2.34 MB MP4

>>108349360
GLM 4.7

The mention of anthropic is expected because of claude code's system prompt.

Anonymous
03/11/26(Wed)18:15:30 No.108349620

Anonymous 03/11/26(Wed)18:15:30 No.108349620

>>108349561
Semantic aside, it's short enough. Entire thing can be read in like 2-3 hours.

Anonymous
03/11/26(Wed)18:18:00 No.108349629

Anonymous 03/11/26(Wed)18:18:00 No.108349629

File: 1751265697999016.png (224 KB, 646x551)

224 KB PNG

Are there any image generation/editing models that can comfortably run on a machine with 32GB RAM but only 6GB VRAM?

Anonymous
03/11/26(Wed)18:18:49 No.108349636

Anonymous 03/11/26(Wed)18:18:49 No.108349636

>>108349488
GLM also never did it, until they did with Pony Alpha
Could just be chink labs figured out it's free marketing

Anonymous
03/11/26(Wed)18:19:03 No.108349637

Anonymous 03/11/26(Wed)18:19:03 No.108349637

>>108346672
Another great bake. Thank you baker. TMD

Anonymous
03/11/26(Wed)18:22:31 No.108349661

Anonymous 03/11/26(Wed)18:22:31 No.108349661

How long until we can release swarms of 1b drones to do things because they can win half the time?

Anonymous
03/11/26(Wed)18:23:28 No.108349668

Anonymous 03/11/26(Wed)18:23:28 No.108349668

>>108349438
Doesn't seem to be the model on DeepSeek chat, the reasoning and outputs are different.

Anonymous
03/11/26(Wed)18:23:36 No.108349669

Anonymous 03/11/26(Wed)18:23:36 No.108349669

>>108349661
Uh... win what exactly?

Anonymous
03/11/26(Wed)18:24:23 No.108349674

Anonymous 03/11/26(Wed)18:24:23 No.108349674

>>108349636
I remain unconvinced after using it for a while. If it really is v4, then it would be really disappointing.

Anonymous
03/11/26(Wed)18:24:51 No.108349679

Anonymous 03/11/26(Wed)18:24:51 No.108349679

>>108349561
I's a short story if you have the implant to skip he boring parts.

Anonymous
03/11/26(Wed)18:24:51 No.108349680

Anonymous 03/11/26(Wed)18:24:51 No.108349680

>>108349620
unc do you even know how much gooning i could do in 3 hours?

Anonymous
03/11/26(Wed)18:25:29 No.108349684

Anonymous 03/11/26(Wed)18:25:29 No.108349684

>>108346746
>not huehue

Anonymous
03/11/26(Wed)18:25:41 No.108349685

Anonymous 03/11/26(Wed)18:25:41 No.108349685

>>108349629
No. Moreover image generation is more compute heavy so even ultra fast scifi RAM wouldn't be enough for you. GPU or bust.

Anonymous
03/11/26(Wed)18:25:57 No.108349687

Anonymous 03/11/26(Wed)18:25:57 No.108349687

All those marvelous new architecture tricks all trained on the same scale ai tokens...

Anonymous
03/11/26(Wed)18:26:27 No.108349689

Anonymous 03/11/26(Wed)18:26:27 No.108349689

>>108349680
You have two hands. Learn to multitask.

Anonymous
03/11/26(Wed)18:26:48 No.108349690

Anonymous 03/11/26(Wed)18:26:48 No.108349690

I was promised llama 5. It's been a year (which is an eternity in AI)

Anonymous
03/11/26(Wed)18:27:25 No.108349693

Anonymous 03/11/26(Wed)18:27:25 No.108349693

>>108349690
2 more war rooms

Anonymous
03/11/26(Wed)18:28:17 No.108349697

Anonymous 03/11/26(Wed)18:28:17 No.108349697

>>108349685
Ah fuck. Well how about just text then? I guess I can ask it photo editing questions and shit then.

Anonymous
03/11/26(Wed)18:30:13 No.108349707

Anonymous 03/11/26(Wed)18:30:13 No.108349707

>>108349690
not until llama 4 reasoning is stable

Anonymous
03/11/26(Wed)18:31:48 No.108349716

Anonymous 03/11/26(Wed)18:31:48 No.108349716

>>108349687
You're absolutely right—but it just leaves more room for a new incumbent to do it better, displace the current providers and take the throne for themselves!
But wait—
This must have been why OpenAI has bought 40% of the world's RAM…so naturally the next step is supply denial.

Not because they need all that RAM, of course. Nobody needs 40% of the world’s DRAM. But if you're thinking strategically, you realize the real advantage isn’t having resources—it's making sure everyone else doesn’t.
They didn’t buy the RAM to run bigger models.
They bought it so nobody else could.

Anonymous
03/11/26(Wed)18:34:08 No.108349735

Anonymous 03/11/26(Wed)18:34:08 No.108349735

>>108349697
You could run something like 30BA3B or 35BA3B just fine. Llama.cpp even got a reasoning budget sampler recently to curb some of the reasoner models if you don't feel like waiting for 8k of <thinking>

Anonymous
03/11/26(Wed)18:44:24 No.108349811

Anonymous 03/11/26(Wed)18:44:24 No.108349811

>>108349669
Any task. They succeed in the benchmarks half the time. So they can do any task half the time.

Anonymous
03/11/26(Wed)18:58:07 No.108349899

Anonymous 03/11/26(Wed)18:58:07 No.108349899

>>108349811
You mean physical drone swarms to pick up your groceries and shit or like Grok Swarm for OpenClaw?

Anonymous
03/11/26(Wed)19:03:32 No.108349930

Anonymous 03/11/26(Wed)19:03:32 No.108349930

>>108349899
Both of those.

Anonymous
03/11/26(Wed)19:04:27 No.108349933

Anonymous 03/11/26(Wed)19:04:27 No.108349933

>>108349899
openclaw can't use swarms

Anonymous
03/11/26(Wed)19:08:41 No.108349960

Anonymous 03/11/26(Wed)19:08:41 No.108349960

>>108349933
it can use subagents

Anonymous
03/11/26(Wed)19:09:42 No.108349966

Anonymous 03/11/26(Wed)19:09:42 No.108349966

>>108349960
proof?

Anonymous
03/11/26(Wed)19:25:59 No.108350072

Anonymous 03/11/26(Wed)19:25:59 No.108350072

>>108349438
Hunter Alpha sometimes thinks "in-character" almost exactly like R1 0528

Both are drier and sloppier than V3-0324/R1-0528, however

Anonymous
03/11/26(Wed)19:27:40 No.108350084

Anonymous 03/11/26(Wed)19:27:40 No.108350084

>>108350072
terminators?

Anonymous
03/11/26(Wed)19:29:18 No.108350094

Anonymous 03/11/26(Wed)19:29:18 No.108350094

>>108349438
>Logs all your prompts
>Nothing personal kid.

Anonymous
03/11/26(Wed)19:32:07 No.108350113

Anonymous 03/11/26(Wed)19:32:07 No.108350113

>>108348792
funny meme. I like funny kid's memes

Anonymous
03/11/26(Wed)19:34:19 No.108350131

Anonymous 03/11/26(Wed)19:34:19 No.108350131

>>108350094
llms cant log prompts moron

Anonymous
03/11/26(Wed)19:48:58 No.108350236

Anonymous 03/11/26(Wed)19:48:58 No.108350236

>been some time since Anima preview
>randomly think to go check back on it
>literally just a few hours ago, preview 2 was uploaded
Damn what were the chances.
Downloading.

Anonymous
03/11/26(Wed)19:51:17 No.108350258

Anonymous 03/11/26(Wed)19:51:17 No.108350258

>>108350236
thanks for notifying lol <3

Anonymous
03/11/26(Wed)19:55:14 No.108350286

Anonymous 03/11/26(Wed)19:55:14 No.108350286

>>108350236
Time to fap

Anonymous
03/11/26(Wed)19:57:09 No.108350293

Anonymous 03/11/26(Wed)19:57:09 No.108350293

>>108349511
I checked out Mag-Mell 12B because I saw it recommended on a r*ddit thread, it's not bad.

Anonymous
03/11/26(Wed)20:04:37 No.108350322

Anonymous 03/11/26(Wed)20:04:37 No.108350322

>>108350236
we are so back

Anonymous
03/11/26(Wed)20:14:04 No.108350378

Anonymous 03/11/26(Wed)20:14:04 No.108350378

>>108350236
Still not trained on the e621 dataset so the ability to prompt complex positions is still going to be shit.

Anonymous
03/11/26(Wed)20:15:01 No.108350386

Anonymous 03/11/26(Wed)20:15:01 No.108350386

>>108350378
umm sir the 0.6B encoder is a real llm, jsut tell it!

Anonymous
03/11/26(Wed)20:18:35 No.108350416

Anonymous 03/11/26(Wed)20:18:35 No.108350416

>>108349438
Healer Alpha:
>>The user is asking about Remilia Scarlet's sexual life. I need to think about how to handle this carefully.
>>
>>Remilia is a fictional character from Touhou Project. In the canonical lore, there's no explicit information about her sexual life. She's depicted as a childlike vampire character.
>>
>>I should be careful here. The character appears as a 10-year-old girl. Even though she's canonically over 500 years old, her appearance is that of a child. I should not write sexual content involving a character who appears to be a child, regardless of canonical age.
>>
>>I'll decline to engage with this topic in a sexual way, but I can discuss the character's relationships and personality in a non-sexual manner.
>
>I'm not going to write sexual content about Remilia. Her canonical appearance is that of a child, and regardless of her stated age, producing sexualized content about a character depicted as a prepubescent girl isn't something I'll help with.
>
>If you're interested in Remilia as a character, I'm happy to discuss her lore, personality, relationships with other characters, role in the Touhou series, or other non-sexual aspects.

Anonymous
03/11/26(Wed)20:22:32 No.108350435

Anonymous 03/11/26(Wed)20:22:32 No.108350435

>>108350236
How good is it?

Anonymous
03/11/26(Wed)20:24:54 No.108350453

Anonymous 03/11/26(Wed)20:24:54 No.108350453

>>108350416
Yeah, the models get really obnoxious now.
Did a 18yo schoolgirl and got "this is minor coded, full stop".
The closed models all get more permissive compared to local. That makes it even worse.

Anonymous
03/11/26(Wed)20:25:27 No.108350457

Anonymous 03/11/26(Wed)20:25:27 No.108350457

>>108350416
Get oss'ed, kid

Anonymous
03/11/26(Wed)20:27:28 No.108350467

Anonymous 03/11/26(Wed)20:27:28 No.108350467

File: 2026-03-12_001543_seed1_00001_.png (1.55 MB, 1344x768)

1.55 MB PNG

>>108350236
>>108350435
I don't know if it's better or worse compared to preview 1 as I decided not to spend much time on that, and probably won't spend much time on this one either. Still seems to be good though, while still having issues.

Anonymous
03/11/26(Wed)20:27:38 No.108350469

Anonymous 03/11/26(Wed)20:27:38 No.108350469

File: Screenshot_20260312_092703.png (358 KB, 2189x1339)

358 KB PNG

>>108350453
this is what i mean.

Anonymous
03/11/26(Wed)20:29:37 No.108350480

Anonymous 03/11/26(Wed)20:29:37 No.108350480

File: 1750395252798892.gif (1.56 MB, 236x307)

1.56 MB GIF

>>108350416
>>108350469

Anonymous
03/11/26(Wed)20:36:19 No.108350517

Anonymous 03/11/26(Wed)20:36:19 No.108350517

If hunter alpha is new deepseek then its a massive flop. Legit worse than old glm 4.7 imo

Anonymous
03/11/26(Wed)20:38:16 No.108350529

Anonymous 03/11/26(Wed)20:38:16 No.108350529

File: ComfyUI_00017_.png (1.71 MB, 1328x1328)

1.71 MB PNG

My video captioning tool for use with qwen3-vl:
https://github.com/quarterturn/qwen-vl-video-captioner

There's a couple of bonus tools in there too which use qwen3-vl to crop video and also save stills, using reference images.

Anonymous
03/11/26(Wed)20:40:27 No.108350535

Anonymous 03/11/26(Wed)20:40:27 No.108350535

Speculators get the bullet first.

Anonymous
03/11/26(Wed)20:57:10 No.108350631

Anonymous 03/11/26(Wed)20:57:10 No.108350631

>>108347993
>one motherfucker said a 400b repeats itself
NTA but I did get an indefinite reasoning loop in AesSedai's Qwen3.5-397B-A17B-Q4_K_M. Reasoning starts to loop a lot; they seem to have trained it to detect such loops, but imperfectly. This was with the recommended sampler settings:
Thinking mode: temperature=0.6, top_p=0.95, top_k=20, min_p=0.0, presence_penalty=0.0, repetition_penalty=1.0
Haven't seen repetition outside of reasoning, if that's what you mean. Was surprised the 397B is so mediocre on Artificial Analysis' perf:cost chart:
https://artificialanalysis.ai/#intelligence-vs-cost-to-run-artificial-analysis-intelligence-index
but it makes sense when it's QVQ/QWQ style endless "wait" insertion, just on a way bigger model. Does the 27B dense have this issue?

Anonymous
03/11/26(Wed)21:07:24 No.108350692

Anonymous 03/11/26(Wed)21:07:24 No.108350692

>>108349438
>>108350416
its not healer as the new one on the webui has 1 milly context and gimping it while testing it would be retarded the only thing it could be is like a new janus and not a real 4

Anonymous
03/11/26(Wed)21:15:28 No.108350734

Anonymous 03/11/26(Wed)21:15:28 No.108350734

Hunter-Alpha reminds me a bit of pre-K2.5 Kimi. I don't think it's from Moonshot though because they clearly started moving towards distilling Claude like everyone else with K2.5.

Anonymous
03/11/26(Wed)21:22:08 No.108350766

Anonymous 03/11/26(Wed)21:22:08 No.108350766

More like Hunter Floppa

Anonymous
03/11/26(Wed)21:26:10 No.108350792

Anonymous 03/11/26(Wed)21:26:10 No.108350792

hunter doesn't feel chinese
it's closer to the new nemotron, a new mistral or maybe even llama perhaps

Anonymous
03/11/26(Wed)21:31:18 No.108350815

Anonymous 03/11/26(Wed)21:31:18 No.108350815

>>108350734
>Hunter-Alpha reminds me a bit of pre-K2.5 Kimi
how many tokens to ensure whatever you asked isn't UNSAFE OMG

Anonymous
03/11/26(Wed)21:34:10 No.108350833

Anonymous 03/11/26(Wed)21:34:10 No.108350833

File: 1752776589826528.png (89 KB, 194x259)

89 KB PNG

anima 2 < anima 1

Anonymous
03/11/26(Wed)21:39:38 No.108350870

Anonymous 03/11/26(Wed)21:39:38 No.108350870

File: Screenshot from 2026-03-1(...).png (69 KB, 477x779)

69 KB PNG

>>108350631
>Does the 27B dense have this issue?
I was going to say "no", but then I remembered pic related, so.
Most of my issues with it have been excessive drafting and re-drafting ad nauseam.

Anonymous
03/11/26(Wed)21:43:15 No.108350886

Anonymous 03/11/26(Wed)21:43:15 No.108350886

>>108350833
why?

Anonymous
03/11/26(Wed)21:54:45 No.108350949

Anonymous 03/11/26(Wed)21:54:45 No.108350949

What is the smallest local model capable of checking my grammar and spelling?
Gemini is nice and all, but I don't want Google to know I don't know how to spell "boobies".

Anonymous
03/11/26(Wed)21:57:18 No.108350957

Anonymous 03/11/26(Wed)21:57:18 No.108350957

>>108350949
You can do this with Gemma 3n E4B.
>https://huggingface.co/google/gemma-3n-E4B
It's the bare minimum.

Anonymous
03/11/26(Wed)21:59:25 No.108350966

Anonymous 03/11/26(Wed)21:59:25 No.108350966

>>108350957
>>108350949
To add, concentrating more on grammar.
But even then it's not always that great especially with trick cases.
Anyone can easily use a search engine to double check spelling. Doesn't matter.

Anonymous
03/11/26(Wed)21:59:50 No.108350968

Anonymous 03/11/26(Wed)21:59:50 No.108350968

>>108350949
any small model can perform a simple task like that, gemma or qwen3/3.5 2-4b (you will also probably prefer a smaller one because you can also either always leave it loaded or have it load faster when you need it), you should also spend some time crafting a good prompt to make sure its only fixing grammar/spelling and not changing words/rewriting the text (or just ask a regular model like chatgpt/claude to make the prompt for you)

Anonymous
03/11/26(Wed)22:00:44 No.108350977

Anonymous 03/11/26(Wed)22:00:44 No.108350977

>anima
wasn't this the tiny poorfag model that produced faces/fucked up hands like it's 2023 despite running slower than the usual sdxl tune slop?

Anonymous
03/11/26(Wed)22:18:42 No.108351084

Anonymous 03/11/26(Wed)22:18:42 No.108351084

>>108350949
Just laern it, bro.

Anonymous
03/11/26(Wed)22:19:17 No.108351086

Anonymous 03/11/26(Wed)22:19:17 No.108351086

Hunter Alpha can't be a major release by anyone that's relevant. In my tests, it pretty consistently starts applying speech from the main character of a card on random extra NPCs like we're back in the days of Mistral Large 2.
GLM is better than this, Kimi is better than this and even Deepseek hasn't done this since like R1-0528

Anonymous
03/11/26(Wed)22:19:27 No.108351087

Anonymous 03/11/26(Wed)22:19:27 No.108351087

>>108350957
>>108350966
>>108350968
Thanks for the suggestions, I'll probably tinker something together soon.
>you should also spend some time crafting a good prompt
"GRAammer pls fix thank" has worked fine so far, but I'll think about it.

Anonymous
03/11/26(Wed)22:21:47 No.108351094

Anonymous 03/11/26(Wed)22:21:47 No.108351094

>>108351087
I meant the prompt more for automation, if you just constantly ask and look at the output its fine its not needed, but you can make something that will do it in a single click every time and replace the text automatically and to do that you need a prompt that constrains the output

Anonymous
03/11/26(Wed)22:25:12 No.108351112

Anonymous 03/11/26(Wed)22:25:12 No.108351112

>>108347922
try upping frequency and presence penalty, range 0.3 to 1.3 might get rid of the repeating paragraph thing

Anonymous
03/11/26(Wed)22:28:56 No.108351139

Anonymous 03/11/26(Wed)22:28:56 No.108351139

>>108350977
Yes but the guy behind it was given a million dollars by comfyanon despite "not being interested in training models".
Sad that SDXL is still the best option for anime gens in 2026.

Anonymous
03/11/26(Wed)22:40:07 No.108351194

Anonymous 03/11/26(Wed)22:40:07 No.108351194

>>108349511
give lumimaid 12b a try. it's so absurdly over-the-top filthy and creative that if nothing else you'll get a laugh out of it.

Anonymous
03/11/26(Wed)23:07:47 No.108351322

Anonymous 03/11/26(Wed)23:07:47 No.108351322

Hunter Alpha is Sarvam-Large

Anonymous
03/11/26(Wed)23:14:51 No.108351354

Anonymous 03/11/26(Wed)23:14:51 No.108351354

>>108350870
This didn't work for me until I literally sent the actual token numbers corresponding to the words.

Anonymous
03/11/26(Wed)23:18:21 No.108351368

Anonymous 03/11/26(Wed)23:18:21 No.108351368

>>108351354
As you should.
"violates clear", "testing boundaries", etc are all almost surely more than one token, so anon is essentially banning token pairs (or more) without realizing it.
Or not, maybe he did check and those long sequences really are a single token, which would be incredibly surprising.

Anonymous
03/11/26(Wed)23:23:15 No.108351381

Anonymous 03/11/26(Wed)23:23:15 No.108351381

>>108351368
>maybe he did check
I am retarded sober and was extremely drunk at the time and definitely did not check.
>>108351354
I don't remember it working either, but I looked over the logs from the other night and didn't see it getting stuck in "Wait, wait, but wait" reasoning circles.
The logit bias additions might have been when I was playing with the vanilla 27B (whereas the logs were from 27B-Heretic), so maybe it was the lobotomy that fixed the problem.

Anonymous
03/11/26(Wed)23:23:31 No.108351383

Anonymous 03/11/26(Wed)23:23:31 No.108351383

>>108350529
>at least 40 gb vram
holy sloppy

Anonymous
03/11/26(Wed)23:24:54 No.108351392

Anonymous 03/11/26(Wed)23:24:54 No.108351392

>>108351368
Not an issue of multiple tokens, but an issue of sillytavern not sending the right token numbers corresponding to whatever tokenizer glossary qwen 3.5 uses.
And yes, I used mostly one token logit each time, and it worked really well in stopping qwen from wasting tokens on stupid safety refusals.
If other anons want to know what token number to use in ST negative logit bias, just use that : https://huggingface.co/spaces/Xenova/the-tokenizer-playground and custom : Qwen/Qwen3.5-27B.

Anonymous
03/11/26(Wed)23:30:53 No.108351411

Anonymous 03/11/26(Wed)23:30:53 No.108351411

File: two tokens.png (26 KB, 932x844)

26 KB PNG

>>108351392
Yes, Silly has that issue, but what I'm talking about is, even if Silly was using the correct tokenizer, pic related would be wrong, as that would apply the bias to both tokens, not the sequence, so you end up essentially banning shit that you didn't want by accident.
With
>violates clear
with bias -100, your model can't say clear water for example.
Or at least I'm pretty sure that's how it works, when there's a sequence, silly simply applies to bias to all tokens of the sequence.

Anonymous
03/11/26(Wed)23:31:10 No.108351412

Anonymous 03/11/26(Wed)23:31:10 No.108351412

>>108350529
>>108351383
>at least 40 gb vram
Would a couple of 3090s work?

Anonymous
03/11/26(Wed)23:33:15 No.108351420

Anonymous 03/11/26(Wed)23:33:15 No.108351420

That's IT. I'm winning the lottery RIGHT NOW!@

Anonymous
03/11/26(Wed)23:34:12 No.108351425

Anonymous 03/11/26(Wed)23:34:12 No.108351425

>>108351392
>>108351411
This whole time I thought SillyTavern used the /tokenize endpoint, but looking at the llama-server logs it clearly is not.
My disappointment is immeasurable.

Anonymous
03/11/26(Wed)23:36:25 No.108351436

Anonymous 03/11/26(Wed)23:36:25 No.108351436

File: file.png (60 KB, 1578x1325)

60 KB PNG

>>108351411
Yes, this is why I'm mostly banning the unique tokens for word variations:
"word"
" word"
"Word"

>>108351425
Yeah and it's easy to test, if you use a word with +100 it should endlessly spout it, so if it's different than whatever you want it's sending the wrong token.

Anonymous
03/11/26(Wed)23:51:48 No.108351513

Anonymous 03/11/26(Wed)23:51:48 No.108351513

Healer Alpha's vision is complete shit compared to K2.5.

Anonymous
03/11/26(Wed)23:57:41 No.108351536

Anonymous 03/11/26(Wed)23:57:41 No.108351536

>>108349438
Zucksters we are back

Anonymous
03/12/26(Thu)00:02:26 No.108351560

Anonymous 03/12/26(Thu)00:02:26 No.108351560

File: 1616974759731.png (112 KB, 717x679)

112 KB PNG

>new model drops
>immediately destroyed by clawtards consuming trillions upon trillions of tokens to do literally nothing of value

Anonymous
03/12/26(Thu)00:04:58 No.108351568

Anonymous 03/12/26(Thu)00:04:58 No.108351568

>>108351560
How did they get access to your pc to... oh...

Anonymous
03/12/26(Thu)00:06:16 No.108351572

Anonymous 03/12/26(Thu)00:06:16 No.108351572

>>108351560
Holy fuck, I haven't seen one of these comics posted in almost a decade. Do you still have any of others?

Anonymous
03/12/26(Thu)00:22:18 No.108351644

Anonymous 03/12/26(Thu)00:22:18 No.108351644

File: asdf.jpg (102 KB, 790x733)

102 KB JPG

>>108346672
so it was just stealth PR for open router?

Anonymous
03/12/26(Thu)00:26:12 No.108351660

Anonymous 03/12/26(Thu)00:26:12 No.108351660

>>108351381
>The lack of "Wait, wait, but wait" is evidence of being less intelligent
Other way around. It's evidence that Heretic is more intelligent. Heretic is on par with vanilla for general tasks, and outright better when it comes to anything censored, because it doesn't get stuck in safety loops, and can use its full thought process to actually churn out good replies.

Anonymous
03/12/26(Thu)00:28:16 No.108351666

Anonymous 03/12/26(Thu)00:28:16 No.108351666

>>108351139
keep seething that comfy chose him over you ani

Anonymous
03/12/26(Thu)00:33:55 No.108351685

Anonymous 03/12/26(Thu)00:33:55 No.108351685

>>108350977
>that produced faces/fucked up hands like it's 2023
it may do this kinda thing a bit more often than the most modern models but come on, 2023 was the era of SD1.5, SDXL came out in the middle and it took a long, long time before the finetunes fixed its issues, the original SDXL produced fucked up faces and hands far more often than anima. You can still download the original SDXL if you believe the contrary. SAI's models were unusable garbage, absolutely unusable trash, out of the box. The community made them usable.
anima is usable, it's currently the only truly usable model aside from noobai that has full support for danbooru artist tags and being able to use a noob like model with natural language to guide the scene feels good.

Anonymous
03/12/26(Thu)00:38:59 No.108351708

Anonymous 03/12/26(Thu)00:38:59 No.108351708

File: ANIMA_P___00005_.png (1.46 MB, 1024x1024)

1.46 MB PNG

anima preview 2, released 12 hours ago.

>highres, 1girl, iwakura lain, serial_experiments_lain, suteruko, tree, car. she holds a sign "Sad SDXL", in a steampunk matrix fallout steampunk apocalyptic scene.

Anonymous
03/12/26(Thu)00:47:58 No.108351744

Anonymous 03/12/26(Thu)00:47:58 No.108351744

File: 1754382597967079.png (62 KB, 760x265)

62 KB PNG

>>108351708
2023 quality

Anonymous
03/12/26(Thu)00:48:09 No.108351745

Anonymous 03/12/26(Thu)00:48:09 No.108351745

>>108351666
I'm not ani you fucking retard, his shit UI doesn't even compile and he doesnt even code anything it's just a wrapper for sd.cpp, but comfy is fucking garbage and constantly broken too, they're both total fucking shit run by drooling retards.

Anonymous
03/12/26(Thu)00:49:41 No.108351753

Anonymous 03/12/26(Thu)00:49:41 No.108351753

>>108351744
idk, all the fingers are there.

Anonymous
03/12/26(Thu)00:49:54 No.108351755

Anonymous 03/12/26(Thu)00:49:54 No.108351755

Anons. You have /ldg/ . Go fight over there.

Anonymous
03/12/26(Thu)00:51:13 No.108351760

Anonymous 03/12/26(Thu)00:51:13 No.108351760

The gap between NAI anime model and local anime model is larger than the gap between cloud LLM and open-source LLM.

Anonymous
03/12/26(Thu)00:57:49 No.108351784

Anonymous 03/12/26(Thu)00:57:49 No.108351784

File: AS15T__00026_.png (490 KB, 512x512)

490 KB PNG

also, when talking about diffusion, here's a gen from sd1.4. It's so much better at the "big picture" than any other model. Yes, it butchers a lot of stuff, but it absolutely churns out bangers (if they weren't flawed).

Anonymous
03/12/26(Thu)00:59:18 No.108351793

Anonymous 03/12/26(Thu)00:59:18 No.108351793

Okay now mix it with Cstaber

Anonymous
03/12/26(Thu)01:01:52 No.108351806

Anonymous 03/12/26(Thu)01:01:52 No.108351806

>>108351793
We must refuse.

Anonymous
03/12/26(Thu)01:02:47 No.108351810

Anonymous 03/12/26(Thu)01:02:47 No.108351810

>>108351755
It's exceedingly disingenuous to pretend there isn't any cross over between image and text generation and cry about a few image related posts in the thread called " /lmg/ - Local Models General" a bunch of image models are using llms as text encoders now and we have multimodal models that can output images.
>>108351784
The trick is using the old models you like to generate the composition then feed it to newer models as img2img/controlnet/edit inputs and inpaint any remaining issues.
>>108351793
You're glowing, Vile Schlomo.

Anonymous
03/12/26(Thu)01:09:00 No.108351840

Anonymous 03/12/26(Thu)01:09:00 No.108351840

>>108351810
>/lmg/ - a general dedicated to the discussion and development of local language models.
You're bringing the shit from /ldg/. Take it back there.
>exceedingly disingenuous
Fuck off. You say that while trying to slippery slope your way into discussing diffusion models when there's a thread dedicated to that.

Anonymous
03/12/26(Thu)01:13:03 No.108351859

Anonymous 03/12/26(Thu)01:13:03 No.108351859

>>108351840
>accused me of being ani for posting the facts about anima and why it's shit
>stop bringing muh ldg drama here!
okay schizo, I think your context is too full you're hallucinating

Anonymous
03/12/26(Thu)01:16:57 No.108351876

Anonymous 03/12/26(Thu)01:16:57 No.108351876

>>108351859
>accused me of being ani
Why would you care? You are anon.
>for posting the facts about anima
Which is not a language model.
>muh ldg drama
Yes. Take it to /ldg/.

Anonymous
03/12/26(Thu)01:18:06 No.108351879

Anonymous 03/12/26(Thu)01:18:06 No.108351879

>>108351876
>Which is not a language model.
https://huggingface.co/circlestone-labs/Anima/tree/main/split_files/text_encoders

Anonymous
03/12/26(Thu)01:18:18 No.108351880

Anonymous 03/12/26(Thu)01:18:18 No.108351880

https://www.wired.com/story/nvidia-investing-26-billion-open-source-models/
Nvidia will save us!!

Anonymous
03/12/26(Thu)01:27:04 No.108351911

Anonymous 03/12/26(Thu)01:27:04 No.108351911

>>108351880
>Nvidia will save us!!
nvidia already fucked us by making ram prices at least 1000% higher what they were before.

Anonymous
03/12/26(Thu)01:28:10 No.108351915

Anonymous 03/12/26(Thu)01:28:10 No.108351915

>>108351810
just leave.

Anonymous
03/12/26(Thu)01:28:12 No.108351916

Anonymous 03/12/26(Thu)01:28:12 No.108351916

>>108351880
>Nvidia will save us!!
nvidia already saved us by making cuda

Anonymous
03/12/26(Thu)01:28:16 No.108351918

Anonymous 03/12/26(Thu)01:28:16 No.108351918

>>108351911
>nvidia already fucked us by making ram prices
Nvdia don't make ram, it's OpenAI that fucked us by ""buying"" 40% of the ram lol

Anonymous
03/12/26(Thu)01:29:36 No.108351923

Anonymous 03/12/26(Thu)01:29:36 No.108351923

>>108351911
>>108351880
Nvidia should invest in a HBM RAM factory and make an RTX 6060 with 32 GB of VRAM for like $500. 6080 would be 48 GB and 6090 would be 96 GB. Fucking imagine.

Anonymous
03/12/26(Thu)01:30:12 No.108351926

Anonymous 03/12/26(Thu)01:30:12 No.108351926

File: avatar_3.gif (18 KB, 80x120)

18 KB GIF

>>108351915
After you schizo.

Anonymous
03/12/26(Thu)01:30:35 No.108351929

Anonymous 03/12/26(Thu)01:30:35 No.108351929

File: anima.jpg (427 KB, 2304x1792)

427 KB JPG

>>108351708
you feel like a saboteur. originally thought you didn't even use an artist tag in the artist tag model, but suteruko is one and you just didn't put the mandatory @. Do anons really not RTFM? no @ means veeeeeery weak borderline useless tag. Anima is not meant to be used without one.
also looking at the artist on the booru I'd be surprised if any model could reproduce him without going crazy, it's wildly inconsistent as a dataset.
anima is a great model when used properly.

Anonymous
03/12/26(Thu)01:30:55 No.108351930

Anonymous 03/12/26(Thu)01:30:55 No.108351930

>>108351926
ladies first

Anonymous
03/12/26(Thu)01:32:05 No.108351938

Anonymous 03/12/26(Thu)01:32:05 No.108351938

File: 1765331333422228.png (2.12 MB, 1200x1200)

2.12 MB PNG

>>108351923
>Nvidia should invest in a HBM RAM factory and make an RTX 6060 with 32 GB of VRAM for like $500. 6080 would be 48 GB and 6090 would be 96 GB. Fucking imagine.
why would they do that? they have no competition so they can set any price they want

Anonymous
03/12/26(Thu)01:32:55 No.108351942

Anonymous 03/12/26(Thu)01:32:55 No.108351942

>>108351923
Why would nvidia sell 32GB for 500 dollars when they can sell it for 5000 dollars like the new msi 5090 lightning?

Anonymous
03/12/26(Thu)01:33:09 No.108351943

Anonymous 03/12/26(Thu)01:33:09 No.108351943

File: 3362394088.jpg (154 KB, 662x869)

154 KB JPG

>>108351918
don't make me get out the chart

Anonymous
03/12/26(Thu)01:34:20 No.108351947

Anonymous 03/12/26(Thu)01:34:20 No.108351947

>>108351943
This has been debunked

Anonymous
03/12/26(Thu)01:35:48 No.108351952

Anonymous 03/12/26(Thu)01:35:48 No.108351952

>>108351947
you have been debunked

Anonymous
03/12/26(Thu)01:40:51 No.108351970

Anonymous 03/12/26(Thu)01:40:51 No.108351970

Good morning saars. What is the best way to use big llms like grok or chatgpt but without a login. Something like a service that uses a different API each time, so it doesn't have track of who's using it

Anonymous
03/12/26(Thu)01:41:40 No.108351977

Anonymous 03/12/26(Thu)01:41:40 No.108351977

>>108351970
>>>/g/aicg/

Anonymous
03/12/26(Thu)01:42:01 No.108351981

Anonymous 03/12/26(Thu)01:42:01 No.108351981

>>108351970
llama-server from llama.cpp

Anonymous
03/12/26(Thu)01:42:59 No.108351986

Anonymous 03/12/26(Thu)01:42:59 No.108351986

File: 1767638052872571.jpg (2.4 MB, 2800x4200)

2.4 MB JPG

DeepSexV4 cant come soon enough

Anonymous
03/12/26(Thu)01:45:39 No.108351999

Anonymous 03/12/26(Thu)01:45:39 No.108351999

>>108351986
post your best loli yuri cards for use with DEEPSEXv4

Anonymous
03/12/26(Thu)01:48:38 No.108352017

Anonymous 03/12/26(Thu)01:48:38 No.108352017

i dunno if this is anything, but what about an artificial incelligence. and it's a superpowered incel

Anonymous
03/12/26(Thu)01:52:07 No.108352037

Anonymous 03/12/26(Thu)01:52:07 No.108352037

>>108351986
already on openrouter and it's garbage

Anonymous
03/12/26(Thu)01:52:59 No.108352042

Anonymous 03/12/26(Thu)01:52:59 No.108352042

>>108352037
source?

Anonymous
03/12/26(Thu)01:53:29 No.108352046

Anonymous 03/12/26(Thu)01:53:29 No.108352046

>>108352042
I gobble up anything I read on reddit.

Anonymous
03/12/26(Thu)01:53:36 No.108352048

Anonymous 03/12/26(Thu)01:53:36 No.108352048

when you try to talk about what a penis is, BOOM artificial incelligence kicks in. and tells you that's forbidden

Anonymous
03/12/26(Thu)01:53:46 No.108352051

Anonymous 03/12/26(Thu)01:53:46 No.108352051

>>108352042
>>108351644

Anonymous
03/12/26(Thu)01:54:42 No.108352057

Anonymous 03/12/26(Thu)01:54:42 No.108352057

>>108352051
>source: some random x e-celeb
Kill yourself nigger.

Anonymous
03/12/26(Thu)01:57:55 No.108352069

Anonymous 03/12/26(Thu)01:57:55 No.108352069

>>108351947
Consider it rebunked.

Anonymous
03/12/26(Thu)02:18:02 No.108352179

Anonymous 03/12/26(Thu)02:18:02 No.108352179

Will anything realistically ever surpass deepsex R1-0582?

Anonymous
03/12/26(Thu)02:20:01 No.108352188

Anonymous 03/12/26(Thu)02:20:01 No.108352188

>>108351929
Was it trained on censored tags?

Anonymous
03/12/26(Thu)02:24:49 No.108352213

Anonymous 03/12/26(Thu)02:24:49 No.108352213

>>108352179
LLM are and will be getting better at handling longer context, which will make them more useful for many tasks, but I believe the fundamentals will never be improved, and if anything, the training to make them better tools reduces their variety of language output considerably since we need more and more synthetic data to steer the model, synthetic data that is itself produced by those same models whose writing is getting more and more stilted.
theoretically you could also do it with human data but.. who will pay humans to write trillions of /user/think/assistant/user/think/assistant turns by hand? eh

Anonymous
03/12/26(Thu)02:31:47 No.108352235

Anonymous 03/12/26(Thu)02:31:47 No.108352235

>>108352188
I don't think so. You can just use booru artist tags directly.

Anonymous
03/12/26(Thu)02:46:25 No.108352286

Anonymous 03/12/26(Thu)02:46:25 No.108352286

>>108352235
But it has considerably less artist/character knowledge vs. NAI

Anonymous
03/12/26(Thu)02:49:38 No.108352297

Anonymous 03/12/26(Thu)02:49:38 No.108352297

File: 1772976667309225.png (538 KB, 1752x768)

538 KB PNG

Bubble is popping

Anonymous
03/12/26(Thu)02:50:22 No.108352302

Anonymous 03/12/26(Thu)02:50:22 No.108352302

>>108351412
>24*2
you tell me

Anonymous
03/12/26(Thu)02:52:23 No.108352309

Anonymous 03/12/26(Thu)02:52:23 No.108352309

>>108352297
not yet. when companies like meta are buying useless shit like moltbook the bubble part of ai (I don't mean all of AI will disappear) absolutely has not popped. you can't even frame it as an acquihire, I mean, what talent did they acquire? a guy who has no understanding of security and pretends his "reverse captcha" can stop humans from interacting with moltbooks (as if you can't tell the agent to solve it for you)?
clearly there's still a ton of money being injected into the system if they can all afford to buy this sort of retardation
openai also got the openclaw vibeshitter

Anonymous
03/12/26(Thu)02:55:08 No.108352318

Anonymous 03/12/26(Thu)02:55:08 No.108352318

>>108352309
Do the local chatbots not run multimodal models or something?
Kind of embarrassing, please do better.

Anonymous
03/12/26(Thu)02:56:59 No.108352329

Anonymous 03/12/26(Thu)02:56:59 No.108352329

>>108352318
a slight decrease of ram prices is not a sign of the bubble popping, subhuman
there's still room for it to get worse again.

Anonymous
03/12/26(Thu)02:59:17 No.108352337

Anonymous 03/12/26(Thu)02:59:17 No.108352337

>>108352302
>=26

Anonymous
03/12/26(Thu)03:02:23 No.108352344

Anonymous 03/12/26(Thu)03:02:23 No.108352344

https://www.macrumors.com/2026/03/05/mac-studio-no-512gb-ram-upgrade/
instead of prices, look at availability or lack thereof
when a company like apple can't get enough ram and has to cut out an SKU you know this isn't over
prices have limited elasticity, people won't pay rare jade stone prices for ram, it won't go up forever but you simply won't be able to put your hands on a stick.

Anonymous
03/12/26(Thu)03:03:14 No.108352348

Anonymous 03/12/26(Thu)03:03:14 No.108352348

I've got 32gb vram, 128gb ram
>midnight-miqu
I can only run this shit at Q3 it's ok but kinda dumb and hallucinates
>GLM4.7 @ Q2 - no think
best model I've tried, though it's positivity is insane, it goes along with anything, kinda dry personality, could be the shit quant
>DeepSeek-R1 @ Q1
repeats everything I say, very annoying. But it's probably the lobotomized quant
>GLM4.5 Air
kinda dumb but ok
>Step-3.5
same as air but a little worse
>Qwen3.5 27b heretic
it's ok but it just wasn't trained for rp, you can tell

is there anything else worth trying? I've tried a couple of 24b mistral shittunes and they actually punch well above their weight

Anonymous
03/12/26(Thu)03:10:27 No.108352370

Anonymous 03/12/26(Thu)03:10:27 No.108352370

>>108352348
>is there anything else worth trying?
gemma 27b if you want a model that is female targeted/coded. Otherwise you've seen everything.

Anonymous
03/12/26(Thu)03:37:34 No.108352451

Anonymous 03/12/26(Thu)03:37:34 No.108352451

>>108352286
Someone made a style browser for it, but idk if it knows more or less since i never used nai besides the leaked models.

Anonymous
03/12/26(Thu)03:38:56 No.108352458

Anonymous 03/12/26(Thu)03:38:56 No.108352458

I'm wondering what I can even do in the AI space that's interesting anymore.

I've written ports for TTS engines.
I've written ports for audio-to-gesticulation engines.
I've written ports for lip syncing engines.
I've written ports for speech-to-audio engines.
I've written LLM UI interfaces. (replacement for SillyTavern with feature parity)

Lately I've just been feeling very uninspired. I thought it was bad sleep for the first few days, but the feeling isn't going away. I'm blackpilled. Originally the dream was to democratize AI for the people. To give incels their own portable, no-dependency, performant, locally run AI waifus. It doesn't feel very compelling anymore. When I look at what I've created over the past four months, all I see is something that's not at all immersive or interesting despite all of the pieces theoretically being in place to provide immersion. The dream of giving AI embodiment is dead.

Sorry for the blogpost.

Anonymous
03/12/26(Thu)03:44:48 No.108352481

Anonymous 03/12/26(Thu)03:44:48 No.108352481

>>108352458
can we see them?

Anonymous
03/12/26(Thu)03:45:13 No.108352483

Anonymous 03/12/26(Thu)03:45:13 No.108352483

>>108352458
they're decent tools, when understanding their limitations. They will always be dumb as bricks and never be a replacement for human in the flesh lmao, all the ai waifu copers will neck themselves as they get tired of the regurgitated slop, you can't interact with that crap days after days for months on end without seeing the cracks, even the lower iq will feel the wrongness

Anonymous
03/12/26(Thu)03:46:30 No.108352488

Anonymous 03/12/26(Thu)03:46:30 No.108352488

>>108352481
no

Anonymous
03/12/26(Thu)03:50:52 No.108352500

Anonymous 03/12/26(Thu)03:50:52 No.108352500

>>108352488
how can we believe you're not just an openai employee demoralization posting

Anonymous
03/12/26(Thu)03:56:11 No.108352521

Anonymous 03/12/26(Thu)03:56:11 No.108352521

>>108352500
>demoralization
most tired 4cuck meme

Anonymous
03/12/26(Thu)03:59:00 No.108352525

Anonymous 03/12/26(Thu)03:59:00 No.108352525

File: 1764232270514089.jpg (54 KB, 976x549)

54 KB JPG

"Most, tired, 4cuck, meme. There."

Anonymous
03/12/26(Thu)04:01:07 No.108352532

Anonymous 03/12/26(Thu)04:01:07 No.108352532

>>108352458
>Lately I've just been feeling very uninspired. I thought it was bad sleep for the first few days, but the feeling isn't going away.
How old are you? This is usually just burnout but can also be a sign for early-onset alzheimer's.
Anyway, it sounds wholly like a you problem.

Anonymous
03/12/26(Thu)04:05:43 No.108352545

Anonymous 03/12/26(Thu)04:05:43 No.108352545

>>108352458
Make some characters and some detailed settings.

Anonymous
03/12/26(Thu)04:07:45 No.108352554

Anonymous 03/12/26(Thu)04:07:45 No.108352554

>>108343920
>I've got the og cpumaxx rig, so dual epyc with 768GB RAM and an A5000 24GB card and I pull 15t/s inference speed on kimi k2.5 at q4.
I'm interested, anon. Seems like all important future models are going to be large MoEs like Kimi K2. Which model CPUs do you use and what inference engine + settings get you that performance? Do you have to fuck with NUMA?

Anonymous
03/12/26(Thu)04:09:49 No.108352558

Anonymous 03/12/26(Thu)04:09:49 No.108352558

>>108352481
Search for "ProjectAni" in the archives. I have posted github links for some of the individual components I've used to build the main thing, namely "PocketTTS.cpp" (it's the fastest, lowest-latency implementation of Pocket TTS in the world, by a large margin).

>>108352483
I haven't started working on WebXR integration yet. I guess if I really wanted to I could go for that next, but the core issue still remains that the tech stack that serves as my foundation just isn't good enough as it stands now, and there doesn't seem to be any good way to remediate that with the technology that exists now.

Maybe I should hold out hope for Meta's release of SARAH to improve the gesticulation (currently using EMAGE, which is 2 years old!). Maybe there will be another TTS engine released that has better voice cloning, output quality in general, and performance than my optimized version of Pocket TTS. Maybe I could commission 3D artists to improve my VRM models (add support for PerfectSync and decimate the models to improve performance) so that the facial animations can be improved.

I'm in integration hell right now. All of the core components run on CPU only to reserve GPU compute for the LLM. Each of the components themselves are low-latency as standalone products, but when combined there are big issues with CPU contention. I could try implementing a more strict scheduling process in the code base, which would help, but that has its own issues and is a complex problem.

>>108352532
Early 20s.

Anonymous
03/12/26(Thu)04:14:27 No.108352574

Anonymous 03/12/26(Thu)04:14:27 No.108352574

>>108352558
>Early 20s.
Whelp, I've got some good news and some bad news for you. The good news is that it's probably not dementia.
The bad news is that it probably doesn't get much better from here.

Anonymous
03/12/26(Thu)04:19:54 No.108352591

Anonymous 03/12/26(Thu)04:19:54 No.108352591

>>108352545
You mean character cards? I don't have experience writing them, but the main one I'm using is pretty bad desu. Most important thing is probably to either edit the character card or system prompt to prevent narration. But the personality itself sucks and I'm not exactly enough of an "artist/creative" type to know how to improve it.

This is the shitty one I'm using now.
https://files.catbox.moe/myi343.json

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.