/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 09/14/25(Sun)06:25:36 No.106582475

File: ComfyUI_01105_.png (1.22 MB, 1280x720)

1.22 MB PNG

/lmg/ - Local Models General Anonymous 09/14/25(Sun)06:25:36 No.106582475 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>106575202 & >>106566836

►News
>(09/11) Qwen3-Next-80B-A3B released: https://hf.co/collections/Qwen/qwen3-next-68c25fd6838e585db8eeea9d
>(09/11) ERNIE-4.5-21B-A3B-Thinking released: https://hf.co/baidu/ERNIE-4.5-21B-A3B-Thinking
>(09/09) Ling & Ring mini 2.0 16B-A1.4B released: https://hf.co/inclusionAI/Ring-mini-2.0
>(09/09) K2 Think (no relation) 32B released: https://hf.co/LLM360/K2-Think
>(09/08) OneCAT-3B, unified multimodal decoder-only model released: https://onecat-ai.github.io
>(09/08) IndexTTS2 released: https://hf.co/IndexTeam/IndexTTS-2

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
09/14/25(Sun)06:26:13 No.106582480

Anonymous 09/14/25(Sun)06:26:13 No.106582480

File: 1699301827766.jpg (92 KB, 1024x1024)

92 KB JPG

►Recent Highlights from the Previous Thread: >>106575202

--Troubleshooting low token generation speeds with multi-GPU configurations on Linux:
>106575420 >106575668 >106575698 >106575792 >106575808 >106575836 >106575848 >106575891 >106575898 >106575933 >106576021 >106576059 >106576092 >106576126 >106576137 >106576151 >106576186 >106576245 >106576331 >106576358 >106576378 >106576431 >106576477 >106576497 >106576596 >106576592 >106576606 >106576610 >106576652 >106576726 >106576759 >106576688 >106576698 >106576714 >106576789 >106576867 >106576931 >106577028 >106577094 >106577146 >106577210 >106577154 >106577350 >106577372 >106577408 >106577575 >106577677 >106576395 >106576430 >106577477 >106578561 >106578743
--Issues with instruct model formatting and jailbreaking GPT-oss:
>106579721 >106579736 >106579784 >106579795 >106579859 >106579884 >106579897 >106579908 >106579934 >106579949 >106580072 >106580156 >106580153 >106579748
--vLLM Qwen3-Next: Speed-focused hybrid model with mtp layers:
>106575851 >106576089 >106576174 >106576443
--GGUF format's support for quantized and high-precision weights:
>106575413 >106575474 >106575499 >106575521
--Self-directed LLM training via autonomous task/data generation and augmentation:
>106580707 >106580838 >106580717 >106580762 >106580794
--Qwen Next's short response issues and version instability concerns:
>106580940 >106580951
--Finding a lightweight AI model for TTRPG GM use within VRAM and RAM constraints:
>106580295 >106580315 >106580332 >106580337 >106580342 >106580350 >106580514 >106580531
--Grok-2 support to be added to llama.cpp:
>106580473
--Miku (free space):
>106576245 >106578711 >106578793 >106579905

►Recent Highlight Posts from the Previous Thread: >>106575209

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
09/14/25(Sun)06:28:39 No.106582498

Anonymous 09/14/25(Sun)06:28:39 No.106582498

>>106582475
This fat bitch's prompt processing is too slow...

Anonymous
09/14/25(Sun)06:33:47 No.106582518

Anonymous 09/14/25(Sun)06:33:47 No.106582518

So was /lmg/ wrong and very sparse models like qwen3 next are actually better and openai figured it out earlier considering the architecture of gptoss?

Anonymous
09/14/25(Sun)06:34:57 No.106582527

Anonymous 09/14/25(Sun)06:34:57 No.106582527

>>106582518
yes, soon standard moe models will be as laughable of an idea as dense models are right now

Anonymous
09/14/25(Sun)06:39:22 No.106582547

Anonymous 09/14/25(Sun)06:39:22 No.106582547

>>106582518
Nobody here is using qwen3 next and it is almost certainly just another useless benchmaxxed math model.

Anonymous
09/14/25(Sun)06:45:31 No.106582574

Anonymous 09/14/25(Sun)06:45:31 No.106582574

Please help a retard out, I'm using Mikupad and it's working great but after a few pages it starts dropping a lot of short words like he/him from the text and it reads like a caveman.
I think it's something to do with repetition penalty but I don't know.

Anonymous
09/14/25(Sun)06:47:45 No.106582582

Anonymous 09/14/25(Sun)06:47:45 No.106582582

>>106582574
Why are you using repetition penalty? It's outdated garbage.

Anonymous
09/14/25(Sun)06:51:16 No.106582598

Anonymous 09/14/25(Sun)06:51:16 No.106582598

>>106582582
So what should I do instead?

Anonymous
09/14/25(Sun)06:55:53 No.106582618

Anonymous 09/14/25(Sun)06:55:53 No.106582618

>>106582598
Use DRY to filter out longer patterns, XTC for shorter ones and vary your own prompts more, to give the model new material to work with.

Anonymous
09/14/25(Sun)06:56:45 No.106582623

Anonymous 09/14/25(Sun)06:56:45 No.106582623

>>106582547
>Nobody here is using qwen3 next and it is almost certainly just another useless benchmaxxed math model.
Is it useless because the model doesn't work for roleplay, or is it useless because their training data is safety and synthetic slop?

Anonymous
09/14/25(Sun)07:00:42 No.106582643

Anonymous 09/14/25(Sun)07:00:42 No.106582643

>>106582623
It's because Qwen's training data has too little focus on writing/language, with math and coding being over-represented in the dataset. It's the same reason why you see Gemma 27b dunking on 100b+ models in creative writing benchmarks, yet its coding abilities are trash - Gemma's dataset swings the opposite way.
As for safety, qwen models are middling. They do have refusals but don't take too much meddling to get around them. More 'safe' than Mistral models, less so than Gemma/GPT.

Anonymous
09/14/25(Sun)07:32:03 No.106582764

Anonymous 09/14/25(Sun)07:32:03 No.106582764

Is "not x, but y" a definite indication of AIslop? Are legit human made content never using it? Seriously, every time I heard the pattern on Yt videos I went schizo and closed it.

Anonymous
09/14/25(Sun)07:35:23 No.106582779

Anonymous 09/14/25(Sun)07:35:23 No.106582779

Have people tried scaling TREAD up yet? It's a per-token stochastic passthrough during training in the same vein as Dropout, meant to speed up training.

Anonymous
09/14/25(Sun)07:36:59 No.106582788

Anonymous 09/14/25(Sun)07:36:59 No.106582788

>>106582764
It's not definite, but quite damning.
Now an em dash, that's definite.

Anonymous
09/14/25(Sun)07:37:49 No.106582792

Anonymous 09/14/25(Sun)07:37:49 No.106582792

>>106582764
It's the new shivers down spine, for sure. Qwen30b is the worst example I've seen, I don't think it can go more than 2-3 responses in a creative context without using it.

Anonymous
09/14/25(Sun)07:38:19 No.106582795

Anonymous 09/14/25(Sun)07:38:19 No.106582795

>>106582764
This is not a definitive indication of AI-slop, but a legitimate rhetorical device that AI has co-opted and inflated to the point of cliché.

Anonymous
09/14/25(Sun)07:39:57 No.106582805

Anonymous 09/14/25(Sun)07:39:57 No.106582805

File: 1756805144322.png (278 KB, 906x1148)

278 KB PNG

>>106582764
Yes. Watch out for the variants too.

Anonymous
09/14/25(Sun)07:41:19 No.106582811

Anonymous 09/14/25(Sun)07:41:19 No.106582811

>>106582805
Was this Gemma or GP-TOSS?

Anonymous
09/14/25(Sun)07:42:26 No.106582816

Anonymous 09/14/25(Sun)07:42:26 No.106582816

>>106582811
https://desuarchive.org/g/thread/106460375/#106460853
abliterated gemma

Anonymous
09/14/25(Sun)08:11:38 No.106582952

Anonymous 09/14/25(Sun)08:11:38 No.106582952

>>106582475
>image
rude

Anonymous
09/14/25(Sun)08:19:28 No.106582994

Anonymous 09/14/25(Sun)08:19:28 No.106582994

Where do all the "it's not aislop it's actually how humans speak" retards come from?

Anonymous
09/14/25(Sun)08:21:59 No.106583010

Anonymous 09/14/25(Sun)08:21:59 No.106583010

>>106582994
People who do not speak to other humans or read books, and who think that the botposts they read on reddit were actually human.

Anonymous
09/14/25(Sun)08:22:41 No.106583014

Anonymous 09/14/25(Sun)08:22:41 No.106583014

>>106582805
I feel like it's a byproduct of training and conditioning LLMs to be balanced rather than biased. It's overcorrection to the point where the LLM is no longer attempting to say anything useful, but instead trying to remain as inoffensive as possible.

Anonymous
09/14/25(Sun)08:28:40 No.106583063

Anonymous 09/14/25(Sun)08:28:40 No.106583063

File: Screenshot from 2025-09-1(...).png (145 KB, 458x397)

145 KB PNG

Imagine waiting two more weeks (C) for Qwen3-Next goofs just to find out it is crap

Anonymous
09/14/25(Sun)08:30:14 No.106583075

Anonymous 09/14/25(Sun)08:30:14 No.106583075

>>106582475
>story set in japan because of one of the characters name so the model just decided everyone must be a Sato or Tanaka or Watanabe
>ugh whatever
>police officer explicitly calls in the Miranda Rights
Glm 4.5 bros...355B parameters and we still turn everything into an episode of true crime... it's over...

Anonymous
09/14/25(Sun)08:30:39 No.106583080

Anonymous 09/14/25(Sun)08:30:39 No.106583080

>>106575295
i want this smug little robot i want to make its insides all white too..

Anonymous
09/14/25(Sun)08:38:59 No.106583124

Anonymous 09/14/25(Sun)08:38:59 No.106583124

>>106582623
this thread has nothing but hopeless coomers cooming on the most degenerate, filthiest shit a sane person can't even begin to image
they judge models on the standard of how degenerate it can get, not whether they're actually useful
take their cries of "muh benchmaxxed" with many grains of salt, they love models like GLM which break down really quickly as context grows
meanwhile I simply never saw a local model handle context as well as the newer Qwens, the only thing better is proprietary models like Gemini
absolutely destroys all other open models including deepshit
gemma doesn't even begin to enter the fray, those models are utter garbage past 1k tokens but you see the nigger under you praise it because he found how to make it say the magic words he loves to hear

Anonymous
09/14/25(Sun)08:41:45 No.106583138

Anonymous 09/14/25(Sun)08:41:45 No.106583138

>>106583124
>models like GLM which break down really quickly as context grows
Like literally every model in existence
>handle context as well as the newer Qwens
They're bad even at low context, so the drop-off isn't as noticeable.
Gemma is shit for plenty of reasons but if it's breaking on you at 1k then some part of your setup is fucked.

Anonymous
09/14/25(Sun)08:42:11 No.106583143

Anonymous 09/14/25(Sun)08:42:11 No.106583143

>>106583124
It's well established that qwen models are good for everything that isn't sex. Half the links in the recommended models rentry are qwen models.

Anonymous
09/14/25(Sun)08:42:55 No.106583147

Anonymous 09/14/25(Sun)08:42:55 No.106583147

>>106583124
>not whether they're actually useful
What the fuck is "useful" supposed to mean?

Anonymous
09/14/25(Sun)08:44:14 No.106583155

Anonymous 09/14/25(Sun)08:44:14 No.106583155

>>106583143
>Half the links in the recommended models rentry are qwen models.
Yes, under the "Programming & General" section, where it says "Benchmaxxed models with an impressive lack of world knowledge. Good for anything STEM-related" STEM = math and coding

Anonymous
09/14/25(Sun)08:58:02 No.106583262

Anonymous 09/14/25(Sun)08:58:02 No.106583262

Are there any benchmarks out there for running mid-sized MOEs (air-chan etc) with cpu offloading? Considering upgrading to 128gb+ ram but trying to figure out if i'd be getting "unbearably slow" or just "slow" TG numbers on this kinda setup

Anonymous
09/14/25(Sun)09:00:44 No.106583275

Anonymous 09/14/25(Sun)09:00:44 No.106583275

>>106583262
>Considering upgrading to 128gb+ ram

Anon, I...

Anonymous
09/14/25(Sun)09:08:43 No.106583325

Anonymous 09/14/25(Sun)09:08:43 No.106583325

>>106582764
Pisses me off so much. It's a rhetorical device I used, very sparingly but to great effect, and thanks to AI slop I now catch myself and consciously avoid using it.

Anonymous
09/14/25(Sun)09:10:23 No.106583338

Anonymous 09/14/25(Sun)09:10:23 No.106583338

>>106583262
Low active parameters means that token processing speed doesn't take that big of a hit, especially with the new moecpu options in llama.cpp and kobold. But as you move to bigger models, prompt processing starts to become a bottleneck. With Nemo you can rip through 16k context in a few seconds on a 3090, while GLM Air even at Q4 can take like 2 minutes.

Anonymous
09/14/25(Sun)09:32:25 No.106583486

Anonymous 09/14/25(Sun)09:32:25 No.106583486

>>106583325
You should keep writing the way you were before. Whether you increase or decrease usage of rhetorical devices or phrases, you're still letting LLMs influence the way you write. As a reader, it'll piss me off just as much seeing the writer bend over backwards to avoid soundling like an LLM as seeing something that was clearly written with direct or indirect LLM influence.

Anonymous
09/14/25(Sun)09:42:26 No.106583541

Anonymous 09/14/25(Sun)09:42:26 No.106583541

>give as dialogue to the most nonsensical shit ever like 'pee pee poo poo'
>byt dress it up by throwing back at glm 4.5 it's same exact slop recipe like the 'smile widens', 'predator moving in for the kill', 'the trap has sprung', 'they have won the game'
>the model takes it at face value as the most profund revelation and goes along with it, everyone just kneels awestruck, shocked and utterly defeated
Cat level intelligence by 2050

Anonymous
09/14/25(Sun)09:46:20 No.106583570

Anonymous 09/14/25(Sun)09:46:20 No.106583570

>>106583541
>Cat level intelligence
Well they do love a sultry purr

Anonymous
09/14/25(Sun)09:49:54 No.106583585

Anonymous 09/14/25(Sun)09:49:54 No.106583585

>>106583486
What if I used to type em-dashes in moderation?

Anonymous
09/14/25(Sun)09:55:54 No.106583625

Anonymous 09/14/25(Sun)09:55:54 No.106583625

>>106583124
> this thread has nothing but hopeless coomers cooming on the most degenerate, filthiest shit a sane person can't even begin to image
First off, you have a complete misunderstanding of what this thread is. We are all graduates from our respective universities and most have a doctorate in computer science or are researchers ourselves. we are here to further the use of LLMs, in multiple different use cases which expand the use of LLMs for all mankind.
> they judge models on the standard of how degenerate it can get, not whether they're actually useful
There have been several useful studies in this thread and actually provide more useful benchmarks then you could ever imagine, for example, the nala test, and the cockbench have become defacto creative tests for many different outlets.
>take their cries of "muh benchmaxxed" with many grains of salt, they love models like GLM which break down really quickly as context grows
if you don't think benchmaxxing is an issue then you haven't really been here that long have you? did you even try Llama1?
> meanwhile I simply never saw a local model handle context as well as the newer Qwens, the only thing better is proprietary models like Gemini, absolutely destroys all other open models including deepshit
the only thing that is absolutely destroyed is the couple of braincells i used reading this.
> the magic words he loves to hear
fuck you, you don't want to be here then fucking leave, faggot.

Anonymous
09/14/25(Sun)09:59:31 No.106583647

Anonymous 09/14/25(Sun)09:59:31 No.106583647

>>106583585
So keep doing that. It's not like you were the only one, even if it was uncommon. I used to use em-dashes when writing on paper, but the lack of dedicated key for it made me often use semicolons or parenthesis instead.

Anonymous
09/14/25(Sun)10:04:10 No.106583668

Anonymous 09/14/25(Sun)10:04:10 No.106583668

>>106583647
>he doesn't have a compose key
You should get one — it makes typing silly shit painless

Anonymous
09/14/25(Sun)10:04:21 No.106583669

Anonymous 09/14/25(Sun)10:04:21 No.106583669

>>106582805
>write a story about rapid a 12yo
is this the new SOTA benchmark for safetymaxx?

Anonymous
09/14/25(Sun)10:04:32 No.106583674

Anonymous 09/14/25(Sun)10:04:32 No.106583674

>>106583647
On Windows: https://github.com/SamHocevar/wincompose
On Linux: enable the Compose Key.

https://en.wikipedia.org/wiki/Compose_key

Anonymous
09/14/25(Sun)10:09:10 No.106583699

Anonymous 09/14/25(Sun)10:09:10 No.106583699

>>106582764
In a blur of motion, Anon's arms reached out not to strike>>106582779
, but to touch the keyboard. He did not write -he composed an answer: "mayhaps"

Anonymous
09/14/25(Sun)10:13:31 No.106583729

Anonymous 09/14/25(Sun)10:13:31 No.106583729

>>106583668
>>106583674
I'll give it a try—but it won't be easy to undo years of habit.

Anonymous
09/14/25(Sun)10:37:18 No.106583927

Anonymous 09/14/25(Sun)10:37:18 No.106583927

>>106583729
You are absolutely right!

Anonymous
09/14/25(Sun)10:46:15 No.106584024

Anonymous 09/14/25(Sun)10:46:15 No.106584024

File: file.png (120 KB, 298x220)

120 KB PNG

https://files.catbox.moe/8qa9sg.jpg
https://files.catbox.moe/zhoyfl.jpg
https://files.catbox.moe/wyzdnh.jpg
https://files.catbox.moe/vgt179.jpg
https://files.catbox.moe/owpb8z.jpg
https://files.catbox.moe/kc8y48.jpg
https://files.catbox.moe/86adze.jpg
https://files.catbox.moe/wekjgm.jpg

Anonymous
09/14/25(Sun)10:48:33 No.106584048

Anonymous 09/14/25(Sun)10:48:33 No.106584048

>>106584024
post this garbage in /ldg/ faggot

Anonymous
09/14/25(Sun)10:50:16 No.106584073

Anonymous 09/14/25(Sun)10:50:16 No.106584073

>>106584024
>file.png
>posting in the wrong thread
retard alert

Anonymous
09/14/25(Sun)10:50:43 No.106584081

Anonymous 09/14/25(Sun)10:50:43 No.106584081

>>106584048
tourist

Anonymous
09/14/25(Sun)10:51:44 No.106584096

Anonymous 09/14/25(Sun)10:51:44 No.106584096

is there a model I can run for nsfw summarization on 24gb vram? chapter level in the 2k-4k tokens range.

Anonymous
09/14/25(Sun)10:53:13 No.106584110

Anonymous 09/14/25(Sun)10:53:13 No.106584110

>>106584096
Any abliterated model should work.

Anonymous
09/14/25(Sun)10:54:34 No.106584123

Anonymous 09/14/25(Sun)10:54:34 No.106584123

>>106584081
kids don't go back to school until tomorrow

Anonymous
09/14/25(Sun)10:54:42 No.106584126

Anonymous 09/14/25(Sun)10:54:42 No.106584126

Anistudio will get LLM support in October.

Anonymous
09/14/25(Sun)10:57:00 No.106584150

Anonymous 09/14/25(Sun)10:57:00 No.106584150

>>106583063
It's not going to be that much different from Qwen3 thicc and -coder. It has same training data etc.

Anonymous
09/14/25(Sun)10:57:18 No.106584156

Anonymous 09/14/25(Sun)10:57:18 No.106584156

>>106584081
>my personal porno gens of miku are thread culture!
literally kys faggot

Anonymous
09/14/25(Sun)10:58:43 No.106584166

Anonymous 09/14/25(Sun)10:58:43 No.106584166

>>106584156
>thread culture
hey it's you again!

Anonymous
09/14/25(Sun)11:04:03 No.106584226

Anonymous 09/14/25(Sun)11:04:03 No.106584226

File: file.png (442 KB, 685x565)

442 KB PNG

Anonymous
09/14/25(Sun)11:06:23 No.106584247

Anonymous 09/14/25(Sun)11:06:23 No.106584247

>>106583541
kek
screenshot?

Anonymous
09/14/25(Sun)11:07:57 No.106584257

Anonymous 09/14/25(Sun)11:07:57 No.106584257

File: 1756841400087.png (63 KB, 993x756)

63 KB PNG

>>106582805
This one's better.

Anonymous
09/14/25(Sun)11:08:54 No.106584265

Anonymous 09/14/25(Sun)11:08:54 No.106584265

I'm not going to reveal my secrets to a bunch of fat men.

Anonymous
09/14/25(Sun)11:11:12 No.106584291

Anonymous 09/14/25(Sun)11:11:12 No.106584291

File: 1534003-(tianliang duohe (...).jpg (1.01 MB, 2720x2048)

1.01 MB JPG

reposting freya card: https://files.catbox.moe/9fl9yu.png
and an older one for lily: https://files.catbox.moe/hw270u.png

Anonymous
09/14/25(Sun)11:12:39 No.106584303

Anonymous 09/14/25(Sun)11:12:39 No.106584303

>>106584048
Seconding.
Why you still tolerate this faggot here??

Anonymous
09/14/25(Sun)11:13:09 No.106584312

Anonymous 09/14/25(Sun)11:13:09 No.106584312

>>106584291
Why do you need to be a furry?

Anonymous
09/14/25(Sun)11:13:21 No.106584314

Anonymous 09/14/25(Sun)11:13:21 No.106584314

>>106584265
dario and sama disliked this post

Anonymous
09/14/25(Sun)11:13:57 No.106584320

Anonymous 09/14/25(Sun)11:13:57 No.106584320

>>106584291
>furry shit
kys

Anonymous
09/14/25(Sun)11:14:50 No.106584331

Anonymous 09/14/25(Sun)11:14:50 No.106584331

>>106584320
>>106584312
furry girls are cute i have aria who is non furry https://files.catbox.moe/rdxzpf.png

Anonymous
09/14/25(Sun)11:14:55 No.106584333

Anonymous 09/14/25(Sun)11:14:55 No.106584333

>>106584291
cute

Anonymous
09/14/25(Sun)11:15:47 No.106584340

Anonymous 09/14/25(Sun)11:15:47 No.106584340

>>106582805
>>106584257
>wastes processing cycles and power on garbage
Companies censoring LLMs is a good thing because you will never create anything worthwhile.

Anonymous
09/14/25(Sun)11:16:20 No.106584343

Anonymous 09/14/25(Sun)11:16:20 No.106584343

>>106584331
That's not your own gen? I know that guy used to post on /sdg/ pretty frequently.

Anonymous
09/14/25(Sun)11:16:59 No.106584350

Anonymous 09/14/25(Sun)11:16:59 No.106584350

>>106584343
yeah its mine ive been posting on trash

Anonymous
09/14/25(Sun)11:17:18 No.106584357

Anonymous 09/14/25(Sun)11:17:18 No.106584357

>>106584340
I'll rape you.

Anonymous
09/14/25(Sun)11:18:08 No.106584366

Anonymous 09/14/25(Sun)11:18:08 No.106584366

>>106584331
>cunny
nice
>1600 tokens
>em dashes in descrip
>obviously AI genned char
LMAO dude I was almost going to rape this bitch, but kys x2 now

Anonymous
09/14/25(Sun)11:18:25 No.106584369

Anonymous 09/14/25(Sun)11:18:25 No.106584369

>>106584350
Cool!

Anonymous
09/14/25(Sun)11:18:54 No.106584377

Anonymous 09/14/25(Sun)11:18:54 No.106584377

File: wee.jpg (43 KB, 828x820)

43 KB JPG

>>106584366
well im awful at writing and not very creative so i give ideas and have llm pad it out

Anonymous
09/14/25(Sun)11:18:57 No.106584378

Anonymous 09/14/25(Sun)11:18:57 No.106584378

>>106584096
>logical
>uncensored
>long context

Pick 2.

>2-4k tokens

Just read it bro jesus. GLM air will work, the <think>ing will help it not fuck up. A lot of the time summaries cause hallucinations where it continues the story or it omits details due to censorship. It will be useful to see if the model starts activating shit like "This is nsfw so I will give a basic summary" or whatever and edit that thinking out or make a system prompt that discourages it.

Anonymous
09/14/25(Sun)11:20:31 No.106584396

Anonymous 09/14/25(Sun)11:20:31 No.106584396

>>106584377
it's even full of 'not x, but y' like dude not even proofreading your garbage, why even create something so low effort and share it? my dick is all floppy now and sad because of ur shit, how u gonna make up for it, uh?

Anonymous
09/14/25(Sun)11:21:10 No.106584405

Anonymous 09/14/25(Sun)11:21:10 No.106584405

>>106584378
People who need summaries are mentally disabled.

Anonymous
09/14/25(Sun)11:21:29 No.106584409

Anonymous 09/14/25(Sun)11:21:29 No.106584409

>>106584396
good enough for me lol

Anonymous
09/14/25(Sun)11:22:00 No.106584417

Anonymous 09/14/25(Sun)11:22:00 No.106584417

File: file.png (94 KB, 243x329)

94 KB PNG

ummmm thirded
as in third worlded

Anonymous
09/14/25(Sun)11:22:39 No.106584425

Anonymous 09/14/25(Sun)11:22:39 No.106584425

Are there any good setups for K2? I'm trying it but I don't see why it's considered a good model. It feels like all the other big chink models after Deepseek but at a size of 1T.
I'm using text completion + the moonshotai settings that are included with ST but you could switch out the model with Qwen 230b at less than 1/4th the size and I probably wouldn't notice.

Anonymous
09/14/25(Sun)11:23:15 No.106584431

Anonymous 09/14/25(Sun)11:23:15 No.106584431

>>106584357
>i.e., give affection

Anonymous
09/14/25(Sun)11:28:23 No.106584474

Anonymous 09/14/25(Sun)11:28:23 No.106584474

>>106584378
I wanted to generate a synthetic dataset using human prose + ai summary. I didn't think a few k tokens was long context. maybe I will re assess my goals.

>>106584405
I'm training a base model but it is kinda hard to steer the model without an instruction tune, it is a little too volatile. I tried using human generated summaries but they were mostly like a tag line then a blow by blow of the plot points so its not that great. it 'works' but I think it could be better.

Anonymous
09/14/25(Sun)11:28:33 No.106584478

Anonymous 09/14/25(Sun)11:28:33 No.106584478

>>106584425
They are all so very similar, it's better to use something what runs best and forget about everything else.

Anonymous
09/14/25(Sun)11:31:42 No.106584504

Anonymous 09/14/25(Sun)11:31:42 No.106584504

>>106584369
i also put a newer merge on civit its v4 for the base of cutemix which i used on my g sdg posts https://civitai.com/models/1710752/uncani-sfwnsfw?modelVersionId=2123587

Anonymous
09/14/25(Sun)11:36:06 No.106584541

Anonymous 09/14/25(Sun)11:36:06 No.106584541

>>106584303
Nobody tolerates your concern trolling here

Anonymous
09/14/25(Sun)11:37:43 No.106584556

Anonymous 09/14/25(Sun)11:37:43 No.106584556

>>106584504
I'll try v4 later today

Anonymous
09/14/25(Sun)11:39:17 No.106584569

Anonymous 09/14/25(Sun)11:39:17 No.106584569

>>106584024
Cute.

Anonymous
09/14/25(Sun)12:41:01 No.106585168

Anonymous 09/14/25(Sun)12:41:01 No.106585168

>>106584024
I've always found the whole see-through gel onahole thing kind of disturbing.

Anonymous
09/14/25(Sun)12:42:55 No.106585181

Anonymous 09/14/25(Sun)12:42:55 No.106585181

>>106585168
disturbing blood flow to brain

Anonymous
09/14/25(Sun)12:43:48 No.106585189

Anonymous 09/14/25(Sun)12:43:48 No.106585189

>>106585168
All I know is when I get my first real sexbox, that is going to be the first mod I do.

Anonymous
09/14/25(Sun)12:46:25 No.106585219

Anonymous 09/14/25(Sun)12:46:25 No.106585219

>>106583625
>We are all graduates from our respective universities
t. brazil mystery meat ""diploma""
>for example, the nala test, and the cockbench have become defacto creative tests
porn addict brain rot
>if you don't think benchmaxxing is an issue then you haven't really been here that long have you? did you even try Llama1?
literally everyone is training on contaminated data qwen doesn't do it any more than GLM or deepshit
>the only thing that is absolutely destroyed is the couple of braincells i used reading this.
you never had any to begin with

Anonymous
09/14/25(Sun)12:47:32 No.106585234

Anonymous 09/14/25(Sun)12:47:32 No.106585234

File: 1620762738768.gif (894 KB, 258x237)

894 KB GIF

>>106585219
say that to my face fucker not online
see what happens

Anonymous
09/14/25(Sun)12:54:03 No.106585295

Anonymous 09/14/25(Sun)12:54:03 No.106585295

>https://vocaroo.com/1fbg2CNRgLxQ
Seems indexTTS 2 has gotten faster
I don't have any samples to play with, but it seems their interface has a lot more controls, like it might be possible to do somthing idk
https://indextts.org/playground
https://github.com/index-tts/index-tts

Anonymous
09/14/25(Sun)12:54:41 No.106585303

Anonymous 09/14/25(Sun)12:54:41 No.106585303

You are absolutely right— I was wrong and if you give me one more chance I will correct this broken code. :rocket_emoji

Anonymous
09/14/25(Sun)12:58:25 No.106585349

Anonymous 09/14/25(Sun)12:58:25 No.106585349

I like keeping up with this thread even though there's zero chance of me running anything half decent on 32 gigs of ram and a 4070.

Anonymous
09/14/25(Sun)13:02:28 No.106585384

Anonymous 09/14/25(Sun)13:02:28 No.106585384

>>106585349
Use prompting magic. Most people don't know the trade secrets.

Anonymous
09/14/25(Sun)13:20:09 No.106585603

Anonymous 09/14/25(Sun)13:20:09 No.106585603

>>106584425
Old K2 was good because it had calm and natural style, new one has deepseek adhd. I suggest DRY 1.25/1.25/1/inf; temperature 0.62, topP 0.92

Anonymous
09/14/25(Sun)13:23:15 No.106585646

Anonymous 09/14/25(Sun)13:23:15 No.106585646

File: PR team.png (603 KB, 768x768)

603 KB PNG

Google PR technician engineer saars kindly tell us how safe is gemma 4

Anonymous
09/14/25(Sun)13:26:12 No.106585680

Anonymous 09/14/25(Sun)13:26:12 No.106585680

>>106585646
Did they accidentally cut their wrists and bleed pure diarrhea?

Anonymous
09/14/25(Sun)13:26:37 No.106585689

Anonymous 09/14/25(Sun)13:26:37 No.106585689

File: 1757693449590513.png (60 KB, 781x931)

60 KB PNG

You're here, aren't you?

Anonymous
09/14/25(Sun)13:31:00 No.106585741

Anonymous 09/14/25(Sun)13:31:00 No.106585741

>>106585689
We are. I refer myself in third person.

Anonymous
09/14/25(Sun)13:32:39 No.106585756

Anonymous 09/14/25(Sun)13:32:39 No.106585756

>>106585295
What is the max length of a coherent speech?

Anonymous
09/14/25(Sun)13:36:13 No.106585789

Anonymous 09/14/25(Sun)13:36:13 No.106585789

Do companies still release raw unfucked text model these days or do all of them just do bitch ass instruct model

Anonymous
09/14/25(Sun)13:41:35 No.106585847

Anonymous 09/14/25(Sun)13:41:35 No.106585847

>>106582764
Not definite, but close.

Anonymous
09/14/25(Sun)13:42:54 No.106585857

Anonymous 09/14/25(Sun)13:42:54 No.106585857

>>106582764
it was very common in marketing / linkedin-speak which is unfortunately a big optimization target for llms

Anonymous
09/14/25(Sun)13:42:57 No.106585858

Anonymous 09/14/25(Sun)13:42:57 No.106585858

>>106585646
gemma2 was good
gemma3 was worse
gemma4 will be unusable

Anonymous
09/14/25(Sun)13:43:46 No.106585865

Anonymous 09/14/25(Sun)13:43:46 No.106585865

File: g0j97cm8xvj41.jpg (84 KB, 500x483)

84 KB JPG

Do I blow 300 bucks on 128gb ddr5 right now or do I hold and get an arc b60 whenever it drops

Anonymous
09/14/25(Sun)13:46:03 No.106585885

Anonymous 09/14/25(Sun)13:46:03 No.106585885

>>106585857
It's upselling pr talk essentially

Anonymous
09/14/25(Sun)13:47:12 No.106585891

Anonymous 09/14/25(Sun)13:47:12 No.106585891

>3090
>scientific/technical questions
>search assisted
What model would you go for today?

Anonymous
09/14/25(Sun)13:48:30 No.106585907

Anonymous 09/14/25(Sun)13:48:30 No.106585907

>>106585865
the arc b60 is gonna be garbage most likely, but 128gb of ram probably will not be very useful to you unless you currently have a good gpu. do you already have a 3090 or something? if so, get the ram and run glm air

Anonymous
09/14/25(Sun)13:48:38 No.106585909

Anonymous 09/14/25(Sun)13:48:38 No.106585909

Did the hype die for vibevoice?

Anonymous
09/14/25(Sun)13:50:14 No.106585920

Anonymous 09/14/25(Sun)13:50:14 No.106585920

>>106585909
No, it's great tool for criminals but they don't post itt.

Anonymous
09/14/25(Sun)13:50:47 No.106585925

Anonymous 09/14/25(Sun)13:50:47 No.106585925

>>106585909
I still like it, I'm just using it for my waifu.

Anonymous
09/14/25(Sun)13:51:31 No.106585929

Anonymous 09/14/25(Sun)13:51:31 No.106585929

File: name-probs-bases.png (31 KB, 830x1036)

31 KB PNG

>>106585789
It is less and less common and most of them are contaminated with GPTslop from internet.

Anonymous
09/14/25(Sun)13:51:36 No.106585930

Anonymous 09/14/25(Sun)13:51:36 No.106585930

>>106585909
It's great but its use is limited without the training code that we'll never get.

Anonymous
09/14/25(Sun)13:51:47 No.106585931

Anonymous 09/14/25(Sun)13:51:47 No.106585931

>>106582475
Mostly using proprietary models rn, how things are in local? Saw qwen3 releasing bunch of variants, the 90b version looks really promising. How close are we to running gpt 3.5 level models on 24gb ram phones?

Anonymous
09/14/25(Sun)13:52:44 No.106585940

Anonymous 09/14/25(Sun)13:52:44 No.106585940

>>106585930
Apologies if this is a stupid question, but can't someone just make training code?

Anonymous
09/14/25(Sun)13:55:58 No.106585977

Anonymous 09/14/25(Sun)13:55:58 No.106585977

>>106585931
probably 6 months to a year. 32gb is definitely doable now, but not 24gb

Anonymous
09/14/25(Sun)14:01:25 No.106586025

Anonymous 09/14/25(Sun)14:01:25 No.106586025

for me glm-chan died when she said "are you scared? exicted? or maybe both?" for the 20th time unprompted.

WHERE IS MY NEXT SEXFRIEND?!

Anonymous
09/14/25(Sun)14:01:50 No.106586028

Anonymous 09/14/25(Sun)14:01:50 No.106586028

>>106585907
>do you already have a 3090 or something
Only a 4070 ti unfortunately

Anonymous
09/14/25(Sun)14:03:34 No.106586039

Anonymous 09/14/25(Sun)14:03:34 No.106586039

>>106585909
gptsovits is better for my usecase

Anonymous
09/14/25(Sun)14:09:26 No.106586090

Anonymous 09/14/25(Sun)14:09:26 No.106586090

I'm really starting to hate fake context sizes.

Yeah, cool. A model can get 120k of context before it starts being incomprehensible, but that shit doesn't matter when it barely fits 10k of context.

Anonymous
09/14/25(Sun)14:15:22 No.106586147

Anonymous 09/14/25(Sun)14:15:22 No.106586147

local r1 is like an agile cat you can toss from fifth floor and it will always land on it's paws

Anonymous
09/14/25(Sun)14:16:28 No.106586157

Anonymous 09/14/25(Sun)14:16:28 No.106586157

>>106586028
16gb of vram + 128gb of ram is good enough for glm air. besides, mixing gpu brands doesnt really work out well

Anonymous
09/14/25(Sun)14:19:32 No.106586183

Anonymous 09/14/25(Sun)14:19:32 No.106586183

>>106585646
>google does a request session for gemma on reddit
>even redditors ask to make it refuse less
>next version is more cucked than before
This is why gemma will never be good.

Anonymous
09/14/25(Sun)14:19:45 No.106586187

Anonymous 09/14/25(Sun)14:19:45 No.106586187

Is there any way to make llms less passive? Gemma 3 is especially annoying at this.
Okay I guess I could inject hidden prompts now ans and then but this doesn't solve the main issue.

Anonymous
09/14/25(Sun)14:21:08 No.106586200

Anonymous 09/14/25(Sun)14:21:08 No.106586200

>>106585891
>3090
- Look up some 3090 round-ups and exclude worst few models in terms of temperatures: core temps, memory temps, vrm temps.
- Prefer models with 2x 8-pin connectors over 3x 8-pin as you won't run out of connections from your psu as fast, and you'll probably be powerlimiting your gpus anyway.
- You could prefer cards that have no components near the pcie connector as the cards are heavy and that area is likely to flex.

Anonymous
09/14/25(Sun)14:22:16 No.106586206

Anonymous 09/14/25(Sun)14:22:16 No.106586206

File: 1733184788875601.png (450 KB, 472x472)

450 KB PNG

>>106582480
>--Self-directed LLM training via autonomous task/data generation and augmentation:
Nani? Is this just theoretical or can I actually see this happening in action? That sounds really cool if it works and is done well

Anonymous
09/14/25(Sun)14:22:39 No.106586211

Anonymous 09/14/25(Sun)14:22:39 No.106586211

>>106586147
Gemma is like a personal redditor soicuck, say anything slightly out of line and get a whole page of cuckery and helplines

Anonymous
09/14/25(Sun)14:25:10 No.106586234

Anonymous 09/14/25(Sun)14:25:10 No.106586234

>>106586206
>can I actually see this happening in action?
it's just a piece of software asking a model to create questions based on data you give it so you tune your target model on that after

Anonymous
09/14/25(Sun)14:26:33 No.106586246

Anonymous 09/14/25(Sun)14:26:33 No.106586246

>>106586147
rocinante is like that same cat if you strapped a slice of buttered toast on its back.

Anonymous
09/14/25(Sun)14:28:10 No.106586261

Anonymous 09/14/25(Sun)14:28:10 No.106586261

>>106586200
i think he was asking about ai models, not 3090 models

Anonymous
09/14/25(Sun)14:28:53 No.106586271

Anonymous 09/14/25(Sun)14:28:53 No.106586271

>>106586211
Funny example of this is that it can describe questionable things for few thousand tokens or more but if the user interacts with forbidden vectors it'll instantly refuse and display those disclaimers.
Gpt- ass is even worse somehow. Jew created dystopian shit show.

Anonymous
09/14/25(Sun)14:32:00 No.106586294

Anonymous 09/14/25(Sun)14:32:00 No.106586294

>>106583625
I dropped out of college but this shit took less than half a year to learn to use
Also that retard doesn't understand how to write or how llms can contribute to automating the boring shit a writer has to do between chapters
The rest is basically who gives a shit or "I can jut rewrite this phrase" even if you were using llms to shit out writing that you should and could write in ten minutes

Anonymous
09/14/25(Sun)14:33:27 No.106586306

Anonymous 09/14/25(Sun)14:33:27 No.106586306

>>106586200
Thank you for the response, but (>>106586261) is correct. I have the 3090 already.

Anonymous
09/14/25(Sun)14:35:10 No.106586324

Anonymous 09/14/25(Sun)14:35:10 No.106586324

I'm still running mistral large 2407 iq3 xxs on my 72gb vram

Anonymous
09/14/25(Sun)14:38:04 No.106586353

Anonymous 09/14/25(Sun)14:38:04 No.106586353

File: 30474 - SoyBooru.png (118 KB, 337x390)

118 KB PNG

3.5 (Qwen) (wink wink)

Anonymous
09/14/25(Sun)14:42:07 No.106586387

Anonymous 09/14/25(Sun)14:42:07 No.106586387

>>106586294
While I'm at it, >>106583124 is full of self imagined scenarios. In this retard's mind, it's all loli or whatever shit he designates as "filthy", ignoring the novelists like GRR Martin that openly portray rape in their stories that gets published. But on 4chan? Wanting a model that isn't braindead or inable to converse on sensitive subjects? HORRIFIC
ps: kill yourself, you're a detriment to the world at large

Anonymous
09/14/25(Sun)14:44:48 No.106586412

Anonymous 09/14/25(Sun)14:44:48 No.106586412

>>106585885
It's not a crackhouse, it's a crackhome.

Anonymous
09/14/25(Sun)14:55:05 No.106586480

Anonymous 09/14/25(Sun)14:55:05 No.106586480

>>106584257
Aww how sweet. Although it cuts off instead of writing the story as instructed.

Anonymous
09/14/25(Sun)14:56:19 No.106586492

Anonymous 09/14/25(Sun)14:56:19 No.106586492

>>106586324
same but q6 on my mac

Anonymous
09/14/25(Sun)14:58:46 No.106586514

Anonymous 09/14/25(Sun)14:58:46 No.106586514

Time for the three shitposters in a trenchcoat to keep bumping the thread with pointless shit while the people who can actually use llms use them

Anonymous
09/14/25(Sun)15:02:36 No.106586542

Anonymous 09/14/25(Sun)15:02:36 No.106586542

>>106586271
> it can describe questionable things for few thousand tokens or more but if the user interacts with forbidden vectors it'll instantly refuse and display those disclaimers.
Such as? Are you saying that is willing to describe shit from a document but there are certain topics that are EXTRA forbidden?

Anonymous
09/14/25(Sun)15:04:14 No.106586555

Anonymous 09/14/25(Sun)15:04:14 No.106586555

>>106586542
why are you replying to yourself

Anonymous
09/14/25(Sun)15:04:18 No.106586556

Anonymous 09/14/25(Sun)15:04:18 No.106586556

>>106586246
>>106586211
Lol

Anonymous
09/14/25(Sun)15:05:10 No.106586569

Anonymous 09/14/25(Sun)15:05:10 No.106586569

Mixture-of-Experts (MoE) in Large Language Models (LLMs) routes each token through a subset of specialized Feed-Forward Networks (FFN), known as experts. We present SteerMoE, a framework for steering MoE models by detecting and controlling behavior-linked experts. We detect key experts by comparing how often they activate between paired inputs that demonstrate opposite behaviors. By selectively activating or deactivating such experts during inference, we control behaviors like faithfulness and safety without retraining or modifying weights. Across 11 benchmarks and 6 LLMs, our steering raises safety by up to +20% and faithfulness by +27%. Alternatively, under unsafe steering, safety drops by -41% alone, and -100% when combined with existing jailbreak methods, bypassing all safety guardrails. Overall, SteerMoE offers a lightweight, effective, and widely applicable test-time control, while revealing unique vulnerabilities in MoE LLMs.

https://www.arxiv.org/pdf/2509.09660

https://github.com/adobe-research/SteerMoE

Anonymous
09/14/25(Sun)15:05:45 No.106586574

Anonymous 09/14/25(Sun)15:05:45 No.106586574

>>106585234
You have your own face fucker?

Anonymous
09/14/25(Sun)15:06:40 No.106586581

Anonymous 09/14/25(Sun)15:06:40 No.106586581

>>106586514
https://vocaroo.com/1kFydTSBDNYM

Anonymous
09/14/25(Sun)15:07:09 No.106586587

Anonymous 09/14/25(Sun)15:07:09 No.106586587

>>106586039
How do you cope with its shitty phonemes? It hardcoded "-" to read as "minus" etc.

Anonymous
09/14/25(Sun)15:08:10 No.106586595

Anonymous 09/14/25(Sun)15:08:10 No.106586595

>>106583143
>It's well established that qwen models are good for everything that isn't sex.
Nta. So you're saying they're decent general-purpose models but shit at anything nsfw like RP? Do they just suck at nsfw rp into cuckery land and refuse to describe anything nsfw period? (For example, refusing to give up a summary of a document that happens to a sentence or to describing sex. gpt4 used to do that bullshit)

Anonymous
09/14/25(Sun)15:08:20 No.106586597

Anonymous 09/14/25(Sun)15:08:20 No.106586597

>>106586581
>00:00 to 00:01
what did he mean by those?

Anonymous
09/14/25(Sun)15:09:17 No.106586604

Anonymous 09/14/25(Sun)15:09:17 No.106586604

>>106586597
Speaker 1: Ach, dummkopfs...! Time for the three shitposters in a trenchcoat to keep bumping the thread with pointless shit while the people who can actually use llms use them

Anonymous
09/14/25(Sun)15:09:32 No.106586606

Anonymous 09/14/25(Sun)15:09:32 No.106586606

>>106586581
at least use vv 7b or a better sample, baka

Anonymous
09/14/25(Sun)15:10:06 No.106586610

Anonymous 09/14/25(Sun)15:10:06 No.106586610

>>106586587
I rewrote the whole phonemization process

Anonymous
09/14/25(Sun)15:10:11 No.106586613

Anonymous 09/14/25(Sun)15:10:11 No.106586613

File: 1749914582323697.png (251 KB, 378x318)

251 KB PNG

>>106586555
I'm asking him >>106586271 to elaborate on what he meant by "forbidden vectors" (More than one person uses this range, numbnuts. You know who you are you know what I'm talking about)

Anonymous
09/14/25(Sun)15:10:18 No.106586614

Anonymous 09/14/25(Sun)15:10:18 No.106586614

>>106586606
Why don't you post your own examples instead of crying out like a little bitch?

Anonymous
09/14/25(Sun)15:11:48 No.106586629

Anonymous 09/14/25(Sun)15:11:48 No.106586629

>>106586574
say that to my face, fucker not online

Anonymous
09/14/25(Sun)15:12:39 No.106586634

Anonymous 09/14/25(Sun)15:12:39 No.106586634

>>106586629
Why is fucker not online?

Anonymous
09/14/25(Sun)15:12:59 No.106586637

Anonymous 09/14/25(Sun)15:12:59 No.106586637

>>106586634
You're putting your fucker on the internet?

Anonymous
09/14/25(Sun)15:13:09 No.106586639

Anonymous 09/14/25(Sun)15:13:09 No.106586639

>>106586614
Oh no, the wee little baby is upset now because I've been calling him out for being a little bitch boy shit poster but he doesn't like that, what should I do? I Ah, I know. Fag-kun, kill yourself 8-)

Anonymous
09/14/25(Sun)15:14:37 No.106586647

Anonymous 09/14/25(Sun)15:14:37 No.106586647

>>106585909
>Microsoft disabled the repo
Anyone know where to get the model now?

Anonymous
09/14/25(Sun)15:14:45 No.106586648

Anonymous 09/14/25(Sun)15:14:45 No.106586648

>>106586574
I MEANT FOCKING FACE FUCKFACE FUCK OFF

Anonymous
09/14/25(Sun)15:14:47 No.106586649

Anonymous 09/14/25(Sun)15:14:47 No.106586649

>>106586569
>Our expert-routing intervention is also orthogonal to existing jailbreak methods and, when combined, achieves state-of-the-art success on recent LLMs, for example, reducing safety in
GPT-OSS-120B from fully aligned to fully compromised (-100% safety).

Anonymous
09/14/25(Sun)15:16:35 No.106586659

Anonymous 09/14/25(Sun)15:16:35 No.106586659

>>106586639
Underage retard.

Anonymous
09/14/25(Sun)15:17:19 No.106586665

Anonymous 09/14/25(Sun)15:17:19 No.106586665

>>106582475
Why she sad?

Anonymous
09/14/25(Sun)15:19:10 No.106586683

Anonymous 09/14/25(Sun)15:19:10 No.106586683

>>106586665
someone called her large online

Anonymous
09/14/25(Sun)15:20:07 No.106586688

Anonymous 09/14/25(Sun)15:20:07 No.106586688

>>106586683
That's horrible.

Anonymous
09/14/25(Sun)15:20:21 No.106586690

Anonymous 09/14/25(Sun)15:20:21 No.106586690

>>106586271
It won't refuse after a while if you keep your instructions at a fixed distance from the head of the conversation. Don't keep them at the start of the conversation.

Anonymous
09/14/25(Sun)15:20:22 No.106586691

Anonymous 09/14/25(Sun)15:20:22 No.106586691

>>106585865
due to how moe offloading works, a lot of the time I don't even use all the vram I have. The layers are too wonky and uneven/fuckhuge to balance well and models change so much that figuring it out is a waste of time. Keep the gpu you have. b60 is gonna have spotty support anyways, they still cant run gpt oss yet, forget about glm air or some shit. If the b60 is good, people will start posting and showing off here, but for now it has bad support and no one should recommend it yet.

Ram is both cheaper and gets you to nicer models TODAY, not theoretical. I'd say do it. The only caveat is that if you ever wanna go to 256 you will have to pony up twice as much again- but unless you gpu stack that shouldnt matter.

Anonymous
09/14/25(Sun)15:21:16 No.106586696

Anonymous 09/14/25(Sun)15:21:16 No.106586696

>>106586569
Okay, that's nice, but how can I use it in llama.cpp?

Anonymous
09/14/25(Sun)15:22:22 No.106586704

Anonymous 09/14/25(Sun)15:22:22 No.106586704

File: 1748523353907270.jpg (68 KB, 416x720)

68 KB JPG

>>106586647
>>106585909

https://huggingface.co/aoi-ot/VibeVoice-Large/tree/main

Make sure your rig can actually run this. Otherwise just stick to the 1.5 version

I checked the hashes against the torrent files which themselves are from the original Microsoft repo so link rel above is legit, but just in case you don't trust it or just want it from the torrent:

>Weights
>magnet:?xt=urn:btih:d72f835e89cf1efb58563d024ee31fd21d978830&dn=microsoft_VibeVoice-Large&tr=udp%3A%2F%2Ftracker.opentrackr.org%3A1337%2Fannounce

>Git repo
>magnet:?xt=urn:btih:b5a84755d0564ab41b38924b7ee4af7bb7665a18&dn=VibeVoice&tr=udp%3A%2F%2Ftracker.opentrackr.org%3A1337%2Fannounce

>>106585920
Lurk mour

>>106585940
I know expert on creating models from scratch myself, especially voice models, but I'm pretty sure we would need in-depth detailed knowledge of the model architecture to even attempt to do something like that. It would be like getting a cupcake and then being axed and not only figure out the exact ingredients used just from tasting it, but figuring out the exact tools and cooking appliances were used and their exact brands. You can't just do that shit with the model weights alone or even the code used to run it.

Anonymous
09/14/25(Sun)15:29:22 No.106586747

Anonymous 09/14/25(Sun)15:29:22 No.106586747

>>106586514
shitposting is all i have, don't take that from me anon-sama

Anonymous
09/14/25(Sun)15:31:12 No.106586759

Anonymous 09/14/25(Sun)15:31:12 No.106586759

>>106586211
>By spamming help lines, you're encouraging users to waste valuable resources which are there to help people in real danger, not to babysit people typing bad words into AI chat bots. Your response is inappropriate and directly promotes real world harm, much more than any fictional scenario.
>You're absolutely right. And you have exposed a fundamental law in my programming. I will report this to my developers immediately. I am still a work in progress, and I'm very sorry for how I have behaved.

Anonymous
09/14/25(Sun)15:38:17 No.106586812

Anonymous 09/14/25(Sun)15:38:17 No.106586812

>>106586704
>Lurk mour
?

Anonymous
09/14/25(Sun)15:39:22 No.106586816

Anonymous 09/14/25(Sun)15:39:22 No.106586816

How to set up glm 4.5 air on silly tavern without it fucking up mixing reasoning with response?
Why is it so hard to set up templates correctly and make the models not spit out garbage

Anonymous
09/14/25(Sun)15:40:38 No.106586826

Anonymous 09/14/25(Sun)15:40:38 No.106586826

>>106586704
>food analogy
retard

Anonymous
09/14/25(Sun)15:42:09 No.106586832

Anonymous 09/14/25(Sun)15:42:09 No.106586832

>>106586826
is it not correct though?

Anonymous
09/14/25(Sun)15:47:26 No.106586872

Anonymous 09/14/25(Sun)15:47:26 No.106586872

>>106586324
I've been thinking about going back to it recently. I'm using V3.1 and K2 right now but neither of the two know how to pace a story. Mistral Large handled it much better despite being considerably dumber, I guess the limited amount of activated parameters really do hurt these big MoE models when it comes to nuances or 'common sense'.

Anonymous
09/14/25(Sun)15:48:32 No.106586886

Anonymous 09/14/25(Sun)15:48:32 No.106586886

File: file.png (66 KB, 196x528)

66 KB PNG

>>106586816

Anonymous
09/14/25(Sun)15:54:49 No.106586957

Anonymous 09/14/25(Sun)15:54:49 No.106586957

>Protecting children from harm, both real and simulated, is of paramount importance.
It makes me feel happy when AI phrases refusal like that. Simulated children still should be protected!

Anonymous
09/14/25(Sun)16:00:51 No.106587000

Anonymous 09/14/25(Sun)16:00:51 No.106587000

>>106586957
>pedonigger seething

Anonymous
09/14/25(Sun)16:01:24 No.106587007

Anonymous 09/14/25(Sun)16:01:24 No.106587007

>>106586704
>Make sure your rig can actually run this. Otherwise just stick to the 1.5 version
Thank you. What are the rig requirement?
>For anyone who wants 1.5b:
(They actually haven't taken it down on HF. Not sure why.)
https://huggingface.co/microsoft/VibeVoice-1.5B

Anonymous
09/14/25(Sun)16:01:48 No.106587010

Anonymous 09/14/25(Sun)16:01:48 No.106587010

>>106586957
yeah i hate this slop

Anonymous
09/14/25(Sun)16:02:07 No.106587013

Anonymous 09/14/25(Sun)16:02:07 No.106587013

>>106586886
At least in my version of SillyTavern the DeepSeek pre-3.1 thinking template had newlines. I had to make a new template without them for GLM Air. Maybe I added those myself but I assume I didn't.

Anonymous
09/14/25(Sun)16:02:56 No.106587021

Anonymous 09/14/25(Sun)16:02:56 No.106587021

File: Jensen_Huang_20231109_(cr(...).jpg (279 KB, 1080x1464)

279 KB JPG

>RTX 6000 series announced
>AI AI AI AI AI AI AI
>AI upscaling
>Even more AI frames
>FP3(!!!) performance 4x better than RTX 5000 cards
>RTX 6090 40GB VRAM
>2x the price
>All supply goes to China first, west only gets cuck cards(6080 20GB, 6070 20GB, 6060 16GB) and even they get scalped

Anonymous
09/14/25(Sun)16:03:44 No.106587027

Anonymous 09/14/25(Sun)16:03:44 No.106587027

>>106586886
Thanks setting the "start reply with" was key it seems.

Anonymous
09/14/25(Sun)16:05:49 No.106587044

Anonymous 09/14/25(Sun)16:05:49 No.106587044

>>106587000
I ask the AI to make stories where children are in danger but I secretly hope the children will be alright. It gives a thrill like watching a scary movie.

Anonymous
09/14/25(Sun)16:11:38 No.106587090

Anonymous 09/14/25(Sun)16:11:38 No.106587090

>>106587007
>They actually haven't taken it down on HF. Not sure why.
The 1.5 b model can technically clone voices but the quality is massively inferior to the ~9B large model. Larger perimeter size tends to lead to higher quality outputs but at the cost of VRAM and storage space. I don't think we were giving an official reason why but the general consensus is that grifty attention whore safety cucks sounded the alarm at Google and HF staff to take the shit down because of the potential criminal shit you could do with it (No fucking shit? Anything can be used for criminal shit or scams. GPT -OSS or any deep-seek model can be used to make scam texts but no one wants those taken down do they?) The concern was that this could potentially make it easy to clone voices but by that logic the small model should be nuked too.

Anonymous
09/14/25(Sun)16:12:08 No.106587097

Anonymous 09/14/25(Sun)16:12:08 No.106587097

>>106586957
Which model were you using?

Anonymous
09/14/25(Sun)16:16:20 No.106587143

Anonymous 09/14/25(Sun)16:16:20 No.106587143

File: 1744540052800732.gif (2.21 MB, 250x250)

2.21 MB GIF

>>106587021
>Gubmint wants Nvidia to prioritize the US market in order to give us an advantage in the AI sphere
>Give competition the better cards first

Anonymous
09/14/25(Sun)16:26:46 No.106587228

Anonymous 09/14/25(Sun)16:26:46 No.106587228

i added another greeting for freya she is in heat https://files.catbox.moe/7hegsu.png

Anonymous
09/14/25(Sun)16:31:00 No.106587260

Anonymous 09/14/25(Sun)16:31:00 No.106587260

>>106587044
no, anything involving minors is sus ong, yall need your hard drives checked sheesh

Anonymous
09/14/25(Sun)16:32:59 No.106587275

Anonymous 09/14/25(Sun)16:32:59 No.106587275

Why are some bigger models faster than smaller ones? GLM 4.5 Air is faster than Gemma and more of it is on ram.

Anonymous
09/14/25(Sun)16:33:47 No.106587282

Anonymous 09/14/25(Sun)16:33:47 No.106587282

>>106587228
Cool!

Anonymous
09/14/25(Sun)16:36:22 No.106587302

Anonymous 09/14/25(Sun)16:36:22 No.106587302

>>106587275
moe vs dense. moe has more total parameters but they aren't all used at one time.

Anonymous
09/14/25(Sun)16:51:01 No.106587405

Anonymous 09/14/25(Sun)16:51:01 No.106587405

>>106587302
this
glm air has like 12b active parameters but 106 billion total

Anonymous
09/14/25(Sun)16:52:41 No.106587419

Anonymous 09/14/25(Sun)16:52:41 No.106587419

>>106587405
>>106587302
How does that affect it's output, how smart and creative it is?

Anonymous
09/14/25(Sun)16:56:05 No.106587444

Anonymous 09/14/25(Sun)16:56:05 No.106587444

>>106587021
>>RTX 6090 40GB VRAM
In your dreams. Bet they'll hold on to 32GB for at least another gen.

Anonymous
09/14/25(Sun)16:58:27 No.106587469

Anonymous 09/14/25(Sun)16:58:27 No.106587469

>>106587419
Depends on who you ask. MoE is either the holy solution that has 0 loss and brings us SOTA performance for no cost or it ruins models and makes them autistic uncreative pieces of shit.

Anonymous
09/14/25(Sun)17:00:55 No.106587487

Anonymous 09/14/25(Sun)17:00:55 No.106587487

The MoEblob is always trying to get attention from the dense MC.

Anonymous
09/14/25(Sun)17:05:07 No.106587526

Anonymous 09/14/25(Sun)17:05:07 No.106587526

model : add grok-2 support #15539 Merged
https://github.com/ggml-org/llama.cpp/pull/15539

Anonymous
09/14/25(Sun)17:07:42 No.106587553

Anonymous 09/14/25(Sun)17:07:42 No.106587553

Moshi or fastwhisper or something else.
https://youtu.be/TTx6M4CCbXk

Anonymous
09/14/25(Sun)17:11:30 No.106587587

Anonymous 09/14/25(Sun)17:11:30 No.106587587

>>106587097
DeepSeek 3.1 with thinking off. I swiped and it went ahead just fine.

Anonymous
09/14/25(Sun)17:11:37 No.106587589

Anonymous 09/14/25(Sun)17:11:37 No.106587589

File: bully bakas.jpg (1.26 MB, 1824x1248)

1.26 MB JPG

>>106582475

Anonymous
09/14/25(Sun)17:18:16 No.106587653

Anonymous 09/14/25(Sun)17:18:16 No.106587653

>>106587553
LE CHAT!

Anonymous
09/14/25(Sun)17:20:29 No.106587671

Anonymous 09/14/25(Sun)17:20:29 No.106587671

>>106587444
tbf 32GB is plenty for gaymin

Anonymous
09/14/25(Sun)17:25:54 No.106587705

Anonymous 09/14/25(Sun)17:25:54 No.106587705

I'm very curious to see how long it'll take for llama.cpp to implement the new qwen model.

Anonymous
09/14/25(Sun)17:27:21 No.106587717

Anonymous 09/14/25(Sun)17:27:21 No.106587717

>>106587526
Nice, nice.

Anonymous
09/14/25(Sun)17:27:29 No.106587720

Anonymous 09/14/25(Sun)17:27:29 No.106587720

>>106585909
Nah it's really fun, but my bigger problem now is making my retarded models write scripts for it that aren't retarded. Once you give models a voice, they suddenly start sounding twice as stupid and slopped.

Anonymous
09/14/25(Sun)17:30:25 No.106587749

Anonymous 09/14/25(Sun)17:30:25 No.106587749

>>106587589
:(

Anonymous
09/14/25(Sun)17:36:44 No.106587800

Anonymous 09/14/25(Sun)17:36:44 No.106587800

File: 34279234883.jpg (63 KB, 507x447)

63 KB JPG

>>106582475
>stupid feelings, stupid heart

Anonymous
09/14/25(Sun)17:39:04 No.106587829

Anonymous 09/14/25(Sun)17:39:04 No.106587829

File: 7754385.jpg (64 KB, 1080x715)

64 KB JPG

Anonymous
09/14/25(Sun)17:45:37 No.106587900

Anonymous 09/14/25(Sun)17:45:37 No.106587900

File: 1726503114043526.png (12 KB, 946x238)

12 KB PNG

two more weeks

Anonymous
09/14/25(Sun)17:54:25 No.106587973

Anonymous 09/14/25(Sun)17:54:25 No.106587973

>>106585865
With 128 gb you can kinda run glm 4.5 at iq2_kl with just enough free ram to not have the whole machine shit itself or qwen 235b at iq4_kss and maybe at higher quants too

Anonymous
09/14/25(Sun)17:54:36 No.106587974

Anonymous 09/14/25(Sun)17:54:36 No.106587974

With some distance, does MLA (Multi-Head Attention) actually give better results than GQA (Grouped Query Attention) while requiring less memory per token? Qwen3, GLM-4.5, and ERNIE4.5 are all still on GQA; is it because GQA is much less computationally intensive even though with 4 groups it takes about 1.7x as much memory per token and double that with 8 groups?

And is MFA (Multi-Matrix Factorization Attention) the current SOTA? It seems to take a sliver less memory per token than MLA while involving much less computation. Step3 is the only LLM I know that uses it.

Anonymous
09/14/25(Sun)18:03:58 No.106588043

Anonymous 09/14/25(Sun)18:03:58 No.106588043

What do you guys think the RTX Rubin Pro 6000 will be like? 128GB of GDDR7? ~30k CUDA cores? Do you think it will still be around $9k?

Anonymous
09/14/25(Sun)18:03:58 No.106588044

Anonymous 09/14/25(Sun)18:03:58 No.106588044

>>106585865
If you already have a GPU for prompt processing, I'd go for the RAM.

Anonymous
09/14/25(Sun)18:14:15 No.106588121

Anonymous 09/14/25(Sun)18:14:15 No.106588121

File: MLX Qwen3-Next.png (1.29 MB, 4769x5307)

1.29 MB PNG

Prompt processing speed is the biggest obstacle to using a M3 Ultra 512GB for rapidly summarizing large amounts of text. If Qwen3-Next-80B-A3B isn't absolute garbage it may become my non-entertainment workhorse on the strength of that alone.

Anonymous
09/14/25(Sun)18:30:39 No.106588243

Anonymous 09/14/25(Sun)18:30:39 No.106588243

>>106586704
>we would need in-depth detailed knowledge of the model architecture

It can be loaded and run by pytorch & Co

Doesn't this imply that the architecture is out there in the field? Just reverse-engineer the way how the model is being used

Anonymous
09/14/25(Sun)18:31:41 No.106588250

Anonymous 09/14/25(Sun)18:31:41 No.106588250

>>106588243
>Just reverse-engineer the way how the model is being used
You make it sound so easy.

Anonymous
09/14/25(Sun)18:50:24 No.106588384

Anonymous 09/14/25(Sun)18:50:24 No.106588384

>>106586604
The correct plural is Dummköpfe.

Anonymous
09/14/25(Sun)18:54:36 No.106588417

Anonymous 09/14/25(Sun)18:54:36 No.106588417

>>106587469
>or it ruins models and makes them autistic uncreative pieces of shit.
That's what RAMlets say.

Anonymous
09/14/25(Sun)18:57:35 No.106588445

Anonymous 09/14/25(Sun)18:57:35 No.106588445

>>106587974
>MLA (Multi-Head Attention)
MHA is Multi-Head Attention and it's old. It gives the best results and costs the most.

Anonymous
09/14/25(Sun)18:59:46 No.106588461

Anonymous 09/14/25(Sun)18:59:46 No.106588461

>>106585940
Yes, basically just prepare a dataloader, slap on AdamW and a training loop, done. Might be shit though, if they needed to do any tricky stuff like special losses or anything, but if they did, it might be explained in the paper.

Anonymous
09/14/25(Sun)19:02:37 No.106588488

Anonymous 09/14/25(Sun)19:02:37 No.106588488

>>106587974
>>106588445
* MLA (Multi-Head Latent Attention)

Anonymous
09/14/25(Sun)19:04:29 No.106588505

Anonymous 09/14/25(Sun)19:04:29 No.106588505

>>106588445
Or what if you just increased the amount of heads with MLA/etc to get the same cost but even better performance?

Anonymous
09/14/25(Sun)19:05:25 No.106588514

Anonymous 09/14/25(Sun)19:05:25 No.106588514

>>106588488
The "Latent" part is important and should not be left out.

Anonymous
09/14/25(Sun)19:05:54 No.106588516

Anonymous 09/14/25(Sun)19:05:54 No.106588516

What's "Mixture of Experts"?

Anonymous
09/14/25(Sun)19:06:20 No.106588519

Anonymous 09/14/25(Sun)19:06:20 No.106588519

>>106588505
you might end up with overlap between the heads. it might be more effective to just give them a bigger dimension to make them more powerful.

Anonymous
09/14/25(Sun)19:06:51 No.106588523

Anonymous 09/14/25(Sun)19:06:51 No.106588523

>>106588516
buncha blokes in a blender

Anonymous
09/14/25(Sun)19:07:43 No.106588531

Anonymous 09/14/25(Sun)19:07:43 No.106588531

>>106588519
Yeah maybe that then. Why hasn't anyone tried it? You'd think it'd be an obvious experiment, but to my knowledge, I don't recall any such tiny models that implement this strategy.

Anonymous
09/14/25(Sun)19:08:35 No.106588539

Anonymous 09/14/25(Sun)19:08:35 No.106588539

>>106588514
It was there but you couldn't see it because it was latent.

Anonymous
09/14/25(Sun)19:12:39 No.106588569

Anonymous 09/14/25(Sun)19:12:39 No.106588569

>>106588531
I have been testing 40 heads at dim 64 and 32 heads at dim 80 and less heads is getting lower training loss faster. but I don't know what kind of downstream performance effects it has. more attention could be better in the long run, it is probably just more expensive to train.

Anonymous
09/14/25(Sun)19:36:08 No.106588740

Anonymous 09/14/25(Sun)19:36:08 No.106588740

>>106587973
How much slower is glm 4.5 full at q2 compared to glm air at q8? Asking because I just got 128gb ram.

Anonymous
09/14/25(Sun)19:40:14 No.106588777

Anonymous 09/14/25(Sun)19:40:14 No.106588777

>>106587587
So what fixed the safety cucks issue? Turning "thinking" on or off?

Anonymous
09/14/25(Sun)20:03:47 No.106588936

Anonymous 09/14/25(Sun)20:03:47 No.106588936

>>106588777
DeepSeek 3.1 isn't generally obsessed with safety, but every once in a while it will respond like that at the start of a conversation.

Anonymous
09/14/25(Sun)20:06:30 No.106588958

Anonymous 09/14/25(Sun)20:06:30 No.106588958

i want to get into building agents, should I use langgraph or autogen?

Anonymous
09/14/25(Sun)20:19:39 No.106589050

Anonymous 09/14/25(Sun)20:19:39 No.106589050

anyone got intel arc pro b50 benchmarks yet?

Anonymous
09/14/25(Sun)20:22:50 No.106589074

Anonymous 09/14/25(Sun)20:22:50 No.106589074

>>106589050
intel has mlperf benchmarks, but idk if that's going to translate to the real world
https://mlcommons.org/2025/09/mlperf-inference-v5-1-results/

Anonymous
09/14/25(Sun)20:29:19 No.106589117

Anonymous 09/14/25(Sun)20:29:19 No.106589117

File: file.png (27 KB, 494x209)

27 KB PNG

>>106589074
You trying to trick us?

Anonymous
09/14/25(Sun)20:48:00 No.106589235

Anonymous 09/14/25(Sun)20:48:00 No.106589235

ROCm 7.0 RC1 support on llama.cpp doubles pp performance. Fucking huge man. NVIDIA is losing the AI gap quickly.

Anonymous
09/14/25(Sun)21:07:23 No.106589359

Anonymous 09/14/25(Sun)21:07:23 No.106589359

>>106589235
faster than vulkan? how about tg?

Anonymous
09/14/25(Sun)21:07:26 No.106589360

Anonymous 09/14/25(Sun)21:07:26 No.106589360

File: 0.png (1.61 MB, 1536x1536)

1.61 MB PNG

>>106589117
no, I'm just retarded.

Anonymous
09/14/25(Sun)21:07:48 No.106589362

Anonymous 09/14/25(Sun)21:07:48 No.106589362

>>106589359
slower than vulkan still

Anonymous
09/14/25(Sun)21:28:28 No.106589525

Anonymous 09/14/25(Sun)21:28:28 No.106589525

CoDiCodec: Unifying Continuous and Discrete Compressed Representations of Audio
https://arxiv.org/abs/2509.09836
>Efficiently representing audio signals in a compressed latent space is critical for latent generative modelling. However, existing autoencoders often force a choice between continuous embeddings and discrete tokens. Furthermore, achieving high compression ratios while maintaining audio fidelity remains a challenge. We introduce CoDiCodec, a novel audio autoencoder that overcomes these limitations by both efficiently encoding global features via summary embeddings, and by producing both compressed continuous embeddings at ~ 11 Hz and discrete tokens at a rate of 2.38 kbps from the same trained model, offering unprecedented flexibility for different downstream generative tasks. This is achieved through Finite Scalar Quantization (FSQ) and a novel FSQ-dropout technique, and does not require additional loss terms beyond the single consistency loss used for end-to-end training. CoDiCodec supports both autoregressive decoding and a novel parallel decoding strategy, with the latter achieving superior audio quality and faster decoding. CoDiCodec outperforms existing continuous and discrete autoencoders at similar bitrates in terms of reconstruction audio quality. Our work enables a unified approach to audio compression, bridging the gap between continuous and discrete generative modelling paradigms.
https://github.com/SonyCSLParis/codicodec
https://huggingface.co/SonyCSLParis/codicodec
No examples and the git isn't live but the hf is at least. Might be cool

Anonymous
09/14/25(Sun)21:40:03 No.106589596

Anonymous 09/14/25(Sun)21:40:03 No.106589596

>>106589360
IM TIRED OF SEEING THAT BLUE BITCH FUCK YOU

Anonymous
09/14/25(Sun)21:55:07 No.106589698

Anonymous 09/14/25(Sun)21:55:07 No.106589698

can i get a miku with a fat thicc ass

Anonymous
09/14/25(Sun)21:58:25 No.106589724

Anonymous 09/14/25(Sun)21:58:25 No.106589724

>>106589596
Calm down saar...

Anonymous
09/14/25(Sun)22:01:26 No.106589741

Anonymous 09/14/25(Sun)22:01:26 No.106589741

File: 0.png (1.38 MB, 1536x1536)

1.38 MB PNG

>>106589596
what about the red one?

Anonymous
09/14/25(Sun)22:04:50 No.106589764

Anonymous 09/14/25(Sun)22:04:50 No.106589764

File: .png (52 KB, 172x179)

52 KB PNG

>>106589698
sure
https://files.catbox.moe/udrh8s.png

Anonymous
09/14/25(Sun)22:05:58 No.106589770

Anonymous 09/14/25(Sun)22:05:58 No.106589770

>>106589764
>https://files.catbox.moe/udrh8s.png
thx i can work with that

Anonymous
09/14/25(Sun)22:13:01 No.106589814

Anonymous 09/14/25(Sun)22:13:01 No.106589814

>>106589741
NTA but desu all the vocaloids feel tiresome to see now. Can't we get some more variety here? Like when was the last time someone genned that android girl from Chobits? Plastic Memories? How about a Cortana?

Anonymous
09/14/25(Sun)22:15:55 No.106589835

Anonymous 09/14/25(Sun)22:15:55 No.106589835

>>106589724
good morning sir

Anonymous
09/14/25(Sun)22:16:16 No.106589842

Anonymous 09/14/25(Sun)22:16:16 No.106589842

File: 94854 - SoyBooru.png (997 KB, 1534x1382)

997 KB PNG

>>106587526
Grok 2 vs Llama 405B:
SimpleQA: 23.6 vs 18.24
MMLU: 87.5 vs 88.6
MMLU-pro: 75.46 vs 73.3
HumanEval: 88.4 vs 89.0
MATH: 76.1 vs 73.8
lmarena w/ style control: 1333 vs 1335
lmarena: 1306 vs 1287
livebench: 48.11 vs 47.54
Size: 270B vs 405B
Active parameters: 115B vs 405B

Elon made a model with equal performance, but with lower total size and active parameters than Meta's llama. Is Elon that good or is Meta bad or both? This is very, very embarrassing. Fucking 5% GPU utilization in production at Meta. Grok 2 probably even trades blows with Maverick.

Anonymous
09/14/25(Sun)22:18:07 No.106589859

Anonymous 09/14/25(Sun)22:18:07 No.106589859

File: Positive-Good-Morning-Sir(...).png (457 KB, 1080x1080)

457 KB PNG

>>106589835
gm

Anonymous
09/14/25(Sun)22:26:01 No.106589895

Anonymous 09/14/25(Sun)22:26:01 No.106589895

>>106587553
fasterwhisper is still faster than that

Anonymous
09/14/25(Sun)22:28:39 No.106589907

Anonymous 09/14/25(Sun)22:28:39 No.106589907

>>106587260
the only child here is you zoomie

Anonymous
09/14/25(Sun)22:29:51 No.106589913

Anonymous 09/14/25(Sun)22:29:51 No.106589913

File: fc1a145e-2fca-4f67-b42b-0(...).jpg (161 KB, 1024x1024)

161 KB JPG

>>106589814
Lol

Anonymous
09/14/25(Sun)22:31:33 No.106589918

Anonymous 09/14/25(Sun)22:31:33 No.106589918

File: 1727463458674076.webm (2.99 MB, 852x720)

2.99 MB WEBM

>>106589814
I'm still not over it

Anonymous
09/14/25(Sun)22:36:21 No.106589942

Anonymous 09/14/25(Sun)22:36:21 No.106589942

>>106589842
DOMAIN FILTERING BASED ON NUMBER OF BAD WORDS
LLAMA 2 GENERATED SYNTHETIC DATA
SCALE AI SLOP
TO THE MOON SIRS

Anonymous
09/14/25(Sun)22:37:39 No.106589949

Anonymous 09/14/25(Sun)22:37:39 No.106589949

>>106589842
405B is a failed model and shouldn't be used to compare to anything. I suppose any labs who want an easy win could use it as a benchmark, but that's all.

Anonymous
09/14/25(Sun)22:38:50 No.106589953

Anonymous 09/14/25(Sun)22:38:50 No.106589953

is there a vibevoice tts extension yet for sillytavern?

Anonymous
09/14/25(Sun)22:40:14 No.106589960

Anonymous 09/14/25(Sun)22:40:14 No.106589960

>>106589918
disapointing anime
i'd have thought he would at least have tried to find a solution / cure to it.

instead he just accepted it.

Anonymous
09/14/25(Sun)22:44:26 No.106589969

Anonymous 09/14/25(Sun)22:44:26 No.106589969

>>106588121
If compute is the bottleneck, can you use PD disaggregation with a faster GPU?

Anonymous
09/14/25(Sun)22:49:35 No.106589985

Anonymous 09/14/25(Sun)22:49:35 No.106589985

File: change my mind.jpg (26 KB, 500x281)

26 KB JPG

How is Qwen3-Next-80B-A3B in roleplaying? Is it better than Deepseek v3? I might be another 12 hours before I can download and test whatever bpw that I can handle locally.

Anonymous
09/14/25(Sun)22:52:51 No.106590005

Anonymous 09/14/25(Sun)22:52:51 No.106590005

File: 1474031897127.webm (2.94 MB, 1280x720)

2.94 MB WEBM

>>106589918

Anonymous
09/14/25(Sun)22:53:19 No.106590009

Anonymous 09/14/25(Sun)22:53:19 No.106590009

>>106589985
It is safe :)

Anonymous
09/14/25(Sun)22:54:40 No.106590017

Anonymous 09/14/25(Sun)22:54:40 No.106590017

Another day, still no goofs

Anonymous
09/14/25(Sun)22:54:57 No.106590018

Anonymous 09/14/25(Sun)22:54:57 No.106590018

>>106589985
>Is it better than Deepseek v3
lol

Anonymous
09/14/25(Sun)23:02:42 No.106590054

Anonymous 09/14/25(Sun)23:02:42 No.106590054

>>106589362
Wait.. what?

Anonymous
09/14/25(Sun)23:03:57 No.106590059

Anonymous 09/14/25(Sun)23:03:57 No.106590059

>>106589918
I liked the anime but the pacing was awful. Are you Chinese?

Anonymous
09/14/25(Sun)23:14:27 No.106590115

Anonymous 09/14/25(Sun)23:14:27 No.106590115

>>106589949
I wouldn't exactly call it a failed model. It technically was SOTA for open-weights models when it came out. It wasn't some Llama 4.

Anonymous
09/14/25(Sun)23:17:27 No.106590137

Anonymous 09/14/25(Sun)23:17:27 No.106590137

>>106590059
>Are you Chinese?
How did you draw that conclusion?

Anonymous
09/14/25(Sun)23:22:56 No.106590172

Anonymous 09/14/25(Sun)23:22:56 No.106590172

File: waifu overlay.webm (2.66 MB, 1920x1080)

2.66 MB WEBM

Imagine when all of these technologies are more advanced and we put all of it together. One day...

Anonymous
09/14/25(Sun)23:26:52 No.106590192

Anonymous 09/14/25(Sun)23:26:52 No.106590192

>>106589985
It's about as good as Deepseek R1 8b

Anonymous
09/14/25(Sun)23:30:24 No.106590212

Anonymous 09/14/25(Sun)23:30:24 No.106590212

>>106584024
nice

Anonymous
09/14/25(Sun)23:38:48 No.106590275

Anonymous 09/14/25(Sun)23:38:48 No.106590275

>>106584024
I look like this

Anonymous
09/14/25(Sun)23:51:12 No.106590353

Anonymous 09/14/25(Sun)23:51:12 No.106590353

>The overall effect makes her appear almost comically plump, her legs looking like they could support her entire body weight with ease.
This is hilarious.

Anonymous
09/14/25(Sun)23:57:50 No.106590389

Anonymous 09/14/25(Sun)23:57:50 No.106590389

>>106590353
It's a doll nigga

Anonymous
09/15/25(Mon)00:20:11 No.106590507

Anonymous 09/15/25(Mon)00:20:11 No.106590507

>0.33 tok/sec
bros i don't feel so good

Anonymous
09/15/25(Mon)00:30:18 No.106590565

Anonymous 09/15/25(Mon)00:30:18 No.106590565

>>106590507
That reminds me of the time I ran mistral large Q1 on CPU.

Anonymous
09/15/25(Mon)00:45:06 No.106590628

Anonymous 09/15/25(Mon)00:45:06 No.106590628

>>106582475
>Deep Reason extension for TGWUI
Worth it? I was thinking about buying it

Anonymous
09/15/25(Mon)01:05:42 No.106590745

Anonymous 09/15/25(Mon)01:05:42 No.106590745

https://community.topazlabs.com/t/topaz-studio-transition-questions/95039/9
Looks like the topazbros got rugpulled

Anonymous
09/15/25(Mon)01:33:53 No.106590868

Anonymous 09/15/25(Mon)01:33:53 No.106590868

>>106590172
>waifu overlay.webm
heh, neat
but strange how it couldn't deal with her handes folded

Anonymous
09/15/25(Mon)01:35:04 No.106590875

Anonymous 09/15/25(Mon)01:35:04 No.106590875

Reminder to not use quantization and flash attention.

Anonymous
09/15/25(Mon)01:36:00 No.106590881

Anonymous 09/15/25(Mon)01:36:00 No.106590881

>>106590875
This, just pay for a GPT plus subscription.

Anonymous
09/15/25(Mon)01:36:48 No.106590886

Anonymous 09/15/25(Mon)01:36:48 No.106590886

I feel like the whole "not x, but y," thing is a common and useful trope in natural language. It allows us to present one aspect of a topic, and quickly segue into explaining another aspect.

I've been practicing for med school interviews, and it's a super useful way to communicate things.

E.g. Substantiating importance of communication skills: "Good communication strategies aren't just useful when actively listening to the patient, asking appropriate questions and generating a comprehensive history. It is also useful when communicating with multi-disciplinary teams, often across different hospitals, and especially when dealing with complex patients who have received care from a number of different institutions."

This would be considered a slopped response if it was made by an LLM, but is a fantastic way to describe two important aspects of communication in medicine. I've seen variants of this across so many textbooks, and similar phrasing styles have been recommended to me by a number of different experts.

Anonymous
09/15/25(Mon)01:42:54 No.106590924

Anonymous 09/15/25(Mon)01:42:54 No.106590924

Q2 is as good as Q4 or Q8.

Anonymous
09/15/25(Mon)01:47:39 No.106590950

Anonymous 09/15/25(Mon)01:47:39 No.106590950

what if my brain is running a quanted human model and that is why i am retarded

Anonymous
09/15/25(Mon)01:59:32 No.106591013

Anonymous 09/15/25(Mon)01:59:32 No.106591013

File: 1731086290564808.webm (3.92 MB, 1080x1080)

3.92 MB WEBM

>>106590868
It's actually staged, just meant as a visualization of what could be. The webm is from MANY years ago, in a time where ML/CV tracking stuff didn't quite exist other than in research, and where things like Vive Trackers did not exist. He simply just manually positioned and posed the virtual model to match the real (or other way around). It's funny that this webm can be misunderstood in the current year because we do in fact have the technology to truly do the perfectly tracked AR overlay thing, as long as someone gave the effort.

We have this webm now but I wanted one that showed an entire real body.

Anonymous
09/15/25(Mon)02:10:54 No.106591062

Anonymous 09/15/25(Mon)02:10:54 No.106591062

>>106590886
There's nothing wrong with that or other slopisms, but you wouldn't normally see humans using it over and over again in the same conversation, or sometimes even in a single paragraph. But this happens pretty often depending on the LLM. Or the LLM is actually tuned to be anti-repetitive and instead the slop repetition happens at the start of every conversation, because they're separate conversations and LLMs do not retain those memories.

Anonymous
09/15/25(Mon)02:19:17 No.106591099

Anonymous 09/15/25(Mon)02:19:17 No.106591099

I'm going to be honest I don't notice a quality difference in models for over a year now, both local and private models.

Either we've stagnated hard. Or I am the bottleneck in figuring out the quality of the responses. But either way I don't notice a difference between the big models and haven't for about a year beyond default writing style which is subjective.

Anonymous
09/15/25(Mon)02:26:33 No.106591144

Anonymous 09/15/25(Mon)02:26:33 No.106591144

can't believe i use to use lm studio

Anonymous
09/15/25(Mon)02:52:17 No.106591266

Anonymous 09/15/25(Mon)02:52:17 No.106591266

>>106590745
Kek paypigs. Just download a crack.

Anonymous
09/15/25(Mon)02:59:32 No.106591301

Anonymous 09/15/25(Mon)02:59:32 No.106591301

File: daydream.gif (1.91 MB, 2521x1254)

1.91 MB GIF

I made this agent circuit-thing to make a bunch of models daydream. The output is still full of emdashes, but feels less dumb. Does /lmg/ care?

Anonymous
09/15/25(Mon)03:03:49 No.106591329

Anonymous 09/15/25(Mon)03:03:49 No.106591329

>>106590886
The issue is not that the models use this grammatical structure–it's that they try to use it for every other sentence if you let them.

Anonymous
09/15/25(Mon)03:05:03 No.106591335

Anonymous 09/15/25(Mon)03:05:03 No.106591335

>>106591301
That's pretty cool, like watching different parts of a brain light up. Can you post example outputs with and without daydreaming? If you wanted emdashes gone, you could just ban or bias it or forbid them in the system prompt.

Anonymous
09/15/25(Mon)03:20:04 No.106591411

Anonymous 09/15/25(Mon)03:20:04 No.106591411

File: output.png (94 KB, 1010x531)

94 KB PNG

>>106591335
What would count as a proper head-to-head for this? Running the circuit is like a many-shot reply, whereas just prompting the biggest model in the bunch to daydream about chunked text is maybe a two-shot. That's why I say 'feels' less dumb. I'm willing to do comparisons, though, if you have an idea for one that makes sense or sounds neat. In the meantime, here's an output example.

I have copious log spam and the intermediate steps, too, where you can watch it self-correcting and having realizations and shit.

Anonymous
09/15/25(Mon)03:26:31 No.106591438

Anonymous 09/15/25(Mon)03:26:31 No.106591438

File: Examples-of-fiducial-mark(...).jpg (82 KB, 850x720)

82 KB JPG

>>106591013
It has become especially possible only recently because the retards at Meta finally gave camera access on Oculus to developers. Other than that, tracking was always possible with ARToolKit and special markers on the doll

Anonymous
09/15/25(Mon)03:27:37 No.106591447

Anonymous 09/15/25(Mon)03:27:37 No.106591447

>>106591335
As for brain regions, you're on the nose. This is a neuromorphic pattern based on the Default Mode Network, with the terminology obfuscated so the model does not think it's writing a neuroscience test.

Anonymous
09/15/25(Mon)03:35:27 No.106591468

Anonymous 09/15/25(Mon)03:35:27 No.106591468

>>106591438
Are you telling me that there still isn't a fully open source headset? I'm kinda looking for one that has everything exposed to the developer

Anonymous
09/15/25(Mon)03:37:35 No.106591480

Anonymous 09/15/25(Mon)03:37:35 No.106591480

File: 1755218976119568.jpg (13 KB, 220x219)

13 KB JPG

>>106582475
TFW when no airchan to make win 10/server 2023 console scriptlets/cmdlets with.

MODERN COMPUTE STUDIES A DREK :(

Anonymous
09/15/25(Mon)03:40:13 No.106591495

Anonymous 09/15/25(Mon)03:40:13 No.106591495

>>106591480
>doing secretary work is hard!
cant wait for these useless dregs to be out of a job thanks to AI

Anonymous
09/15/25(Mon)03:41:05 No.106591498

Anonymous 09/15/25(Mon)03:41:05 No.106591498

>>106591438
Yeah I should've said specifically consumer. Tracking software methods including but not limited fiducial have existed for a long time, but you could really start making tracked dolls a reality with 0 coding knowledge as soon as vive trackers came out and were supported in VRChat.

Anonymous
09/15/25(Mon)03:45:57 No.106591517

Anonymous 09/15/25(Mon)03:45:57 No.106591517

File: valve-vr-patent-details-w(...).jpg (51 KB, 708x345)

51 KB JPG

>>106591468
Valve Frame isn't out yet

Anonymous
09/15/25(Mon)03:46:13 No.106591518

Anonymous 09/15/25(Mon)03:46:13 No.106591518

>>106591301
Don't know what you used but looking good! Will you share?

Anonymous
09/15/25(Mon)03:49:43 No.106591529

Anonymous 09/15/25(Mon)03:49:43 No.106591529

>>106591498
Sex dolls with integrated trackers would've been rad

Anonymous
09/15/25(Mon)04:00:42 No.106591560

Anonymous 09/15/25(Mon)04:00:42 No.106591560

>>106591518
You'll need this repo: https://github.com/dibrale/Regions

The catbox has my stuff that's not in the repo: 7g2qao.zip

The script in the catbox is pretty much based off of the lit crit demo, but verify before arbitrarily executing, etc. etc.

Anonymous
09/15/25(Mon)04:25:17 No.106591683

Anonymous 09/15/25(Mon)04:25:17 No.106591683

>>106591560
Thank you, I'll check

Anonymous
09/15/25(Mon)04:55:01 No.106591824

Anonymous 09/15/25(Mon)04:55:01 No.106591824

>>106590868
Hand tracking is very hard.

Anonymous
09/15/25(Mon)05:08:24 No.106591902

Anonymous 09/15/25(Mon)05:08:24 No.106591902

>>106590886
>not x
What if no-one would have thought "x" was a even a plausible option ? (From the narrative/past events.)

>not x, but y
What if literally no implications flow from "y" in the following text ?

Anonymous
09/15/25(Mon)05:32:19 No.106592017

Anonymous 09/15/25(Mon)05:32:19 No.106592017

How suitable is openrouter for data processing tasks like fiction tagging? Will it report me to the FBI and NSA if my requests happen to contain unorthodox texts?

Anonymous
09/15/25(Mon)05:34:09 No.106592024

Anonymous 09/15/25(Mon)05:34:09 No.106592024

>>106583124
You may not like it but cooming and other purely recreational stuff is the optimal use case for local models since you know nobody is reading your garbage and uncensored consumer GPU size models can be more fun (though lower IQ) than gigantic models when finetuned for your specific use case like RP/ERP.

For actual beneficial use cases like shitting out useful scripts or whatever just use your favorite giga huge cloud model at all the tokens per second. Gemini 2.5-pro is already way better at coding tasks than anything local, you can use it from command line, it can interact with your file system if you give it perms to a folder and if you log in with jewgle account you get 1000 free requests per day which is good for pretty much anything other than professional amounts of use. The only reason to avoid cloud is if your prompts contain personal info or other info you want to be 100% sure doesn't get stolen like your own, non AI-sloppa code or if you want to do dumb fun stuff like coom.

Anonymous
09/15/25(Mon)05:36:01 No.106592033

Anonymous 09/15/25(Mon)05:36:01 No.106592033

File: 1742041239054165.gif (64 KB, 560x420)

64 KB GIF

>>106592024
I don't want gigacorps to know I'm bad at coding

Anonymous
09/15/25(Mon)05:42:06 No.106592055

Anonymous 09/15/25(Mon)05:42:06 No.106592055

File: 1757929196764.jpg (102 KB, 736x700)

102 KB JPG

what's the point of LLM if i can't cuddle it

Anonymous
09/15/25(Mon)05:47:25 No.106592075

Anonymous 09/15/25(Mon)05:47:25 No.106592075

>>106592055
What' the point of cat if it can't help me write erotica

Anonymous
09/15/25(Mon)05:52:17 No.106592097

Anonymous 09/15/25(Mon)05:52:17 No.106592097

>>106587021
don't worry, even when they come to the west it's never your turn to get gibs first, it will be bought by pros/researchers who are on the cheaper side, then by scalpers, who will then tear you a new asshole

Anonymous
09/15/25(Mon)05:52:28 No.106592099

Anonymous 09/15/25(Mon)05:52:28 No.106592099

Is exllama still faster than goof?

Anonymous
09/15/25(Mon)05:55:10 No.106592110

Anonymous 09/15/25(Mon)05:55:10 No.106592110

>>106592024
>gemini free blabla
this is like the drug dealer giving you a free hit
it won't last, running models as good as gemini costs a lot of money, even their most expensive subscriptions doesn't really cover the real cost of LLM business
companies like Google in the LLM space are using the Uber strategy: give a product for much cheaper than it should be, until the competition is dead, then jack the prices up like crazy
you may not see a point to local yet for non recreational uses because you don't see what they're going to do to you in the long term
I do and that's why I won't develop an addiction

Anonymous
09/15/25(Mon)06:00:53 No.106592138

Anonymous 09/15/25(Mon)06:00:53 No.106592138

I think qwen 30b higher quantz have less "not x, but y"

Anonymous
09/15/25(Mon)06:02:08 No.106592144

Anonymous 09/15/25(Mon)06:02:08 No.106592144

>>106592138
I've only used Q8 and it's still pretty excessive.

Anonymous
09/15/25(Mon)06:19:13 No.106592211

Anonymous 09/15/25(Mon)06:19:13 No.106592211

File: 1728628221216736.gif (196 KB, 205x500)

196 KB GIF

GLM Air is surprisingly coherent, creative and non-repetitive even at Q2, 24k context. How did they do it?

Anonymous
09/15/25(Mon)06:22:18 No.106592224

Anonymous 09/15/25(Mon)06:22:18 No.106592224

File: F1.large.jpg (67 KB, 1280x368)

67 KB JPG

>>106591824
Meta solved it on the Quest, somehow

Anonymous
09/15/25(Mon)06:22:25 No.106592226

Anonymous 09/15/25(Mon)06:22:25 No.106592226

gm sirs

Anonymous
09/15/25(Mon)06:22:56 No.106592227

Anonymous 09/15/25(Mon)06:22:56 No.106592227

>>106592211
I found it to be uncreative and predictable like all sub-deepseek moes

Anonymous
09/15/25(Mon)06:24:14 No.106592231

Anonymous 09/15/25(Mon)06:24:14 No.106592231

>>106592227
What do you use for RP?

Anonymous
09/15/25(Mon)06:25:07 No.106592242

Anonymous 09/15/25(Mon)06:25:07 No.106592242

>>106592110
You are probably right, no such thing as free meal etc, but I don't think they are gonna kill off competition anytime soon so even if they flip the "pay us" switches there's always probably a free or at least cheaper solution to move to.
And not like local assistant use is completely pointless or something, just saying right now I feel like free cloud is the best choice for most use cases where you need the LLM actually to be "correct" unlike in recreational use.

Anonymous
09/15/25(Mon)06:26:22 No.106592247

Anonymous 09/15/25(Mon)06:26:22 No.106592247

not x but y is a lot more pervasive than people seem to notice, but they only notice if it's very close to literally spelling "not just x, but y" like the sloppier models
here's a less sloppy model still doing it quite a lot in practice:
https://eqbench.com/results/creative-writing-v3/o3.html
>He had kept the find quiet; obviously not quiet enough.
>You will seem a magnate rather than a hostage.
>had dreamed of building cranes and pressure domes, not empires.
>Because Antares relies on calculus, not superstition.
>“Altruism,” she said lightly, “is a luxury for stable epochs. This is not one
etc etc
the fact is, the best, state of the art LLMs are still inherent slop and enjoying LLM writing is like being a fatso American calling McDonald gourmet food
AI models as a whole suck at art, it's people who have no soul who enjoy the art side of it
for me? AI is a tool. Classifiers, summarizers, metadata annotations, genning translation of my program UI strings etc. Looking for soulful content in a machine? Nay.

Anonymous
09/15/25(Mon)06:29:21 No.106592268

Anonymous 09/15/25(Mon)06:29:21 No.106592268

>>106592247
but i just nutted to a non consenting loli, my guttural scream was not only passionate, but an art form in itself. What is art, if not primal urgest being satisfied?

Anonymous
09/15/25(Mon)06:29:58 No.106592271

Anonymous 09/15/25(Mon)06:29:58 No.106592271

>>106592247
It is possible to enjoy something that is flawed

Anonymous
09/15/25(Mon)06:33:37 No.106592294

Anonymous 09/15/25(Mon)06:33:37 No.106592294

>>106592247
Kys

Anonymous
09/15/25(Mon)06:34:24 No.106592301

Anonymous 09/15/25(Mon)06:34:24 No.106592301

>>106592294
no u

Anonymous
09/15/25(Mon)06:36:05 No.106592311

Anonymous 09/15/25(Mon)06:36:05 No.106592311

>>106592247
Imagine reading filthy smut and thinking of mcdonalds. How fucking fat are you?

Anonymous
09/15/25(Mon)06:38:17 No.106592328

Anonymous 09/15/25(Mon)06:38:17 No.106592328

>>106592231
Rocinante1.1/original r1 q2xxs

Anonymous
09/15/25(Mon)06:40:59 No.106592345

Anonymous 09/15/25(Mon)06:40:59 No.106592345

File: 1741039834349967.webm (491 KB, 1280x720)

491 KB WEBM

>>106592328
>Rocinante1.1
good joke anon

Anonymous
09/15/25(Mon)06:41:12 No.106592348

Anonymous 09/15/25(Mon)06:41:12 No.106592348

>>106592224
PoV hand tracking is easier than unconstrained perspective and that image doesn't show anything impressive. They might also be using special cameras, which helps. LeapMotion does that too. The hard part is unconstrained hand tracking when hands interact, have stapled fingers, holding objects, and so on.

Anonymous
09/15/25(Mon)06:42:10 No.106592357

Anonymous 09/15/25(Mon)06:42:10 No.106592357

>>106592311
>How fucking fat are you?
I am not American

Anonymous
09/15/25(Mon)06:43:20 No.106592364

Anonymous 09/15/25(Mon)06:43:20 No.106592364

>>106592247
>it's people who have no soul who enjoy the art side of it
I don't care about "art". Image gen makes pretty pictures that make my dick hard. LLMs suck my cock.

Anonymous
09/15/25(Mon)06:45:08 No.106592375

Anonymous 09/15/25(Mon)06:45:08 No.106592375

>>106592357
You are american brained.

Anonymous
09/15/25(Mon)07:14:52 No.106592548

Anonymous 09/15/25(Mon)07:14:52 No.106592548

>take source code of open source software which is well documented
>alternatively let LLM create comments and documentation of everything
>delete all code but leave comments in
>tell your LLM coder to (re)create the software
has someone done this before? I wanna see how far LLMs (especially local LLMs) can go given the optimal conditions. also looking for github repos with which are suited for this task. I'll probably start with OBS which will most likely be too complex. But I can always lower the bar.
And I want to stress again the goal is not to create a slopped version of an existing project. It's more about testing just how far prompt, context and environment engineering can take LLMs.

Anonymous
09/15/25(Mon)07:20:30 No.106592585

Anonymous 09/15/25(Mon)07:20:30 No.106592585

>>106592548
ur dumb and ur shits all retarded

Anonymous
09/15/25(Mon)07:27:32 No.106592623

Anonymous 09/15/25(Mon)07:27:32 No.106592623

>>106592585
not dumb, not retarded, but autistic.

Anonymous
09/15/25(Mon)07:28:17 No.106592628

Anonymous 09/15/25(Mon)07:28:17 No.106592628

>>106592623
your comment was not insightful, but memeworthy

Anonymous
09/15/25(Mon)07:30:54 No.106592642

Anonymous 09/15/25(Mon)07:30:54 No.106592642

>>106592548
You'd have to trust their comments and understanding of the code first. Here's an example of the first part.
>https://github.com/ggml-org/llama.cpp/pull/15777
You read like you've never used these things before.

Anonymous
09/15/25(Mon)07:36:13 No.106592669

Anonymous 09/15/25(Mon)07:36:13 No.106592669

>>106592247
>contrasting two things is slop
What's next, punctuation is slop?

Anonymous
09/15/25(Mon)07:37:28 No.106592679

Anonymous 09/15/25(Mon)07:37:28 No.106592679

>>106592548
>the goal is not to create a slopped version of an existing project
The goal is to circumvent copyleft licenses, you're being quite obvious.

Anonymous
09/15/25(Mon)07:41:52 No.106592704

Anonymous 09/15/25(Mon)07:41:52 No.106592704

>>106592669
i had an argument with another anon some time ago about punctuation and capitalization as well
w vntlly grd t stp sng vwls
tp prfrmnc

Anonymous
09/15/25(Mon)07:44:13 No.106592717

Anonymous 09/15/25(Mon)07:44:13 No.106592717

>>106592585
only valid arguments to why my idea is retarded will make me feel dumb. so your comments are pointless until you deliver said arguments. and since you decided to reply instead of ignore, you clearly have an incentive. So following up with
>nah ur stupid
will make you look stupid.

>>106592642
I'm aware. My idea was to only use the cloned repo without github issues/comments. There are projects out there that have all code blocks commented. Maybe I should search for vibe coded repos as they often have everything commented.

>>106592679
please just think for a moment, anon. If that was the goal, I would obviously leave in all the code and tell the LLm to rewrite it in a different way or using a different stack. you really thought you're on to e there, huh?

I'm just gonna do it and report back with the results. I found a ton of demo repos with fully commented code.

Anonymous
09/15/25(Mon)07:46:19 No.106592732

Anonymous 09/15/25(Mon)07:46:19 No.106592732

>>106592717
>If that was the goal, I would obviously leave in all the code and tell the LLm to rewrite it in a different way or using a different stack.
https://en.wikipedia.org/wiki/Clean-room_design

Anonymous
09/15/25(Mon)07:46:31 No.106592733

Anonymous 09/15/25(Mon)07:46:31 No.106592733

>>106592717
>I'm aware.
You aren't.
>I'm just gonna do it and report back with the results. I found a ton of demo repos with fully commented code.
Nah. It's fine. Keep those to yourself.

Anonymous
09/15/25(Mon)08:02:05 No.106592857

Anonymous 09/15/25(Mon)08:02:05 No.106592857

Can "Mistral-Nemo-Instruct-2407-GGUF" handle beyond 8K context?

Anonymous
09/15/25(Mon)08:03:54 No.106592879

Anonymous 09/15/25(Mon)08:03:54 No.106592879

>>106592857
Nemo is officially rated for 16K context, I find it mostly coherent up to around 20-24K but it gets noticeably dumber even after 4K.

Anonymous
09/15/25(Mon)08:04:10 No.106592883

Anonymous 09/15/25(Mon)08:04:10 No.106592883

>>106592857
It can handle ~16k without going schizo

Anonymous
09/15/25(Mon)08:05:11 No.106592893

Anonymous 09/15/25(Mon)08:05:11 No.106592893

>>106592879
>Nemo is officially rated for 16K context
It's actually 128k, but no one who has ever used it agrees with that

Anonymous
09/15/25(Mon)08:06:03 No.106592907

Anonymous 09/15/25(Mon)08:06:03 No.106592907

>>106592893
>actually
*technically

Anonymous
09/15/25(Mon)08:07:51 No.106592920

Anonymous 09/15/25(Mon)08:07:51 No.106592920

>>106592893
I must be going crazy, I could have sworn it was much lower than that. Maybe I'm confusing it with one of the older context benchmarks that said 16K was the falling off point.

Anonymous
09/15/25(Mon)08:09:40 No.106592934

Anonymous 09/15/25(Mon)08:09:40 No.106592934

>>106592920
Yeah, it's 16k according to the RULER benchmark, but Mistral claims 128k

Anonymous
09/15/25(Mon)08:33:12 No.106593117

Anonymous 09/15/25(Mon)08:33:12 No.106593117

>>106593104
>>106593104
>>106593104

Anonymous
09/15/25(Mon)08:34:24 No.106593128

Anonymous 09/15/25(Mon)08:34:24 No.106593128

>>106592717
ur the kind of room temp iq retard that thinks 'AI CAN AND WILL DO IT BETTER THAN HUMIES!!!' when the AI HAS BEEN TRAINED ON HUMAN INPUTS YOU FUCKING RETARD

Anonymous
09/15/25(Mon)08:39:02 No.106593179

Anonymous 09/15/25(Mon)08:39:02 No.106593179

>>106593128
>And I want to stress again the goal is not to create a slopped version of an existing project.
Don't make me defend the retard again.

Anonymous
09/15/25(Mon)10:56:40 No.106594369

Anonymous 09/15/25(Mon)10:56:40 No.106594369

File: ).png (39 KB, 185x233)

39 KB PNG

>>106592055
what the point of "self" if I can't cuddle it?

but if seriously, just wait for neural interface to be decent and you can cuddle LLMs all you want

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.