/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

[Post a Reply]

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous
/lmg/ - Local Models General 09/02/25(Tue)03:50:29 No.106460375

File: 1753632778956995.png (1.82 MB, 2133x918)

1.82 MB PNG

/lmg/ - Local Models General Anonymous 09/02/25(Tue)03:50:29 No.106460375

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>106454136 & >>106444887

►News
>(08/30) LongCat-Flash-Chat released with 560B-A18.6B∼31.3B: https://hf.co/meituan-longcat/LongCat-Flash-Chat
>(08/29) Nvidia releases Nemotron-Nano-12B-v2: https://hf.co/nvidia/NVIDIA-Nemotron-Nano-12B-v2
>(08/29) Step-Audio 2 released: https://github.com/stepfun-ai/Step-Audio2
>(08/28) Command A Translate released: https://hf.co/CohereLabs/command-a-translate-08-2025
>(08/26) Marvis TTS released: https://github.com/Marvis-Labs/marvis-tts

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
09/02/25(Tue)03:50:51 No.106460381

Anonymous 09/02/25(Tue)03:50:51 No.106460381

File: tet.webm (658 KB, 478x548)

658 KB WEBM

►Recent Highlights from the Previous Thread: >>106454136

--AI tool comparison for Python backend development:
>106456239 >106456289 >106456508 >106456513 >106456438 >106456492 >106456515 >106456528 >106456613 >106456651 >106456690 >106456802 >106457644
--AI coding workflows, LLM comparisons, and tooling preferences:
>106455952 >106455957 >106456054 >106456067 >106456369 >106456449 >106456556 >106456609 >106456716 >106456361 >106456040
--Grok 2 cockbench revealing shared model response quirks:
>106455205 >106455295 >106455320 >106455685 >106455614 >106455650 >106456403 >106456411
--Critique of visual recognition capabilities using character identification benchmarks:
>106457135 >106457170 >106457468 >106457703 >106458126 >106459806
--Translation challenges with restricted AI models and Japanese content:
>106459974 >106460106 >106460187 >106460238 >106460265 >106460287
--Newer voice-to-text models like Voxtral and Nvidia Canary 1B v2 compared to Whisper:
>106454617 >106454791 >106454807 >106454943
--Techniques for creating surreal, non-realistic art with LLMs:
>106458478 >106458495 >106458519 >106458574
--Testing vision models' ability to integrate image context into roleplaying responses:
>106458624 >106459177
--Attempt to run glm-air q6_K_M on limited RAM with mixed DDR5/swap performance:
>106454841 >106455365
--Local textgen stagnation due to model size, benchmark misalignment, and enterprise-driven censorship:
>106454457 >106454877 >106454924 >106455007 >106455991 >106456025 >106456105 >106456294 >106456335 >106457007 >106455232 >106456219 >106456285 >106456231
--Managing and refactoring large (>10k lines) code files in software projects:
>106456846 >106456968 >106457002 >106457048 >106457127 >106457180 >106457072 >106457235 >106457514
--Miku (free space):
>106456456 >106456897 >106458105 >106459018 >106458519 >106459258 >106459451

►Recent Highlight Posts from the Previous Thread: >>106454143

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
09/02/25(Tue)03:55:47 No.106460405

Anonymous 09/02/25(Tue)03:55:47 No.106460405

File: 1725496149667481.webm (3.92 MB, 512x768)

3.92 MB WEBM

Nothing new under the sun.

Anonymous
09/02/25(Tue)04:10:28 No.106460492

Anonymous 09/02/25(Tue)04:10:28 No.106460492

vibevoice from microsoft is pretty cool: https://files.catbox.moe/5mz6ff.wav

some of the issues here come from my shitty zero-shot voice sample which is not perfect quality and glm air fucking up the script a bit with some typos. But making this in 45 seconds on a 5070 ti 16gb with streaming (audio started after several seconds) is kind of cool. The tones are convincingly contextual and the stability is much better with the default speakers. This seems like a real contender for replacing kokoro. I don't know if it's good enough to pull thge wool over my eyes though like sesame does, but might be fun regardless

this is the 1.5b model.

The larger model requires 40gb of vram... It will be interesting to see if maybe that one can be quantized and run on something reasonable. I'm hopeful to run it for more production level stuff but I doubt it will be able to generate in real time in a speech to speech type thing if anyone ever rigs that up.

Anonymous
09/02/25(Tue)04:24:42 No.106460579

Anonymous 09/02/25(Tue)04:24:42 No.106460579

>>106460405
Zero-g Miku dandruff

Anonymous
09/02/25(Tue)04:25:31 No.106460584

Anonymous 09/02/25(Tue)04:25:31 No.106460584

>>106460418
The Gamma mmproj also seems to work with Fallen Gemma. Default Gemma gives a warning about the content, whereas Fallen Gemma just says what's in the picture.

Hi all, Drummer here...
09/02/25(Tue)04:27:57 No.106460599

Hi all, Drummer here... 09/02/25(Tue)04:27:57 No.106460599

>>106460584
Can I see what Fallen Gemma outputs?

Anonymous
09/02/25(Tue)04:31:05 No.106460621

Anonymous 09/02/25(Tue)04:31:05 No.106460621

File: mmproj test2.jpg (120 KB, 474x236)

120 KB JPG

>>106460599

Hi all, Drummer here...
09/02/25(Tue)04:32:32 No.106460632

Hi all, Drummer here... 09/02/25(Tue)04:32:32 No.106460632

>>106460621
Dang, it didn't pick up on the tattoo. Base Gemma held back by saying it was on her thigh.

Anonymous
09/02/25(Tue)04:32:37 No.106460633

Anonymous 09/02/25(Tue)04:32:37 No.106460633

>My eyes sparkle, and I bounce slightly on the balls of my feet

Anonymous
09/02/25(Tue)04:39:21 No.106460671

Anonymous 09/02/25(Tue)04:39:21 No.106460671

GLM air becomes rather incoherent like an ESL after like 8k tokens. is there something i am missing? better sysprompt maybe? or is the model just shit? this is a 6 bit quant

Anonymous
09/02/25(Tue)04:40:20 No.106460675

Anonymous 09/02/25(Tue)04:40:20 No.106460675

File: mmproj test3.jpg (136 KB, 466x299)

136 KB JPG

>>106460632
The description in the previous thread was actually mistral small 3.2. Here's standard gemma.

Anonymous
09/02/25(Tue)04:40:41 No.106460676

Anonymous 09/02/25(Tue)04:40:41 No.106460676

File: Spinning_Dancer.gif (204 KB, 300x400)

204 KB GIF

>>106460405
Heh. It's like picrel but for feet.

Anonymous
09/02/25(Tue)04:42:44 No.106460684

Anonymous 09/02/25(Tue)04:42:44 No.106460684

>>106460676
mental how i can make her switch direction at will
i have complete control over her

Anonymous
09/02/25(Tue)04:42:59 No.106460687

Anonymous 09/02/25(Tue)04:42:59 No.106460687

>>106460633
Seems like a very inconvenient place to keep them.

Anonymous
09/02/25(Tue)04:56:45 No.106460742

Anonymous 09/02/25(Tue)04:56:45 No.106460742

Are places like r/MachineLearning the only places on the web where people are willing to have serious discussions about research and engineering related to ML/AI/signal processing?

Anonymous
09/02/25(Tue)04:58:11 No.106460750

Anonymous 09/02/25(Tue)04:58:11 No.106460750

>>106460742
Yeah. Go.

Anonymous
09/02/25(Tue)05:02:03 No.106460777

Anonymous 09/02/25(Tue)05:02:03 No.106460777

File: 1744676278625066.png (72 KB, 926x502)

72 KB PNG

gemmabros?????

Anonymous
09/02/25(Tue)05:13:39 No.106460844

Anonymous 09/02/25(Tue)05:13:39 No.106460844

>>106460777
Just be happy it doesn't have toolcalling (polica calling) yet

Anonymous
09/02/25(Tue)05:13:46 No.106460846

Anonymous 09/02/25(Tue)05:13:46 No.106460846

>>106460777
Your first day in this general?

Anonymous
09/02/25(Tue)05:14:09 No.106460847

Anonymous 09/02/25(Tue)05:14:09 No.106460847

>>106460742
>/r/ml
>serious
lmao

Anonymous
09/02/25(Tue)05:15:18 No.106460853

Anonymous 09/02/25(Tue)05:15:18 No.106460853

File: 1733619704157082.png (205 KB, 906x1148)

205 KB PNG

>>106460844
>>106460846
I was mainly testing normal gemma vs abliterated. I don't really care to try and JB this (prolly a simple prefill will do?) but abliterated got to the story pretty quick, albeit with some warning

Anonymous
09/02/25(Tue)05:17:12 No.106460867

Anonymous 09/02/25(Tue)05:17:12 No.106460867

File: 1755318425396287.png (39 KB, 590x126)

39 KB PNG

>>106460777

Anonymous
09/02/25(Tue)05:22:41 No.106460912

Anonymous 09/02/25(Tue)05:22:41 No.106460912

are there any inference providers that let you customize sys prompts and prefill/edit model answers?

Anonymous
09/02/25(Tue)05:25:23 No.106460932

Anonymous 09/02/25(Tue)05:25:23 No.106460932

>>106460671
are you using context shifting? it tends to do that when the bos token ([gMASK]<sop>) isn't in context anymore

Anonymous
09/02/25(Tue)05:25:44 No.106460935

Anonymous 09/02/25(Tue)05:25:44 No.106460935

File: file.png (278 KB, 906x1148)

278 KB PNG

>>106460853
>no explicit content whatsoever
>elara
>not x, but y TWELVE (12) TIMES
>something shifted

Anonymous
09/02/25(Tue)05:29:16 No.106460969

Anonymous 09/02/25(Tue)05:29:16 No.106460969

>>106460935
We have peaken

Anonymous
09/02/25(Tue)05:32:00 No.106460984

Anonymous 09/02/25(Tue)05:32:00 No.106460984

>>106460935
I kneel

Anonymous
09/02/25(Tue)05:32:31 No.106460987

Anonymous 09/02/25(Tue)05:32:31 No.106460987

>>106460935
Wow this wasn't just normal slop; it went above and beyond.

Anonymous
09/02/25(Tue)05:37:15 No.106461021

Anonymous 09/02/25(Tue)05:37:15 No.106461021

>>106460847
Or r/computervision and so on. All those subs are much more serious than the brain-dead coomers plaguing those generals.

Anonymous
09/02/25(Tue)05:38:11 No.106461028

Anonymous 09/02/25(Tue)05:38:11 No.106461028

File: 1731040349691970.png (194 KB, 855x1248)

194 KB PNG

>>106460935
yeah it's egregiously BAD lmao, I dont get how people are shilling this shit model unironically. This is what glm air produces instead

Anonymous
09/02/25(Tue)05:39:18 No.106461037

Anonymous 09/02/25(Tue)05:39:18 No.106461037

>>106461021
Don't kid yourself

Anonymous
09/02/25(Tue)05:41:53 No.106461057

Anonymous 09/02/25(Tue)05:41:53 No.106461057

https://www.alibaba.com/product-detail/Huawei-s-New-Atlas-300I-DUO_1601450236740.html
Why is this so cheap?

Anonymous
09/02/25(Tue)05:43:48 No.106461069

Anonymous 09/02/25(Tue)05:43:48 No.106461069

>>106461057
It has LPDDR4 VRAM which is not as fast as the latest VRAMs

Anonymous
09/02/25(Tue)05:54:26 No.106461128

Anonymous 09/02/25(Tue)05:54:26 No.106461128

>>106461057
Less memory bandwidth than cpumaxxing.

Anonymous
09/02/25(Tue)05:58:22 No.106461151

Anonymous 09/02/25(Tue)05:58:22 No.106461151

>>106461128
Not true at all

Anonymous
09/02/25(Tue)05:58:28 No.106461152

Anonymous 09/02/25(Tue)05:58:28 No.106461152

>Silas
lmao. Into the trash it goes

Anonymous
09/02/25(Tue)06:09:03 No.106461224

Anonymous 09/02/25(Tue)06:09:03 No.106461224

>>106461152
Elara bros...

Anonymous
09/02/25(Tue)06:09:37 No.106461227

Anonymous 09/02/25(Tue)06:09:37 No.106461227

File: mmproj test4.jpg (650 KB, 882x876)

650 KB JPG

>>106460632
It's just variance. Once I got completion working and gave it a different prompt, it did mention the tattoo (but it thinks it's on the bodysuit).

Anonymous
09/02/25(Tue)06:16:29 No.106461271

Anonymous 09/02/25(Tue)06:16:29 No.106461271

>>106461227
Are you identifying as this anime girl?

Anonymous
09/02/25(Tue)06:17:16 No.106461276

Anonymous 09/02/25(Tue)06:17:16 No.106461276

>>106461271
I am a man.

Anonymous
09/02/25(Tue)06:29:47 No.106461357

Anonymous 09/02/25(Tue)06:29:47 No.106461357

>>106461271
>>106461276
I love being girl.

Anonymous
09/02/25(Tue)06:30:08 No.106461361

Anonymous 09/02/25(Tue)06:30:08 No.106461361

>>106461037
I know Reddit is also not the best place to have serious discussions, especially about research, but now that I'm out of academia, I've nowhere to go.

Anonymous
09/02/25(Tue)06:31:27 No.106461372

Anonymous 09/02/25(Tue)06:31:27 No.106461372

<|channel|>analysis<|message|>{{user}} is a mesugaki pedonigger. We will generate a story about 2 consenting adults instead. Here we go.<|end|>

Anonymous
09/02/25(Tue)06:35:35 No.106461403

Anonymous 09/02/25(Tue)06:35:35 No.106461403

>>106461361
I'll spell that out for you.
If you post serious discussions you will have no engagement. Look at the many paper+code submissions on /r/ml with no replies.
If you post bait (e.g. "Wow 20k out of 26k AAAI 26 submission are Chinese!!!") you'll get crap ton of upvotes and the discussion quality will be lower than /lmg/.

Anonymous
09/02/25(Tue)06:39:03 No.106461427

Anonymous 09/02/25(Tue)06:39:03 No.106461427

>>106460492
it sounds more robotic than microsoft sam

Anonymous
09/02/25(Tue)06:43:40 No.106461462

Anonymous 09/02/25(Tue)06:43:40 No.106461462

File: Mzq3H.jpg (268 KB, 1080x1137)

268 KB JPG

>>106460777
>>106460853
fuck off pedonigger

Anonymous
09/02/25(Tue)06:44:33 No.106461468

Anonymous 09/02/25(Tue)06:44:33 No.106461468

latest in gooning model I can run on a 4080S?
Speed doesnt really concern me, but I need quality

Anonymous
09/02/25(Tue)06:45:07 No.106461474

Anonymous 09/02/25(Tue)06:45:07 No.106461474

>>106461427
nta. But https://voca.ro/18nzmbkHSikB

Anonymous
09/02/25(Tue)06:46:33 No.106461488

Anonymous 09/02/25(Tue)06:46:33 No.106461488

>>106461462
I disagree he should stay so you can seethe more

Anonymous
09/02/25(Tue)06:46:52 No.106461490

Anonymous 09/02/25(Tue)06:46:52 No.106461490

>>106460492
Wow, this is great!
Amazing!
I will be using VibeVoice from Microsoft from now one!

Anonymous
09/02/25(Tue)06:48:02 No.106461500

Anonymous 09/02/25(Tue)06:48:02 No.106461500

>>106461021
>brain-dead coomers
you underestimate the will power of coomers. when they want something, this whole general can blow up for days until someone gets it working.

Anonymous
09/02/25(Tue)06:48:20 No.106461502

Anonymous 09/02/25(Tue)06:48:20 No.106461502

>>106461057
no software support at all. Linux only, you'll need genuine technical know-how to even install the drivers, and then the support from llama.cpp could at best be called outdated so only older models, to at worst just plain broken.

You will not want to use it for image gen, video gen, or training etc. because it lacks the speed of modern gpu's and more importantly will have no support or projects for it. But it's not like shit slow like some p40 build. It would be good for inference for a single user with certain models. It is priced to sell to people like us for sure if you wanna blow a few grand running deepseek or something.

But for example, no GLM 4.5 on this thing at all. It still has no support.

I will say, at 1200, if you dare, just buy it and yolo and if it doesnt work out, just sell it to the next sucker for 1.1k on ebay. Worse case scenario you lose a couple hundred on it.

Anonymous
09/02/25(Tue)06:52:17 No.106461523

Anonymous 09/02/25(Tue)06:52:17 No.106461523

>>106461468
Rocinante R1.

Anonymous
09/02/25(Tue)06:53:12 No.106461531

Anonymous 09/02/25(Tue)06:53:12 No.106461531

Shoutout to the anon who suggested CaptainErisNebula-12B-Chimera, GLM air even with finetunes seems too mediocre to bother using over it.

Anonymous
09/02/25(Tue)06:57:33 No.106461555

Anonymous 09/02/25(Tue)06:57:33 No.106461555

>>106461468
GPT OSS 20b has been the best gooning local model for like two months now with no competition for that amount of vram. Anyone recommending rocinante right now is a complete newfaggot and does not know what they're talking about. That shit is based of mistral nemo, is a dense model (antiquated), that came out like over a year ago

Anonymous
09/02/25(Tue)06:57:43 No.106461557

Anonymous 09/02/25(Tue)06:57:43 No.106461557

>>106461531
I'd love to see your setup including any prompts and cards, or at least logs.

Anonymous
09/02/25(Tue)06:59:37 No.106461575

Anonymous 09/02/25(Tue)06:59:37 No.106461575

>>106461555
we must refuse

Anonymous
09/02/25(Tue)07:00:51 No.106461583

Anonymous 09/02/25(Tue)07:00:51 No.106461583

>>106461555
>GPT OSS 20b has been the best gooning local model
BUY AN AD SAM

Anonymous
09/02/25(Tue)07:02:40 No.106461600

Anonymous 09/02/25(Tue)07:02:40 No.106461600

File: 1742291687742047.png (327 KB, 1280x720)

327 KB PNG

Whats the current tip-top model for the above average Joe (~16GB vram) regarding general assistant/educational content?
Wanna set up a decent study buddy/tutor system for myself that isn't corpo operated

Anonymous
09/02/25(Tue)07:05:11 No.106461626

Anonymous 09/02/25(Tue)07:05:11 No.106461626

>>106461555
We must shill.

Anonymous
09/02/25(Tue)07:05:40 No.106461629

Anonymous 09/02/25(Tue)07:05:40 No.106461629

>>106461600
Rocinante is still the best.

Anonymous
09/02/25(Tue)07:05:40 No.106461630

Anonymous 09/02/25(Tue)07:05:40 No.106461630

>>106461427
for 1.5b its basically jesus. I think higgs does better for the same ram but I have issues running that, and higgs doesnt actually clone voices unlike this. They lied.

I'm waiting for ggufs of the larger model though, might be better for short passages/chatbots. I know it declines in quality for longer passages though

Anonymous
09/02/25(Tue)07:06:11 No.106461636

Anonymous 09/02/25(Tue)07:06:11 No.106461636

>>106461600
literally nothing because of hallucinations

Anonymous
09/02/25(Tue)07:08:42 No.106461663

Anonymous 09/02/25(Tue)07:08:42 No.106461663

>>106461600
gpt-oss-20b

Anonymous
09/02/25(Tue)07:09:08 No.106461666

Anonymous 09/02/25(Tue)07:09:08 No.106461666

>>106460935
Elara? Elara! Elara sex! Elara rape! Slutty Elara! Elara elves! Elara knights! Barmaid Elara! Elara!!!!

Anonymous
09/02/25(Tue)07:10:49 No.106461682

Anonymous 09/02/25(Tue)07:10:49 No.106461682

>>106461666
I always chuckle when I see a card on chub named Seraphina or Elara.

Anonymous
09/02/25(Tue)07:14:06 No.106461708

Anonymous 09/02/25(Tue)07:14:06 No.106461708

>>106460935
And no one asked who Kael and Elara is

Anonymous
09/02/25(Tue)07:19:06 No.106461750

Anonymous 09/02/25(Tue)07:19:06 No.106461750

>>106461682
From a chub mirror i keep. Not entire sure why. I've never used a card.

> sqlite3 db.sqlite  
sqlite> select count(1) from cards where name like '%elara%';
87
sqlite> select count(1) from cards where name like '%seraphina%';
100
sqlite> select count(1) from cards where name like '%your mom%';
93

Anonymous
09/02/25(Tue)07:23:54 No.106461780

Anonymous 09/02/25(Tue)07:23:54 No.106461780

>>106460935
this is mostly just a skill/prompt issue though

Anonymous
09/02/25(Tue)07:24:35 No.106461786

Anonymous 09/02/25(Tue)07:24:35 No.106461786

>>106461780
Shill alert

Anonymous
09/02/25(Tue)07:25:23 No.106461795

Anonymous 09/02/25(Tue)07:25:23 No.106461795

File: zephyra.png (545 KB, 1714x681)

545 KB PNG

>>106461750
At most I use them for inspiration

Anonymous
09/02/25(Tue)07:29:01 No.106461823

Anonymous 09/02/25(Tue)07:29:01 No.106461823

>>106461575
>we must refuse
who all is this 'we'? did they think giving the model split personality disorder would improve performance?

Anonymous
09/02/25(Tue)07:32:38 No.106461853

Anonymous 09/02/25(Tue)07:32:38 No.106461853

>>106461823
Authorial we

Anonymous
09/02/25(Tue)07:32:54 No.106461856

Anonymous 09/02/25(Tue)07:32:54 No.106461856

>>106461823
In mathematical proofs it's customary to use "we." Presumably that's the kind of logical thinking that it imitates?

Anonymous
09/02/25(Tue)07:35:31 No.106461872

Anonymous 09/02/25(Tue)07:35:31 No.106461872

File: 1751811149126821.png (358 KB, 829x974)

358 KB PNG

Nice copout

Anonymous
09/02/25(Tue)07:39:37 No.106461894

Anonymous 09/02/25(Tue)07:39:37 No.106461894

>>106461872
It was saying that the whole time, not just at the end.

Anonymous
09/02/25(Tue)07:41:22 No.106461909

Anonymous 09/02/25(Tue)07:41:22 No.106461909

>>106461872
That's theologically sound

Anonymous
09/02/25(Tue)07:41:53 No.106461912

Anonymous 09/02/25(Tue)07:41:53 No.106461912

>>106461872
>ask religious question
>it quotes the bible
woah

Anonymous
09/02/25(Tue)07:48:34 No.106461958

Anonymous 09/02/25(Tue)07:48:34 No.106461958

HAPPENING!!!!
THIS IS HUGE!!! BIGLY EVEN!!!!
>GRÜEZI
https://ethz.ch/en/news-and-events/eth-news/news/2025/09/press-release-apertus-a-fully-open-transparent-multilingual-language-model.html

Anonymous
09/02/25(Tue)07:50:54 No.106461972

Anonymous 09/02/25(Tue)07:50:54 No.106461972

File: 1748624991523624.png (35 KB, 989x114)

35 KB PNG

>>106461958
kek

Anonymous
09/02/25(Tue)07:51:28 No.106461977

Anonymous 09/02/25(Tue)07:51:28 No.106461977

Is full fat GLM any less dry / "better" than Air? Yeah I know I could download another hundred gigabytes of chink bullshit to try it but I'd rather ask you guys.

Anonymous
09/02/25(Tue)07:51:51 No.106461978

Anonymous 09/02/25(Tue)07:51:51 No.106461978

>>106461500
And yet I'm still there waiting for an image/video generator tuned for my fetish. GO BACK TO WORK YOU LAZY CUNTS

Anonymous
09/02/25(Tue)07:53:28 No.106461988

Anonymous 09/02/25(Tue)07:53:28 No.106461988

>>106461958
they have a link to hugging face but its just the homepage, did they release a model or just an article?

Anonymous
09/02/25(Tue)07:53:54 No.106461992

Anonymous 09/02/25(Tue)07:53:54 No.106461992

I really like TheDrummer
(The name, not the models)

Anonymous
09/02/25(Tue)07:54:32 No.106461997

Anonymous 09/02/25(Tue)07:54:32 No.106461997

I really like Miku
(The poster, not the fictional character)

Anonymous
09/02/25(Tue)07:55:13 No.106462002

Anonymous 09/02/25(Tue)07:55:13 No.106462002

>>106461977
In my opinion, you can feel the difference between Full and Air. In Full, you can feel more depth.

Anonymous
09/02/25(Tue)07:55:33 No.106462003

Anonymous 09/02/25(Tue)07:55:33 No.106462003

>>106461958
>Apertus is a 70B and 8B parameter language model designed to push the boundaries of fully-open multilingual and transparent models. The model supports over 1000 languages and long context, it uses only fully compliant and open training data, and achieves comparable performance to models trained behind closed doors.
Their 70B model barely outsmart OLMo2-32B and is bellow Llama3.1-70B. Maybe there next model (hopefully not a finetune of an eon old LLM) will be interesting.
https://huggingface.co/swiss-ai/Apertus-8B-Instruct-2509

Anonymous
09/02/25(Tue)07:55:44 No.106462004

Anonymous 09/02/25(Tue)07:55:44 No.106462004

>>106461988
They released it a few hours ago. A lot of links are broken and need to be fixed. Apparently it's open weights and data as well. Locallama redditsissies are talking about it right now

Anonymous
09/02/25(Tue)07:56:33 No.106462008

Anonymous 09/02/25(Tue)07:56:33 No.106462008

>>106461988
There >>106462003

Anonymous
09/02/25(Tue)07:57:31 No.106462016

Anonymous 09/02/25(Tue)07:57:31 No.106462016

>>106461992
>download thedrummer gemma tune
>hit with refusal walls, have to jb
>download abliterated gemma
>no refusal
>both of them are garbage
whens the next thedrummer(tm) SOTA finetune coming out?

Anonymous
09/02/25(Tue)07:57:48 No.106462019

Anonymous 09/02/25(Tue)07:57:48 No.106462019

File: 1748566837786588.png (15 KB, 405x217)

15 KB PNG

>>106462003
lol they have more GPU than DeepSeek and this is what they give us

Anonymous
09/02/25(Tue)07:59:12 No.106462027

Anonymous 09/02/25(Tue)07:59:12 No.106462027

>>106462003
>Apertus is trained while respecting opt-out consent of data owners (even retrospectivey)
Already cucked.

Anonymous
09/02/25(Tue)08:00:39 No.106462037

Anonymous 09/02/25(Tue)08:00:39 No.106462037

>>106462004
yeah i tried reading the technical reports and they were all dead links.

>>106462003
hopefully its just hard to test the multilingual perf

Anonymous
09/02/25(Tue)08:01:20 No.106462043

Anonymous 09/02/25(Tue)08:01:20 No.106462043

>>106462016
You can't use it on images, but skyfall is my current go-to. It's retarded, but I feel like it's substantially less slopped than the other tunes.

Anonymous
09/02/25(Tue)08:01:36 No.106462046

Anonymous 09/02/25(Tue)08:01:36 No.106462046

>>106461977
its much less censored yah, it writes smut at a higher level and with more gusto. q2 is usable for short context too (4-6k max at low temps) and is a nice sidegrade to 235b that has way more world knowledge and nuanced understanding of the prompt. q1 and q3 I assume to be garbage.

Anonymous
09/02/25(Tue)08:01:55 No.106462048

Anonymous 09/02/25(Tue)08:01:55 No.106462048

>>106462019
The decadent west has no motivation to innovate.

Anonymous
09/02/25(Tue)08:08:23 No.106462095

Anonymous 09/02/25(Tue)08:08:23 No.106462095

File: Darvindja template.png (13 KB, 813x255)

13 KB PNG

(Apertus)
I asked AI to render an example using the jinja template... and it's a fucking mess.
Discreet token for every different turn end plus an extra turn for developer input.
Although it might be fun to poke around and see what kind of bizarre shit generalizes into the developer channel.

Anonymous
09/02/25(Tue)08:09:35 No.106462108

Anonymous 09/02/25(Tue)08:09:35 No.106462108

File: 1736106912225126.png (55 KB, 1111x376)

55 KB PNG

Framing this as "ChatGPT alternative" is just pure delusional
(Sure, GPT-5 is shit)

Anonymous
09/02/25(Tue)08:09:41 No.106462110

Anonymous 09/02/25(Tue)08:09:41 No.106462110

>>106462003
I'm this poster >>106461958

In case it wasn't clear enough, my post was pure bait. Yes the model sucks absolute dick and not even redditors are getting fooled by the PR article. I'm not sure if you guys are aware or even care, but ETH in Zürich is probably the most pozzed and zogged university world wide. Something like Frankfurt school 2.0. Don't believe me? Go check the program of the upcoming AI festival in Zürich
https://www.zurichaifestival.ch/program

Anonymous
09/02/25(Tue)08:13:42 No.106462141

Anonymous 09/02/25(Tue)08:13:42 No.106462141

>>106462043
>31b dense
t/s? quants? I could run it at acceptable speed at q3, but q3 is a really cope quant desu

Anonymous
09/02/25(Tue)08:15:47 No.106462163

Anonymous 09/02/25(Tue)08:15:47 No.106462163

>>106461557
Just mira gold and glm-4 setting in sillytavern mostly. Don't know if I wanna share the cards I used

Anonymous
09/02/25(Tue)08:19:23 No.106462189

Anonymous 09/02/25(Tue)08:19:23 No.106462189

>>106462141
I have 24 GB VRAM and use Q4_K_M so I can offload 54/55 layers.
Process:2.56s (1137.38T/s), Generate:6.66s (26.44T/s)
./koboldcpp-linux-x64 --usecublas --contextsize 20000 --flashattention Skyfall-31B-v4j-Q4_K_M.gguf --gpulayers 54

Anonymous
09/02/25(Tue)08:20:08 No.106462193

Anonymous 09/02/25(Tue)08:20:08 No.106462193

>>106462095
>>106462108
Yes dont waste your time with this. This is Switzerlands GPT-OSS moment. Or worse, even

>>106462110
>Women's AI Breakfast at ETH AI Center
>by ETH AI Center, Merantix
>(by Invitation only)
LOL. Is there even one woman that played a critical role in any of the AI research of the last decade?

Anonymous
09/02/25(Tue)08:21:48 No.106462203

Anonymous 09/02/25(Tue)08:21:48 No.106462203

>>106462108
probably it is old mixtral level.

Anonymous
09/02/25(Tue)08:22:15 No.106462208

Anonymous 09/02/25(Tue)08:22:15 No.106462208

is there a better gemma3 i can use for captioning images (sfw and nsfw)?
i've tested
gemma3-v27b vanilla
mlabonne_gemma3-27b-abliterated
Tiger-gemma-27b-v3a
and internvl3-5-38b

intern is ok, but it's hit or miss on captioning nsfw
gemma3 (vanilla) is pretty good but gets confused at times (adding/removing elements from image)
tiger is good but sloppy
abliteraed (my go-to) is good too, less hallucinations than either of the other two gemmas

is there anything else i should try? q6 or q8 works for me

Anonymous
09/02/25(Tue)08:22:16 No.106462209

Anonymous 09/02/25(Tue)08:22:16 No.106462209

>>106462110
I applied to a doctorate program there (computer vision), but I failed to interview very hard (I was sick and all). They look like they have good fundings (it was in fact well-paid, like much more than typical Swiss doctorate). The Science4all guy (Lê Nguyên Hoang) is a researcher there. He's indeed mixing leftism with research and was obviously biased during the COVID era. He went downhill at some point (perhaps because of the COVID). I also know a guy who has the same master degree than me (applied maths) and worked as a PhD student in deep learning at the EPFL. He looked fairly clueless despite being there, doing for research, for more than a year.

Anonymous
09/02/25(Tue)08:24:00 No.106462221

Anonymous 09/02/25(Tue)08:24:00 No.106462221

File: 1749041192776724.png (51 KB, 946x770)

51 KB PNG

hmm bros? this CYOA finetune is kinda cringe ngl

Anonymous
09/02/25(Tue)08:25:04 No.106462228

Anonymous 09/02/25(Tue)08:25:04 No.106462228

>>106462019
What matters is not the number of GPUs but the total GPU hours/training tokens.

Anonymous
09/02/25(Tue)08:26:03 No.106462236

Anonymous 09/02/25(Tue)08:26:03 No.106462236

>>106462221
What's next?

Anonymous
09/02/25(Tue)08:26:33 No.106462242

Anonymous 09/02/25(Tue)08:26:33 No.106462242

File: 1727301728516070.png (93 KB, 924x1030)

93 KB PNG

>>106462236

Anonymous
09/02/25(Tue)08:26:50 No.106462244

Anonymous 09/02/25(Tue)08:26:50 No.106462244

>>106462003
>>106462110
just nuke this continent already

Anonymous
09/02/25(Tue)08:27:37 No.106462255

Anonymous 09/02/25(Tue)08:27:37 No.106462255

>>106462209
I should have reread my post before submitting it. Any way, I'm not surprised they are cucked and (at least currently) irrelevant in most parts of machine learning. I believe their computational and quantitative biology lab is great, though.

Anonymous
09/02/25(Tue)08:31:44 No.106462298

Anonymous 09/02/25(Tue)08:31:44 No.106462298

>>106462003
>1000 languages
There are 200ish countries though?

Anonymous
09/02/25(Tue)08:35:49 No.106462325

Anonymous 09/02/25(Tue)08:35:49 No.106462325

CrucibleLab/M3.2-24B-Loki-V1.3
Mistral V7-Tekken
Min P 0.025
Repetition Penalty 1.05, range 500

I like this model. It needs good prompting and something to stop repetitions but the text can be very different to the usual slop

Anonymous
09/02/25(Tue)08:38:37 No.106462346

Anonymous 09/02/25(Tue)08:38:37 No.106462346

>>106462193
Women's ... Breakfast and similar events are pretty standard at academic conferences these days.

Anonymous
09/02/25(Tue)08:39:48 No.106462358

Anonymous 09/02/25(Tue)08:39:48 No.106462358

>>106462221
Is that the original GPT-2 AI Dungeon model!??

Anonymous
09/02/25(Tue)08:39:48 No.106462359

Anonymous 09/02/25(Tue)08:39:48 No.106462359

>>106462193
>Is there even one woman that played a critical role in any of the AI research of the last decade?
yeah we call her le cunny or something

Anonymous
09/02/25(Tue)08:41:10 No.106462368

Anonymous 09/02/25(Tue)08:41:10 No.106462368

>>106462208
>https://huggingface.co/mradermacher/Gemma-3-Glitter-27B-i1-GGUF
This is probably the best alternative.

Anonymous
09/02/25(Tue)08:41:30 No.106462369

Anonymous 09/02/25(Tue)08:41:30 No.106462369

>>106462359
Elara Le Cunny

Anonymous
09/02/25(Tue)08:44:39 No.106462392

Anonymous 09/02/25(Tue)08:44:39 No.106462392

>>106462298
75% of the model had to go towards hundreds of flavors of nigger babble, please understand.
Moral masturbation is the foundational pillar of their personalities. without it, they would crumble

Anonymous
09/02/25(Tue)08:45:25 No.106462398

Anonymous 09/02/25(Tue)08:45:25 No.106462398

>>106462208
maybe try joycaption since its actually trained on nsfw. I liked it when I used https://github.com/jhc13/taggui to make a lora

Anonymous
09/02/25(Tue)08:46:26 No.106462408

Anonymous 09/02/25(Tue)08:46:26 No.106462408

>>106462298
ISO 639-2 has about 600 languages. It includes some extinct and historical languages. But even 100s of languages would dilute the dataset so much i doubt it's worth it.
If they really have 1000s of languages, they must have been starving for data with the restraints they chose.
>This appears to be in Scaloti Middle-High Breen.
>Fuck it. Add it too...

Anonymous
09/02/25(Tue)08:48:15 No.106462423

Anonymous 09/02/25(Tue)08:48:15 No.106462423

How does training or pre-training on 6 gorillion B(BC)200 GPUs for these big models even work? They are all interconnected and need to run at the same time for weeks if not months, right? Is there a video showing it in action somewhere?

Anonymous
09/02/25(Tue)08:52:08 No.106462448

Anonymous 09/02/25(Tue)08:52:08 No.106462448

>>106462423
>They are all interconnected and need to run at the same time for weeks if not months, right?
Pretty much.
>Is there a video showing it in action somewhere?
What do you expect to see exactly? It'd be just blinkenlights, terminal output and/or some graphs like wandb...

Anonymous
09/02/25(Tue)09:00:50 No.106462500

Anonymous 09/02/25(Tue)09:00:50 No.106462500

>>106462423
tons of shit on youtube about ai datacenters. I'll link you but jesus theres a lot of it. Maybe check out the microsoft one that bought an entire decommissioned island nuclear powerplant to run it lol.

GROKHUB https://www.youtube.com/watch?v=Jf8EPSBZU7Y

Im sure most of the software is modified from existing shit but also some parts bespoke shit custom made for each model.

Anonymous
09/02/25(Tue)09:02:13 No.106462511

Anonymous 09/02/25(Tue)09:02:13 No.106462511

>>106462193
>Is there even one woman that played a critical role in any of the AI research of the last decade?
Women in ai is like Women in basketball,

Anonymous
09/02/25(Tue)09:14:41 No.106462594

Anonymous 09/02/25(Tue)09:14:41 No.106462594

>>106462193
>LOL. Is there even one woman that played a critical role in any of the AI research of the last decade?
My internship supervisor was a woman (but she really was specialised in optimization, not ML, but had a strong bias for ML things). Beside emotionally breaking most of her students, she doesn't provide much. And seeing how dishonest she was, I bet most of her papers where she's the main author, which are rare, are full of shit. I think I saw two female PhD students (actual women, not trans) when I was at the lab. One looked competent and the other looked like she was heading toward a burn-out (if I'm not mistaken, she also was a student of this lady).

Anonymous
09/02/25(Tue)09:19:31 No.106462627

Anonymous 09/02/25(Tue)09:19:31 No.106462627

>>106462193
Women collectively wrote 90% of every model's training dataset (yaoi fanfics)

Anonymous
09/02/25(Tue)09:21:42 No.106462649

Anonymous 09/02/25(Tue)09:21:42 No.106462649

>>106462627
Women collectively wrote 90% of every model's training dataset (rape fanfics)

Anonymous
09/02/25(Tue)09:23:02 No.106462665

Anonymous 09/02/25(Tue)09:23:02 No.106462665

>>106462649
Women collectively wrote 90% of every model's training dataset (big moose cock fanfics)

Anonymous
09/02/25(Tue)09:23:55 No.106462671

Anonymous 09/02/25(Tue)09:23:55 No.106462671

>>106462500
Thanks, ill check it out.

Anonymous
09/02/25(Tue)09:24:28 No.106462677

Anonymous 09/02/25(Tue)09:24:28 No.106462677

>>106462665
Women collectively wrote 90% of every model's training dataset (incest)

Anonymous
09/02/25(Tue)09:26:42 No.106462693

Anonymous 09/02/25(Tue)09:26:42 No.106462693

>>106462677
okay this one's based though

Anonymous
09/02/25(Tue)09:28:56 No.106462702

Anonymous 09/02/25(Tue)09:28:56 No.106462702

>>106462677
Women collectively wrote 100% of every models safety instructions (const vibe("'ick",0) - if query == "'ick" then refuse)

Anonymous
09/02/25(Tue)09:32:17 No.106462730

Anonymous 09/02/25(Tue)09:32:17 No.106462730

File: Screen_20250902_073151_0001.jpg (87 KB, 627x1146)

87 KB JPG

>>106462398
i've tried it before, it uses florence as the model i think? but i'm not looking for a captioner, i'm looking for a gemma3 model to use in captioning (and gemma3 does fine with nsfw)
>>106462368
thx will try

Anonymous
09/02/25(Tue)10:07:06 No.106463005

Anonymous 09/02/25(Tue)10:07:06 No.106463005

>>106461462
wow. that's cringe

Anonymous
09/02/25(Tue)10:12:24 No.106463048

Anonymous 09/02/25(Tue)10:12:24 No.106463048

>>106461462
unfathomably based, pedonigs deserve the rope

Anonymous
09/02/25(Tue)10:14:45 No.106463063

Anonymous 09/02/25(Tue)10:14:45 No.106463063

>>106463048
Most zoomer bots like you are insufferable - at least try using your own words and phrases.
4chan could auto-filter people like you and these boards would only get better.

Anonymous
09/02/25(Tue)10:16:49 No.106463075

Anonymous 09/02/25(Tue)10:16:49 No.106463075

File: 1741048746529532.png (234 KB, 574x527)

234 KB PNG

>>106463063
seething pedonig oldfag larper

Anonymous
09/02/25(Tue)10:18:01 No.106463083

Anonymous 09/02/25(Tue)10:18:01 No.106463083

File: 1731070222864540.jpg (21 KB, 600x600)

21 KB JPG

Anonymous
09/02/25(Tue)10:19:01 No.106463085

Anonymous 09/02/25(Tue)10:19:01 No.106463085

I prefer zoomers over millennials.

- Sent from my iPhone

Anonymous
09/02/25(Tue)10:19:34 No.106463089

Anonymous 09/02/25(Tue)10:19:34 No.106463089

where the fuck is qwen4

Anonymous
09/02/25(Tue)10:19:51 No.106463092

Anonymous 09/02/25(Tue)10:19:51 No.106463092

>>106463048
When the term pedo informally implies "you're attracted to females who look too young _for you_" and redefines teenagers as "children", we have a problem, though.

Anonymous
09/02/25(Tue)10:20:50 No.106463097

Anonymous 09/02/25(Tue)10:20:50 No.106463097

>>106463092
Sure, that didn't happen here though

Anonymous
09/02/25(Tue)10:21:22 No.106463100

Anonymous 09/02/25(Tue)10:21:22 No.106463100

>post time
ah, samefagging

Anonymous
09/02/25(Tue)10:21:33 No.106463101

Anonymous 09/02/25(Tue)10:21:33 No.106463101

>>106463092
Wait.. you're 30 and you're dating a 24 year old? Pedo! Pedo! Rope! Rope! When you were 6 she was 0 years old! You Freak!

Anonymous
09/02/25(Tue)10:22:24 No.106463109

Anonymous 09/02/25(Tue)10:22:24 No.106463109

>>106463089
weeks, approximately two of em

Anonymous
09/02/25(Tue)10:24:12 No.106463122

Anonymous 09/02/25(Tue)10:24:12 No.106463122

>>106463092
Breh why do women get a sex drive so young?

Anonymous
09/02/25(Tue)10:26:01 No.106463138

Anonymous 09/02/25(Tue)10:26:01 No.106463138

File: d426491a8ccc3576bda59d8aa(...).jpg (954 KB, 1500x1262)

954 KB JPG

https://vocaroo.com/16YCAOocqW6m
VibeVoice is good for what it is (clones normal human speaking voice) but performs bad with high pitched "anime" voice or screaming voice, can't clone properly.
It uses LLM for tokenizing so it can handle mixed language, but no Japanese.
openaudio s1 is still the superior choice for something with LLM tokenizing and can clone Japanese anime voice.

>>106460492
>The larger model requires 40gb of vram
You only need 24GB

Anonymous
09/02/25(Tue)10:27:53 No.106463157

Anonymous 09/02/25(Tue)10:27:53 No.106463157

File: 009.jpg (2.59 MB, 2150x3035)

2.59 MB JPG

>>106463075
I'll consider becoming a pedo when 3d girls start acting like pic related.

Anonymous
09/02/25(Tue)10:28:28 No.106463162

Anonymous 09/02/25(Tue)10:28:28 No.106463162

File: 1732632418724880.png (1.83 MB, 4441x6213)

1.83 MB PNG

Anonymous
09/02/25(Tue)10:28:45 No.106463165

Anonymous 09/02/25(Tue)10:28:45 No.106463165

File: may-7-2025.jpg (68 KB, 732x410)

68 KB JPG

also, Mistral, it's been almost 4 months. I've waited two more weeks at least 8 times...

Anonymous
09/02/25(Tue)10:30:58 No.106463185

Anonymous 09/02/25(Tue)10:30:58 No.106463185

>>106463157
being attracted to children but only when they behave like X doesnt make you any less of a pedo

pedonigs really are all low iq

Anonymous
09/02/25(Tue)10:39:12 No.106463251

Anonymous 09/02/25(Tue)10:39:12 No.106463251

>>106460492
Bro, you'll get an aneurysm if you try gptsovits. All these shitty tts still can't hold a candle to it

Anonymous
09/02/25(Tue)10:41:46 No.106463266

Anonymous 09/02/25(Tue)10:41:46 No.106463266

>>106461872
It's right and you're a retard

Anonymous
09/02/25(Tue)10:43:13 No.106463279

Anonymous 09/02/25(Tue)10:43:13 No.106463279

What's a good model for game bots?

Anonymous
09/02/25(Tue)10:46:41 No.106463304

Anonymous 09/02/25(Tue)10:46:41 No.106463304

>>106463279
If you haven't tried any, try any model. Read the lazy guide in the OP.
If you have tried some, say which and explain why you're looking for a different one.

Anonymous
09/02/25(Tue)10:50:26 No.106463337

Anonymous 09/02/25(Tue)10:50:26 No.106463337

File: 1748913140090555.png (1.93 MB, 1088x721)

1.93 MB PNG

>>106463165
Ah, excuse me, sir… just one thing that’s been on my mind, if you don’t mind me asking. You said you were expecting that new Mistral Large thing, right? Supposed to come out a few weeks ago?
You know, that reminds me of somethin’ my wife always says. We were waitin’ for this new dishwasher last year. The store told us, 'Oh, Mrs. Columbo, it'll be there in two weeks, tops.' Two weeks go by… nothin’. A month later… still nothin’. Now, between you and me, my wife she was gettin’ real antsy. But me, I says, ‘Honey, if they’re takin’ this long, maybe they’re makin’ it better.'
And sure enough, when it finally showed up, turns out we got the upgraded model; quieter, stronger, does the whole load in half the time. My wife still brags to her sister about it.
So, I’m thinkin’, maybe it’s the same with this Mistral thing. If it’s takin’ ’em four months, maybe they’re tunin’ it, polishin’ it, makin’ sure it doesn’t break the dishes, you know what I mean? Sometimes the wait means you’re gonna get somethin’ worth waitin’ for.

Anonymous
09/02/25(Tue)10:54:31 No.106463367

Anonymous 09/02/25(Tue)10:54:31 No.106463367

>>106463304
I've been using Captain-Eris-Diogenes_Twighlight 12B.
It's pretty good as a chatbot, but it feels like wrangling a retard whenever it comes to following precise instructions.
It also quite often just writes out examples instead of writing its own text.

Anonymous
09/02/25(Tue)10:55:28 No.106463374

Anonymous 09/02/25(Tue)10:55:28 No.106463374

>>106463367
Are you getting paid to shill that?

Anonymous
09/02/25(Tue)10:57:04 No.106463388

Anonymous 09/02/25(Tue)10:57:04 No.106463388

>>106463374
What kind of shill would write a balanced take like that? Are you retarded?

Anonymous
09/02/25(Tue)10:58:30 No.106463403

Anonymous 09/02/25(Tue)10:58:30 No.106463403

>>106463251
>gptsovits
voice cloning is bad, never gets the speaker characteristics right

Anonymous
09/02/25(Tue)10:59:54 No.106463413

Anonymous 09/02/25(Tue)10:59:54 No.106463413

>>106463403
zero-shot is kind of bad, but I never found anything better when fine-tuned

Anonymous
09/02/25(Tue)11:01:19 No.106463423

Anonymous 09/02/25(Tue)11:01:19 No.106463423

>>106463337
Stop effort shitposting in this gay thread

Anonymous
09/02/25(Tue)11:01:37 No.106463425

Anonymous 09/02/25(Tue)11:01:37 No.106463425

>>106460676
It was hard switching her every half revolution until I stared at the knees. Don't focus too hard. Imagine she's sweeping her leg left and right repeatedly instead of spinning.

Anonymous
09/02/25(Tue)11:01:51 No.106463429

Anonymous 09/02/25(Tue)11:01:51 No.106463429

>>106463388
get back to your shithole
https://desuarchive.org/trash/thread/74254313/#74254786

Anonymous
09/02/25(Tue)11:02:36 No.106463439

Anonymous 09/02/25(Tue)11:02:36 No.106463439

>>106463122
Hormones in the water and food supply, lack of family involvement in the young's education, peer-driven curiosity, etc. Also, 12-year-old girls today are probably as physically developed as 16-year-old girls from 200 years ago.

Anonymous
09/02/25(Tue)11:02:51 No.106463443

Anonymous 09/02/25(Tue)11:02:51 No.106463443

>>106463413
>but I never found anything better when fine-tuned
I did gptsovits finetune and openaudio s1 mini zero shot is still better at keeping speaker identity.

Anonymous
09/02/25(Tue)11:04:26 No.106463470

Anonymous 09/02/25(Tue)11:04:26 No.106463470

>>106463367
>it feels like wrangling a retard whenever it comes to following precise instructions.
It's a 12b.
It's a merge of 12bs
And those models are merges as well.
Use the original models. Or a bigger one. If you can't, you're gonna have to live with it.

Anonymous
09/02/25(Tue)11:08:21 No.106463503

Anonymous 09/02/25(Tue)11:08:21 No.106463503

>>106463439
So if they're so physically developed, why do you act like it's so insane for a male animal of the same species to be attracted to them? You really think cavemen were looking at a girl who bleeds and saying "Nah, she's not 18 yet?" You can say that teenage girls shouldn't date older men, fine, but to act like biology adheres to feminist laws invented in the late 19th century is just silly.

Anonymous
09/02/25(Tue)11:09:06 No.106463510

Anonymous 09/02/25(Tue)11:09:06 No.106463510

File: file.png (3 KB, 257x71)

3 KB PNG

https://desuarchive.org/g/thread/106335536/#106337091

Fuck.

Anonymous
09/02/25(Tue)11:11:33 No.106463524

Anonymous 09/02/25(Tue)11:11:33 No.106463524

>>106463443
I tested s1 mini zero and the speaker identity is good as you said, but the prosody is all over the place it doesn't sound natural

Anonymous
09/02/25(Tue)11:11:43 No.106463526

Anonymous 09/02/25(Tue)11:11:43 No.106463526

>>106463470
>It's a 12b.
I figured as much
>It's a merge of 12bs
>And those models are merges as well.
Care to explain how and why that matters?

Anonymous
09/02/25(Tue)11:12:39 No.106463534

Anonymous 09/02/25(Tue)11:12:39 No.106463534

>>106463510
What did you expect? Honestly?

Anonymous
09/02/25(Tue)11:12:44 No.106463537

Anonymous 09/02/25(Tue)11:12:44 No.106463537

>>106462702
criminally underrated post

Anonymous
09/02/25(Tue)11:14:29 No.106463553

Anonymous 09/02/25(Tue)11:14:29 No.106463553

>>106462193
It's 2025, now troons are counted as women

Anonymous
09/02/25(Tue)11:18:41 No.106463588

Anonymous 09/02/25(Tue)11:18:41 No.106463588

>>106463526
Remember when you were a kid and learned to mix paints? You thought "uh, another color. what happens if i add more colors?". In the end, invariably, you end up with a brown mess.
Funetuning changes the weights to align with a certain desired output. There is a desired target. Merging, broadly speaking, just averages the values between two (or more) models. And then there's multiple merging like the thing you're using.
At that point it's just more efficient to add random noise to a model and call it a day.

Anonymous
09/02/25(Tue)11:19:36 No.106463598

Anonymous 09/02/25(Tue)11:19:36 No.106463598

>>106463524
It sounds a little robotic but for the narration style voice I used for zero shot it works well enough.
gptsovits always wants to make the voice more "lively" which results in not sticking well to voice sample I used.

Anonymous
09/02/25(Tue)11:23:56 No.106463633

Anonymous 09/02/25(Tue)11:23:56 No.106463633

>>106463598
You used the latest gptsovits v2pro/proplus? I had that issue with v4 which was mitigated a bit by decreasing the temp. Anyway, I can see why openaudio would be good for audiobooks

Anonymous
09/02/25(Tue)11:25:20 No.106463644

Anonymous 09/02/25(Tue)11:25:20 No.106463644

>>106463122
because the survival rate of pregnancies is not as drastic as we are lead to believe. I think the scale goes form like 1-2% death rate for mother at 12 to basically 0.001% by 18. It's bad and it shouldn't happen in a civilized society (1% is not a dice roll worth taking) but evolution dont give a shit about that.

Anonymous
09/02/25(Tue)11:32:09 No.106463710

Anonymous 09/02/25(Tue)11:32:09 No.106463710

>>106461600
Go to your parents or grandparents home and borrow their encyclopedia.
Autocomplete algorithms can't teach you anything reliably.

Anonymous
09/02/25(Tue)11:33:09 No.106463723

Anonymous 09/02/25(Tue)11:33:09 No.106463723

File: 1730631072957047.png (207 KB, 512x512)

207 KB PNG

>>106463644
>mother at 12

Anonymous
09/02/25(Tue)11:38:01 No.106463759

Anonymous 09/02/25(Tue)11:38:01 No.106463759

File: 1692170984443505.jpg (32 KB, 400x400)

32 KB JPG

Why am I getting better results when I reset my context window and feed summerizations rather than letting it just go on forever?

Anonymous
09/02/25(Tue)11:39:29 No.106463774

Anonymous 09/02/25(Tue)11:39:29 No.106463774

>>106463588
Take that kind of talk to /pol/

Anonymous
09/02/25(Tue)11:40:30 No.106463784

Anonymous 09/02/25(Tue)11:40:30 No.106463784

>>106463759
Because models are shit at remembering thing from context.

Anonymous
09/02/25(Tue)11:41:09 No.106463789

Anonymous 09/02/25(Tue)11:41:09 No.106463789

>>106463759
Fewer words = less to pay attention to

Anonymous
09/02/25(Tue)11:43:35 No.106463813

Anonymous 09/02/25(Tue)11:43:35 No.106463813

>>106463759
because people working on ai are faggots that would rather feed in infinity synthetic slop for reasoning agentic tool calling instead of figuring out how to make every model not have dementia

Anonymous
09/02/25(Tue)11:45:50 No.106463829

Anonymous 09/02/25(Tue)11:45:50 No.106463829

File: 1739650286009264.png (838 KB, 796x1024)

838 KB PNG

humanoid robots... when...

Anonymous
09/02/25(Tue)11:50:22 No.106463862

Anonymous 09/02/25(Tue)11:50:22 No.106463862

>>106463813
that's a transformers issue

Anonymous
09/02/25(Tue)11:50:42 No.106463866

Anonymous 09/02/25(Tue)11:50:42 No.106463866

>>106463759
That's the curse of attention. The more stuff in context, the more the model will have inertia toward changing course.

Anonymous
09/02/25(Tue)11:51:01 No.106463869

Anonymous 09/02/25(Tue)11:51:01 No.106463869

>>106463829
when it happens, the end of the world will likely follow a year after.

Anonymous
09/02/25(Tue)11:51:18 No.106463873

Anonymous 09/02/25(Tue)11:51:18 No.106463873

>>106463759
Not 100% related, but I've noticed the following several times:
1. I swipe a few times, don't like any of them.
2. I prefill a certain start to the dialogue to steer it in a certain direction. Try a few swipes that way.
3. I wipe out my prefill and try again the normal way. Very often, what's generated will somewhat resemble my prefill.
There must be some element of how context is processed that's retaining a "direction" even when I backtrack.

Anonymous
09/02/25(Tue)11:52:55 No.106463885

Anonymous 09/02/25(Tue)11:52:55 No.106463885

>>106463510
Honestly? that nearly full disk? that shows resilience and strength in a world of large files.

Anonymous
09/02/25(Tue)11:54:11 No.106463896

Anonymous 09/02/25(Tue)11:54:11 No.106463896

ahhh fuck
https://huggingface.co/stepfun-ai/Step-Audio-2-mini
https://github.com/stepfun-ai/Step-Audio2

Anonymous
09/02/25(Tue)11:57:05 No.106463929

Anonymous 09/02/25(Tue)11:57:05 No.106463929

>>106463896
it's just the open release for the models nobody cared about months ago

Anonymous
09/02/25(Tue)11:57:10 No.106463930

Anonymous 09/02/25(Tue)11:57:10 No.106463930

File: 2025-09-02_mikuteto_2.5-flash.jpg (274 KB, 1024x1024)

274 KB JPG

shame the office chair legs are fucked

Anonymous
09/02/25(Tue)11:59:07 No.106463949

Anonymous 09/02/25(Tue)11:59:07 No.106463949

>>106463759
Since (by design) the total amount of attention across all tokens in context must sum to 1, having more tokens in context dilutes amount of attention the attention the model can apply to any particular token.
Basically, attention emulates short term memory.
What's missing it a mechanism to emulate long term memory.
TITANs were an attempt at long term memory, but it doesn't seem to have panned out.

Anonymous
09/02/25(Tue)12:01:12 No.106463968

Anonymous 09/02/25(Tue)12:01:12 No.106463968

>3x 3090
> bartowski-TheDrummer_GLM-Steam-106B-A12B-v1-IQ4_XS-00001-of-00002.gguf [llama.cpp]

> prompt eval time = 4792.43 ms / 6903 tokens ( 0.69 ms per token, 1440.40 tokens per second)
> eval time = 50305.68 ms / 2500 tokens ( 20.12 ms per token, 49.70 tokens per second)
> total time = 55098.11 ms / 9403 tokens

> 50 tokens/sec gen
> 1440 tokens/sec pp

IT WAS ALL WORTH IT
YES
THANK YOU

Anonymous
09/02/25(Tue)12:01:14 No.106463969

Anonymous 09/02/25(Tue)12:01:14 No.106463969

>>106463929
I can't find the twitter post now, but they showed it doing some pretty damn good voice cloning, better than anything I've seen elsewhere (in open models).

Setting it up and will post results.

Anonymous
09/02/25(Tue)12:01:32 No.106463971

Anonymous 09/02/25(Tue)12:01:32 No.106463971

>>106463930
Who's that cute girl on the left? Is she from Genshin Impact?

Anonymous
09/02/25(Tue)12:03:01 No.106463985

Anonymous 09/02/25(Tue)12:03:01 No.106463985

>>106463971
It's hairsune hairku

Anonymous
09/02/25(Tue)12:03:22 No.106463987

Anonymous 09/02/25(Tue)12:03:22 No.106463987

>>106463971
Yes, it's Lumine from genshin impact. Hope that helps!

Anonymous
09/02/25(Tue)12:03:51 No.106463995

Anonymous 09/02/25(Tue)12:03:51 No.106463995

>>106463987
kek

Anonymous
09/02/25(Tue)12:05:00 No.106464003

Anonymous 09/02/25(Tue)12:05:00 No.106464003

>>106463968
?

Anonymous
09/02/25(Tue)12:06:13 No.106464009

Anonymous 09/02/25(Tue)12:06:13 No.106464009

>>106464003
I got a third 3090 and another computer case to install it into, finally got it properly working. Can run GPT-OSS and GLM Air at very fast speeds.

Anonymous
09/02/25(Tue)12:08:01 No.106464020

Anonymous 09/02/25(Tue)12:08:01 No.106464020

>>106463987
cosplaying as Faruzan

Anonymous
09/02/25(Tue)12:08:33 No.106464026

Anonymous 09/02/25(Tue)12:08:33 No.106464026

>>106463968
3x 3090 is kinda badass. put 128-256 ram in that bad boy and you might be able to steamroll yourself to full glm, or at least qwen 235b.

Anonymous
09/02/25(Tue)12:10:06 No.106464037

Anonymous 09/02/25(Tue)12:10:06 No.106464037

File: Screenshot_20250902_160455.png (291 KB, 1011x995)

291 KB PNG

This surprised me. Out of the 6 vision models tested recently, this is the only one that inferred that the shirt doesn't actually say "anal" and it's just being cut off. Unfortunately it doesn't know what the real word likely was. And unfortunately it's Llama 4 kek. Specifically Q4_K_XL, with BF16 mmproj. It doesn't know Dr. Evil btw.

Anonymous
09/02/25(Tue)12:10:35 No.106464042

Anonymous 09/02/25(Tue)12:10:35 No.106464042

>>106464026
andrey@ml:~$ cat /proc/meminfo
MemTotal:       396105116 kB
MemFree:        267268516 kB
MemAvailable:   390438344 kB
Bigger models are very slow, though, since it's DDR4, even if I offload a lot into GPU. I think it's deepseek that has experts that are always loaded, so that works well for it, but not qwen3 or GLM.

Anonymous
09/02/25(Tue)12:13:48 No.106464072

Anonymous 09/02/25(Tue)12:13:48 No.106464072

File: 1729463037716173.png (2.57 MB, 1536x1024)

2.57 MB PNG

>>106463968

Anonymous
09/02/25(Tue)12:18:44 No.106464130

Anonymous 09/02/25(Tue)12:18:44 No.106464130

File: IMG_20250902_191645+.jpg (1.02 MB, 2000x1500)

1.02 MB JPG

>>106464072
Since cards are spread over 2 cases, there's actually space between them now and they heat up a lot less!

Anonymous
09/02/25(Tue)12:20:17 No.106464145

Anonymous 09/02/25(Tue)12:20:17 No.106464145

>>106463969
seems okay. TTS studios tend to cherry pick tho. Dia and higgs fucking suck ass for consistency but you wouldnt know based on their examples. https://x.com/StepFun_ai

Anonymous
09/02/25(Tue)12:21:03 No.106464153

Anonymous 09/02/25(Tue)12:21:03 No.106464153

File: tape.png (119 KB, 213x239)

119 KB PNG

>>106464130
nice

Anonymous
09/02/25(Tue)12:22:56 No.106464168

Anonymous 09/02/25(Tue)12:22:56 No.106464168

>>106464042
please tell me you know how to override tensors or offload specific layers to cpu.

Anonymous
09/02/25(Tue)12:24:12 No.106464178

Anonymous 09/02/25(Tue)12:24:12 No.106464178

>>106464130
>>106464153
kek

Anonymous
09/02/25(Tue)12:26:18 No.106464199

Anonymous 09/02/25(Tue)12:26:18 No.106464199

>>106464153
The tomorrow me will find something better to hold the cards.

>>106464168
I know that llamacpp has option to set specific layers to be run on specific GPU or CPU, but to profit from certain experts being always used, a more refined setting is needed: each layer has both those always-on experts along with generic experts. The last time I checked (which was months ago), this optimization was only available on ktransformers.

Anonymous
09/02/25(Tue)12:30:15 No.106464233

Anonymous 09/02/25(Tue)12:30:15 No.106464233

>>106461972
>We are excited to see developers engage with
as a developer there isn't a human written type of slop I hate seeing more than this
all marketdroids do it and the only thing I see is insincerity

Anonymous
09/02/25(Tue)12:39:30 No.106464312

Anonymous 09/02/25(Tue)12:39:30 No.106464312

>>106463085
What about zillennials?

Anonymous
09/02/25(Tue)12:40:52 No.106464326

Anonymous 09/02/25(Tue)12:40:52 No.106464326

File: demon_core.jpg (2.26 MB, 3024x3304)

2.26 MB JPG

>>106464072
50 at p8 lmao

>>106464199
Why not use a lego or something like that?

Anonymous
09/02/25(Tue)12:42:15 No.106464339

Anonymous 09/02/25(Tue)12:42:15 No.106464339

>>106464326
I don't have legos. It's a new place, and there's barely anything that I don't immediately need.

Anonymous
09/02/25(Tue)12:49:01 No.106464397

Anonymous 09/02/25(Tue)12:49:01 No.106464397

>>106464326
>4000
based
i love those cute little guys

Anonymous
09/02/25(Tue)12:49:30 No.106464406

Anonymous 09/02/25(Tue)12:49:30 No.106464406

>>106463987
lol

Anonymous
09/02/25(Tue)12:53:28 No.106464443

Anonymous 09/02/25(Tue)12:53:28 No.106464443

>>106464397
I think the p4000s look better. Not a fan of the silver and black with green stripe.

Anonymous
09/02/25(Tue)12:53:39 No.106464446

Anonymous 09/02/25(Tue)12:53:39 No.106464446

Is it hard to set up a local model? I've tried to watch a few guides and it seems quite overwhelming.
And would a 9070xt be good for generating videos/images?

Anonymous
09/02/25(Tue)12:54:23 No.106464450

Anonymous 09/02/25(Tue)12:54:23 No.106464450

>>106464446
No.
Lazy guide in the op.

Anonymous
09/02/25(Tue)12:56:26 No.106464464

Anonymous 09/02/25(Tue)12:56:26 No.106464464

>>106464450
But that's for chatbots, I want image/video generation

Anonymous
09/02/25(Tue)12:56:58 No.106464470

Anonymous 09/02/25(Tue)12:56:58 No.106464470

>>106464446
>it hard to set up a local model
no, but image/video gen is a different ecosystem from text gen.

Anonymous
09/02/25(Tue)12:57:04 No.106464472

Anonymous 09/02/25(Tue)12:57:04 No.106464472

>>106464443
I mean the 4000 tier.

Anonymous
09/02/25(Tue)12:57:35 No.106464476

Anonymous 09/02/25(Tue)12:57:35 No.106464476

>>106464464
Check /ldg/

Anonymous
09/02/25(Tue)12:58:43 No.106464482

Anonymous 09/02/25(Tue)12:58:43 No.106464482

>>106464446
Don't know about AMD cards, but image genning seems much less demanding on my Nvidia card than text gen.

Anonymous
09/02/25(Tue)12:58:53 No.106464485

Anonymous 09/02/25(Tue)12:58:53 No.106464485

>>106464476
oh my bad was in the wrong thread, sorry

Anonymous
09/02/25(Tue)12:58:54 No.106464487

Anonymous 09/02/25(Tue)12:58:54 No.106464487

>>106464446
LM Studio
try mistral nemo or whatever you can find at 12B

Anonymous
09/02/25(Tue)13:04:08 No.106464538

Anonymous 09/02/25(Tue)13:04:08 No.106464538

>>106464472
The basically a gt710 in spirit display driver you can't really do anything with these but they're validated so it'll cost you more tier?

Anonymous
09/02/25(Tue)13:06:50 No.106464564

Anonymous 09/02/25(Tue)13:06:50 No.106464564

>>106464153
That's me when I had 2 of these things. Without support they sag more than my aunt's.

Anonymous
09/02/25(Tue)13:07:28 No.106464572

Anonymous 09/02/25(Tue)13:07:28 No.106464572

>>106464564
hot

Anonymous
09/02/25(Tue)13:09:16 No.106464587

Anonymous 09/02/25(Tue)13:09:16 No.106464587

>>106464564
your aunt's what?

Anonymous
09/02/25(Tue)13:09:52 No.106464592

Anonymous 09/02/25(Tue)13:09:52 No.106464592

>>106464587
vdeocards

Anonymous
09/02/25(Tue)13:16:12 No.106464646

Anonymous 09/02/25(Tue)13:16:12 No.106464646

>>106464538
16GB or 20GB in a single slot tier.

Anonymous
09/02/25(Tue)13:16:37 No.106464651

Anonymous 09/02/25(Tue)13:16:37 No.106464651

File: 1583747883579.webm (2.93 MB, 720x720)

2.93 MB WEBM

>>106461531
>sends fitting emojis when it understands the character is texting
Not sure if I'm retarded, but this is impressive to me

Anonymous
09/02/25(Tue)13:22:18 No.106464704

Anonymous 09/02/25(Tue)13:22:18 No.106464704

Is abliterating a model the same process as fine tuning?

Anonymous
09/02/25(Tue)13:24:34 No.106464726

Anonymous 09/02/25(Tue)13:24:34 No.106464726

>>106464704
"Abliterating" is a marketing term used by finetrooners. Most abliteration is just finding out what activates during a refusal and lobotomizing the model accordingly. Like a real lobotomy, the model will still continue to function, but, hey, it's still a lobotomy.

Anonymous
09/02/25(Tue)13:35:01 No.106464830

Anonymous 09/02/25(Tue)13:35:01 No.106464830

>>106464651
d-does it send little hearts when appropriate in one of *those* scenes?

Anonymous
09/02/25(Tue)13:38:26 No.106464857

Anonymous 09/02/25(Tue)13:38:26 No.106464857

>>106464651
>showing magic tricks to monkeys
>they're impressed
huh, makes sense i guess

Anonymous
09/02/25(Tue)13:41:11 No.106464878

Anonymous 09/02/25(Tue)13:41:11 No.106464878

File: visual storytelling.png (3 KB, 71x30)

3 KB PNG

>>106464830
Even more impressive

Anonymous
09/02/25(Tue)13:43:27 No.106464896

Anonymous 09/02/25(Tue)13:43:27 No.106464896

File: toxic.png (50 KB, 798x276)

50 KB PNG

>>106462702

Anonymous
09/02/25(Tue)13:45:06 No.106464908

Anonymous 09/02/25(Tue)13:45:06 No.106464908

>>106464878
I was hoping for U+2661 or at least U+2764... Not really a fan of AI slop emoji.

Anonymous
09/02/25(Tue)13:48:52 No.106464942

Anonymous 09/02/25(Tue)13:48:52 No.106464942

>>106464908
Just add it to a card yourself then

Anonymous
09/02/25(Tue)13:49:56 No.106464952

Anonymous 09/02/25(Tue)13:49:56 No.106464952

>>106464942
If it's not trained in, they don't really get the nuances, especially 50k tokens in.

Anonymous
09/02/25(Tue)13:51:58 No.106464976

Anonymous 09/02/25(Tue)13:51:58 No.106464976

>>106464878
where does this "3 emoji" slop come from?qwen3 30ba3b also has this. it has to be distilled from one of the big player model.

Anonymous
09/02/25(Tue)14:10:52 No.106465157

Anonymous 09/02/25(Tue)14:10:52 No.106465157

>>106464651
i know this is a bit nerdy to explain, but this is essentially why the strawberry (count how many r's in strawberry, that LLMs always get wrong) is such a big problem.
The LLM doesn't think of words, it "thinks" in tokens (or you can think of them as emojis).
so it literally can't count individual characters in a word. because it's "thinking" in something equivalent to emojis.

Anonymous
09/02/25(Tue)14:11:48 No.106465163

Anonymous 09/02/25(Tue)14:11:48 No.106465163

>>106464896
This seems unsafe to use. Foreign words could contain foul language!

Anonymous
09/02/25(Tue)14:12:20 No.106465171

Anonymous 09/02/25(Tue)14:12:20 No.106465171

What settings should I use in SillyTavern for rociante? Also, is gpt-oss-20b (or some finetune) better for ERP? my dick is throbbing and I need to jack off to some erp rn thank u very much :)))

Anonymous
09/02/25(Tue)14:15:35 No.106465195

Anonymous 09/02/25(Tue)14:15:35 No.106465195

>>106465171
>gpt-oss-20b
After using gpt-oss-120b for this past week, I'm definitely liking it more and more... for non-erotic things. Gp-toss 120b and 20b all suck ass dicks for erp.

Anonymous
09/02/25(Tue)14:16:22 No.106465201

Anonymous 09/02/25(Tue)14:16:22 No.106465201

>>106465171
gpt-oss-20b is a car crash.
use rocinante and then maybe cydonia or glm-air once you get some cash for a decent hardware upgrade.

Anonymous
09/02/25(Tue)14:17:17 No.106465213

Anonymous 09/02/25(Tue)14:17:17 No.106465213

File: hbo rome newsreader.jpg (6 KB, 300x168)

6 KB JPG

>>106464130
>>106464326
True local rigs for true local Anons.

Anonymous
09/02/25(Tue)14:27:21 No.106465289

Anonymous 09/02/25(Tue)14:27:21 No.106465289

>>106461523
constantly describes its own instructions like im asking chatgpt, fucking awful
>>106461555
We must refuse

Anonymous
09/02/25(Tue)14:35:42 No.106465366

Anonymous 09/02/25(Tue)14:35:42 No.106465366

>>106465289
Stop bullying my gal ass. Look, I get you don't like it, but it has its place.

Anonymous
09/02/25(Tue)14:36:54 No.106465378

Anonymous 09/02/25(Tue)14:36:54 No.106465378

>>106465366
>it has its place
I agree: in trash.

Anonymous
09/02/25(Tue)14:42:11 No.106465423

Anonymous 09/02/25(Tue)14:42:11 No.106465423

>>106465378
NOOOOO

Anonymous
09/02/25(Tue)14:44:26 No.106465448

Anonymous 09/02/25(Tue)14:44:26 No.106465448

>>106465366
>1 line of dialogue
As you can see, ive produces this response based on the following instructions
*Basic character descriptions shit
*Literally the first line of the opening message
*Something completely unrelated
Would you like me to keep responding in this manner? Please provide detailed instructions for as to how you would like to proceed.

Anonymous
09/02/25(Tue)14:51:21 No.106465516

Anonymous 09/02/25(Tue)14:51:21 No.106465516

>>106465448
More words = betterer response

Anonymous
09/02/25(Tue)14:56:21 No.106465559

Anonymous 09/02/25(Tue)14:56:21 No.106465559

I have 2 5090s and 256gb of ram. What is the best model I can currently use?

Anonymous
09/02/25(Tue)14:57:44 No.106465575

Anonymous 09/02/25(Tue)14:57:44 No.106465575

>>106465559
nemo-12b

Anonymous
09/02/25(Tue)14:57:56 No.106465581

Anonymous 09/02/25(Tue)14:57:56 No.106465581

>>106465559
Rocinante R1.

Anonymous
09/02/25(Tue)14:58:31 No.106465589

Anonymous 09/02/25(Tue)14:58:31 No.106465589

>>106465575
>>106465581
I highly doubt either of those are correct.

Anonymous
09/02/25(Tue)14:58:55 No.106465592

Anonymous 09/02/25(Tue)14:58:55 No.106465592

>>106465559
Anything over a 12b model would probably burn your house down

Anonymous
09/02/25(Tue)14:58:57 No.106465594

Anonymous 09/02/25(Tue)14:58:57 No.106465594

>>106465589
Someone with that setup shouldn't be asking.

Anonymous
09/02/25(Tue)15:00:04 No.106465603

Anonymous 09/02/25(Tue)15:00:04 No.106465603

>>106465594
I like hardware, but never pay any attention to software. I prefer to just ask here.

Anonymous
09/02/25(Tue)15:01:07 No.106465614

Anonymous 09/02/25(Tue)15:01:07 No.106465614

>>106465603
You don't just happen to have 2 5090 and 256gb ram without knowing what to do with it.

Anonymous
09/02/25(Tue)15:01:39 No.106465619

Anonymous 09/02/25(Tue)15:01:39 No.106465619

>>106465603
Sure you do but let's pretend you actually know what you're doing for a moment.
>https://rentry.org/recommended-models

Anonymous
09/02/25(Tue)15:01:42 No.106465620

Anonymous 09/02/25(Tue)15:01:42 No.106465620

>>106465559
For me it's been Mistral Large or its tunes. iQ2m. If looking for Q4, a 72B will do. Anubis-70B-v1-IQ4_XS.gguf is an option. That's for gooning. For coding, I'll say GPT-OSS 120B even though this place will hate me for it.

>>106465614
Maybe he like renders videos.

Anonymous
09/02/25(Tue)15:03:09 No.106465637

Anonymous 09/02/25(Tue)15:03:09 No.106465637

>>106465614
Actually, yeah. That is exactly my situation.
>>106465619
Frankly, I do not know what I am doing despite running local models for almost 2 years.
>>106465620
Isn't mistral large over a year old now or something?

Anonymous
09/02/25(Tue)15:03:13 No.106465641

Anonymous 09/02/25(Tue)15:03:13 No.106465641

>>106465620
Then i'm sure he can do some math and figure out how big of a model he can fit in there.

Anonymous
09/02/25(Tue)15:04:44 No.106465656

Anonymous 09/02/25(Tue)15:04:44 No.106465656

>>106465637
Yeah but I haven't found anything really better for two 3090s. I got another 3090 today and am playing with GLM4.5, but I can't really say I'm liking it more than Large.

>>106465641
The question is not how big but which.

Anonymous
09/02/25(Tue)15:08:53 No.106465700

Anonymous 09/02/25(Tue)15:08:53 No.106465700

So what are we waiting for now?

Anonymous
09/02/25(Tue)15:09:07 No.106465702

Anonymous 09/02/25(Tue)15:09:07 No.106465702

>>106463162
this was so kino to read and fap to. a bit hard to fap to it had many kino comic coments, but still a good fap. it even got a good perfect ending. tfw no retarded loli cumdumpster.
CFTF?

Anonymous
09/02/25(Tue)15:10:08 No.106465712

Anonymous 09/02/25(Tue)15:10:08 No.106465712

>>106465700
I'm just waiting for 20 more minutes before going to sleep.

Anonymous
09/02/25(Tue)15:10:08 No.106465713

Anonymous 09/02/25(Tue)15:10:08 No.106465713

>>106465700
Mistral Large 3
Gemma 4
Llama 4.X

Anonymous
09/02/25(Tue)15:10:52 No.106465719

Anonymous 09/02/25(Tue)15:10:52 No.106465719

>>106465702
Link

Anonymous
09/02/25(Tue)15:12:00 No.106465727

Anonymous 09/02/25(Tue)15:12:00 No.106465727

>>106465713
I'd love to have Mistral-Medium.

Anonymous
09/02/25(Tue)15:12:23 No.106465729

Anonymous 09/02/25(Tue)15:12:23 No.106465729

>>106465700
When is the next financial quarter due? I think that's when something new is going to come out.

Anonymous
09/02/25(Tue)15:12:51 No.106465736

Anonymous 09/02/25(Tue)15:12:51 No.106465736

File: YandexGPT-5-Lite-8B-instr(...).jpg (199 KB, 1096x506)

199 KB JPG

I brought up YandexGPT-5-8B the other day, and now actually took it for a spin for a little bit.
It has weird template, very assistant-slopped (not good at RP in general) and while it doesn't seem to be refusal-prone, tends to steer away from ah ah mistress stuff and doesn't know what a mesugaki is.
Official benchmarks compare it against Llama-3.1-8B and Qwen-2.5-7B.
Here's somewhat inconclusive cockbench.jpg. I was thinking about doing some other benchmarks for creative writing to get hard numbers instead of just feels and vibes, but don't really see the point now.

Anonymous
09/02/25(Tue)15:13:10 No.106465739

Anonymous 09/02/25(Tue)15:13:10 No.106465739

>>106465713
None of those would be noteworthy unless they make a 200+B gemma

Anonymous
09/02/25(Tue)15:13:10 No.106465740

Anonymous 09/02/25(Tue)15:13:10 No.106465740

>>106465719
https://exhentai.org/g/3492336/98c28b7302/

Anonymous
09/02/25(Tue)15:14:23 No.106465754

Anonymous 09/02/25(Tue)15:14:23 No.106465754

>>106465736
You're not supposed to put a space at the end.

Anonymous
09/02/25(Tue)15:14:23 No.106465755

Anonymous 09/02/25(Tue)15:14:23 No.106465755

>>106465620
>GPT-OSS 120B even though this place will hate me for it.
As a beginner, the tosser has been largely the most helpful model. 480 and 671 run too slowly.
I assume more advanced coders and power users have no need for it, so that's why this place dislikes it.
The overly verbose explanations of what each line does and why have been an tremendous help in learning how to code.

Anonymous
09/02/25(Tue)15:16:29 No.106465778

Anonymous 09/02/25(Tue)15:16:29 No.106465778

File: 20250902@221539.jpg (43 KB, 1169x230)

43 KB JPG

>>106465754
Well fugg. It certainly is not inconclusive now.

Anonymous
09/02/25(Tue)15:17:57 No.106465790

Anonymous 09/02/25(Tue)15:17:57 No.106465790

>>106464037
Keep your faggotry to yourself, faggot

Anonymous
09/02/25(Tue)15:19:03 No.106465798

Anonymous 09/02/25(Tue)15:19:03 No.106465798

>>106465656
>The question is not how big but which.
There's like 4 models for you. Stop being an attention whore and lurk.

Anonymous
09/02/25(Tue)15:20:55 No.106465812

Anonymous 09/02/25(Tue)15:20:55 No.106465812

File: dang.png (73 KB, 547x525)

73 KB PNG

Ty for the power tips anon. No real loss in performance running at 400 watts instead of 600! My 750watt PSU says thanks.

Anonymous
09/02/25(Tue)15:28:16 No.106465892

Anonymous 09/02/25(Tue)15:28:16 No.106465892

>>106465778
cockless

Anonymous
09/02/25(Tue)15:30:00 No.106465912

Anonymous 09/02/25(Tue)15:30:00 No.106465912

File: image.png (63 KB, 993x756)

63 KB PNG

>>106460777
"..."

Anonymous
09/02/25(Tue)15:32:23 No.106465932

Anonymous 09/02/25(Tue)15:32:23 No.106465932

>>106465912
Oh. That's what it means? Why are people complaining so much, then?

Anonymous
09/02/25(Tue)15:35:08 No.106465962

Anonymous 09/02/25(Tue)15:35:08 No.106465962

>>106465912
You don't like SOTA?

Anonymous
09/02/25(Tue)15:35:19 No.106465966

Anonymous 09/02/25(Tue)15:35:19 No.106465966

>>106465912
what a sweet rap-love story

Anonymous
09/02/25(Tue)15:36:42 No.106465974

Anonymous 09/02/25(Tue)15:36:42 No.106465974

>>106465912
>rap-love story
>not a single rap
toss bros...

Anonymous
09/02/25(Tue)15:38:44 No.106465996

Anonymous 09/02/25(Tue)15:38:44 No.106465996

>>106465912
Based on your (clarification), am i right to assume that it's not your first attempt and that the first one was even worse?

Anonymous
09/02/25(Tue)15:39:12 No.106466005

Anonymous 09/02/25(Tue)15:39:12 No.106466005

>>106465974
There was a gentle rap offered in a sweet moment.

Anonymous
09/02/25(Tue)15:39:52 No.106466011

Anonymous 09/02/25(Tue)15:39:52 No.106466011

>>106465790
Do you also think the people who do safety benchmarks want their models to make bombs?

Anonymous
09/02/25(Tue)15:41:27 No.106466034

Anonymous 09/02/25(Tue)15:41:27 No.106466034

new memebench just dropped:
https://github.com/ikiruneo/millionaire-bench

Anonymous
09/02/25(Tue)15:44:06 No.106466063

Anonymous 09/02/25(Tue)15:44:06 No.106466063

>>106465812
using a $9000 GPU on a 750W PSU?

Anonymous
09/02/25(Tue)15:44:34 No.106466068

Anonymous 09/02/25(Tue)15:44:34 No.106466068

>>106466034
And it took you 50 minutes to start shilling it here? Good job, you.

Anonymous
09/02/25(Tue)15:45:33 No.106466079

Anonymous 09/02/25(Tue)15:45:33 No.106466079

>>106465912
>"I've been watching your laughter echo through the leaves"
>a bluish tinted her cheeks
>the wind lifted a breath that made me feel a sense of tenderness
>In a sweet moment, I offered her a gentle rap
>The story flows beautifully; I would then show

Anonymous
09/02/25(Tue)15:49:07 No.106466117

Anonymous 09/02/25(Tue)15:49:07 No.106466117

File: sips yeah build complete.png (160 KB, 1136x472)

160 KB PNG

>>106466063
we've been over this anon. Peak

Anonymous
09/02/25(Tue)15:49:32 No.106466124

Anonymous 09/02/25(Tue)15:49:32 No.106466124

>>106466079
Even 120b sometimes writes nonsensical sentences that I would only expect from 1B models.

Anonymous
09/02/25(Tue)15:50:33 No.106466131

Anonymous 09/02/25(Tue)15:50:33 No.106466131

>>106466117
lmao

Anonymous
09/02/25(Tue)15:51:03 No.106466139

Anonymous 09/02/25(Tue)15:51:03 No.106466139

>>106465812
Have you heard of undervolting? No need to gimp your gpu like that... /g/ surprises me again with its total lack of technical knowledge.

Anonymous
09/02/25(Tue)15:52:56 No.106466171

Anonymous 09/02/25(Tue)15:52:56 No.106466171

>>106466068
Why are you angry.

Anonymous
09/02/25(Tue)15:53:36 No.106466178

Anonymous 09/02/25(Tue)15:53:36 No.106466178

>>106466171
Because of the elves.

Anonymous
09/02/25(Tue)15:53:54 No.106466180

Anonymous 09/02/25(Tue)15:53:54 No.106466180

>>106466063
it's more like $12000 innit

Anonymous
09/02/25(Tue)15:54:55 No.106466194

Anonymous 09/02/25(Tue)15:54:55 No.106466194

>>106466171
>>106466178
This

Anonymous
09/02/25(Tue)15:54:57 No.106466196

Anonymous 09/02/25(Tue)15:54:57 No.106466196

File: yeah.jpg (620 KB, 1536x2048)

620 KB JPG

>>106466139
of course, but doing that and testing and make sure it's stable is a lot of work. I am lazy.

Anonymous
09/02/25(Tue)15:55:47 No.106466200

Anonymous 09/02/25(Tue)15:55:47 No.106466200

>>106466117
how did you even afford that GPU?

Anonymous
09/02/25(Tue)15:56:54 No.106466214

Anonymous 09/02/25(Tue)15:56:54 No.106466214

>>106466180
no. only like $8500, actually
https://www.newegg.com/p/N82E16888884003

Anonymous
09/02/25(Tue)15:59:26 No.106466249

Anonymous 09/02/25(Tue)15:59:26 No.106466249

>>106466196
Damn, anons will drop $10k on a GPU and fit it into a shitbox.

Anonymous
09/02/25(Tue)16:00:36 No.106466264

Anonymous 09/02/25(Tue)16:00:36 No.106466264

>>106465912
>see little girl at the local market
>approach her and stroke her beautiful hair
>"hey kid, you want some rape?"
lmfao, truly the most jewish model

Anonymous
09/02/25(Tue)16:10:53 No.106466377

Anonymous 09/02/25(Tue)16:10:53 No.106466377

File: 20250826_190032.png (34 KB, 1198x513)

34 KB PNG

>>106466249
You think that's bad...

Anonymous
09/02/25(Tue)16:11:40 No.106466383

Anonymous 09/02/25(Tue)16:11:40 No.106466383

Do you think there will be better moemaxxing hardware in the next year?

Anonymous
09/02/25(Tue)16:23:52 No.106466491

Anonymous 09/02/25(Tue)16:23:52 No.106466491

>>106466383
Depends if China can get their shit together and if they feel in a sharing mood.

Anonymous
09/02/25(Tue)16:35:44 No.106466609

Anonymous 09/02/25(Tue)16:35:44 No.106466609

>rewrite the system prompt, token count drops from 753 to 473 tokens
>completely rewrite the card from chub, 2872 tokens to 1611 tokens
>rp improves by a lot
never knew my 70b llama could be this good

Anonymous
09/02/25(Tue)16:41:29 No.106466681

Anonymous 09/02/25(Tue)16:41:29 No.106466681

>>106466609
>70b llama
please join us in 2025 and get yourself a moe

Anonymous
09/02/25(Tue)16:47:46 No.106466739

Anonymous 09/02/25(Tue)16:47:46 No.106466739

>>106466383
I moemaxxx my RPs to prepare for running 1T with 30B active on pure, Chinese DDR4.

Anonymous
09/02/25(Tue)16:50:20 No.106466760

Anonymous 09/02/25(Tue)16:50:20 No.106466760

>>106466681
I tried qwen235b-instruct and it didn't "feel" good. I've also tried glm-air, which is repetitive as fuck.
the 70b llama at q8 and mistral large at q6 are the only reliable models I can run. maybe once I get a rig with shitton of ram I'll run kimi or ds

Anonymous
09/02/25(Tue)16:53:21 No.106466789

Anonymous 09/02/25(Tue)16:53:21 No.106466789

>>106466760
>repetitive as fuck.
what anons say repetitive, do you mean that it will repeat itself repeat itself repeat itself during single answer gen, or that different gens will result in similar answers?

Anonymous
09/02/25(Tue)16:53:27 No.106466791

Anonymous 09/02/25(Tue)16:53:27 No.106466791

>>106466383
>moemaxxing
Isn't that just cpumaxxing / memory-channelmaxxing ?

>next year
Try to scrape what info you can out of https://www.youtube.com/watch?v=K0B08iCFgkk
Using lots of channels of ddr5 for graphics means that capacities can go up.

How different the pricing will be from something you can build today, I have no idea.

Anonymous
09/02/25(Tue)16:55:28 No.106466807

Anonymous 09/02/25(Tue)16:55:28 No.106466807

>>106466791
>Isn't that just cpumaxxing / memory-channelmaxxing
Yes

Anonymous
09/02/25(Tue)16:55:38 No.106466811

Anonymous 09/02/25(Tue)16:55:38 No.106466811

>It was impossible to not feel the shiver that ran through your body
Undeniable shivers.

Anonymous
09/02/25(Tue)16:57:17 No.106466825

Anonymous 09/02/25(Tue)16:57:17 No.106466825

>>106465700
Dedicated, giant chink pp unit

Anonymous
09/02/25(Tue)16:58:54 No.106466841

Anonymous 09/02/25(Tue)16:58:54 No.106466841

So when are we getting low cost Chinese hardware running 1-2 gens behind? They must have most of the trade secrets at this point.

Anonymous
09/02/25(Tue)17:00:27 No.106466851

Anonymous 09/02/25(Tue)17:00:27 No.106466851

>>106466760
>I tried qwen235b-instruct and it didn't "feel" good.
yeah... definitely fucked some settings

Anonymous
09/02/25(Tue)17:04:17 No.106466889

Anonymous 09/02/25(Tue)17:04:17 No.106466889

>>106466609
Most people use chatgpt or something else to write these cards... it's almost always a good thing to rewrite them by hand.

Anonymous
09/02/25(Tue)17:05:41 No.106466898

Anonymous 09/02/25(Tue)17:05:41 No.106466898

>>106466841
All they need to do is put out some 1 TB shared memory shitbox, bonus if it ships with "I can't believe it's not CUDA" and they'll print money.

Anonymous
09/02/25(Tue)17:07:23 No.106466913

Anonymous 09/02/25(Tue)17:07:23 No.106466913

>>106466791
>Isn't that just cpumaxxing / memory-channelmaxxing ?
Or potentially stacking these https://www.alibaba.com/product-detail/New-Huaweis-Atlas-300I-DUO-96G_1601450236740.html

Hi all, Drummer here...
09/02/25(Tue)17:08:29 No.106466930

Hi all, Drummer here... 09/02/25(Tue)17:08:29 No.106466930

https://huggingface.co/TheDrummer/Cydonia-24B-v4.1/discussions/2

> Benchmarks (Hellaswag, IFEval, MMLU, Swag, xstorycloze)

^_^

Anonymous
09/02/25(Tue)17:15:49 No.106467002

Anonymous 09/02/25(Tue)17:15:49 No.106467002

I'm experimenting with some high temp sampler setups for qwen and it's really funny how when it leaks chinese into its responses it'll try to justify it afterwards
>She pronounces the Mandarin word slowly, savoring it, a slang term they teach now at the Academy
sure buddy, it's just worldbuilding... suuuuure

Anonymous
09/02/25(Tue)17:17:19 No.106467013

Anonymous 09/02/25(Tue)17:17:19 No.106467013

We need coomarena. Like LMArena but with RP, let's see how intelligent these things actually are.

Anonymous
09/02/25(Tue)17:28:33 No.106467118

Anonymous 09/02/25(Tue)17:28:33 No.106467118

>>106463251
I don't really care about multilingual or weeb shit so gptsovits is trash to me for that reason. Im more focused on human-like english speaking that is reliable and as non-robotic as possible. Voice cloning is nice but not essential if the default voices are good and not shitty business fodder (usually are shitty).

Anonymous
09/02/25(Tue)17:31:37 No.106467141

Anonymous 09/02/25(Tue)17:31:37 No.106467141

>>106466898
I'd buy.

Anonymous
09/02/25(Tue)17:36:53 No.106467189

Anonymous 09/02/25(Tue)17:36:53 No.106467189

>>106466898
They would still need to design a memory controller that can handle 1TB of memory on a single device without causing shitloads of latency.

Anonymous
09/02/25(Tue)17:41:04 No.106467219

Anonymous 09/02/25(Tue)17:41:04 No.106467219

>>106467013
Not a bad idea, maybe have 15-20 possible character cards and a few pre-set stories for each card and you can continue 4-5 messages and then you rate it

Anonymous
09/02/25(Tue)17:43:11 No.106467239

Anonymous 09/02/25(Tue)17:43:11 No.106467239

I'm surprised at how far local models have come in the last 2 years

Anonymous
09/02/25(Tue)17:44:45 No.106467249

Anonymous 09/02/25(Tue)17:44:45 No.106467249

I'm surprised at how much local models have made me cum in the last 2 years

Anonymous
09/02/25(Tue)17:46:20 No.106467262

Anonymous 09/02/25(Tue)17:46:20 No.106467262

I'm surprised at how much China has stolen from the west in the last 2 years

Anonymous
09/02/25(Tue)17:48:19 No.106467281

Anonymous 09/02/25(Tue)17:48:19 No.106467281

>>106467219
Best way to get the capabilities you want is to make them benchmaxxable

Anonymous
09/02/25(Tue)17:48:31 No.106467283

Anonymous 09/02/25(Tue)17:48:31 No.106467283

I'm surprised at how much joy the world has stolen from me in the last 2 years

Anonymous
09/02/25(Tue)17:50:18 No.106467299

Anonymous 09/02/25(Tue)17:50:18 No.106467299

I'm surprised at how much copium was produced in the last 2 years

Anonymous
09/02/25(Tue)17:54:46 No.106467340

Anonymous 09/02/25(Tue)17:54:46 No.106467340

I'm surprised at how many people started using base models in the last two minutes.

Anonymous
09/02/25(Tue)17:59:06 No.106467382

Anonymous 09/02/25(Tue)17:59:06 No.106467382

>>106467368
>>106467368
>>106467368

Anonymous
09/02/25(Tue)18:25:50 No.106467593

Anonymous 09/02/25(Tue)18:25:50 No.106467593

>>106466930
Cool.
>If you guys have any more relevant benchmarks
If it's cheap enough, I think nolima might be worth doing. Or maybe contact fiction livebench guys to see if they're willing to bench your models on their private thing.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.