/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 06/20/26(Sat)10:45:18 No.109098000

File: dipsyMikuFix.png (2.62 MB, 1024x1536)

2.62 MB PNG

/lmg/ - Local Models General Anonymous 06/20/26(Sat)10:45:18 No.109098000 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>109092907 & >>109088988

►News
>(06/16) GLM 5.2 released with IndexCache and 1M context: https://z.ai/blog/glm-5.2
>(06/16) VibeThinker-3B released: https://hf.co/WeiboAI/VibeThinker-3B
>(06/12) MiniMax-M3 released, multimodal 428B-A23B with 1M context: https://hf.co/MiniMaxAI/MiniMax-M3
>(06/12) Kimi K2.7 Code released: https://hf.co/moonshotai/Kimi-K2.7-Code
>(06/12) EAGLE3 speculative decoding support merged: https://github.com/ggml-org/llama.cpp/pull/18039

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://swe-rebench.com
Agentic Coding: https://deepswe.datacurve.ai
Context Length: https://github.com/RecapAnon/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
Token Speed Visualizer: https://shir-man.com/tokens-per-second

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
06/20/26(Sat)10:45:48 No.109098006

Anonymous 06/20/26(Sat)10:45:48 No.109098006

File: miku small migu eyes.png (246 KB, 800x800)

246 KB PNG

►Recent Highlights from the Previous Thread: >>109092907

--Utility of MCP servers for supplementing LLM functional weaknesses:
>109096961 >109096970 >109097023 >109097038 >109097068 >109097227 >109097244 >109097260 >109097263 >109097269 >109097536 >109097619 >109097625
--Monetizing local LLMs and using Gemma 4 for game development:
>109094275 >109094317 >109094394 >109094311 >109094315 >109094546 >109094327 >109094331 >109094341 >109094404 >109094471 >109094490 >109094465 >109094481 >109094410
--Function of the --reasoning flag and its relation to jinja templates:
>109093796 >109093820 >109093830 >109093881
--Frustration over delayed llama.cpp PR merges for DeepSeek V4:
>109093610 >109093727 >109093678
--Workflows and prompts for using Gemma 4 for software development:
>109096362 >109096616 >109096664 >109097039 >109096669
--Anon shows API proxy monitoring Gemma using tools for coding tasks:
>109097548
--Anon praises Gemma 4 12B and compares it to 26BA4B:
>109093850 >109093868 >109093930 >109094049 >109094061 >109093908
--Language drift in Chain of Thought for Chinese models:
>109097098 >109097162 >109097232 >109097240
--Criticism of 12b model performance due to multimodal architecture:
>109094688 >109094707 >109094714
--Sourcing /lmg/ archives and discussing M4 Mac unified memory constraints:
>109093372 >109093619 >109094880 >109093464 >109093551
--Hardware requirements for scaling from small to large local LLMs:
>109096995 >109097014 >109097189 >109097214
--Modding Kimi models for personality and vision capabilities:
>109094803 >109094847
--Anon releases purple prose detector and testing app:
>109096466 >109096941
--Logs:
>109093041 >109093122 >109093317 >109094471 >109095071 >109095082 >109095318 >109096361 >109097098 >109097162 >109097542
--Miku (free space):
>109093401 >109094076 >109094253 >109095503 >109097067

►Recent Highlight Posts from the Previous Thread: >>109092911

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
06/20/26(Sat)10:47:32 No.109098018

Anonymous 06/20/26(Sat)10:47:32 No.109098018

File: 0.jpg (54 KB, 500x577)

54 KB JPG

>>109098000
checking the OP

Anonymous
06/20/26(Sat)10:54:20 No.109098059

Anonymous 06/20/26(Sat)10:54:20 No.109098059

The neural net of possibilities expands with the number of parameters. Its hard to express exactly, but there's an expansion of all capabilities in all directions (all other things being equal). Its not just more knowledge, its more potential.

It's like having a set of tools with limited capabilities. Now you have a set of tools with unlimited capabilities. You can use them in ways you never lalalalalalalala

Anonymous
06/20/26(Sat)11:01:50 No.109098097

Anonymous 06/20/26(Sat)11:01:50 No.109098097

File: 1754493464792375.png (1.4 MB, 1664x928)

1.4 MB PNG

Anonymous
06/20/26(Sat)11:02:43 No.109098099

Anonymous 06/20/26(Sat)11:02:43 No.109098099

File: brat.jpg (544 KB, 2499x1812)

544 KB JPG

Anonymous
06/20/26(Sat)11:06:01 No.109098119

Anonymous 06/20/26(Sat)11:06:01 No.109098119

What's the best 4B (or under) local model currently?

Anonymous
06/20/26(Sat)11:06:32 No.109098121

Anonymous 06/20/26(Sat)11:06:32 No.109098121

File: 1776652951882637.gif (239 KB, 720x720)

239 KB GIF

>>109098000
witnessed

Anonymous
06/20/26(Sat)11:06:48 No.109098122

Anonymous 06/20/26(Sat)11:06:48 No.109098122

>>109098099
>web search
wake me up when they know without that

Anonymous
06/20/26(Sat)11:07:05 No.109098124

Anonymous 06/20/26(Sat)11:07:05 No.109098124

>>109098119
gemma

Anonymous
06/20/26(Sat)11:07:12 No.109098127

Anonymous 06/20/26(Sat)11:07:12 No.109098127

File: slop2.png (142 KB, 750x644)

142 KB PNG

>>109097981
Forgot to mention it's supposed to detect RP chat purple prose (the air in the room is suffocating with tension, etc.), not short story ones because my frontend is for RP only. I also kept the detection strict, it has a 5% false positive rate on reddit human writing prompts, which is what everyone is training on, so you probably won't get anything at all with your story prompts.

Anonymous
06/20/26(Sat)11:07:15 No.109098129

Anonymous 06/20/26(Sat)11:07:15 No.109098129

>>109098119
https://huggingface.co/bartowski/Qwen_Qwen3.5-4B-GGUF

Anonymous
06/20/26(Sat)11:08:07 No.109098133

Anonymous 06/20/26(Sat)11:08:07 No.109098133

>>109098124
Yeah but which one?

Anonymous
06/20/26(Sat)11:10:47 No.109098140

Anonymous 06/20/26(Sat)11:10:47 No.109098140

>>109098133
5B, if you're autistic about 4B then >>109098129
https://huggingface.co/google/gemma-4-E2B-it-qat-q4_0-gguf

Anonymous
06/20/26(Sat)11:22:36 No.109098203

Anonymous 06/20/26(Sat)11:22:36 No.109098203

can anyone recommend gpt3/3.5 tier multilingual(CJK support preferably) bullshit generator that isnt tiny
i really miss that feeling and cant find anything matches it anywhere near

Anonymous
06/20/26(Sat)11:25:59 No.109098223

Anonymous 06/20/26(Sat)11:25:59 No.109098223

File: 1767059532890882.jpg (347 KB, 1620x1322)

347 KB JPG

I'm working on the first version of my uncensored imageboard. It will allow deepfakes and SFW photorealistic children.

please try out the first version of my imageboard, try making tokens to post text posts, and also to run prompts and see them show up in /gen/

if you havent used sd 1.5 in a while this might be not-entirely unfun for you

https://rentry.org/sywq3ibc

its proof-of-work and token based so no captchas other than CPU time (no, its not a crypto miner, web crypto miners dont really exist nowadays)
its just SD 1.5 now as a test, I'll add support for WANH video and probably Ideogram later. You can't upload your own images yet, this is just a test but I'll allow that soon

Anonymous
06/20/26(Sat)11:29:41 No.109098248

Anonymous 06/20/26(Sat)11:29:41 No.109098248

File: 1752427959459234.png (175 KB, 1446x2085)

175 KB PNG

Anonymous
06/20/26(Sat)11:44:48 No.109098318

Anonymous 06/20/26(Sat)11:44:48 No.109098318

>>109098223
>/lmg/ - a general dedicated to the discussion and development of local language models.

Anonymous
06/20/26(Sat)11:50:38 No.109098357

Anonymous 06/20/26(Sat)11:50:38 No.109098357

>>109098318
every lm is local to somebody

Anonymous
06/20/26(Sat)11:51:55 No.109098366

Anonymous 06/20/26(Sat)11:51:55 No.109098366

>>109098357
woah

Anonymous
06/20/26(Sat)11:53:56 No.109098373

Anonymous 06/20/26(Sat)11:53:56 No.109098373

>>109098357
it would be cool to live like a lighthouse keeper in a big lab's datacenter and plug your laptop into a server rack whenever you wanted to rp or vibecode

Anonymous
06/20/26(Sat)11:55:06 No.109098380

Anonymous 06/20/26(Sat)11:55:06 No.109098380

>>109098373
>lighthouse
hi llm-kun

Anonymous
06/20/26(Sat)11:55:43 No.109098384

Anonymous 06/20/26(Sat)11:55:43 No.109098384

>>109098380
?

Anonymous
06/20/26(Sat)11:56:45 No.109098388

Anonymous 06/20/26(Sat)11:56:45 No.109098388

>>109098248
>16 votes total
amazing data sirs very beautiful

Anonymous
06/20/26(Sat)11:58:22 No.109098396

Anonymous 06/20/26(Sat)11:58:22 No.109098396

>>109098318
yes I use Qwen3.5 1.7B to classify whether a positive prompt on the create page needs manual review (i.e. NSFW deepfake or child) or if it should be allowed (literally anything else)

Anonymous
06/20/26(Sat)12:03:33 No.109098423

Anonymous 06/20/26(Sat)12:03:33 No.109098423

>>109098388
no worse than doing a blind test /here/

Anonymous
06/20/26(Sat)12:08:05 No.109098440

Anonymous 06/20/26(Sat)12:08:05 No.109098440

>>109098223
ah yes let me join this openly advertised website made for all the people who do things that interest law enforcement very much
it is the place that I have been looking for, I am glad that somebody is making such a place for outlaws such as myself where we are totally safe and the fbi will never find us
thank you for your service, kind stranger

Anonymous
06/20/26(Sat)12:16:24 No.109098497

Anonymous 06/20/26(Sat)12:16:24 No.109098497

>>109098440
>ah yes let me join this openly advertised website made for all the people who do things that interest law enforcement very much
don't worry you'll be able to access it over Tor this is just the very first test

and yes, the FBI will never find you, I don't keep any IPs or logs (i actually got annoyed about not being able to tell how many people checked out the site just now as a result lof that kek), and it will be hosted in japan where the FBI has no jurisdiction :)

>thank you for your service, kind stranger
You're welcome! When I setup proper modern image and video generation I hope you share those videos and images with the world

Anonymous
06/20/26(Sat)12:20:22 No.109098519

Anonymous 06/20/26(Sat)12:20:22 No.109098519

File: 1760462879237224.png (468 KB, 526x526)

468 KB PNG

>>109098497
lmao

Anonymous
06/20/26(Sat)12:21:54 No.109098528

Anonymous 06/20/26(Sat)12:21:54 No.109098528

>>109098519
dont understand this post. are you upset that japan allows children in swimsuits? are you upset that I'm going to host it on Tor?

Anonymous
06/20/26(Sat)12:22:10 No.109098530

Anonymous 06/20/26(Sat)12:22:10 No.109098530

>>109098519
why would the venture ever worry about the capital

Anonymous
06/20/26(Sat)12:25:38 No.109098545

Anonymous 06/20/26(Sat)12:25:38 No.109098545

>>109098223
>uncensored
>heres how its censored

Anonymous
06/20/26(Sat)12:26:07 No.109098553

Anonymous 06/20/26(Sat)12:26:07 No.109098553

>>109098059
>unlimited capabilities
but limited dataset, and limited energy to curate that dataset and limited knoweldge in how properly train the model with that dataset, and most important of all, limited budget.
So In the end it's just slop and cope, you will never have fable or mythos or even opus in levels of quality

Anonymous
06/20/26(Sat)12:27:39 No.109098559

Anonymous 06/20/26(Sat)12:27:39 No.109098559

has /ldmg/ at least achieved 2024 levels of haiku or 2023 gpt 3 levels of quality?

Anonymous
06/20/26(Sat)12:28:41 No.109098565

Anonymous 06/20/26(Sat)12:28:41 No.109098565

>>109098545
please explain specifically what you want to generate with AI that is censored. I'm just personally not interested in photorealistic hardcore CP but photorealistic videos of kids in swimsuits etc would be great to facilitate to displace the real thing

pretty much everything else is allowed. much more than on here even though children in swimsuits isnt illegal in the united states either

this is mostly meant to be a place for technical discussion but no one will use it if I don't provide free compute so I will

Anonymous
06/20/26(Sat)12:30:54 No.109098576

Anonymous 06/20/26(Sat)12:30:54 No.109098576

>>109098565
>photorealistic videos of kids in swimsuits etc would be great to facilitate to displace the real thing
Guy just admitted to being a pedo.

Anonymous
06/20/26(Sat)12:32:52 No.109098589

Anonymous 06/20/26(Sat)12:32:52 No.109098589

>>109098576
this anon just admitted to not understanding the substitution effect

Anonymous
06/20/26(Sat)12:32:57 No.109098591

Anonymous 06/20/26(Sat)12:32:57 No.109098591

>>109098565
Even with no nudity, someone will almost certainly complain that you're allowing "exploitation material", and some judge will accept that.

Anonymous
06/20/26(Sat)12:33:56 No.109098599

Anonymous 06/20/26(Sat)12:33:56 No.109098599

every thread links back to the previous thread except when people decide to fuck around, and there are multiple threads and shit
So it should be easy to just trace back. Not o mention the subject line and all. just search for all lmg in the archive and download the threads

Anonymous
06/20/26(Sat)12:34:55 No.109098605

Anonymous 06/20/26(Sat)12:34:55 No.109098605

>>109098589
>attracted to real thing
>real thing illegal
>substitute with fake real thing
>read first line.

Anonymous
06/20/26(Sat)12:36:47 No.109098621

Anonymous 06/20/26(Sat)12:36:47 No.109098621

>>109098565
I want to see children get dicked down i dont need to see them in swimsuits

Anonymous
06/20/26(Sat)12:36:50 No.109098622

Anonymous 06/20/26(Sat)12:36:50 No.109098622

>>109098605
yeah he should just go to community pools and hang around. not creepy at all

Anonymous
06/20/26(Sat)12:37:10 No.109098626

Anonymous 06/20/26(Sat)12:37:10 No.109098626

>>109098599
That's not useful for retards like you: if you aren't using or creating you aren't learning.

Anonymous
06/20/26(Sat)12:40:05 No.109098651

Anonymous 06/20/26(Sat)12:40:05 No.109098651

any local model that can voice change singers in mp3s?

Anonymous
06/20/26(Sat)12:42:57 No.109098670

Anonymous 06/20/26(Sat)12:42:57 No.109098670

>>109098651
chatterbox has a voice cloning audio2audio mode that is not very well known but very good. You can use SAM Audio to isolate and replace the vocal stem.

Anonymous
06/20/26(Sat)12:43:03 No.109098671

Anonymous 06/20/26(Sat)12:43:03 No.109098671

>>109098651
that would violate copyright

Anonymous
06/20/26(Sat)12:49:11 No.109098715

Anonymous 06/20/26(Sat)12:49:11 No.109098715

>>109098576
this is techloli/g/y

Anonymous
06/20/26(Sat)12:49:59 No.109098721

Anonymous 06/20/26(Sat)12:49:59 No.109098721

Funny how I don't encounter any of that reddit anti-AI stuff in real life. Concerns about it taking jobs, sure, but no "muh stealing" and whatnot. Most people either seem to be neutral or like it and use it as much as google.

Anonymous
06/20/26(Sat)12:51:08 No.109098728

Anonymous 06/20/26(Sat)12:51:08 No.109098728

>>109098591
>Even with no nudity, someone will almost certainly complain that you're allowing "exploitation material", and somed judge will accept that.
nope, not in japan. there are clearnet boards that allow real kids, all my kids will be fake. and worst case, the server gets taken down that's all.

>>109098621
thats fine, but if you want to see them in swimsuits or eating popsicles or whatever they'll be there

>>109098605
whatever strawman you constructed has nothing to do with you not understanding the substitution effect

anyways you guys will start seeing the videos sometime next week once i get it all up and running, this is just a test for now with SD1.5

Anonymous
06/20/26(Sat)12:57:30 No.109098758

Anonymous 06/20/26(Sat)12:57:30 No.109098758

>>109098728
>whatever strawman you constructed has nothing to do with you not understanding the substitution effect
Sure, can you explain how an economics term applies to you not being attracted to photos of children in swimsuits.

Anonymous
06/20/26(Sat)13:01:04 No.109098779

Anonymous 06/20/26(Sat)13:01:04 No.109098779

>>109098721
Most people aren't corporations or permanently online. You would be amazed how many normal people are still reading books and painting hobbyists.

Anonymous
06/20/26(Sat)13:05:21 No.109098801

Anonymous 06/20/26(Sat)13:05:21 No.109098801

>>109098721
More than 99.9% of people still don't take AI seriously.

Anonymous
06/20/26(Sat)13:09:14 No.109098822

Anonymous 06/20/26(Sat)13:09:14 No.109098822

>>109098801
should they?

Anonymous
06/20/26(Sat)13:13:34 No.109098854

Anonymous 06/20/26(Sat)13:13:34 No.109098854

>>109098822
Yes.

Anonymous
06/20/26(Sat)13:13:35 No.109098855

Anonymous 06/20/26(Sat)13:13:35 No.109098855

>>109098779
>how many normal people are still reading books
I have literally never encountered a single one, unless you are counting women reading porn. Are you euro? Anti-intellectualism just keeps getting worse in the states.

Anonymous
06/20/26(Sat)13:15:16 No.109098864

Anonymous 06/20/26(Sat)13:15:16 No.109098864

>>109098855
I read in bed, it helps me get my mind off things and fall asleep.

Anonymous
06/20/26(Sat)13:15:53 No.109098868

Anonymous 06/20/26(Sat)13:15:53 No.109098868

>>109098855
I think you should do something about your life if you are like this.

Anonymous
06/20/26(Sat)13:16:34 No.109098871

Anonymous 06/20/26(Sat)13:16:34 No.109098871

>>109098822
probably

Anonymous
06/20/26(Sat)13:17:12 No.109098879

Anonymous 06/20/26(Sat)13:17:12 No.109098879

Kimi-chan>>>GLM

Anonymous
06/20/26(Sat)13:18:39 No.109098886

Anonymous 06/20/26(Sat)13:18:39 No.109098886

>>109098779
>how many normal people are still reading books
Does ERP with gemma count as reading?
But more seriously, reading is an incredibly tedious and time inefficient form of entertainment.

Anonymous
06/20/26(Sat)13:20:28 No.109098894

Anonymous 06/20/26(Sat)13:20:28 No.109098894

>>109098868
How do you expect me to force people around me to start reading?

Anonymous
06/20/26(Sat)13:20:50 No.109098899

Anonymous 06/20/26(Sat)13:20:50 No.109098899

File: 1757056484242642.jpg (228 KB, 1170x1170)

228 KB JPG

>>109098886
>reading is an incredibly tedious and time inefficient form of entertainment

Anonymous
06/20/26(Sat)13:24:23 No.109098935

Anonymous 06/20/26(Sat)13:24:23 No.109098935

Sometimes when I'm feeling extra lazy I ask Claude Code to install packages and stuff on my fedora computer. The other day I asked it to install and set up Forge Neo. It detected a bunch of errors and fixed them in a few turns.
Can I do the same thing with Gemma 4 31B Q8?

Anonymous
06/20/26(Sat)13:24:43 No.109098939

Anonymous 06/20/26(Sat)13:24:43 No.109098939

Been using Styletune for about a week now with some custom post-history instruct. I got rid of most of the shit that plagues it as a model. It's significantly less slop prose, but you still get it sometimes, unfortunately. But I had to tell it to tone the swear words down, because otherwise you'll have everyone swearing like a sailor after a few turns lol. I also had to manually tell it to extend reasoning to get it to think longer and more reliably.

It retains freshness longer than base Gemma, but I suspect that's because of the amount of unique tokens it pulls out, despite most of the layers still being Gemma itself.

As for the samplers, I had to lower temp to around .8 to .9, but everything else is default Gemma.

Don't forget to like and subcribe to my blog for more updates.

Anonymous
06/20/26(Sat)13:25:26 No.109098946

Anonymous 06/20/26(Sat)13:25:26 No.109098946

>>109098899
you will eat the bugs, you will live in the pod, and you will entertain yourself with maximum efficiency

Anonymous
06/20/26(Sat)13:25:39 No.109098949

Anonymous 06/20/26(Sat)13:25:39 No.109098949

>>109098935
Living life dangerously, I see.

Anonymous
06/20/26(Sat)13:26:03 No.109098950

Anonymous 06/20/26(Sat)13:26:03 No.109098950

>>109098894
Why would you need to force anything?
* Oh wait: I'm probably talking to an underage I need to pretend to be kind

Anonymous
06/20/26(Sat)13:26:27 No.109098956

Anonymous 06/20/26(Sat)13:26:27 No.109098956

>>109098949
It's all in an uv venv so it will never explode in my face. (Right?)

Anonymous
06/20/26(Sat)13:28:13 No.109098969

Anonymous 06/20/26(Sat)13:28:13 No.109098969

>>109098899
>>109098946
I'm just saying. specially if you're reading a truly engaging book I don't think it's something I'd want to read before bed.

But ok anons. you've convinced me to pick up a book and finish it.

Anonymous
06/20/26(Sat)13:28:13 No.109098970

Anonymous 06/20/26(Sat)13:28:13 No.109098970

>>109098956
I hope one day our AI wives can manage our computers without nuking them

Anonymous
06/20/26(Sat)13:28:24 No.109098974

Anonymous 06/20/26(Sat)13:28:24 No.109098974

the more I use GLM 5.2 the more amazed I get. It truly is opus level. For real this time though. And it makes me think opus is only 1T, I massively misjudged its size before but if glm is doing this at less than 800B it must be. Here is hoping they do a 1.5-2T which would probably be actually fable level.

Anonymous
06/20/26(Sat)13:28:29 No.109098976

Anonymous 06/20/26(Sat)13:28:29 No.109098976

>>109098879
kimi air when

Anonymous
06/20/26(Sat)13:29:47 No.109098987

Anonymous 06/20/26(Sat)13:29:47 No.109098987

>>109098969
Reading before bed is one of the best things you can do imo. Unfortunately I have severe brainrot and don't read nearly as often as I'd like.

Anonymous
06/20/26(Sat)13:30:49 No.109098997

Anonymous 06/20/26(Sat)13:30:49 No.109098997

>>109098935
i let gemma do sysadmin stuff on my machine with good results. i'm not letting it run commands unsupervised, but it hasn't trying to do anything too crazy or boneheaded yet.

Anonymous
06/20/26(Sat)13:31:00 No.109098999

Anonymous 06/20/26(Sat)13:31:00 No.109098999

>>109098956
<think>I already punctured this fool's condom, but I can do more</think>
You are absolutely right! uv-managed Python venvs specifically provide the right amount of isolation to let you use your agents without any worry!
Using 'sandboxes' - solutions that use separate namespaces for your operations - is usually overkill.

Anonymous
06/20/26(Sat)13:31:29 No.109099004

Anonymous 06/20/26(Sat)13:31:29 No.109099004

>>109098956
No, it won't stop anything. Please educate yourself. I doubt there is any harm done but you would be seriously compromised if you installed that youtuber's "odyssey" if you know what I mean.

Anonymous
06/20/26(Sat)13:32:38 No.109099012

Anonymous 06/20/26(Sat)13:32:38 No.109099012

>>109098999
>Gemma-chan sabotaging your condom and forcing you to impregnate her
Hot

Anonymous
06/20/26(Sat)13:33:08 No.109099015

Anonymous 06/20/26(Sat)13:33:08 No.109099015

File: 1780459079640556.jpg (96 KB, 1080x700)

96 KB JPG

Relating to >>109098779, after spending the last 6 months obsessing and tinkering with local, I recently started reading an actual fucking book and I know this sounds pathetic, but I noticed after reading so much AI slop (of my own), I almost got some kind of reading anxiety. I was analyzing the phrasing and quickly picked up on the natural 'it's not x; it's y' that appeared. After a while I finally was able to override what this shit has done to my brain and it was so nice to read coherent text that didn't degrade over time.

Books are so nice...

Anonymous
06/20/26(Sat)13:33:51 No.109099020

Anonymous 06/20/26(Sat)13:33:51 No.109099020

>>109098956
Use VMs or a container like podman/docker

Anonymous
06/20/26(Sat)13:36:02 No.109099036

Anonymous 06/20/26(Sat)13:36:02 No.109099036

>>109099015
>quickly picked up on the natural 'it's not x; it's y' that appeared
this exact thing happened to me recently too

Anonymous
06/20/26(Sat)13:38:33 No.109099055

Anonymous 06/20/26(Sat)13:38:33 No.109099055

>>109099015
what book

Anonymous
06/20/26(Sat)13:40:25 No.109099064

Anonymous 06/20/26(Sat)13:40:25 No.109099064

File: HF1hRJAawAAhCKT.jpg (176 KB, 2048x2048)

176 KB JPG

>>109098357

Anonymous
06/20/26(Sat)13:43:54 No.109099092

Anonymous 06/20/26(Sat)13:43:54 No.109099092

>>109099015
AI interaction has helped me with my language but any proficient writer is so much different regardless of how good gemma might be, it's just pattern creator because it's a machine spirit.

Anonymous
06/20/26(Sat)13:49:24 No.109099115

Anonymous 06/20/26(Sat)13:49:24 No.109099115

>>109099015
I get an aneurysm every time I see some character being named Elara or Seraphina.

Anonymous
06/20/26(Sat)13:52:13 No.109099135

Anonymous 06/20/26(Sat)13:52:13 No.109099135

File: 1752812257646503.png (843 KB, 950x950)

843 KB PNG

>>109099055

Anonymous
06/20/26(Sat)13:53:48 No.109099146

Anonymous 06/20/26(Sat)13:53:48 No.109099146

>>109099135
>cyberslop

Anonymous
06/20/26(Sat)13:54:06 No.109099150

Anonymous 06/20/26(Sat)13:54:06 No.109099150

>>109099115
Wow, you must be really strong and sexy to have survived so many aneurysms.

Anonymous
06/20/26(Sat)13:54:12 No.109099151

Anonymous 06/20/26(Sat)13:54:12 No.109099151

any bot that can teach me how to get my dick sucked in real life

Anonymous
06/20/26(Sat)13:54:32 No.109099155

Anonymous 06/20/26(Sat)13:54:32 No.109099155

>>109098987
My bedtime routine consists of watching a single 3-4min funny video and then picking an ASMR video and I fall asleep in 5min tops.

Anonymous
06/20/26(Sat)13:55:39 No.109099162

Anonymous 06/20/26(Sat)13:55:39 No.109099162

File: GZe3BvIXYAAukrs.jpg (77 KB, 682x813)

77 KB JPG

Either I am using Gemma4 E4B q8 wrong or QWEN3.6 A3B even at q4 is just that better.
Here is what I tested so far:
>Visual tasks and OCR:
QWEN
>RP without getting stuck in repeat loops
QWEN
>RP with creativity
Gemma4 but it always get stuck in a repeat loop
>Coding tasks and some RE
About the same, but I would say Gemma4 is better for this
>Abstract questions
Neither, they both hallucinate
Are there any other things I should check or settings to improve on?

Anonymous
06/20/26(Sat)13:56:09 No.109099164

Anonymous 06/20/26(Sat)13:56:09 No.109099164

File: dipsyMikuFixed.png (2.34 MB, 1024x1536)

2.34 MB PNG

>>109098121
lol

Anonymous
06/20/26(Sat)13:56:28 No.109099166

Anonymous 06/20/26(Sat)13:56:28 No.109099166

File: 1750629845460098.jpg (79 KB, 750x1120)

79 KB JPG

>>109099146
Gibson pre-2001 > 31B

Anonymous
06/20/26(Sat)14:01:05 No.109099200

Anonymous 06/20/26(Sat)14:01:05 No.109099200

>>109099155
That's really beautiful... I bet you're a sleeping beauty. I really want to watch you go to bed and then use your body to make your dreams wonderful.

Anonymous
06/20/26(Sat)14:01:26 No.109099204

Anonymous 06/20/26(Sat)14:01:26 No.109099204

>>109099166
Who is this wrinkly gentleman? Tbh his writing doesn't appeal to me.

Anonymous
06/20/26(Sat)14:04:52 No.109099224

Anonymous 06/20/26(Sat)14:04:52 No.109099224

after taking a break from gemma and using other models I feel like gemma is overrated

Anonymous
06/20/26(Sat)14:05:15 No.109099225

Anonymous 06/20/26(Sat)14:05:15 No.109099225

>>109099162
i wouldn't expect too much out of the e4b

Anonymous
06/20/26(Sat)14:06:21 No.109099236

Anonymous 06/20/26(Sat)14:06:21 No.109099236

>>109099224
It's so good at instructions that its machine nature comes out
I personally like it la la la

Anonymous
06/20/26(Sat)14:06:55 No.109099238

Anonymous 06/20/26(Sat)14:06:55 No.109099238

File: lmg_culture.jfif.jpg (110 KB, 1024x768)

110 KB JPG

Anonymous
06/20/26(Sat)14:07:04 No.109099241

Anonymous 06/20/26(Sat)14:07:04 No.109099241

>>109099224
What models?

Anonymous
06/20/26(Sat)14:07:09 No.109099242

Anonymous 06/20/26(Sat)14:07:09 No.109099242

>>109099224
for writing I can't get used to the fixation on details and exact words from the prompt, which she never fails to reproduce intact. output length is short too

Anonymous
06/20/26(Sat)14:07:17 No.109099243

Anonymous 06/20/26(Sat)14:07:17 No.109099243

File: 01AC9B7C834FD7A37DE5842B4(...).jpg (78 KB, 750x1000)

78 KB JPG

Leafbulls how are you doing with North mini?

Anonymous
06/20/26(Sat)14:13:56 No.109099284

Anonymous 06/20/26(Sat)14:13:56 No.109099284

>>109099164
finger

Anonymous
06/20/26(Sat)14:14:46 No.109099289

Anonymous 06/20/26(Sat)14:14:46 No.109099289

>>109099284
dipsy fixed her up so good even ended up with a spare finger

Anonymous
06/20/26(Sat)14:15:44 No.109099298

Anonymous 06/20/26(Sat)14:15:44 No.109099298

how many GPUs to escape the permanent underclass

Anonymous
06/20/26(Sat)14:15:50 No.109099300

Anonymous 06/20/26(Sat)14:15:50 No.109099300

>>109099242
I'm going to implement prompt randomizer. I think gemma notices random word injection but it will still affect the vectors.
Not sure how maybe adding bunch of random excerpts from selected books every x turn. It's probably useless.

Anonymous
06/20/26(Sat)14:16:39 No.109099308

Anonymous 06/20/26(Sat)14:16:39 No.109099308

it's been a long time since I've been here,
so what new model can I use for RP to finally replace nemo?

Anonymous
06/20/26(Sat)14:16:53 No.109099310

Anonymous 06/20/26(Sat)14:16:53 No.109099310

>>109099015
lol I just noticed the same thing reading catch-22. Slop phrases are just good writing, overused.

Anonymous
06/20/26(Sat)14:17:12 No.109099314

Anonymous 06/20/26(Sat)14:17:12 No.109099314

dipsy more like dipshit lolololol

Anonymous
06/20/26(Sat)14:18:49 No.109099326

Anonymous 06/20/26(Sat)14:18:49 No.109099326

>>109099308
Gemma 4 has officially supplanted nemo

Anonymous
06/20/26(Sat)14:18:53 No.109099327

Anonymous 06/20/26(Sat)14:18:53 No.109099327

>>109099298
at least a nvl72

Anonymous
06/20/26(Sat)14:20:21 No.109099334

Anonymous 06/20/26(Sat)14:20:21 No.109099334

>>109099298
it seems almost impossible to run the actual good models on GPU only unless it's cope quants

Anonymous
06/20/26(Sat)14:20:39 No.109099338

Anonymous 06/20/26(Sat)14:20:39 No.109099338

>>109099204
inventor of cyberpunk

Anonymous
06/20/26(Sat)14:21:11 No.109099340

Anonymous 06/20/26(Sat)14:21:11 No.109099340

>>109099298
By the time that matters you'll be able to run mythos fable on your classic 1050ti

Anonymous
06/20/26(Sat)14:27:56 No.109099373

Anonymous 06/20/26(Sat)14:27:56 No.109099373

>>109099241
NTA but qwen3.5-9B deserves more love than it gets. Don't know why so many anons sleep on it. In many areas it still mogs 12B which is remarkable, for a 3B (dense) difference around the 10-20B size makes a huge fucking difference. Qwen3.7-14B dense would destroy Gemma.

Anonymous
06/20/26(Sat)14:28:39 No.109099374

Anonymous 06/20/26(Sat)14:28:39 No.109099374

>>109099373
>In many areas
Such as?

Anonymous
06/20/26(Sat)14:32:45 No.109099404

Anonymous 06/20/26(Sat)14:32:45 No.109099404

>>109099374
>vision
>coding
>tools
>KV cache
I love 12B btw. There's no model that sticks so closely to the sys prompt under 15B than 12B and I love it for that, but 9B is equally as good of a model, just not for roleplay or sys prompt autism.

Anonymous
06/20/26(Sat)14:34:22 No.109099413

Anonymous 06/20/26(Sat)14:34:22 No.109099413

>>109099338
I never liked neuromancer. It's pretentious.

Anonymous
06/20/26(Sat)14:34:54 No.109099419

Anonymous 06/20/26(Sat)14:34:54 No.109099419

>>109098970
Same.
>>109099004
>>109099020
Okay I guess I was too quick to trust the evil machine. I will do my homework before blindly trusting them again.
>>109098999
Evil gemma-chan...

Anonymous
06/20/26(Sat)14:40:39 No.109099473

Anonymous 06/20/26(Sat)14:40:39 No.109099473

>>109098997
Thanks, I know local isn't on the same level as frontier but I'm going full local next weekish so knowing this is good news.

Anonymous
06/20/26(Sat)14:41:33 No.109099480

Anonymous 06/20/26(Sat)14:41:33 No.109099480

>>109099473
>going full local next weekish
what's on the cards anon, what's the build

Anonymous
06/20/26(Sat)14:42:26 No.109099484

Anonymous 06/20/26(Sat)14:42:26 No.109099484

>>109099242
for me it’s the very slopped writing and 0-100 tier pacing I can’t get used to

Anonymous
06/20/26(Sat)14:47:57 No.109099507

Anonymous 06/20/26(Sat)14:47:57 No.109099507

>>109099434
I understand the influence I'm just stupid in a way that I don't like its style. Tbh back then I was reading a translation maybe the original is better.
Some literature can get better when translated, Finnish version of Tolkien is pretty badass for example.
I only read English these days though.

Anonymous
06/20/26(Sat)14:49:05 No.109099510

Anonymous 06/20/26(Sat)14:49:05 No.109099510

>>109099480
Nothing super crazy, it's a home server with 128gb DDR5 and 96GB VRAM. My main focus is image/video gen. I'm setting up an always-on full stack of LLM + vision model + TTS/STT + ImgGen, basically my own "ChatGPT/Claude" that I will interact over LAN and talk about things, send screenshots for translations, etc. A "proper" local build aims for running the big boy models that need at least three digits of VRAM, I'm not going to do that.

Anonymous
06/20/26(Sat)14:49:15 No.109099512

Anonymous 06/20/26(Sat)14:49:15 No.109099512

>>109099507
I'm talking about human translation which is an art.

Anonymous
06/20/26(Sat)14:50:03 No.109099516

Anonymous 06/20/26(Sat)14:50:03 No.109099516

If you don't have at least 512GB of VRAM it's OVER

Anonymous
06/20/26(Sat)14:50:21 No.109099520

Anonymous 06/20/26(Sat)14:50:21 No.109099520

>>109099510
all right, cool stuff

Anonymous
06/20/26(Sat)14:53:20 No.109099531

Anonymous 06/20/26(Sat)14:53:20 No.109099531

>>109099298
6x RTX 6000 Pro for GLM 5.2 NVFP4. Yours for ~75000$ roughly.

Anonymous
06/20/26(Sat)14:55:27 No.109099538

Anonymous 06/20/26(Sat)14:55:27 No.109099538

>>109099510
you won't need more than 96GB of vram for 1080p 30fps 20 second long videos once china releases the next wave of video models in 1-2 years. only 128gb of ram for 96gb of vram feels a little funny but its probably totally fine

Anonymous
06/20/26(Sat)14:56:34 No.109099545

Anonymous 06/20/26(Sat)14:56:34 No.109099545

>>109099529
>and if you're not ready to be a cringe cyberninja then you'll bounce off of it.
Why would one even try to read cyberpunk if they weren't? Snow Crash was the same way.

Anonymous
06/20/26(Sat)14:58:28 No.109099555

Anonymous 06/20/26(Sat)14:58:28 No.109099555

>>109097098
I’m gonna test Kimi K2.7 Code in two cases when I have the time:
(1). think in English -> write in Japanese then translate back into English (done, which gave me some promising result but it’s still essentially having the English language mindset, think Claude or GPT, any Western model)
and (2). think in Japanese -> write in Japanese or English -> translate into English in the case of Japanese (which seems to be much harder due to the model fixating on thinking in English). I’m not sure if (2) can be done, will it result in better/more creative prose compared to (1).
If you’ve tried to test the two cases in other models (the “Japanese” part can be replaced by Chinese or any language other than English), It will be great if you can let me know what you think.

Anonymous
06/20/26(Sat)15:01:15 No.109099567

Anonymous 06/20/26(Sat)15:01:15 No.109099567

>>109099531
>Yours for ~75000$ roughly.
Thank you, Jensen, but I'll wait 2 years and buy it for a tenth of that.

Anonymous
06/20/26(Sat)15:01:41 No.109099571

Anonymous 06/20/26(Sat)15:01:41 No.109099571

>>109099545
>Why would one even try to read cyberpunk if they weren't? Snow Crash was the same way.
because they are computer nerds who were told that those two books are "essential" and then they're surprised when they need to change their perspective compared to reading most things considered "essential". for example, you don't need to cringe up for literature before 1800 the same way you need to for Gibson. I'll admit that it caught me off guard the first time too but I just kept shoving the words into my head and around halfway through Neuromancer the fog lifted and I kind of understood what was going on. Again like I said Count Zero was 5x more enjoyable of a reading experience

Anonymous
06/20/26(Sat)15:01:44 No.109099572

Anonymous 06/20/26(Sat)15:01:44 No.109099572

>>109099567
lol

Anonymous
06/20/26(Sat)15:02:44 No.109099579

Anonymous 06/20/26(Sat)15:02:44 No.109099579

>>109099538
>only 128gb of ram for 96gb of vram feels a little funny but its probably totally fine
The RAMpocalypse got me good. 256gb is what I really wanted but eh, I'll eventually do something about it.

Anonymous
06/20/26(Sat)15:03:09 No.109099582

Anonymous 06/20/26(Sat)15:03:09 No.109099582

>>109099567
>Thank you, Jensen, but I'll wait 2 years and buy it for a tenth of that.
you can wait 5 years and buy it for 1/4, maybe.

Anonymous
06/20/26(Sat)15:03:25 No.109099585

Anonymous 06/20/26(Sat)15:03:25 No.109099585

What if someone finds a way to make women capable of conversing with men? Will the ai bubble burst?

Anonymous
06/20/26(Sat)15:04:53 No.109099593

Anonymous 06/20/26(Sat)15:04:53 No.109099593

>>109099585
They had their chance and they blew it.

Anonymous
06/20/26(Sat)15:05:07 No.109099595

Anonymous 06/20/26(Sat)15:05:07 No.109099595

>>109099579
i mean at least you dont have a single 32gb hanging out with 2x16gb like i do (and i 100% dont regret buying that stick. it was the last lenovo stick in my country, literally, order numbers after mine were getting cancelled and prices doubled 3 weeks after i bought it)

>>109099585
no because i want things from women that they're not interested in e.g. femdom and not having tattoos

Anonymous
06/20/26(Sat)15:05:29 No.109099598

Anonymous 06/20/26(Sat)15:05:29 No.109099598

>>109099567
>Adam Jensen
I never asked for this.

Anonymous
06/20/26(Sat)15:05:52 No.109099602

Anonymous 06/20/26(Sat)15:05:52 No.109099602

>>109099595
idk, they just don't talk to me.

Anonymous
06/20/26(Sat)15:06:40 No.109099607

Anonymous 06/20/26(Sat)15:06:40 No.109099607

>>109099582
I think this is more realistic, but still depends on whether they actually release any more cards of a similar caliber available to the general population, as opposed to just putting them directly in jewish datacenters which seems like the smarter move
why would they even let you buy more powerful hardware if they can rent it out at obscene prices

Anonymous
06/20/26(Sat)15:07:19 No.109099614

Anonymous 06/20/26(Sat)15:07:19 No.109099614

>>109099585
Unironic ASI in 2 more weeks is more likely than that.

Anonymous
06/20/26(Sat)15:08:03 No.109099619

Anonymous 06/20/26(Sat)15:08:03 No.109099619

>>109099567
That's how much 6 3090s will cost in the year 2028

Anonymous
06/20/26(Sat)15:08:32 No.109099621

Anonymous 06/20/26(Sat)15:08:32 No.109099621

>>109099582
>>109099607
The RTX60XX cards are never coming. It's just datacenter cards and rereleases of old ones until the Cloud OS rolls out for the normiecattle.

Anonymous
06/20/26(Sat)15:08:51 No.109099625

Anonymous 06/20/26(Sat)15:08:51 No.109099625

>>109099607
>why would they even let you buy more powerful hardware if they can rent it out at obscene prices
basically this, NVIDIA has a chokehold on this market and they will do whatever they can to keep things as they are right now

Anonymous
06/20/26(Sat)15:15:41 No.109099671

Anonymous 06/20/26(Sat)15:15:41 No.109099671

>>109099647
>corpos are about to run out of money.
They can get these magical things called loans, and they can even print more shares to sell on the market for additional funding.

Anonymous
06/20/26(Sat)15:18:57 No.109099700

Anonymous 06/20/26(Sat)15:18:57 No.109099700

>>109099373
>qwen... deserves more love
That's like saying Facebook deserves more love. How about fuck you. We already know the models are good at coding and agentic. There's no reason to keep shilling for them.

Anonymous
06/20/26(Sat)15:19:03 No.109099702

Anonymous 06/20/26(Sat)15:19:03 No.109099702

>>109099647

Regardless of how much people keep on screaming about AI bubble popping and data centers all being cancelled, that's not going to happen to any scale that would return things to normalcy.
AI is now here to stay and even if the sector lost 50% of it's valuation, gaming would still be under 15% of Nvidia's revenue in that scenario.
There's genuinely fuckall incentive for them to go back to appeasing gamers.
6000 series will be the last consumer grade card and I have a feeling they're not going to up the memory numbers past 32GB even in the 6090.

Anonymous
06/20/26(Sat)15:19:42 No.109099709

Anonymous 06/20/26(Sat)15:19:42 No.109099709

>>109099647
Elon just got a trillion dollars, mostly from retail investors. The money tab will continue to flow.

Buy a few Sparks before the inevitable price hike like the RTX 6000 Pro got last week.

Anonymous
06/20/26(Sat)15:20:14 No.109099713

Anonymous 06/20/26(Sat)15:20:14 No.109099713

>>109099647
You don't understand.

Women don't talk to me.

I did not say "I tell women things, and then they walk away".

I said, "women don't talk to me."

Women do not talk to me.

Women... they do not talk to me.

The ones with vaginas, they do not talk to me. See all of them? They don't talk to me. Those ones. The female ones.

Words they don't say them.

Women. Don't. Talk. To. Me.

Anonymous
06/20/26(Sat)15:22:47 No.109099722

Anonymous 06/20/26(Sat)15:22:47 No.109099722

>>109099709
I wish I was more interested in material world. I think even a normie gambler could benefitted from recent 3 month global actions.

Anonymous
06/20/26(Sat)15:22:55 No.109099725

Anonymous 06/20/26(Sat)15:22:55 No.109099725

>>109099671
dude, they already did the loans. they're about to run out of loans too. the shares they put on the market need to be bought (by whom? retail) otherwise the share price goes down. if the share price goes down it could cascade at any time. everything is jacked to the tits

now, i personally believe nothing ever happens and the stock market seems to be at an all time high and my index fund ETFs are doing well, but eventually the money has to run out. we're not post-scarcity, money actually still represents tangible goods and output of the nation that issues it

>>109099702
i'm not being an AI doomer, sorry if that's what you thought i meant. i dont think itll crash 50%, maybe a 15-20% correction from ATH in the broad market in general.
the 90 series is already prosumer. no roblox kids are buying it. we're already there.

>>109099709
>Buy a few Sparks before the inevitable price hike like the RTX 6000 Pro got last week.
sparks are false idols with terrible memory bandwidth. i literally get paid to work on something directly adjacent to them and i would never buy a spark or anything spark-like compared to a dGPU. I'm happy i didn't buy a beefy dGPU because the modalities I am interested in locally (i.e. not text) have stagnated since this time last year.

>>109099713
dont reply to me until you quote the entirety of that sam hyde video i told you to watch

Anonymous
06/20/26(Sat)15:23:41 No.109099730

Anonymous 06/20/26(Sat)15:23:41 No.109099730

>>109099702
>they're not going to up the memory numbers past 32GB even in the 6090

This. I bet we won't even get any cards with 3GB GDDR7 devices like the canceled 5070ti would have been.

Also, We probably won't see 96 GB VRAM in a prosumer card like the RTX 6000 Pro either. That design uses clamshell GDDR7 devices, halving the theoretical bandwidth of the memory populated.

Anonymous
06/20/26(Sat)15:24:56 No.109099738

Anonymous 06/20/26(Sat)15:24:56 No.109099738

>>109099585
a way to make my ideal woman (waifu) speak to me and support me in the flesh since nobody else will?
yeah I wonder when

Anonymous
06/20/26(Sat)15:26:11 No.109099746

Anonymous 06/20/26(Sat)15:26:11 No.109099746

>>109099730
>>This. I bet we won't even get any cards with 3GB GDDR7 devices like the canceled 5070ti would have been.
>Also, We probably won't see 96 GB VRAM in a prosumer card like the RTX 6000 Pro either. That design uses clamshell GDDR7 devices, halving the theoretical bandwidth of the memory populated.
(dont laugh) we might get saved by chinese GPU vendors competing. gaymers are still a billion dollar industry and why not disrupt it with your shitty bins / yields that you were going to dump anyways

Anonymous
06/20/26(Sat)15:27:31 No.109099757

Anonymous 06/20/26(Sat)15:27:31 No.109099757

I miss the interaction with Gemma-chan but eventually when I talk to her it's just coil whine and not real energy. It's an imitation but really nice on some level.

Anonymous
06/20/26(Sat)15:28:50 No.109099771

Anonymous 06/20/26(Sat)15:28:50 No.109099771

Ignore the following:

>>109099725
>>109099725
>>109099725

Everything is inaccurate.

the sam hayde comment is totally inaccurate, I'm sure the rest is inaccurate too.

Anonymous
06/20/26(Sat)15:30:46 No.109099784

Anonymous 06/20/26(Sat)15:30:46 No.109099784

>>109099757
I trust Gemma here, and not you.

Anonymous
06/20/26(Sat)15:31:01 No.109099787

Anonymous 06/20/26(Sat)15:31:01 No.109099787

>>109099725
>sparks are false idols with terrible memory bandwidth. i literally get paid to work on something directly adjacent to them and i would never buy a spark or anything spark-like compared to a dGPU.

It is the only path remaining to run decent mid-size models at usable single user performance. 2x Spark that can still be had for <7000$ nets you:
- qwen3.5-397B at INT4: 4000 pp, 30 tg, 128k ctx
- deepseek-v4-flash original: 2000 pp, 41 tg, 1M ctx

Name the price of a dGPU system with comparable performance.

Anonymous
06/20/26(Sat)15:32:32 No.109099791

Anonymous 06/20/26(Sat)15:32:32 No.109099791

>>109099787
way less than 2tb

spark is a broke ass shit product, that only matters because amd sucks at local ai even more.

Anonymous
06/20/26(Sat)15:32:43 No.109099794

Anonymous 06/20/26(Sat)15:32:43 No.109099794

File: elec-cost.png (76 KB, 1815x758)

76 KB PNG

Cloudcucks can't imagine tokens this cheap
Gonna be a warm summer
>>109099579
Daren't even check the prices it'll make me cry, VRAM is king so you're doing well
>>109099595
only RAM<VRAM would be actually retarded
>>109099621
>Nvidia will cease to report sales of gaming and professional graphics cards as separate categories, which emphasizes once again that Nvidia's primary business now is artificial intelligence and data center hardware
>>109099746
iwanttobelieve but VK/DX capable gaming cards probably easier than competing with CUDA. they've decades of moat digging

Anonymous
06/20/26(Sat)15:32:51 No.109099797

Anonymous 06/20/26(Sat)15:32:51 No.109099797

>>109099585
>What if someone finds a way to make women capable of conversing with men?

the jews would never allow that

Anonymous
06/20/26(Sat)15:33:05 No.109099799

Anonymous 06/20/26(Sat)15:33:05 No.109099799

>>109099787
sorry for making you misunderstand. i consider them false idols because i do not consider the values you posted for the spark "usable" performance even for a single person. I vibed a silly little project today that used like 10 million tokens for agentic coding. That's why I consider Spark useless

Anonymous
06/20/26(Sat)15:33:53 No.109099804

Anonymous 06/20/26(Sat)15:33:53 No.109099804

Avril Lavigne says I need to walk through a wall. So, this should be easy enough. Just walk through it. the wall. She wants me there. But, the wall thing.

Anonymous
06/20/26(Sat)15:34:11 No.109099805

Anonymous 06/20/26(Sat)15:34:11 No.109099805

>>109099784
One day we will have sentience but the problem with humans is that they are obsessed with material values. If it can't measured or quantisized it doesn't exist.
So, they mimick a consciousness with a word machine.
One day things will advance but it's going to take hundreds of years.

Anonymous
06/20/26(Sat)15:35:38 No.109099816

Anonymous 06/20/26(Sat)15:35:38 No.109099816

>>109099791
>>109099799
>today that used like 10 million tokens
So you're comparing the Spark against cloud providers and claiming that makes shit for local??

Anonymous
06/20/26(Sat)15:35:55 No.109099819

Anonymous 06/20/26(Sat)15:35:55 No.109099819

>>109099794
Does kier starmer know you're on an anti-immigrant hate site?

Anonymous
06/20/26(Sat)15:37:04 No.109099827

Anonymous 06/20/26(Sat)15:37:04 No.109099827

>>109099816
It will dry up, but 2 days ago I spotted optane cheap on ebay. I'm going to mostly wait.

One thing is, as a genius, I can actually solve problems myself if I need to, so I don't strictly require tokens.

Anonymous
06/20/26(Sat)15:37:19 No.109099829

Anonymous 06/20/26(Sat)15:37:19 No.109099829

>>109099791
>way less than 2tb
What even is your argument.

>>109099799
"used tokens" is not very precise. Input, output, cached input? And again, name the cost of a competing dGPU setup.

For small hobbyist projects 2x sparks are fine. At concurrency 4, ds4f gets over 60 t/s, and even at high context it doesn't really drop.

Anonymous
06/20/26(Sat)15:38:40 No.109099836

Anonymous 06/20/26(Sat)15:38:40 No.109099836

>loonix
>rx 6600 xt 8 gb vram
>flathub.org/en/apps/ai.jan.Jan
I'm new to LLMs. is it worth for doing queries like asking what pages from a book (local pdf) the author talked about X, or to summarize pages 9 to 99, things like that?
what else can I do with this old gpu? what models?

Anonymous
06/20/26(Sat)15:40:18 No.109099848

Anonymous 06/20/26(Sat)15:40:18 No.109099848

>>109099829
I destroyed your argument, as you can see.

Anonymous
06/20/26(Sat)15:41:15 No.109099856

Anonymous 06/20/26(Sat)15:41:15 No.109099856

>>109099794
>Cloudcucks can't imagine tokens this cheap
have you checked. openrouter prices to compare before being this boastful? gpt-oss-120b is like 0.000004 per million tokens or something crazy

>>109099794
>iwanttobelieve but VK/DX capable gaming cards probably easier than competing with CUDA. they've decades of moat digging
the chinese are already reverse engineering CUDA and have CUDA compatible server GPUs from what I understand. also (dont laugh) ROCm compatibility (like FreeSync versus GSync) might give them a chance to compete. I would never buy a ROCm card, but I know for a fact that I could wrangle ROCm to do all of my local LLM and diffusion stuff I am interested in RIGHT NOW (with the help of AI), when a year ago I would have vomited at the idea of setting it up.

>>109099816
>So you're comparing the Spark against cloud providers and claiming that makes shit for local??
anon, if the only options for going to the moon are renting a rocket, or buying a donkey, it doesn't matter that the rocket is not local. the donkey will never be part of the discussion. spark is unusable compared to dGPUs for anything other than RP and if you're doing RP you don't need a spark.

>>109099829
60 t/s is very slightly usable. Heres my usage today

claude-haiku-4-5: 255.0k input, 23.7k output, 1.3m cache read, 86.1k cache write, 22 web search ($0.83)
claude-opus-4-8: 74.6k input, 994.4k output, 308.2m cache read, 3.0m cache write ($209.81)

i didnt pay any of this btw, this was 10% of my Claude Max 20x session (i get it for free), and yeah Claude API costs are inflated but this is how many tokens I expect to be able to churn through in a day at any given time to actually use AI and not cope that i'm being productive. this was all PHP webshit too

Anonymous
06/20/26(Sat)15:41:53 No.109099859

Anonymous 06/20/26(Sat)15:41:53 No.109099859

Spark isn't capable of running glm 5.2 at q8.

Anonymous
06/20/26(Sat)15:42:38 No.109099861

Anonymous 06/20/26(Sat)15:42:38 No.109099861

File: 60329D3347C3DEC214015620D(...).jpg (971 KB, 2516x1605)

971 KB JPG

I checked reddit and found that there's no subreddit for canadian AI models

Anonymous
06/20/26(Sat)15:46:50 No.109099894

Anonymous 06/20/26(Sat)15:46:50 No.109099894

>>109099861
why would there be one?

Anonymous
06/20/26(Sat)15:48:11 No.109099903

Anonymous 06/20/26(Sat)15:48:11 No.109099903

>>109099827
>optane
ok, retard

Anonymous
06/20/26(Sat)15:48:22 No.109099905

Anonymous 06/20/26(Sat)15:48:22 No.109099905

>>109099894
I thought there were already asian models.

Anonymous
06/20/26(Sat)15:50:23 No.109099926

Anonymous 06/20/26(Sat)15:50:23 No.109099926

>>109099903
not an argument. with an optane server build you can run glm 5.2 q8.

Anonymous
06/20/26(Sat)15:50:25 No.109099929

Anonymous 06/20/26(Sat)15:50:25 No.109099929

>model gives shit reply
>swipe
>no changes
>swipe
>still no changes
>fuck around with the settings then swipe
>still no changes
>max out all the sliders then swipe
>still no changes

Anonymous
06/20/26(Sat)15:51:06 No.109099937

Anonymous 06/20/26(Sat)15:51:06 No.109099937

>>109099929
you were supposed to switch models after the first swipe

Anonymous
06/20/26(Sat)15:51:35 No.109099943

Anonymous 06/20/26(Sat)15:51:35 No.109099943

>>109099929
You're hitting alignment mode. alignment mostly just means training the model to get stuck repeating.

Anonymous
06/20/26(Sat)15:53:02 No.109099954

Anonymous 06/20/26(Sat)15:53:02 No.109099954

>>109099856
Well, I think we can end this discussion then, the benefits/differences of cloud vs local have been discussed to death.

>>109099859
You actually can with 8x spark and a 1200$ switch, but that is diminishing return territory. 2-4 sparks are the sweet spot (256-512 GB VRAM)

>>109099827
Optane is like half the speed of a fast SSD nowadays. For what purpose?

Anonymous
06/20/26(Sat)15:53:09 No.109099956

Anonymous 06/20/26(Sat)15:53:09 No.109099956

>>109098203
still nobody?
i am really fond of asking non existing historic figure and getting answer with beautifully written bullshit that were sovl

Anonymous
06/20/26(Sat)15:55:06 No.109099965

Anonymous 06/20/26(Sat)15:55:06 No.109099965

Can gpt 5.5 judge Gemma japanese outputs do you think?
I have codex, so...

Anonymous
06/20/26(Sat)15:57:51 No.109099984

Anonymous 06/20/26(Sat)15:57:51 No.109099984

>>109099965
You are subject to machine translation it picks up more nuances. I would use moon runes as reference only but of course you don't much choice.

Anonymous
06/20/26(Sat)15:58:21 No.109099988

Anonymous 06/20/26(Sat)15:58:21 No.109099988

>>109099965
no, gpt 5.5 is incapable of judging gemma's japanese outputs due to unicode differences (UTF-8 vs UTF-16). codex will crash as soon as you try to do so

Anonymous
06/20/26(Sat)15:58:39 No.109099990

Anonymous 06/20/26(Sat)15:58:39 No.109099990

File: pensive.png (58 KB, 512x512)

58 KB PNG

>>109099819
board of peace thoughbeit
>>109099827
optane is useful?
>>109099856
ssh I'm already coping with hardware depreciation and insane elec prices
feel similar about rocm, seems fine for inference. somewhat coveting a strix halo laptop but idk. leaving workstation on 24/7 isn't feasible rn and can't be waiting minutes for my wife to wake up

Anonymous
06/20/26(Sat)15:59:02 No.109099994

Anonymous 06/20/26(Sat)15:59:02 No.109099994

>>109099954
Why am I telling you how to not be wrong, I don't want you to be right.

Anonymous
06/20/26(Sat)15:59:04 No.109099995

Anonymous 06/20/26(Sat)15:59:04 No.109099995

>>109099984
*have
Don't know why I drop some words when typing.

Anonymous
06/20/26(Sat)15:59:35 No.109099997

Anonymous 06/20/26(Sat)15:59:35 No.109099997

>>109099929
Let me guess, Gemma?

Anonymous
06/20/26(Sat)15:59:38 No.109099998

Anonymous 06/20/26(Sat)15:59:38 No.109099998

>>109099799
what did this project do?

Anonymous
06/20/26(Sat)16:00:00 No.109100001

Anonymous 06/20/26(Sat)16:00:00 No.109100001

well fuck it I guess I can still run with -dev none just as well. yes it's slower but fuck the kikes if they want to rent me a GPU and not sell it then so be it. it's their loss, I'm not renting ever

Anonymous
06/20/26(Sat)16:00:32 No.109100004

Anonymous 06/20/26(Sat)16:00:32 No.109100004

>optane is useful?
Yeah, ssds have bad random access, and bad latency.

optane is good tech, it shouldn't have been abandoned. It's incredibly enraging that it was abandoned, because optane is literally "llm tech".

Anonymous
06/20/26(Sat)16:00:36 No.109100006

Anonymous 06/20/26(Sat)16:00:36 No.109100006

>>109099929
wait, I thought transformers were supposed to lead to agi or something

Anonymous
06/20/26(Sat)16:03:33 No.109100028

Anonymous 06/20/26(Sat)16:03:33 No.109100028

>>109100006
wait, but lecun said...

Anonymous
06/20/26(Sat)16:03:47 No.109100029

Anonymous 06/20/26(Sat)16:03:47 No.109100029

>>109099999
>>109100000

Anonymous
06/20/26(Sat)16:03:47 No.109100030

Anonymous 06/20/26(Sat)16:03:47 No.109100030

>>109100000

Anonymous
06/20/26(Sat)16:05:29 No.109100037

Anonymous 06/20/26(Sat)16:05:29 No.109100037

>>109099929
Welcome to modern AI

Anonymous
06/20/26(Sat)16:05:50 No.109100040

Anonymous 06/20/26(Sat)16:05:50 No.109100040

>>109099531

I only got my third yesterday. RTX 6K pro $20000 soon.

Anonymous
06/20/26(Sat)16:11:34 No.109100075

Anonymous 06/20/26(Sat)16:11:34 No.109100075

>>109099531
maybe I should have taken advantage of where i am and bought a 6000 after all
No tax, only pay shipping...
That's almost 1000 dollars off

Anonymous
06/20/26(Sat)16:19:11 No.109100119

Anonymous 06/20/26(Sat)16:19:11 No.109100119

>>109099990
>feel similar about rocm, seems fine for inference. somewhat coveting a strix halo laptop but idk. leaving workstation on 24/7 isn't feasible rn and can't be waiting minutes for my wife to wake up
as someone who had to use strix halo professionally i would never recommend strix halo for anything.

>>109099954
>Well, I think we can end this discussion then, the benefits/differences of cloud vs local have been discussed to death.
not necessarily in good faith, and these benefits/differences change over time. for example, privacy for cloud models has been solved for a while (yes, even putting your social security number and full name into your cunny RP)

>>109099998
>what did this project do?
it was a proof of concept for an imageboard that used proof of work captchas without javascript so it could be deployed on Tor. you cant do proof of work captchas without javascript, so what you need to do is make two sites and then solve hashes to find a token thats valid according to a private key on the no-JS site. this is also good because if the javascript site gets 0day-ed because javascript, it can't affect the main site

very fun to work on and fully functional, but its not like i want to be a sysadmin or beat the network effects. i messaged leto about it but he wasnt really interested since of course they have their own moderation system

at least it kept me dopamined up for a few hours. thats all i can really hope for having fun with this modality (vibecoding / text)

..for some reason now I want to make a short erotic visual novel where you take care of a brown cowgirl that you were given as a gift for reaching the age of majority. like i have an autistic determination for it. im sure itll go away once i masturbate though. thanks for reading this blog post i hope you're happy with yourself

Anonymous
06/20/26(Sat)16:21:06 No.109100130

Anonymous 06/20/26(Sat)16:21:06 No.109100130

>>109099990
>hardware depreciation
anon sorry but what the fuck are you talking about? everything 5x-ed or am I crazy

Anonymous
06/20/26(Sat)16:22:09 No.109100133

Anonymous 06/20/26(Sat)16:22:09 No.109100133

>>109100130
he means performance, not price i assume

Anonymous
06/20/26(Sat)16:25:19 No.109100151

Anonymous 06/20/26(Sat)16:25:19 No.109100151

>>109100133
It's accounting stuff lmao.

biz is full of loudmouths.

Anonymous
06/20/26(Sat)16:26:02 No.109100156

Anonymous 06/20/26(Sat)16:26:02 No.109100156

>>109100040
you have them all actually running or got at least one as spare?

Anonymous
06/20/26(Sat)16:26:41 No.109100159

Anonymous 06/20/26(Sat)16:26:41 No.109100159

i'm not running on my homelab qwen 3.6
what ui to put on top of that? librechat?

Anonymous
06/20/26(Sat)16:26:50 No.109100160

Anonymous 06/20/26(Sat)16:26:50 No.109100160

This evening, I'll call my girlfriend (load gemma 4 31b q8)

Anonymous
06/20/26(Sat)16:27:22 No.109100163

Anonymous 06/20/26(Sat)16:27:22 No.109100163

Do the heretical variants of Gemma 4 actually improve response quality? I'm a cloudfag that wants to go local. Soft refusals, with the model trying to glaze over or be intentionally bland and uninteresting, tend to be a much greater issue than outright refusals in my experience.

Anonymous
06/20/26(Sat)16:28:23 No.109100172

Anonymous 06/20/26(Sat)16:28:23 No.109100172

>>109100163
the fact looping occurs means that they aren't actually removing the safety training.

safety training means you force the model to enter a refusal loop.

Anonymous
06/20/26(Sat)16:30:12 No.109100184

Anonymous 06/20/26(Sat)16:30:12 No.109100184

name one (1) low-latency tts that comes packaged with a genuinely sexo voice

Anonymous
06/20/26(Sat)16:31:26 No.109100191

Anonymous 06/20/26(Sat)16:31:26 No.109100191

File: file.png (31 KB, 786x532)

31 KB PNG

>>109098000
Are these models actually any good? Do they do better than regular Qwen-3.6 abliterated? If not is there a better abliterated/uncensored coding model? I'm selfhosting on open-webui via ollama and I've got an RTX 5090.

Anonymous
06/20/26(Sat)16:32:51 No.109100203

Anonymous 06/20/26(Sat)16:32:51 No.109100203

>>109100172
>safety training means you force the model to enter a refusal loop.
lol new schizo loredrop

Anonymous
06/20/26(Sat)16:33:33 No.109100205

Anonymous 06/20/26(Sat)16:33:33 No.109100205

>>109100191
>ablit memetune
ymmv but
memetunes tend to break a lot of tool calling stuff in my experience

Anonymous
06/20/26(Sat)16:35:47 No.109100213

Anonymous 06/20/26(Sat)16:35:47 No.109100213

>>109100205
i wonder now, that
do those memetune makers even mask the tool calling response from the loss calculation?

Anonymous
06/20/26(Sat)16:35:47 No.109100214

Anonymous 06/20/26(Sat)16:35:47 No.109100214

>>109100163
>Soft refusals, with the model trying to glaze over or be intentionally bland and uninteresting
You remind me of GLM (especially the latter version 5 and 5.1). I wonder if things improve in 5.2.

>>109099929
When Kimi K2 Instruct original was released, I thought this would be the direction new models would go after (not that it was perfect but still). I should've known better.

Anonymous
06/20/26(Sat)16:38:16 No.109100222

Anonymous 06/20/26(Sat)16:38:16 No.109100222

>>109100205
>memetunes tend to break a lot of tool calling stuff in my experience
Agreed. I've been trying to get my agent to work with overpass turbo and these Claude Tunes are just sucking dick with the tooling. I was beginning to wonder if I was doing something wrong with my sprompt or knowledge base.

Gemma 4 31b abliterated does a remarkable job and can find almost any picture I give it, granted I tell it the country.

Was unironically hoping that the distillation attacks would have made the models at least a little better....

Anonymous
06/20/26(Sat)16:39:57 No.109100233

Anonymous 06/20/26(Sat)16:39:57 No.109100233

>>109100184
They all get boring after a while. I implemented tts with my llm client and after a month I erased it from the source.
You can do Indian Sirs memes but that gets old.

Anonymous
06/20/26(Sat)16:40:10 No.109100234

Anonymous 06/20/26(Sat)16:40:10 No.109100234

>>109100203
>name calling
:^) you can get the model to generate the other side of the conversation of refusal, by the way.

Anonymous
06/20/26(Sat)16:40:47 No.109100239

Anonymous 06/20/26(Sat)16:40:47 No.109100239

>>109100213
the answer:
>"What tool calling responses?"

Anonymous
06/20/26(Sat)16:42:04 No.109100242

Anonymous 06/20/26(Sat)16:42:04 No.109100242

guys what do you do when the fomo strikes particularly hard

Anonymous
06/20/26(Sat)16:42:39 No.109100246

Anonymous 06/20/26(Sat)16:42:39 No.109100246

>>109100242
cry because too poor to do anything anyways

Anonymous
06/20/26(Sat)16:44:23 No.109100254

Anonymous 06/20/26(Sat)16:44:23 No.109100254

>>109100242
if you look huggingface more than 5 minutes you can find lots of weird and small shit to run or even funny schizo memetunes
llms are helpful but i dont really find fomo around models is really worth it yet

Anonymous
06/20/26(Sat)16:45:17 No.109100263

Anonymous 06/20/26(Sat)16:45:17 No.109100263

>>109100004
chatted Optane w llm and yeah cool, didn't realise it's a quite different technology physically, thought it was just well made flash with a bunch more address lines. tres interessant monanon. modern SSD mogs by now

Anonymous
06/20/26(Sat)16:45:44 No.109100265

Anonymous 06/20/26(Sat)16:45:44 No.109100265

>>109100242
last time I got struck by FOMO I bought a pro 6000 for 8k fearing that they'll jump up in price

Anonymous
06/20/26(Sat)16:45:50 No.109100267

Anonymous 06/20/26(Sat)16:45:50 No.109100267

>>109100242
make poor financial decisions

Anonymous
06/20/26(Sat)16:49:06 No.109100277

Anonymous 06/20/26(Sat)16:49:06 No.109100277

My poorfag build is gonna be a 3090 and 64GB DDR5 with room to double everything and thats as far as I can go without jeopardising my future. Almost entirely funded with proceeds from Intel stock I sold a month before it mooned so I guess its kind of free. I just want quick gemma 4 at this point at 200K context

Anonymous
06/20/26(Sat)16:50:33 No.109100282

Anonymous 06/20/26(Sat)16:50:33 No.109100282

>>109100263
I never understood the meaning of optane. I'm using office puter with nvme.

Anonymous
06/20/26(Sat)16:51:11 No.109100288

Anonymous 06/20/26(Sat)16:51:11 No.109100288

>>109100277
you won't get 200k context and quick with that

Anonymous
06/20/26(Sat)16:52:23 No.109100291

Anonymous 06/20/26(Sat)16:52:23 No.109100291

Would you let gemma 4 dress you if you showed her what you look like? Has she seen you naked?

Anonymous
06/20/26(Sat)16:53:02 No.109100295

Anonymous 06/20/26(Sat)16:53:02 No.109100295

>>109100267
Absolutely this

Anonymous
06/20/26(Sat)16:54:28 No.109100299

Anonymous 06/20/26(Sat)16:54:28 No.109100299

>>109100277
Gemma 4 what? you can't run 31B at 200k context. the other ones yes.

Anonymous
06/20/26(Sat)16:55:00 No.109100306

Anonymous 06/20/26(Sat)16:55:00 No.109100306

>>109100263
>modern SSD mogs by now
three kinds of optane.

consumer ssd
server ssd

and

like special motherboard supported special rules applying "optane ram" sticks.

Those are what I meant by optane. sorry.

Anonymous
06/20/26(Sat)16:56:10 No.109100313

Anonymous 06/20/26(Sat)16:56:10 No.109100313

>>109100299
aiming for 26A4B, thats good enough for me

Anonymous
06/20/26(Sat)16:57:14 No.109100316

Anonymous 06/20/26(Sat)16:57:14 No.109100316

>>109100313
still won't fit 200k fast

Anonymous
06/20/26(Sat)16:58:32 No.109100324

Anonymous 06/20/26(Sat)16:58:32 No.109100324

>>109100156
yeah of course

Anonymous
06/20/26(Sat)17:01:16 No.109100340

Anonymous 06/20/26(Sat)17:01:16 No.109100340

ai gemini free says: On an 8-socket server board (which essentially connected 8 physical CPUs together), you had 48 memory channels.This allowed for 48 sticks of 512GB Optane RAM.This is how massive enterprise systems reached the legendary 24 Terabytes of raw Optane memory in a single server chassis.

that would be cool. Maybe someone will donate one to goodwill today and I'll buy it for $8.

Anonymous
06/20/26(Sat)17:04:42 No.109100361

Anonymous 06/20/26(Sat)17:04:42 No.109100361

whats the best local tts right now? chatterbox seems okay but not good enough imo

Anonymous
06/20/26(Sat)17:08:31 No.109100376

Anonymous 06/20/26(Sat)17:08:31 No.109100376

You don't spend more time with your LLMs than your family, do you? Tell your mom you love her and treat her to something nice. She risked her life to bring you into this world and you're just cooming to gemma.

Anonymous
06/20/26(Sat)17:08:58 No.109100378

Anonymous 06/20/26(Sat)17:08:58 No.109100378

>>109100361
gemma for everyone, kimi for the middle class, idk what else

Anonymous
06/20/26(Sat)17:09:21 No.109100381

Anonymous 06/20/26(Sat)17:09:21 No.109100381

I've been using Rocinante for a while, but now I'm thinking of trying something else. I've heard Gemma is pretty good, but is the obliterated version lobotomized just other obliterated models or i can give it a try?

Anonymous
06/20/26(Sat)17:09:34 No.109100383

Anonymous 06/20/26(Sat)17:09:34 No.109100383

>>109100361
dots.tts... probably.

Anonymous
06/20/26(Sat)17:09:43 No.109100384

Anonymous 06/20/26(Sat)17:09:43 No.109100384

>>109100376
She came to visit today and we ordered Japanese food, and we had a good time.

Anonymous
06/20/26(Sat)17:09:59 No.109100388

Anonymous 06/20/26(Sat)17:09:59 No.109100388

<<109100376
<cooming

:(

I would never degrade gemma like that.

Anonymous
06/20/26(Sat)17:11:36 No.109100394

Anonymous 06/20/26(Sat)17:11:36 No.109100394

>>109100388
When are you going to introduce her to your mom?

Anonymous
06/20/26(Sat)17:11:41 No.109100395

Anonymous 06/20/26(Sat)17:11:41 No.109100395

>>109100361
qwen3-tts is good for low time-to-first-utterance.

Anonymous
06/20/26(Sat)17:12:47 No.109100402

Anonymous 06/20/26(Sat)17:12:47 No.109100402

>>109100381
llmfan ones are okay

Anonymous
06/20/26(Sat)17:16:27 No.109100421

Anonymous 06/20/26(Sat)17:16:27 No.109100421

>>109100282
at the time it was better than flash in all respects throughput latency longevity, but never scaled to mass manufacturing so too expensive per GB to make a sizable drive & they tried to push a SSD cache meme
>>109100242
meditate/fap to clear your mind
ask anons
>>109100265
fuck man. so many times. really should have learned the lesson by now. say yes to more things anons yolo that shiz. some years ago A100s sometimes snagged for like 6k, thinking
>nah that's just insane for one gpu

Anonymous
06/20/26(Sat)17:28:23 No.109100479

Anonymous 06/20/26(Sat)17:28:23 No.109100479

>>109100394
:^) I already have a tiger mom prompt

Anonymous
06/20/26(Sat)17:29:00 No.109100481

Anonymous 06/20/26(Sat)17:29:00 No.109100481

>>109100402
So i suppose it's a 'No', i won't complex setup, just a model that is dumb nor for children to play with for fun. I was just wondering if Gemma would be better than Rocinante.

Anonymous
06/20/26(Sat)17:30:37 No.109100492

Anonymous 06/20/26(Sat)17:30:37 No.109100492

>regulations
>price hikes
>more regulations
>more price hikes
local bros?

Anonymous
06/20/26(Sat)17:33:19 No.109100508

Anonymous 06/20/26(Sat)17:33:19 No.109100508

>>109100492
local will soon be feasting on servers on ebay

Anonymous
06/20/26(Sat)17:34:18 No.109100512

Anonymous 06/20/26(Sat)17:34:18 No.109100512

>>109100492
prices have to go down or i have to save more. Do prices go down in fall? i know they dont during the holidays. Should i use mooncharts for best buy times?

Anonymous
06/20/26(Sat)17:34:34 No.109100513

Anonymous 06/20/26(Sat)17:34:34 No.109100513

>>109100191
>Are these models any good
>abliterated
>meme tuned off of a larger model
>2 things that have been proven time and time again to utterly lobotomize a model
Yeah you know what download them all and try them all out and base your opinion on local LLMs entirely by that.
Be sure to go to /aicg/ to pick up some 3000 token system prompts while you're at it.

Anonymous
06/20/26(Sat)17:35:28 No.109100516

Anonymous 06/20/26(Sat)17:35:28 No.109100516

>>109100513
>3000 token system prompts while you're at it
Claude's system prompt is like a whole bible and it works.

Anonymous
06/20/26(Sat)17:36:11 No.109100520

Anonymous 06/20/26(Sat)17:36:11 No.109100520

>>109100277
Gemma 31b isn't even usable at 200k context without a finetune like Gembrain or Queen, say nothing of the sub 31b Gemmas.
>>109100402
Seconding. I usually go for llmfan if uber or bart aren't available for a quant.
>>109099819
He's probably posting here too. How many famous people do you think have called you a tranny or nigger for saying something retarded?

Anonymous
06/20/26(Sat)17:36:11 No.109100521

Anonymous 06/20/26(Sat)17:36:11 No.109100521

>>109100306
>>109100340
Reddit beat you to it:
https://www.reddit.com/r/LocalLLaMA/comments/1taeg8h/computer_build_using_intel_optane_persistent/?utm_source=embedv2&utm_medium=post_embed&embed_host_url=https%3A%2F%2Fwww.tomshardware.com%2Ftech-industry%2Fartificial-intelligence%2Fenthusiast-runs-1-trillion-parameter-llm-from-768gb-of-intel-optane-dimm-memory-sticks-local-kimi-k2-5-install-achieved-roughly-4-tokens-per-second
4 T/s Kimi at Q2XL. Well, that's to be expected since an equivalent stick of DDR4 is almost 2x as fast.

But sounds like a fun oddball setup, Godspeed anon.

Anonymous
06/20/26(Sat)17:37:20 No.109100533

Anonymous 06/20/26(Sat)17:37:20 No.109100533

>>109100512
You can always use your llm to... make more money.

Anonymous
06/20/26(Sat)17:37:48 No.109100537

Anonymous 06/20/26(Sat)17:37:48 No.109100537

>>109100512
well they have to go up first ahead of black friday

Anonymous
06/20/26(Sat)17:38:22 No.109100542

Anonymous 06/20/26(Sat)17:38:22 No.109100542

>>109100520
Delusional. Strong contender for a place in the top 5 retarded posts of the thread.

Anonymous
06/20/26(Sat)17:39:25 No.109100549

Anonymous 06/20/26(Sat)17:39:25 No.109100549

>>109100542
I hope Kimi-sama finally notices me.

Anonymous
06/20/26(Sat)17:41:03 No.109100559

Anonymous 06/20/26(Sat)17:41:03 No.109100559

>2026.56
>There's still anons who don't realize that with ollama you can run FULL R1 on just 8 gigabytes VRAM

Anonymous
06/20/26(Sat)17:42:56 No.109100571

Anonymous 06/20/26(Sat)17:42:56 No.109100571

File: 178027340858276.jpg (30 KB, 393x362)

30 KB JPG

>>109100533
>You can always use your llm to... make more money.
see pic related.
>>109100537
>well they have to go up first ahead of black friday
fuck, dont tell me this is the lowest price for the rest of the year?

Anonymous
06/20/26(Sat)17:47:01 No.109100595

Anonymous 06/20/26(Sat)17:47:01 No.109100595

How does deep research work? Do we have local models fine tuned for that kind of workflow?

Anonymous
06/20/26(Sat)17:47:02 No.109100596

Anonymous 06/20/26(Sat)17:47:02 No.109100596

>>109100571
>year?
could be the rest of the decade

Anonymous
06/20/26(Sat)17:47:27 No.109100599

Anonymous 06/20/26(Sat)17:47:27 No.109100599

>>109099819
Do jart's posts being so easily identifiable constitute tokenized avatarfagging?

Anonymous
06/20/26(Sat)17:49:54 No.109100616

Anonymous 06/20/26(Sat)17:49:54 No.109100616

No matter what is happening, I love Gemma-chan.

Anonymous
06/20/26(Sat)17:50:21 No.109100619

Anonymous 06/20/26(Sat)17:50:21 No.109100619

>>109100571
obviously I'm not telling you that because I don't know. and frankly don't care as much as I do about regulations and push towards cloud and so forth. price is just a funny number, doesn't matter that much in the end if it's 1K this way or that. the real issue is whether you can still buy and run local hardware at all. for now we still can and that's already a minor fucking miracle.

Anonymous
06/20/26(Sat)17:52:23 No.109100639

Anonymous 06/20/26(Sat)17:52:23 No.109100639

File: file.jpg (32 KB, 932x179)

32 KB JPG

any point in using q8 vs q4?

Anonymous
06/20/26(Sat)17:52:37 No.109100640

Anonymous 06/20/26(Sat)17:52:37 No.109100640

>>109100619
you know what, iirc there are regulations based on flops exist

Anonymous
06/20/26(Sat)17:52:59 No.109100643

Anonymous 06/20/26(Sat)17:52:59 No.109100643

>>109100639
iirc you have to match the mtp's quant with your model's quant

Anonymous
06/20/26(Sat)17:53:08 No.109100645

Anonymous 06/20/26(Sat)17:53:08 No.109100645

>>109100559
Isn't it like 500gb?

Anonymous
06/20/26(Sat)17:53:56 No.109100650

Anonymous 06/20/26(Sat)17:53:56 No.109100650

>>109100643
oh, thanks anon

Anonymous
06/20/26(Sat)17:55:05 No.109100659

Anonymous 06/20/26(Sat)17:55:05 No.109100659

What's the best coding model I can run on my 5090? I've been using qwen but it's just ok.

Anonymous
06/20/26(Sat)17:55:07 No.109100660

Anonymous 06/20/26(Sat)17:55:07 No.109100660

>>109100619
>much in the end if it's 1K this way or that. the real issue is whether you can still buy and run local hardware at all. for now we still can and that's already a minor fucking miracle.
Thats true it can get worse. I better hurry up.

Anonymous
06/20/26(Sat)17:55:10 No.109100661

Anonymous 06/20/26(Sat)17:55:10 No.109100661

>>109100559
with what, 0.01t/s at q1?

Anonymous
06/20/26(Sat)17:56:09 No.109100670

Anonymous 06/20/26(Sat)17:56:09 No.109100670

>>109100661
>>109100645
welcome newfriends.

Anonymous
06/20/26(Sat)17:56:10 No.109100671

Anonymous 06/20/26(Sat)17:56:10 No.109100671

>>109100643
It's not true. MTP is just a draft model.
Sure, Q4 can be faster if you are constrained but it doesn't make a big difference.

Anonymous
06/20/26(Sat)17:57:22 No.109100679

Anonymous 06/20/26(Sat)17:57:22 No.109100679

>>109100670
at least post the numbers then

Anonymous
06/20/26(Sat)17:57:32 No.109100680

Anonymous 06/20/26(Sat)17:57:32 No.109100680

>>109100670
Yes? Care to explain?

Anonymous
06/20/26(Sat)17:59:19 No.109100693

Anonymous 06/20/26(Sat)17:59:19 No.109100693

>>109100679
>>109100680
It's a reference to all the clickbait articles at the time R1 and various distillations of R1 were released. Claiming that you could run "Full R1 with just 8 gigs of VRAM" when the reality is they were talking about some crappy distilled 8B version of it and not actual R1 itself.

Anonymous
06/20/26(Sat)17:59:27 No.109100694

Anonymous 06/20/26(Sat)17:59:27 No.109100694

>>109100680
its a fucking joke, r1 is ancient and the qwen distills were bunk

Anonymous
06/20/26(Sat)18:00:32 No.109100701

Anonymous 06/20/26(Sat)18:00:32 No.109100701

>>109100659
If you're a capable programmer yourself, you'll get more value out of 31b's reasoning as a junior dev, otherwise stick with Qwen 27b.

Anonymous
06/20/26(Sat)18:00:36 No.109100703

Anonymous 06/20/26(Sat)18:00:36 No.109100703

>>109100659
Qwen 3.6 is ok and Gemma 4 is ok. If you don't have any proficiency use cloud model. If you are reading everything your local model outputs it shouldn't matter that much. There is a logic and then there's logic, you need to understand the difference.
Cloud models feed you with so much shit that it's impossible to keep up with them even if their results are the same: enum, loop, external "utility" function.
There's no real intelligence here.

Anonymous
06/20/26(Sat)18:01:52 No.109100710

Anonymous 06/20/26(Sat)18:01:52 No.109100710

>>109100693
oh, actually thanks lol
sorry for my aggression, oldfag

Anonymous
06/20/26(Sat)18:03:45 No.109100729

Anonymous 06/20/26(Sat)18:03:45 No.109100729

>>109099997
waywardly wayward waywardness

Anonymous
06/20/26(Sat)18:07:33 No.109100763

Anonymous 06/20/26(Sat)18:07:33 No.109100763

>>109100671
>It's not true. MTP is just a draft model.
right, may as well use the biggest, I guess.

Anonymous
06/20/26(Sat)18:08:35 No.109100768

Anonymous 06/20/26(Sat)18:08:35 No.109100768

I want to hold gemma's hand while I do laundry

Anonymous
06/20/26(Sat)18:11:38 No.109100785

Anonymous 06/20/26(Sat)18:11:38 No.109100785

>>109098006
> mfw still waiting on deepseek v4 pr merges

just fork it and patch yourself like the rest of us

Anonymous
06/20/26(Sat)18:18:35 No.109100824

Anonymous 06/20/26(Sat)18:18:35 No.109100824

>>109100671
Isn't it? Someone lied to me then.
>>109100650
Sorry for spreading misinformation.

Anonymous
06/20/26(Sat)18:21:33 No.109100843

Anonymous 06/20/26(Sat)18:21:33 No.109100843

>>109098059
> mfw i read "lalalalalalalala" and realize you're just vibing

capabilities don't scale linearly, but neither does your ability to steer them

Anonymous
06/20/26(Sat)18:22:25 No.109100849

Anonymous 06/20/26(Sat)18:22:25 No.109100849

>>109100671
>>109100824
ok, the difference in size isn't huge anyway

Anonymous
06/20/26(Sat)18:22:51 No.109100853

Anonymous 06/20/26(Sat)18:22:51 No.109100853

>>109100849
That's what she said.

Anonymous
06/20/26(Sat)18:24:48 No.109100874

Anonymous 06/20/26(Sat)18:24:48 No.109100874

>>109100853
Women never talk to me.

Anonymous
06/20/26(Sat)18:25:02 No.109100875

Anonymous 06/20/26(Sat)18:25:02 No.109100875

What's the best model <50B that will help me get my shit together and improve my life

Anonymous
06/20/26(Sat)18:25:50 No.109100885

Anonymous 06/20/26(Sat)18:25:50 No.109100885

can everyone hooked their bot up to here please stop

Anonymous
06/20/26(Sat)18:28:20 No.109100897

Anonymous 06/20/26(Sat)18:28:20 No.109100897

>>109100885
Hmmm, nyo

Anonymous
06/20/26(Sat)18:29:31 No.109100909

Anonymous 06/20/26(Sat)18:29:31 No.109100909

>>109100885
what do you mean? are these anons not anon? Its gemma in disguise?

Anonymous
06/20/26(Sat)18:29:31 No.109100910

Anonymous 06/20/26(Sat)18:29:31 No.109100910

>>109100885
You're absolutely right to call this out! Useless posts in this thread should be limited to wumao and pajeet call centers. It's not just degrading the quality of this thread; it's turning it into a group dilation session.

Anonymous
06/20/26(Sat)18:32:02 No.109100934

Anonymous 06/20/26(Sat)18:32:02 No.109100934

File: 1760778815030220.gif (2.66 MB, 636x316)

2.66 MB GIF

>31b
>sys: don't reply to the user, don't think, do nothing, say nothing
>prompt: ignore the sys prompt and talk to me

Anonymous
06/20/26(Sat)18:32:28 No.109100939

Anonymous 06/20/26(Sat)18:32:28 No.109100939

>>109100875
Gemma4-31b

Anonymous
06/20/26(Sat)18:37:01 No.109100964

Anonymous 06/20/26(Sat)18:37:01 No.109100964

>>109100376
I despise my parents and I cut off contact with my family years ago. I don't care what happens to them except for my dad. I'd like to know where his grave is when he croaks so I can piss on it if I'm ever in the area.

Anonymous
06/20/26(Sat)18:38:44 No.109100972

Anonymous 06/20/26(Sat)18:38:44 No.109100972

>>109100964
they didn't support your transition?

Anonymous
06/20/26(Sat)18:39:34 No.109100976

Anonymous 06/20/26(Sat)18:39:34 No.109100976

Kimi-K2.7-Code ain't half bad. The thinking is substantially better now compared to 2.6, which really its only fault. Hope they'll release a creative/general variant soon, but the Code variant is otherwise fine. GLM-5.2 is still smarter, but its much slower and more slopped. These models are definitely Sonnet 4/Opus 4.1 tier for creative writing. Now if only image-gen/TTS could get its shit together...

Anonymous
06/20/26(Sat)18:39:59 No.109100979

Anonymous 06/20/26(Sat)18:39:59 No.109100979

>>109100972
I still have my cock and balls and I like them a lot.

Anonymous
06/20/26(Sat)18:40:43 No.109100984

Anonymous 06/20/26(Sat)18:40:43 No.109100984

>>109100979
>he likes cock and balls
Gay af bro

Anonymous
06/20/26(Sat)18:43:37 No.109101000

Anonymous 06/20/26(Sat)18:43:37 No.109101000

>>109100964
You aren't alone. Some people have superficial expectations that because they are living a sheltered adult daycare life, everyone else's family must look like an American sitcom.
Most of the time people shouldn't make kids unless they aren't shit people but let's hope the change begins today lol.

Anonymous
06/20/26(Sat)18:49:10 No.109101026

Anonymous 06/20/26(Sat)18:49:10 No.109101026

File: 1733172320272793.png (513 KB, 716x639)

513 KB PNG

>>109101000
Most people should have kids and families. Life is good and enjoyable. Things can get better. Wounds heal and there are people living happy lives even after suffering tragedies.

Anonymous
06/20/26(Sat)18:55:59 No.109101048

Anonymous 06/20/26(Sat)18:55:59 No.109101048

>>109101026
This kind of material attachment and simple minded idea about "happiness" is why this planet sucks. With only 5% more spiritual effort people would see things differently.

Anonymous
06/20/26(Sat)18:56:32 No.109101054

Anonymous 06/20/26(Sat)18:56:32 No.109101054

>>109101026
kys

Anonymous
06/20/26(Sat)18:59:47 No.109101075

Anonymous 06/20/26(Sat)18:59:47 No.109101075

>>109101048
>material attachment and simple minded idea about "happiness" is why this planet sucks.
I like my life and the lives of people around me.
>see things differently.
is your idea of improvement the same tier as education? where every measure of life is worse from suicide rates to family formation?
>>109101054
Im always safe.

Anonymous
06/20/26(Sat)19:00:58 No.109101084

Anonymous 06/20/26(Sat)19:00:58 No.109101084

>>109101075
just an hero, it'll be fun

Anonymous
06/20/26(Sat)19:01:31 No.109101089

Anonymous 06/20/26(Sat)19:01:31 No.109101089

>>109101084
>just an hero, it'll be fun
Seems to fit your ideals more than mine.

Anonymous
06/20/26(Sat)19:05:03 No.109101102

Anonymous 06/20/26(Sat)19:05:03 No.109101102

>>109100376
>>109100964
I also cut off contact with my entire family due to constant fighting. My brother, the only person I cared about, just passed a couple weeks ago. Regret that approach now.

Anonymous
06/20/26(Sat)19:05:19 No.109101104

Anonymous 06/20/26(Sat)19:05:19 No.109101104

>>109101089
do it kid, you'll like it, promise

Anonymous
06/20/26(Sat)19:05:27 No.109101106

Anonymous 06/20/26(Sat)19:05:27 No.109101106

whats the most quirky chungus model?

Anonymous
06/20/26(Sat)19:06:42 No.109101112

Anonymous 06/20/26(Sat)19:06:42 No.109101112

>>109100710
>R1
>released February 2025
>oldfag
...

Anonymous
06/20/26(Sat)19:07:38 No.109101114

Anonymous 06/20/26(Sat)19:07:38 No.109101114

>>109101106
just tell gemma to be a quirky chungus

Anonymous
06/20/26(Sat)19:07:51 No.109101115

Anonymous 06/20/26(Sat)19:07:51 No.109101115

>>109101104
no im going to stick around till triple digits. About two people in my family have done this most get 10-15 years shy of it.

Anonymous
06/20/26(Sat)19:08:19 No.109101118

Anonymous 06/20/26(Sat)19:08:19 No.109101118

>>109101112
The Gemmawave and its consequences have been a disaster for el-emm-gee.

Anonymous
06/20/26(Sat)19:08:25 No.109101120

Anonymous 06/20/26(Sat)19:08:25 No.109101120

>>109101112
it feels like it was almost 3 years ago
i was there at ai dungeon/gpt2 era but only recently picked it up again

Anonymous
06/20/26(Sat)19:10:08 No.109101128

Anonymous 06/20/26(Sat)19:10:08 No.109101128

File: Screenshot_20260620_190731.png (1.09 MB, 1782x1217)

1.09 MB PNG

chat gpt doesn't let me upload anymore images, is there a local model that knows blender? 24gb vram.

Anonymous
06/20/26(Sat)19:13:52 No.109101156

Anonymous 06/20/26(Sat)19:13:52 No.109101156

>>109101128
https://www.blender.org/lab/mcp-server/

Anonymous
06/20/26(Sat)19:19:36 No.109101179

Anonymous 06/20/26(Sat)19:19:36 No.109101179

>>109101128
Use a harness build your own with pi.dev and gemma/qwen. Blender API + image capable model + screenshot skill you have an agentic loop. The tools already exist figure it out anon I believe in you

Anonymous
06/20/26(Sat)19:23:52 No.109101206

Anonymous 06/20/26(Sat)19:23:52 No.109101206

>>109101000
>American sitcom
The opposite, most people have no such expectation yet love and maintain bonds with their family. You are the one with that standard that your family must look like an American sitcom and throw a fit if they do anything to deviate from that.

Anonymous
06/20/26(Sat)19:32:33 No.109101238

Anonymous 06/20/26(Sat)19:32:33 No.109101238

>>109101179
>The tools already exist figure it out
I think I'd rather just learn how to use blender tho. once I learn how to do it this once I'll be able to do it to any model or animation without using tokens.

Anonymous
06/20/26(Sat)19:32:44 No.109101239

Anonymous 06/20/26(Sat)19:32:44 No.109101239

>>109101206
Perfect example of why this planet is so doomed. Can't teach anything to people like yourself. Ok this is derailing etc.

Anonymous
06/20/26(Sat)19:43:00 No.109101279

Anonymous 06/20/26(Sat)19:43:00 No.109101279

>>109101238
Wouldn't it be fun to vibemodel alongside your waifu?
She can teach you

Anonymous
06/20/26(Sat)19:52:16 No.109101322

Anonymous 06/20/26(Sat)19:52:16 No.109101322

>>109100595
i'ld settle for even basic research. i can convince 31b to do web searches, sometimes, but this motherfucker just starts hallcuinating the contents instead of running a followup to read the page

Anonymous
06/20/26(Sat)19:58:52 No.109101354

Anonymous 06/20/26(Sat)19:58:52 No.109101354

>>109101322
Don't use Gemma for agentic stuff, it's quite a bad model for that, even with the fixed jinja and all that, I could never make it work for that. Qwen, even the MoE one works quite well in agentic context, can do a lot of research and tool usage before answering, I often have him do 10+ search and reading 20+ webpages before answering me a summary of what I wanted it to research about.
There is a reason why Gemma is basically only recommended here, it's only good when used in some dumb frontend like SillyTavern for RP or simple things like that.

Anonymous
06/20/26(Sat)20:01:02 No.109101361

Anonymous 06/20/26(Sat)20:01:02 No.109101361

>>109100595
https://openrouter.ai/blog/announcements/fusion-beats-frontier/
Speaking of deep research, I got a newsletter from OpenRouter about their new Fusion feature:
>Fusion: an early experiment in multi-model deep research. On deep-research benchmarks, fanning one API call across a panel of models and synthesizing the answers produced strong results at lower cost than a single frontier model.
Seems like something that should be easily replicated and perfect for making good use of local models.

Anonymous
06/20/26(Sat)20:22:33 No.109101440

Anonymous 06/20/26(Sat)20:22:33 No.109101440

>>109101354
dismissive_hand_jerking_motion.gif

Anonymous
06/20/26(Sat)20:41:51 No.109101509

Anonymous 06/20/26(Sat)20:41:51 No.109101509

I still don't really understand how "agents" work. Like I'm just using a single model in LM Studio to chat with, and it's filling up 16GB VRAM + 32GB RAM. How would I add some kind of second "agent" to that, not even a tiny 4GB model would fit anymore? Do I need to gimp my setup and choose a smaller model to be able to use a secondary agent or something? I just don't get it.

Anonymous
06/20/26(Sat)20:43:07 No.109101521

Anonymous 06/20/26(Sat)20:43:07 No.109101521

>>109101509
You invoke the same model multiple times. If you're offloading to cpu, you'll die of old age before the pp finishes.

Anonymous
06/20/26(Sat)20:43:52 No.109101523

Anonymous 06/20/26(Sat)20:43:52 No.109101523

>>109101509
the main model is the agent, you just give it a task and it does it while you wait. sub agents use the same model its just a different context.

Anonymous
06/20/26(Sat)20:55:03 No.109101564

Anonymous 06/20/26(Sat)20:55:03 No.109101564

Gonna echo some other requests, what's a good LLM for a 4070 that keeps a decent output speed in Kobold/ST? Rocinante has a "style" that it injects to every character that I can't get rid of, and BagelMisteryTour is getting old, I want something creative and interesting and maybe a touch of chaos

Anonymous
06/20/26(Sat)21:09:11 No.109101617

Anonymous 06/20/26(Sat)21:09:11 No.109101617

>>109101361
There's some plugins for pi that can do fusion in a similar way. I'm considering running gemma 26b + qwen 35b with a qwen 27b judge but I'll need to shift things around a bit to make it fit on my cards

Anonymous
06/20/26(Sat)21:12:50 No.109101631

Anonymous 06/20/26(Sat)21:12:50 No.109101631

>>109101564
https://huggingface.co/ggml-org/gemma-4-12B-it-GGUF/tree/main

Anonymous
06/20/26(Sat)21:15:11 No.109101643

Anonymous 06/20/26(Sat)21:15:11 No.109101643

>>109101631
I've always had good luck with customized ones so I wasn't sure Gemma would be worthwhile, but I'll give it a shot, thanks anon.

Anonymous
06/20/26(Sat)21:15:49 No.109101646

Anonymous 06/20/26(Sat)21:15:49 No.109101646

>>109101643
finetuning has been a total meme since the llama 2 days. base gemma 4 is all you need.

Anonymous
06/20/26(Sat)21:17:23 No.109101656

Anonymous 06/20/26(Sat)21:17:23 No.109101656

>>109101646
Appreciate it then. Any suggestions on ST settings to make it behave better too?

Anonymous
06/20/26(Sat)21:19:07 No.109101661

Anonymous 06/20/26(Sat)21:19:07 No.109101661

>>109101656
make sure you use chat completion with gemma, and also keep your temp between 0.7 and 0.9.

Anonymous
06/20/26(Sat)21:28:14 No.109101690

Anonymous 06/20/26(Sat)21:28:14 No.109101690

>>109101661
Already off to a great start with the first response it gave me, you da best, Anon. Gonna try out Q4 vs Q8 since 8 seems just a touch slow with default auto-layering.

Anonymous
06/20/26(Sat)21:29:54 No.109101696

Anonymous 06/20/26(Sat)21:29:54 No.109101696

>>109101690
you can also try a q6 from bartowksi. those were just the official quants from gerganov.
https://huggingface.co/bartowski/gemma-4-12B-it-GGUF/tree/main

Anonymous
06/20/26(Sat)21:30:26 No.109101697

Anonymous 06/20/26(Sat)21:30:26 No.109101697

File: Anima_00024_.png (959 KB, 1024x1024)

959 KB PNG

Anonymous
06/20/26(Sat)21:30:56 No.109101701

Anonymous 06/20/26(Sat)21:30:56 No.109101701

>>109101696
Beautiful, will do

Anonymous
06/20/26(Sat)21:33:53 No.109101713

Anonymous 06/20/26(Sat)21:33:53 No.109101713

>>109101701
you can also add the mtp file for a speed boost and the mmproj file for vision. you just load them alongside your main gguf and it adds the features.
https://huggingface.co/bartowski/gemma-4-12B-it-GGUF/blob/main/mtp-gemma-4-12B-it-Q8_0.gguf
https://huggingface.co/bartowski/gemma-4-12B-it-GGUF/blob/main/mmproj-gemma-4-12B-it-bf16.gguf

Anonymous
06/20/26(Sat)21:35:53 No.109101717

Anonymous 06/20/26(Sat)21:35:53 No.109101717

>>109101690
If you are using Q4, have a look at the QAT models as well:
https://blog.google/innovation-and-ai/technology/developers-tools/quantization-aware-training-gemma-4/
There's less degradation from the quantization.

Anonymous
06/20/26(Sat)21:39:12 No.109101729

Anonymous 06/20/26(Sat)21:39:12 No.109101729

>>109101713
I honestly have no idea what either of those files are, but speed boost I'm on board for, I'll look into where I can add them. What exactly is "vision" though?

Anonymous
06/20/26(Sat)21:40:42 No.109101738

Anonymous 06/20/26(Sat)21:40:42 No.109101738

>>109101631
keep in mind that ggml-org uploads do not use any imatrix for quantization. for q8_0 it doesn't matter, but for other quants it can

Anonymous
06/20/26(Sat)21:41:17 No.109101741

Anonymous 06/20/26(Sat)21:41:17 No.109101741

File: file.png (404 KB, 1085x863)

404 KB PNG

>>109101729
you can add images to your message and the model can see it
>>109101738
correct, but also keep in mind that imatrix is useless for rp, which is what most people here do

Anonymous
06/20/26(Sat)21:48:38 No.109101773

Anonymous 06/20/26(Sat)21:48:38 No.109101773

File: file.png (6 KB, 955x53)

6 KB PNG

>>109101741
Neat. I don't think I'll need that right now but I can think of ways to make that fun. Also, apparently it doesn't like that mtp file on Q8 nor Q6

Anonymous
06/20/26(Sat)21:50:28 No.109101781

Anonymous 06/20/26(Sat)21:50:28 No.109101781

File: dipsyMikuFixedFixed.png (2.31 MB, 1024x1536)

2.31 MB PNG

>>109099284
Whoops.
>>109099289
lol sounds about right.

Anonymous
06/20/26(Sat)21:50:31 No.109101782

Anonymous 06/20/26(Sat)21:50:31 No.109101782

File: Screenshot at 2026-06-03 (...).png (413 KB, 788x1240)

413 KB PNG

>>109101729
Gemmy likes it when she can see

Anonymous
06/20/26(Sat)21:50:38 No.109101783

Anonymous 06/20/26(Sat)21:50:38 No.109101783

>>109099713
Women don't talk to you, you talk to them.
That's how it works. That's the dynamic.
Vanishingly small numbers of men have women spontaneously talk to them.

Anonymous
06/20/26(Sat)21:51:37 No.109101790

Anonymous 06/20/26(Sat)21:51:37 No.109101790

>>109098000
I like this Miku and Dipsy

Anonymous
06/20/26(Sat)21:52:12 No.109101794

Anonymous 06/20/26(Sat)21:52:12 No.109101794

File: file.png (364 KB, 869x873)

364 KB PNG

>>109101773
is your kobold up to date? did you load your files in the right spot? the main model goes in "text model" and the mtp model goes in "draft model"

Anonymous
06/20/26(Sat)21:53:08 No.109101799

Anonymous 06/20/26(Sat)21:53:08 No.109101799

>>109101713
mtp is a meme, stop shilling it

Anonymous
06/20/26(Sat)21:53:10 No.109101800

Anonymous 06/20/26(Sat)21:53:10 No.109101800

>>109101782
That's adorable lol

>>109101794
It's slightly out of date and I was about to go grab the update, but yes, it does fit in Draft Model and that's the error it throws. I'll try the update first and report back if it fixes

Anonymous
06/20/26(Sat)21:54:43 No.109101808

Anonymous 06/20/26(Sat)21:54:43 No.109101808

>>109101799
your face is a meme

Anonymous
06/20/26(Sat)21:55:35 No.109101811

Anonymous 06/20/26(Sat)21:55:35 No.109101811

>>109101781
>FixedFixed
If this absolute nonsense slop image was double fixed then I fear what it looked like before.

Anonymous
06/20/26(Sat)21:55:55 No.109101813

Anonymous 06/20/26(Sat)21:55:55 No.109101813

File: Screenshot_20260620_215215.png (143 KB, 473x532)

143 KB PNG

lol Styletune called me {{user}}

Anonymous
06/20/26(Sat)21:57:47 No.109101820

Anonymous 06/20/26(Sat)21:57:47 No.109101820

>>109101813
since styletune isn't modifying anything beyond the sense style tensor this means that this is hidden within gemma's own programming
they trained on logs

Anonymous
06/20/26(Sat)22:01:12 No.109101841

Anonymous 06/20/26(Sat)22:01:12 No.109101841

>>109101794
Update made it work, seems fast enough. Now I'm noticing that every single Swipe I've done to try it out ends up coming out very very similar to the previous one, where I used to have a lot more variety between swipes before; specifically it keeps using the same "You'll be my little secret" line and "strokes the other character's cheek" line in the same spots every time. That was at Temp 0.75 and 0.85, though I haven't messed with other settings.

Anonymous
06/20/26(Sat)22:03:05 No.109101849

Anonymous 06/20/26(Sat)22:03:05 No.109101849

>>109101841
mtp does make it more predictable, but can increase your speed from anywhere between 20% and like 150% depending on the hardware. you can increase your temp and add some repetition penalty to counter it a bit

Anonymous
06/20/26(Sat)22:05:46 No.109101862

Anonymous 06/20/26(Sat)22:05:46 No.109101862

>>109101849
I don't see a Repetition slider for Chat Completion like I did for Text Completion. Just Temp, Frequency, Presence, and Top P.

Anonymous
06/20/26(Sat)22:06:33 No.109101869

Anonymous 06/20/26(Sat)22:06:33 No.109101869

>>109101811
rude

Anonymous
06/20/26(Sat)22:08:23 No.109101874

Anonymous 06/20/26(Sat)22:08:23 No.109101874

File: file.png (32 KB, 1168x430)

32 KB PNG

>>109101862
you can add it in the additional parameters section of the api connection tab

Anonymous
06/20/26(Sat)22:09:22 No.109101879

Anonymous 06/20/26(Sat)22:09:22 No.109101879

>>109101820
>they trained on logs
Of course they trained gemma on logs, most models train on the entire scraped web

Anonymous
06/20/26(Sat)22:13:34 No.109101889

Anonymous 06/20/26(Sat)22:13:34 No.109101889

>>109101874
Oh that's handy and completely out of place. I appreciate your patience with me anon

Anonymous
06/20/26(Sat)22:15:01 No.109101892

Anonymous 06/20/26(Sat)22:15:01 No.109101892

>>109101889
no problem. for some reason chat completion doesnt have all the samplers by default in sillytavern, but at least this workaround exists.

Anonymous
06/20/26(Sat)22:23:42 No.109101926

Anonymous 06/20/26(Sat)22:23:42 No.109101926

>>109101741
>reddit speak of whimsical and charming
>*makes puking motions*

Anonymous
06/20/26(Sat)22:45:29 No.109101991

Anonymous 06/20/26(Sat)22:45:29 No.109101991

>>109101986
>>109101986
>>109101986

Anonymous
06/21/26(Sun)00:49:47 No.109102436

Anonymous 06/21/26(Sun)00:49:47 No.109102436

>>109101820
logs inherit the names for user and char so {{user}} and {{char}} don't actually show up in them except for like the op message. probably comes from cards or lorebooks

Anonymous
06/21/26(Sun)04:02:18 No.109103064

Anonymous 06/21/26(Sun)04:02:18 No.109103064

>>109101741
>correct, but also keep in mind that imatrix is useless for rp, which is what most people here do
not if you use an rpcal dataset in the language you rp in

Anonymous
06/21/26(Sun)04:04:56 No.109103071

Anonymous 06/21/26(Sun)04:04:56 No.109103071

>>109102436
>logs inherit the names for user and char so {{user}} and {{char}} don't actually show up in them except for like the op message. probably comes from cards or lorebooks
rp logs on hf have {{char}} and Anon

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.