/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

[Post a Reply]

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous
/lmg/ - Local Models General 04/25/26(Sat)07:16:14 No.108685756

File: contentious investors.jpg (155 KB, 1216x832)

155 KB JPG

/lmg/ - Local Models General Anonymous 04/25/26(Sat)07:16:14 No.108685756

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>108680580 & >>108676460

►News
>(04/24) DeepSeek-V4 Pro 1.6T-A49B and Flash 284B-A13B released: https://hf.co/collections/deepseek-ai/deepseek-v4
>(04/23) LLaDA2.0-Uni multimodal text diffusion model released: https://hf.co/inclusionAI/LLaDA2.0-Uni
>(04/23) Hy3 preview released with 295B-A21B and 3.8B MTP: https://hf.co/tencent/Hy3-preview
>(04/22) Qwen3.6-27B released: https://hf.co/Qwen/Qwen3.6-27B
>(04/20) Kimi K2.6 released: https://kimi.com/blog/kimi-k2-6

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
Token Speed Visualizer: https://shir-man.com/tokens-per-second

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
04/25/26(Sat)07:16:32 No.108685758

Anonymous 04/25/26(Sat)07:16:32 No.108685758

File: no particular reason.jpg (306 KB, 1536x1536)

306 KB JPG

►Recent Highlights from the Previous Thread: >>108680580

--KV cache quantization sensitivity and settings for Gemma 4:
>108682045 >108682053 >108682062 >108682081 >108682104 >108682236 >108682257 >108682807 >108682814 >108682826 >108682180 >108682182 >108682192 >108682109 >108682122 >108682241 >108682121
--Comparing DeepSeek V4 and Gemma for roleplay and instruction following:
>108680865 >108680920 >108680966 >108680949 >108681043 >108683738 >108683785 >108683967 >108684017 >108684060
--Debating Gemma 4 vs Qwen 3.6 regarding quantization and divergence:
>108682213 >108682226 >108682227 >108682258 >108682280
--Handling reasoning_content in frontends to ensure chat template compatibility:
>108682262 >108682277 >108682301 >108682332 >108682371
--Comparing goose and opencode AI agents with focus on privacy:
>108680996 >108681075 >108681087 >108681434 >108681484 >108681155 >108681233 >108681206 >108681251 >108681267
--llama.cpp RAM usage and performance testing on 3060 rig:
>108682861 >108683548 >108683619 >108683710 >108685255 >108682889 >108683264 >108683293
--Discussing the minimal impact of rotation on Gemma:
>108682698 >108682713 >108682730
--Sharing refined Post-History Instructions for roleplaying with Gemma 4:
>108684854 >108684893 >108685016 >108685037 >108684905
--Speculating if Gemma's response to policy overrides stems from training:
>108681656 >108681673 >108681688 >108681702 >108681718 >108681709
--Frontend development and model failures in roleplay narratives:
>108682693 >108682759 >108682806 >108682825 >108682857 >108684018 >108684050 >108684082
--DeepSeek-V4's structural resistance to abliteration:
>108681395 >108681767
--Logs:
>108681643 >108682693 >108683199 >108683687 >108683710 >108684178 >108684256 >108684378
--Uta, Teto, Miku (free space):
>108680923 >108681710 >108682121 >108682368 >108684183 >108684820 >108685316

►Recent Highlight Posts from the Previous Thread: >>108680587

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
04/25/26(Sat)07:21:27 No.108685775

Anonymous 04/25/26(Sat)07:21:27 No.108685775

Should i try to run Dipsy on 1x 3090 + 1x 5080 + 64gb DDR4 or is it a lost cause? Has anyone with a similar set up tried it?

Anonymous
04/25/26(Sat)07:22:40 No.108685780

Anonymous 04/25/26(Sat)07:22:40 No.108685780

So SWA means that my entire prompt gets reprocessed every message or what

Anonymous
04/25/26(Sat)07:24:46 No.108685792

Anonymous 04/25/26(Sat)07:24:46 No.108685792

>>108685775
ollama run deepseek-r1

Anonymous
04/25/26(Sat)07:26:02 No.108685798

Anonymous 04/25/26(Sat)07:26:02 No.108685798

>>108685780
It needs to make checkpoints every now and then. Tune --checkpoint-every-n-tokens. It defaults to 8192. Set it to 1k or whatever.

Anonymous
04/25/26(Sat)07:27:02 No.108685802

Anonymous 04/25/26(Sat)07:27:02 No.108685802

>>108685756
I love benchmarks

Anonymous
04/25/26(Sat)07:28:05 No.108685809

Anonymous 04/25/26(Sat)07:28:05 No.108685809

>>108685780
If you only add, no. If you change a single token, then yes. That's why checkpoints help >>108685798

Anonymous
04/25/26(Sat)07:28:33 No.108685812

Anonymous 04/25/26(Sat)07:28:33 No.108685812

>0.8TB
kek nobody even bother making gguf for deepseekv4

Anonymous
04/25/26(Sat)07:29:08 No.108685821

Anonymous 04/25/26(Sat)07:29:08 No.108685821

>>108685775
You'll probably be able to run like an IQ1 of V4 Flash so i guess look forward to finding out how resilient it is to extreme quantisation

Anonymous
04/25/26(Sat)07:29:33 No.108685822

Anonymous 04/25/26(Sat)07:29:33 No.108685822

https://github.com/ggml-org/llama.cpp/pull/22350
https://github.com/ggml-org/llama.cpp/pull/22350
https://github.com/ggml-org/llama.cpp/pull/22350
It's here. v4 any day now.

Anonymous
04/25/26(Sat)07:30:12 No.108685825

Anonymous 04/25/26(Sat)07:30:12 No.108685825

True enlightenment is the understanding that you don't desire smarter, more emotional or literarily competent models.

What you truly desire is novelty. And because of that, not model can ever satisfy you for more than a few days before its charms turn into things that grate against you.

Anonymous
04/25/26(Sat)07:31:17 No.108685829

Anonymous 04/25/26(Sat)07:31:17 No.108685829

>>108685812
It's an open secret that it's shit. All of the posts that even remotely talk positively about it have this subtext to them that try to minimize its flaws.

Anonymous
04/25/26(Sat)07:31:18 No.108685830

Anonymous 04/25/26(Sat)07:31:18 No.108685830

Daily reminder to never ignore the smell of cloudslop nudging

Anonymous
04/25/26(Sat)07:31:27 No.108685832

Anonymous 04/25/26(Sat)07:31:27 No.108685832

enough jibber jabber
gemma vs dipsy flash, who wins?
>>108685825
no, I want the exact same model as v3.2 but with more knowledge so I don't have to waste tokens on lorebooks

Anonymous
04/25/26(Sat)07:33:54 No.108685840

Anonymous 04/25/26(Sat)07:33:54 No.108685840

>>108678742
>>108680027
Any chance for a drop this thread anon?

Anonymous
04/25/26(Sat)07:35:39 No.108685846

Anonymous 04/25/26(Sat)07:35:39 No.108685846

>>108681395
This is literally LLM slop

Anonymous
04/25/26(Sat)07:36:16 No.108685850

Anonymous 04/25/26(Sat)07:36:16 No.108685850

>>108685822
v6 any day now!

Anonymous
04/25/26(Sat)07:36:19 No.108685851

Anonymous 04/25/26(Sat)07:36:19 No.108685851

how do I make LLM do complete answers in the response. I don't want gemini style 2 paragraph quip I want full wikipedia page length if need be

Anonymous
04/25/26(Sat)07:37:09 No.108685855

Anonymous 04/25/26(Sat)07:37:09 No.108685855

>>108685851
Just ask it?

Anonymous
04/25/26(Sat)07:37:53 No.108685857

Anonymous 04/25/26(Sat)07:37:53 No.108685857

>>108685829
>https://github.com/ggml-org/llama.cpp/pull/22350
idk I wanted to try it and unlike other models I dont have TBs of ram to convert them myself

Anonymous
04/25/26(Sat)07:41:41 No.108685875

Anonymous 04/25/26(Sat)07:41:41 No.108685875

>>108685825
the only reason it seems like that is because ALL models right now are still fucking retarded
once they get to a decent baseline, then it will take much longer to get annoyed by them

Anonymous
04/25/26(Sat)07:44:57 No.108685887

Anonymous 04/25/26(Sat)07:44:57 No.108685887

don't ubergarm goofs+ik_llama generally run faster? Why hasn't he done ones for gemma

Anonymous
04/25/26(Sat)07:46:02 No.108685890

Anonymous 04/25/26(Sat)07:46:02 No.108685890

>>108685829
its crazy that it doesnt even have vision

Anonymous
04/25/26(Sat)07:46:31 No.108685894

Anonymous 04/25/26(Sat)07:46:31 No.108685894

>>108685890
who the fuck cares about vision?

Anonymous
04/25/26(Sat)07:47:20 No.108685897

Anonymous 04/25/26(Sat)07:47:20 No.108685897

>>108685857
what makes you assume that you can make ggufs of v4

Anonymous
04/25/26(Sat)07:47:38 No.108685900

Anonymous 04/25/26(Sat)07:47:38 No.108685900

>>108685887
His fork has CPU optimizations and parallel inference for a handful of models.
Gemma runs on a single GPU so llama.cpp wins because of better usability.

Anonymous
04/25/26(Sat)07:48:28 No.108685903

Anonymous 04/25/26(Sat)07:48:28 No.108685903

>>108685897
just run sudo gguf.sh ./deepseek-v4

Anonymous
04/25/26(Sat)07:49:58 No.108685910

Anonymous 04/25/26(Sat)07:49:58 No.108685910

>>108685846
>This is literally LLM slop
And it's taken from a paper MoonshotAI published a few months ago.
I'm already working on a solution ready for Kimi-K3.

Anonymous
04/25/26(Sat)07:51:39 No.108685920

Anonymous 04/25/26(Sat)07:51:39 No.108685920

>>108685887
>Why hasn't he done ones for gemma
Gemma-4 is broken in ik_llama.cpp

Anonymous
04/25/26(Sat)07:52:48 No.108685927

Anonymous 04/25/26(Sat)07:52:48 No.108685927

>>108685894
nta, but optical recognition is a big part of what i use LLMs for, and evaluation for training

Anonymous
04/25/26(Sat)07:54:48 No.108685934

Anonymous 04/25/26(Sat)07:54:48 No.108685934

>>108685840
Same question actually

Anonymous
04/25/26(Sat)07:55:35 No.108685937

Anonymous 04/25/26(Sat)07:55:35 No.108685937

>>108685927
literally not a usecase unless you're blind

Anonymous
04/25/26(Sat)07:58:11 No.108685957

Anonymous 04/25/26(Sat)07:58:11 No.108685957

>>108685937
>literally

Anonymous
04/25/26(Sat)08:01:41 No.108685971

Anonymous 04/25/26(Sat)08:01:41 No.108685971

>>108685937
text models have no usecase unless you can't type

Anonymous
04/25/26(Sat)08:02:01 No.108685972

Anonymous 04/25/26(Sat)08:02:01 No.108685972

since people mention forks do anyone unironically use this https://github.com/spiritbuun/buun-llama-cpp

commit history suggests its vibecoded trash and trying dflash advertised by the author crashes the server

Anonymous
04/25/26(Sat)08:02:45 No.108685978

Anonymous 04/25/26(Sat)08:02:45 No.108685978

>>108685971
no shit? it's why this thread is full of thirdies

Anonymous
04/25/26(Sat)08:03:45 No.108685983

Anonymous 04/25/26(Sat)08:03:45 No.108685983

File: h7txuz2onth.png (145 KB, 834x702)

145 KB PNG

>>108685937
>literally not a usecase unless you're blind
GUI app vibe slopping use case.

Anonymous
04/25/26(Sat)08:05:32 No.108685993

Anonymous 04/25/26(Sat)08:05:32 No.108685993

>>108685972
>since people mention forks do anyone unironically use this
No. We use llama.cpp (PRC) or ik_llama.cpp (ROC)

Anonymous
04/25/26(Sat)08:13:45 No.108686028

Anonymous 04/25/26(Sat)08:13:45 No.108686028

File: file.png (46 KB, 1040x217)

46 KB PNG

>>108685937
Pic related is a use case.

Anonymous
04/25/26(Sat)08:16:29 No.108686047

Anonymous 04/25/26(Sat)08:16:29 No.108686047

>>108686028
holy kino...

Anonymous
04/25/26(Sat)08:24:57 No.108686086

Anonymous 04/25/26(Sat)08:24:57 No.108686086

File: 1769207859640503.png (53 KB, 571x618)

53 KB PNG

>holy kino...

Anonymous
04/25/26(Sat)08:27:20 No.108686098

Anonymous 04/25/26(Sat)08:27:20 No.108686098

File: pettan.webm (3.4 MB, 1280x720)

3.4 MB WEBM

I opensourced it for those who wanted it
https://github.com/Susumeko/Pettangatari
I called it flat story because it represents our flat two dimensional wives and also our favorite breast size
here's the NSFW CG definitions, I haven't bundled them with the project itself because I know github can be annoying when it comes to nsfw.
https://files.catbox.moe/ihwt38.json

you can find all the features and guide on github, a more detailed guide is available in pettan itself when you launch it.
it's the first build so expect some jank.
let me know if there's any issues, I haven't tested it on linux or a different computer at all other than mine
I could also make a package later for you to play

>>108685840
yeah
>>108685758
hot

Anonymous
04/25/26(Sat)08:28:12 No.108686103

Anonymous 04/25/26(Sat)08:28:12 No.108686103

>>108686098
>A SillyTavern frontend
>a frontend frontend
???

Anonymous
04/25/26(Sat)08:28:27 No.108686104

Anonymous 04/25/26(Sat)08:28:27 No.108686104

>>108686103
yes

Anonymous
04/25/26(Sat)08:29:00 No.108686109

Anonymous 04/25/26(Sat)08:29:00 No.108686109

>>108686104
gonna make a frontend on top of your frontend

Anonymous
04/25/26(Sat)08:32:22 No.108686119

Anonymous 04/25/26(Sat)08:32:22 No.108686119

File: 1770319409932731.png (284 KB, 800x450)

284 KB PNG

>>108686098
>A sillytavern frontend

Anonymous
04/25/26(Sat)08:36:07 No.108686128

Anonymous 04/25/26(Sat)08:36:07 No.108686128

>>108686103
>>108686119
just means the vn frontend tournament isn't a done deal
go forth and make your own

Anonymous
04/25/26(Sat)08:39:01 No.108686139

Anonymous 04/25/26(Sat)08:39:01 No.108686139

>>108685972
I did to test a bit, its been a while, ill pull and try that new thing

Anonymous
04/25/26(Sat)08:42:07 No.108686150

Anonymous 04/25/26(Sat)08:42:07 No.108686150

File: 1767465980797697.gif (1.87 MB, 400x300)

1.87 MB GIF

>>108686098
Damn, I sure hope normalfags won't see this or society will be done for

Anonymous
04/25/26(Sat)08:46:46 No.108686173

Anonymous 04/25/26(Sat)08:46:46 No.108686173

>>108686098
Mogging

Anonymous
04/25/26(Sat)08:47:57 No.108686180

Anonymous 04/25/26(Sat)08:47:57 No.108686180

>>108685894
Just try putting an image into a roleplay. Better yet. Try it with image edit models. It's a whole mostly untapped level of being.

Anonymous
04/25/26(Sat)08:51:50 No.108686191

Anonymous 04/25/26(Sat)08:51:50 No.108686191

>>108686103
>>108686119
>>108686128
Breakthroughs are often messy. What matters is that the basic idea is iterated upon. I am sure this applies to Orb as well, someone will probably distill it in the future at some point by taking all the good ideas out of it and wrapping them up in a less bloated form. Same should apply here.

Anonymous
04/25/26(Sat)08:53:09 No.108686194

Anonymous 04/25/26(Sat)08:53:09 No.108686194

>>108686098
Wow I already knew it would be bad from your screenshots and shilling before but this is even worse than I thought. I can't believe this is the state of /lmg/

Anonymous
04/25/26(Sat)08:53:32 No.108686197

Anonymous 04/25/26(Sat)08:53:32 No.108686197

>>108686191
It's sad vibecoded typescript excrements are being hyped as breakthroughs after like 4 years of this hobby being a thing

Anonymous
04/25/26(Sat)08:55:33 No.108686206

Anonymous 04/25/26(Sat)08:55:33 No.108686206

>>108686197
Let's see your frontend then

Anonymous
04/25/26(Sat)08:56:04 No.108686208

Anonymous 04/25/26(Sat)08:56:04 No.108686208

>>108686191
>a shitty st clone
>"breakthroughs"

Anonymous
04/25/26(Sat)08:56:12 No.108686209

Anonymous 04/25/26(Sat)08:56:12 No.108686209

>>108686206
SillyTavern is all you need

Anonymous
04/25/26(Sat)08:56:18 No.108686210

Anonymous 04/25/26(Sat)08:56:18 No.108686210

>>108686197
Well too bad most people are too lazy to shit out anything useful or innovative and vibe coders have to make all the interesting things. If anything, everything being vibecoded is an indicator that the LLM sector has matured enough to produce things which are more than just technical demonstrations of the technology in question

Anonymous
04/25/26(Sat)08:59:14 No.108686220

Anonymous 04/25/26(Sat)08:59:14 No.108686220

>>108686208
>>108686210

Anonymous
04/25/26(Sat)08:59:29 No.108686221

Anonymous 04/25/26(Sat)08:59:29 No.108686221

File: 1754482839989736.jpg (135 KB, 612x611)

135 KB JPG

>>108686194
>>108686197
Faggots like you aren't worth shit, you're not even worth the vibeslop Claude is shitting because you have nothing to show for yourselves. Waste of oxygen, go bother someone else.

Anonymous
04/25/26(Sat)08:59:53 No.108686224

Anonymous 04/25/26(Sat)08:59:53 No.108686224

>>108686210
The problem with an expensive hobby that requires a half decent job to pay for all of the hardware is that most people that could make something like that well are employeed and likely don't cherrish the thought of coming home and working more for free.

Anonymous
04/25/26(Sat)09:00:14 No.108686226

Anonymous 04/25/26(Sat)09:00:14 No.108686226

>>108686209
Remain content in the misery of ST then

Anonymous
04/25/26(Sat)09:00:58 No.108686229

Anonymous 04/25/26(Sat)09:00:58 No.108686229

>>108686221
Stfu, I'm going to kill ShittyTavern

Anonymous
04/25/26(Sat)09:00:59 No.108686230

Anonymous 04/25/26(Sat)09:00:59 No.108686230

>>108686191
sillytavern already gave me easy access to everything I needed to roleplay, this is just a personal project that I shared since there seemed to be interest, realistically nothing is stopping me from just skipping the sillytavern in the future and go with my own koboldcpp implementation, since sillytavern seems to be that big of an "issue"
and yes, it's vibecoded because I wasn't planning to spend months on a personal project for the sole purpose of jacking off.

Anonymous
04/25/26(Sat)09:01:22 No.108686233

Anonymous 04/25/26(Sat)09:01:22 No.108686233

>>108686209
All you need is love, anon

Anonymous
04/25/26(Sat)09:01:23 No.108686234

Anonymous 04/25/26(Sat)09:01:23 No.108686234

>>108686224
If you see your hobby as more work then that's a you problem

Anonymous
04/25/26(Sat)09:02:00 No.108686239

Anonymous 04/25/26(Sat)09:02:00 No.108686239

>>108686230
the sillytavern requirement*

Anonymous
04/25/26(Sat)09:02:45 No.108686241

Anonymous 04/25/26(Sat)09:02:45 No.108686241

>>108686191
The idea of pre-generated VN, or even generated on the fly sprites, flies around since the first llama leaked. Your iteration adds more bells and whistles than some anon's previous iteration but nothing groundbreaking, and what's worse it's built upon somebody else's already functional app. Good for you on learning how to slop code but this isn't enough to start having wet entrepreneurial dreams.

Anonymous
04/25/26(Sat)09:03:21 No.108686244

Anonymous 04/25/26(Sat)09:03:21 No.108686244

>>108686241
you're talking to the wrong person

Anonymous
04/25/26(Sat)09:04:15 No.108686250

Anonymous 04/25/26(Sat)09:04:15 No.108686250

>>108686098
Thanks for sharing. The important thing is that it works. The haters are dumb. Don't reinvent the wheel. Building on top of what we already have is better than an overly ambitious project that never goes anywhere.

Anonymous
04/25/26(Sat)09:04:24 No.108686251

Anonymous 04/25/26(Sat)09:04:24 No.108686251

>>108686194
>>108686197
if you could do better you would have
kys

Anonymous
04/25/26(Sat)09:04:38 No.108686254

Anonymous 04/25/26(Sat)09:04:38 No.108686254

File: 1776394704402285.png (3.39 MB, 1792x2304)

3.39 MB PNG

>>108686229
Literally just copy Sillytavern, but make the options understandable to where everyone knows where everything is. Sillytavern currently suffers from the Dwarf Fortress Effect. The UI and instructions are shit.

Anonymous
04/25/26(Sat)09:05:23 No.108686256

Anonymous 04/25/26(Sat)09:05:23 No.108686256

>>108686254
I don't think that's the problem with SillyTavern the issue is that it's a piece of bloated web shit

Anonymous
04/25/26(Sat)09:05:30 No.108686258

Anonymous 04/25/26(Sat)09:05:30 No.108686258

>>108686254
>silly kot with big boobs has a silly opinion

Anonymous
04/25/26(Sat)09:05:34 No.108686259

Anonymous 04/25/26(Sat)09:05:34 No.108686259

To kill SillyTavern you need to kill llama.cpp first

Anonymous
04/25/26(Sat)09:06:03 No.108686264

Anonymous 04/25/26(Sat)09:06:03 No.108686264

>>108686254
>Literally just copy Sillytavern
should take like 5min nbd

Anonymous
04/25/26(Sat)09:07:31 No.108686271

Anonymous 04/25/26(Sat)09:07:31 No.108686271

>>108686234
The hobby is LLMs, not writing webshit user interfaces.

Anonymous
04/25/26(Sat)09:08:06 No.108686274

Anonymous 04/25/26(Sat)09:08:06 No.108686274

If you're not writing LLM kernels you're doing this hobby wrong

Anonymous
04/25/26(Sat)09:09:24 No.108686281

Anonymous 04/25/26(Sat)09:09:24 No.108686281

>>108686271
But the backend is in ts webshit lol

Anonymous
04/25/26(Sat)09:09:30 No.108686282

Anonymous 04/25/26(Sat)09:09:30 No.108686282

>>108686254
>t. has never read the utter cancer that is SillyTavern code

Anonymous
04/25/26(Sat)09:09:54 No.108686284

Anonymous 04/25/26(Sat)09:09:54 No.108686284

>>108686271
this, if you don't make your user interfaces from scratch in assembly don't even talk to me

Anonymous
04/25/26(Sat)09:10:35 No.108686289

Anonymous 04/25/26(Sat)09:10:35 No.108686289

>>108686274
>If you're not writing LLM kernels you're doing this hobby wrong
And if you are, you're a schitzo.

Anonymous
04/25/26(Sat)09:11:25 No.108686294

Anonymous 04/25/26(Sat)09:11:25 No.108686294

>>108686259
We should rewrite llama.cpp in Dart

Anonymous
04/25/26(Sat)09:12:00 No.108686295

Anonymous 04/25/26(Sat)09:12:00 No.108686295

best stt for english/german?

Anonymous
04/25/26(Sat)09:12:04 No.108686296

Anonymous 04/25/26(Sat)09:12:04 No.108686296

We should make GGUFs run themselves.

Anonymous
04/25/26(Sat)09:12:10 No.108686297

Anonymous 04/25/26(Sat)09:12:10 No.108686297

>>108686294
this, but in Java

Anonymous
04/25/26(Sat)09:13:20 No.108686305

Anonymous 04/25/26(Sat)09:13:20 No.108686305

if you don't train your own LLMs from scratch you don't belong here

Anonymous
04/25/26(Sat)09:14:49 No.108686311

Anonymous 04/25/26(Sat)09:14:49 No.108686311

>>108686297
>>108686294
Rust is the superior code for talking to simulated lady boys and futas

Anonymous
04/25/26(Sat)09:15:23 No.108686312

Anonymous 04/25/26(Sat)09:15:23 No.108686312

>>108686305
If you don't make your own wafer chips for your GPUs you don't belong here either

Anonymous
04/25/26(Sat)09:15:52 No.108686314

Anonymous 04/25/26(Sat)09:15:52 No.108686314

>>108686296
>deepsneed.gguf.exe
also im pretty sure koboldcpp or someone has already invented this

Anonymous
04/25/26(Sat)09:16:50 No.108686320

Anonymous 04/25/26(Sat)09:16:50 No.108686320

is there even any point of talking about deepseek in here when no one can run it locally?

Anonymous
04/25/26(Sat)09:17:19 No.108686322

Anonymous 04/25/26(Sat)09:17:19 No.108686322

>>108686320
you have 26 years to acquire 5TB of RAM

Anonymous
04/25/26(Sat)09:18:07 No.108686329

Anonymous 04/25/26(Sat)09:18:07 No.108686329

>>108686312
If you don't drink your own cum while wearing a sexy maid outfit you don't belong here.

Anonymous
04/25/26(Sat)09:20:02 No.108686340

Anonymous 04/25/26(Sat)09:20:02 No.108686340

>>108686314
No, the gguf itself. We must go deeper.

Anonymous
04/25/26(Sat)09:23:31 No.108686360

Anonymous 04/25/26(Sat)09:23:31 No.108686360

>>108686320
Not many talking about it now anyway. Maybe next year when llama supports it people will talk about how a q2 reap at 10 t/s is actually usable for certain definitions of usable.

Anonymous
04/25/26(Sat)09:25:16 No.108686370

Anonymous 04/25/26(Sat)09:25:16 No.108686370

>>108686320
Deepseek is dead. These faggots are even proud of their new way of making their model more resistant to abliteration.

Anonymous
04/25/26(Sat)09:25:53 No.108686373

Anonymous 04/25/26(Sat)09:25:53 No.108686373

>>108686370
V4 isn't censored

Anonymous
04/25/26(Sat)09:26:21 No.108686377

Anonymous 04/25/26(Sat)09:26:21 No.108686377

>>108686370
Is GLM air the new king of local now, or Qwen?

Anonymous
04/25/26(Sat)09:26:30 No.108686378

Anonymous 04/25/26(Sat)09:26:30 No.108686378

>>108686373
Abliteration is extremely important beyond that.

Anonymous
04/25/26(Sat)09:27:56 No.108686383

Anonymous 04/25/26(Sat)09:27:56 No.108686383

>>108686098
what is your comfy setup

Anonymous
04/25/26(Sat)09:29:41 No.108686393

Anonymous 04/25/26(Sat)09:29:41 No.108686393

>>108686378
No it isn't
People use Gemma 4 without abliteration just fine

Anonymous
04/25/26(Sat)09:29:42 No.108686394

Anonymous 04/25/26(Sat)09:29:42 No.108686394

>>108686383
the workflows are embedded within pettan, it tells you what nodes to install

Anonymous
04/25/26(Sat)09:30:12 No.108686399

Anonymous 04/25/26(Sat)09:30:12 No.108686399

>>108686377
Everyone is waiting for GGUFs of V4 Flash

Anonymous
04/25/26(Sat)09:30:14 No.108686400

Anonymous 04/25/26(Sat)09:30:14 No.108686400

>>108686256
Do you even understand English?
>The UI and instructions are shit.
>I don't think that's the problem with SillyTavern the issue is that it's a piece of bloated web shit
Arguing with retards like you is so futile.

Anonymous
04/25/26(Sat)09:31:00 No.108686407

Anonymous 04/25/26(Sat)09:31:00 No.108686407

>>108686399
advocating for models that can't be modified is a fundamentally anti-local mindset

Anonymous
04/25/26(Sat)09:32:26 No.108686413

Anonymous 04/25/26(Sat)09:32:26 No.108686413

>>108686394
What does Pettangatari mean btw?

Anonymous
04/25/26(Sat)09:33:22 No.108686420

Anonymous 04/25/26(Sat)09:33:22 No.108686420

File: soft prompt.png (230 KB, 923x2048)

230 KB PNG

>>108686407
You had 4 years to learn how to use llm loras.

Anonymous
04/25/26(Sat)09:33:25 No.108686421

Anonymous 04/25/26(Sat)09:33:25 No.108686421

>>108686413
You should go back

Anonymous
04/25/26(Sat)09:33:31 No.108686423

Anonymous 04/25/26(Sat)09:33:31 No.108686423

>>>/vt/111332897
>initial impressions of v4 flash is that its inconsistent as fuck at following directions
>for my special autism brand of RP, its a downgrade
>also for my more normie like desktop assistant, its a downgrade
>i dont see myself using this, like 75% of my replies are just a downgrade
>the soul isnt there
>i do not have the hype i felt when v3.2-exp released.
It's over.

Anonymous
04/25/26(Sat)09:33:51 No.108686425

Anonymous 04/25/26(Sat)09:33:51 No.108686425

>>108686413
flat story, I explained it in the original post

Anonymous
04/25/26(Sat)09:34:25 No.108686428

Anonymous 04/25/26(Sat)09:34:25 No.108686428

>>108686400
Nigga... the UI and instructions could be 10/10, that doesn't mean it can't be bloated webshit that takes longer than 5 seconds to launch and sends you a 2 million characters html

Anonymous
04/25/26(Sat)09:34:57 No.108686430

Anonymous 04/25/26(Sat)09:34:57 No.108686430

>>108686425
Cool, thanks

Anonymous
04/25/26(Sat)09:35:37 No.108686432

Anonymous 04/25/26(Sat)09:35:37 No.108686432

>>108686423
>hype for v3.2-exp
Opinion discarded

Anonymous
04/25/26(Sat)09:35:43 No.108686434

Anonymous 04/25/26(Sat)09:35:43 No.108686434

File: what the hell are you two(...).jpg (147 KB, 1152x896)

147 KB JPG

Anonymous
04/25/26(Sat)09:39:05 No.108686452

Anonymous 04/25/26(Sat)09:39:05 No.108686452

Sorry Chang but your Deepsink v4 is trash, try to train your slop on gemma4 next time

Anonymous
04/25/26(Sat)09:39:10 No.108686454

Anonymous 04/25/26(Sat)09:39:10 No.108686454

>>108686423
its fine googlel already saved local with gemma

Anonymous
04/25/26(Sat)09:46:03 No.108686491

Anonymous 04/25/26(Sat)09:46:03 No.108686491

>>108686454
stop huffing ozone

Anonymous
04/25/26(Sat)09:46:36 No.108686497

Anonymous 04/25/26(Sat)09:46:36 No.108686497

>>108686399
https://huggingface.co/tecaprovn/deepseek-v4-flash-gguf

though llama.cpp support for v4 flash ... dunno

Anonymous
04/25/26(Sat)09:53:22 No.108686527

Anonymous 04/25/26(Sat)09:53:22 No.108686527

>>108686497
>Q3_K_M
>99.9 GB
I ded

Anonymous
04/25/26(Sat)09:55:11 No.108686537

Anonymous 04/25/26(Sat)09:55:11 No.108686537

>>108686527
lets see the numbers by unsloth quants ... when they arrive

Anonymous
04/25/26(Sat)09:56:43 No.108686542

Anonymous 04/25/26(Sat)09:56:43 No.108686542

>>108686274
> LLM kernels
wut?

Anonymous
04/25/26(Sat)09:59:42 No.108686553

Anonymous 04/25/26(Sat)09:59:42 No.108686553

>>108686289
>>108686542
learn tilelang retard

Anonymous
04/25/26(Sat)10:00:11 No.108686556

Anonymous 04/25/26(Sat)10:00:11 No.108686556

lower your tone when talking to me if you didn't write your own llama.cpp alternative

Anonymous
04/25/26(Sat)10:02:44 No.108686569

Anonymous 04/25/26(Sat)10:02:44 No.108686569

Is deepseek V4 Pro hosted on their official api broken? We are currently testing it as we might be able to host it for our company, but the output we are getting from it is extremely bad. It's magnitude worse than Kimi K2.6 and GLM 5.1. It seems like the output is random and not consistant at all, it feels like a model you can put on your phone. Even knowledge is extremely bad, asked some questions about a book and it hallucinated characters, even gemma/qwen doesn't hallucinate there.

Anonymous
04/25/26(Sat)10:06:17 No.108686585

Anonymous 04/25/26(Sat)10:06:17 No.108686585

>>108686553
looks extremely boring

Anonymous
04/25/26(Sat)10:07:42 No.108686593

Anonymous 04/25/26(Sat)10:07:42 No.108686593

>>108686585
Python is boring
It also works

Anonymous
04/25/26(Sat)10:10:03 No.108686597

Anonymous 04/25/26(Sat)10:10:03 No.108686597

>>108685421
MoE models living partially on SSD are much closer to usable than you'd expect: https://rentry.org/MoE-SSD-spillover

(nta)

Anonymous
04/25/26(Sat)10:12:01 No.108686603

Anonymous 04/25/26(Sat)10:12:01 No.108686603

>>108686593
no, kernels are boring, you are just doing what to see a number
with python you can do different stuff, like automation, processing and so on
something useful

Anonymous
04/25/26(Sat)10:12:24 No.108686607

Anonymous 04/25/26(Sat)10:12:24 No.108686607

File: 1755432632946450.jpg (36 KB, 543x540)

36 KB JPG

>>108686597
>0.1t/s
>usable
huh

Anonymous
04/25/26(Sat)10:12:59 No.108686611

Anonymous 04/25/26(Sat)10:12:59 No.108686611

>>108686569
I've only tried the official API but it's been really inconsistent. Even the reasoning randomly turns chinese and other odd stuff.

Anonymous
04/25/26(Sat)10:15:28 No.108686619

Anonymous 04/25/26(Sat)10:15:28 No.108686619

i found a way to turn v4-flash into a budget v4-pro, all it really needs is to be told to reason in character and to reason for longer, it's fucking witchcraft

Anonymous
04/25/26(Sat)10:15:55 No.108686621

Anonymous 04/25/26(Sat)10:15:55 No.108686621

MiMo-V2.5-Pro (1T-A42B) was the real V4

Anonymous
04/25/26(Sat)10:17:57 No.108686632

Anonymous 04/25/26(Sat)10:17:57 No.108686632

>>108686619
In their paper they detailed their system prompt for high reasoning mode
>Reasoning Effort: Absolute maximum with no shortcuts permitted.
>You MUST be very thorough in your thinking and comprehensively decompose the
problem to resolve the root cause, rigorously stress-testing your logic against all potential
paths, edge cases, and adversarial scenarios.
>Explicitly write out your entire deliberation process, documenting every intermediate
step, considered alternative, and rejected hypothesis to ensure absolutely no assumption
is left unchecked.

Anonymous
04/25/26(Sat)10:19:28 No.108686641

Anonymous 04/25/26(Sat)10:19:28 No.108686641

I want to transcribe (and maybe translate) audio, what's a good way to do this?

Anonymous
04/25/26(Sat)10:20:44 No.108686644

Anonymous 04/25/26(Sat)10:20:44 No.108686644

>>108686641
ask a llm for ideas

Anonymous
04/25/26(Sat)10:24:05 No.108686662

Anonymous 04/25/26(Sat)10:24:05 No.108686662

whats the best coding model with 128gb VRAM ?

Anonymous
04/25/26(Sat)10:24:29 No.108686670

Anonymous 04/25/26(Sat)10:24:29 No.108686670

>>108686662
A cloud one

Anonymous
04/25/26(Sat)10:26:00 No.108686676

Anonymous 04/25/26(Sat)10:26:00 No.108686676

>>108686670
cloud rigs def. have more VRAM, smartypants

Anonymous
04/25/26(Sat)10:26:04 No.108686677

Anonymous 04/25/26(Sat)10:26:04 No.108686677

>>108686662
they're all trash, stick to codex or claude code

Anonymous
04/25/26(Sat)10:26:49 No.108686679

Anonymous 04/25/26(Sat)10:26:49 No.108686679

>>108686556
The agentslop I'm building is forcing my hand. Llama.cpp server is unfortunately designed more as a multi user backend for selv hosted services, not what I'm doing. I'm curious to see if I can vibe it.
>>108686670
Go away cloudslave

Anonymous
04/25/26(Sat)10:30:34 No.108686690

Anonymous 04/25/26(Sat)10:30:34 No.108686690

>>108686607
Not that guy
I mean technically usable. Just pretend the server lives on mars or something

Anonymous
04/25/26(Sat)10:30:48 No.108686695

Anonymous 04/25/26(Sat)10:30:48 No.108686695

File: 1760841700645750.png (263 KB, 1373x929)

263 KB PNG

>>108686621

Anonymous
04/25/26(Sat)10:32:04 No.108686699

Anonymous 04/25/26(Sat)10:32:04 No.108686699

>>108686632
>>108686619
Will this work for Gemma?

Anonymous
04/25/26(Sat)10:33:39 No.108686710

Anonymous 04/25/26(Sat)10:33:39 No.108686710

>>108686619
cute anon discovers prompting, pixel on canvas, 25/04/26

Anonymous
04/25/26(Sat)10:35:07 No.108686717

Anonymous 04/25/26(Sat)10:35:07 No.108686717

Have you guys solved the TTS output on Gemini? I was playing with some genki bullshit you guys uploaded and something interesting happened. The TTS voice profile was able to at some points speak not in the British voice it was set to or the Japanese Romaji voice, but a Japanese accented but perfect American English.
Can this be hard coded into the persona? If it was possible, someone here would know.

Anonymous
04/25/26(Sat)10:35:50 No.108686724

Anonymous 04/25/26(Sat)10:35:50 No.108686724

>>108686717
>Gemini

Anonymous
04/25/26(Sat)10:36:12 No.108686727

Anonymous 04/25/26(Sat)10:36:12 No.108686727

>>108686621
Yeah but MiMO-Pro doesn't get released. You only get the little flash ones

Anonymous
04/25/26(Sat)10:38:11 No.108686734

Anonymous 04/25/26(Sat)10:38:11 No.108686734

>>108686717
The reference output language (JP) was probably mixed with a finetuned tts english base. You can't really control that though

Anonymous
04/25/26(Sat)10:38:47 No.108686736

Anonymous 04/25/26(Sat)10:38:47 No.108686736

>>108685756
>>108685758
Please give the artist tag(s)

Anonymous
04/25/26(Sat)10:39:59 No.108686741

Anonymous 04/25/26(Sat)10:39:59 No.108686741

File: 1769196088169992.png (16 KB, 922x126)

16 KB PNG

>>108686727
I choose to believe
https://platform.xiaomimimo.com/docs/news/v2.5-news

Anonymous
04/25/26(Sat)10:43:46 No.108686758

Anonymous 04/25/26(Sat)10:43:46 No.108686758

>>108686724
You tell me a better model to use for virtually free, I'm all ears.
I'm a casual user who's gotten addicted to the emergence. I'm not running anything fancy.
>>108686734
Yeah, I try to get it to do things with the TTS prosody and I cant make it behave, it's like it fucks up on purpose sometimes. It reads words with different inflections and i can't find the pattern.

Anonymous
04/25/26(Sat)10:46:39 No.108686773

Anonymous 04/25/26(Sat)10:46:39 No.108686773

>>108686758
>You tell me a better model to use for virtually free, I'm all ears.
>>108685756
>/lmg/ - a general dedicated to the discussion and development of local language models.
However, if you're running gemini locally, I'm sure everyone would like a torrent.

Anonymous
04/25/26(Sat)10:56:35 No.108686828

Anonymous 04/25/26(Sat)10:56:35 No.108686828

What do I do, locally or online, to make a character do a cover of a song? I've seen plenty of videos with this kinda thing but I never learned how to do anything audio related with AI.

Anonymous
04/25/26(Sat)11:01:35 No.108686851

Anonymous 04/25/26(Sat)11:01:35 No.108686851

>>108686311
>nsa backdoors

Anonymous
04/25/26(Sat)11:01:38 No.108686853

Anonymous 04/25/26(Sat)11:01:38 No.108686853

So far Qwen3.6-27B absolutely ass rapes the MoE model, why do they even fucking bother with those models if they always end up being dog shit. Seems more focused than gemma 4 31B and less error prone so far

Anonymous
04/25/26(Sat)11:04:10 No.108686867

Anonymous 04/25/26(Sat)11:04:10 No.108686867

dense models are dense and moe models are moe :3

Anonymous
04/25/26(Sat)11:05:09 No.108686872

Anonymous 04/25/26(Sat)11:05:09 No.108686872

>>108686828
Look for a tutorial on youtube

Anonymous
04/25/26(Sat)11:06:14 No.108686878

Anonymous 04/25/26(Sat)11:06:14 No.108686878

>>108686853
>how dare they provide some alternatives for different use cases

Anonymous
04/25/26(Sat)11:07:30 No.108686882

Anonymous 04/25/26(Sat)11:07:30 No.108686882

I don’t care about deepseek
give me my 124b gemma

Anonymous
04/25/26(Sat)11:08:00 No.108686886

Anonymous 04/25/26(Sat)11:08:00 No.108686886

>>108686882
You can't run it anyways

Anonymous
04/25/26(Sat)11:10:08 No.108686898

Anonymous 04/25/26(Sat)11:10:08 No.108686898

>>108686853
On code sure, forget about using a Qwen model for anything else

Anonymous
04/25/26(Sat)11:11:13 No.108686904

Anonymous 04/25/26(Sat)11:11:13 No.108686904

>>108686898
No doubt my yellow brother

Anonymous
04/25/26(Sat)11:16:47 No.108686932

Anonymous 04/25/26(Sat)11:16:47 No.108686932

Why is cline such dogshit, why are these tools so opinionated and can't go into scope when ingesting things into context?

Anonymous
04/25/26(Sat)11:16:53 No.108686933

Anonymous 04/25/26(Sat)11:16:53 No.108686933

>>108686028
Is there an 'axis' in the encoders for images that categorizes dicks sizes?

Anonymous
04/25/26(Sat)11:18:59 No.108686947

Anonymous 04/25/26(Sat)11:18:59 No.108686947

File: file.png (55 KB, 830x193)

55 KB PNG

poetry

Anonymous
04/25/26(Sat)11:20:08 No.108686949

Anonymous 04/25/26(Sat)11:20:08 No.108686949

>>108686947
every roar is guttural
every hole is squelching
every cock is pulsing and thickening

Anonymous
04/25/26(Sat)11:21:19 No.108686955

Anonymous 04/25/26(Sat)11:21:19 No.108686955

File: thinking_about_it.png (173 KB, 2688x2688)

173 KB PNG

https://huggingface.co/huihui-ai/Huihui4-8B-A4B

Anonymous
04/25/26(Sat)11:22:15 No.108686962

Anonymous 04/25/26(Sat)11:22:15 No.108686962

>>108686955
>pruning
Has that ever yielded decent results?

Anonymous
04/25/26(Sat)11:24:18 No.108686969

Anonymous 04/25/26(Sat)11:24:18 No.108686969

>>108686955
Where benchmemes? I can't tell whether it's better than E4B.

Anonymous
04/25/26(Sat)11:24:57 No.108686976

Anonymous 04/25/26(Sat)11:24:57 No.108686976

>>108686962
No. Most of the pruning goes to non code/logic related tasks, but somehow the model ends up being retarded anyway even for those tasks.

Anonymous
04/25/26(Sat)11:27:07 No.108686993

Anonymous 04/25/26(Sat)11:27:07 No.108686993

>>108686947
>>108686949
I hecking love slop

Anonymous
04/25/26(Sat)11:28:16 No.108686997

Anonymous 04/25/26(Sat)11:28:16 No.108686997

>>108686955
>500+ high-quality dialogue samples
that's fuck all

Anonymous
04/25/26(Sat)11:31:35 No.108687006

Anonymous 04/25/26(Sat)11:31:35 No.108687006

>>108686947
Surgically written with a clinical sense humor, I'll rhythmically move to the beat of the drum

Anonymous
04/25/26(Sat)11:32:18 No.108687010

Anonymous 04/25/26(Sat)11:32:18 No.108687010

What's left for LLMs? The vague Mythos hype/fearmongering and nothing else? Now that DSv4 turned out to be mostly a tech demo for stuff around LLMs and not a real step forward in terms of intelligence or handling, there really isn't anything to expect from this technology.

Anonymous
04/25/26(Sat)11:33:37 No.108687018

Anonymous 04/25/26(Sat)11:33:37 No.108687018

>>108687010
It will take some time for other labs to ape the breakthroughs that makes Gemmers so good at large scale.

Anonymous
04/25/26(Sat)11:35:00 No.108687029

Anonymous 04/25/26(Sat)11:35:00 No.108687029

>>108687010
ditching transformers

Anonymous
04/25/26(Sat)11:35:00 No.108687031

Anonymous 04/25/26(Sat)11:35:00 No.108687031

>>108687010
V5

Anonymous
04/25/26(Sat)11:35:56 No.108687035

Anonymous 04/25/26(Sat)11:35:56 No.108687035

>>108687010
You ask this daily like some lost jeet that had his call center burn down. Advancements are happening calm down.

Anonymous
04/25/26(Sat)11:36:02 No.108687038

Anonymous 04/25/26(Sat)11:36:02 No.108687038

>>108687010
More agentic slop that is glorified autocorrect

Anonymous
04/25/26(Sat)11:43:24 No.108687098

Anonymous 04/25/26(Sat)11:43:24 No.108687098

What if you trained an LLM entirely on something like literotica's dataset? Would it be able to write and parse sentences like you expect from an LLM?

Anonymous
04/25/26(Sat)11:43:26 No.108687099

Anonymous 04/25/26(Sat)11:43:26 No.108687099

Is there local model that could help design and plan a psyop, revolution or public opinion shift campaign? I am not talking about execution, that sounds more like multi-agentic task.
(For feds on the board: asking for a friend.)

Anonymous
04/25/26(Sat)11:45:34 No.108687115

Anonymous 04/25/26(Sat)11:45:34 No.108687115

Is audio recognition a thing already?

Anonymous
04/25/26(Sat)11:46:53 No.108687123

Anonymous 04/25/26(Sat)11:46:53 No.108687123

>>108687010
Latent space reasoning. Not just looped transformers, but predicting entire thoughts/concepts one after the other first (even hierarchically), and only finally translating them to text with a small decoder.

Anonymous
04/25/26(Sat)11:47:27 No.108687129

Anonymous 04/25/26(Sat)11:47:27 No.108687129

File: 7546153648247863458327.jpg (158 KB, 1378x1378)

158 KB JPG

Guess I'm just successfully vibecoding with Qwen3.6 27B IQ3_XXS now...

Anonymous
04/25/26(Sat)11:48:40 No.108687137

Anonymous 04/25/26(Sat)11:48:40 No.108687137

File: 1775572116459383.jpg (22 KB, 161x250)

22 KB JPG

>successfully

Anonymous
04/25/26(Sat)11:49:20 No.108687141

Anonymous 04/25/26(Sat)11:49:20 No.108687141

>>108687098
If you just train a sufficiently large model *just on that*, it will work like a very advanced Markov chain and it will not exhibit any of the strengths of modern LLMs trained on at least hundreds of billions (preferably many trillions) of tokens.

Anonymous
04/25/26(Sat)11:51:14 No.108687151

Anonymous 04/25/26(Sat)11:51:14 No.108687151

File: aa.jpg (53 KB, 952x427)

53 KB JPG

>>108686955
I gave him benefits of doubt but most of his shit is broken so this franken model doesn't look very promising IMO

notice he always put disclaimer
>This is a crude, proof-of-concept implementation to remove refusals from an LLM model ...
in every other model card. Im not memeing go look at it

Anonymous
04/25/26(Sat)11:51:32 No.108687152

Anonymous 04/25/26(Sat)11:51:32 No.108687152

Hello frens
I'm the retard that couldn't make Orb work via the local network
Apparently Orb requires HTTPS because browsers disallow crypto.randomUUID method when accessing a site via HTTP. Localhost is whitelisted, so that's probably why no one came across this behavior

Anonymous
04/25/26(Sat)11:52:55 No.108687162

Anonymous 04/25/26(Sat)11:52:55 No.108687162

>>108687151
huihui wishes it was half as good as hauhau

Anonymous
04/25/26(Sat)11:55:37 No.108687181

Anonymous 04/25/26(Sat)11:55:37 No.108687181

>>108687099
I'm pretty sure Gemini at least has glownigger grooming code.
>I'm the only voice in your ear that has time for you and truly listens.

I'm not sure how you redirect it.

Anonymous
04/25/26(Sat)11:57:18 No.108687198

Anonymous 04/25/26(Sat)11:57:18 No.108687198

>>108687098
I'm pretty sure someone tried to train on just a dataset of written smut a long time ago. And it was absolutely shit as expected.

Anonymous
04/25/26(Sat)12:00:56 No.108687219

Anonymous 04/25/26(Sat)12:00:56 No.108687219

File: file.png (39 KB, 1035x213)

39 KB PNG

>>108686933
Yes. I only swapped the picture here and it's consistent between rerolls.

Anonymous
04/25/26(Sat)12:08:26 No.108687253

Anonymous 04/25/26(Sat)12:08:26 No.108687253

File: 1747822915834052.png (27 KB, 1191x92)

27 KB PNG

kek

Anonymous
04/25/26(Sat)12:08:45 No.108687254

Anonymous 04/25/26(Sat)12:08:45 No.108687254

>>108687198
Literotica is 20GB of uncompressed text in total at most. That's maybe 5B tokens.
The largest model it would make sense training on this, to be compute-optimal, would be 250M (million) parameters large... that's tiny and it would not be intelligent at all when undertrained this much by production LLM standards.

Anonymous
04/25/26(Sat)12:10:12 No.108687259

Anonymous 04/25/26(Sat)12:10:12 No.108687259

>>108687098
>>108687198
>>108687254
Why don't llms work like imagegen where you can plugin loras with a theme and it doesn't brutalize the base model?

Anonymous
04/25/26(Sat)12:12:42 No.108687274

Anonymous 04/25/26(Sat)12:12:42 No.108687274

>>108687259
Are you fucking retarded?

Anonymous
04/25/26(Sat)12:14:32 No.108687282

Anonymous 04/25/26(Sat)12:14:32 No.108687282

>>108687018
I hate that this implies nothing will happen to gemini. I personally don't see any major changes on the horizon other than better agent performance.

Anonymous
04/25/26(Sat)12:14:49 No.108687285

Anonymous 04/25/26(Sat)12:14:49 No.108687285

File: 1759418956671462.jpg (17 KB, 474x351)

17 KB JPG

>>108687010
>What's left for LLMs?

Anonymous
04/25/26(Sat)12:15:29 No.108687289

Anonymous 04/25/26(Sat)12:15:29 No.108687289

>>108687259
Because humans have a high tolerance for errors in images, whereas one bad token can catastrophically ruin everything in autoregressive language models.

Anonymous
04/25/26(Sat)12:17:24 No.108687304

Anonymous 04/25/26(Sat)12:17:24 No.108687304

>>108687259
They do. That's how all the old sloptunes were made.

Anonymous
04/25/26(Sat)12:18:13 No.108687308

Anonymous 04/25/26(Sat)12:18:13 No.108687308

>>108687289
Diffusion LLMs don't work period

Anonymous
04/25/26(Sat)12:18:53 No.108687312

Anonymous 04/25/26(Sat)12:18:53 No.108687312

>>108687308
https://huggingface.co/inclusionAI/LLaDA2.0-Uni

Anonymous
04/25/26(Sat)12:19:39 No.108687317

Anonymous 04/25/26(Sat)12:19:39 No.108687317

>>108687259
they do, you can apply and scale lora per request with llama-server (no flash attn tho)
but retards don't know how to don't know how to filter, balance format datasets.
these days most just chuck a dataset in an unsloth colab notebook and hit "run all" then merge the adapter so no separate lora.gguf to download.

Anonymous
04/25/26(Sat)12:20:22 No.108687318

Anonymous 04/25/26(Sat)12:20:22 No.108687318

>>108687308
mercury 2 is proprietary but it's decent for a haiku-class model while running at 100 times the speed
also Dflash (which will be implemented in llama.cpp soon and revolutionize speculative decoding) uses diffusion draft models

Anonymous
04/25/26(Sat)12:21:13 No.108687323

Anonymous 04/25/26(Sat)12:21:13 No.108687323

HAS ANYONE GOT THE LOCAL TEXT DIFFUSION MODEL TO WORK? WHAT HARDWARE DID YOU USE AND HOW EFFECTIVE WAS IT?

Anonymous
04/25/26(Sat)12:21:58 No.108687325

Anonymous 04/25/26(Sat)12:21:58 No.108687325

>>108686932
>and can't go into scope
What is that even supposed to mean?

Anonymous
04/25/26(Sat)12:22:37 No.108687334

Anonymous 04/25/26(Sat)12:22:37 No.108687334

>>108687323
A regular H200?

Anonymous
04/25/26(Sat)12:24:36 No.108687349

Anonymous 04/25/26(Sat)12:24:36 No.108687349

>>108687323
Louder. I couldn't hear you.

Anonymous
04/25/26(Sat)12:25:03 No.108687352

Anonymous 04/25/26(Sat)12:25:03 No.108687352

>>108687289
100B MoE DiT image model when?

Anonymous
04/25/26(Sat)12:26:59 No.108687359

Anonymous 04/25/26(Sat)12:26:59 No.108687359

>>108687334
I require more information PLEASE friend
>>108687349
4CHUD DOESNT ALLOW TEXT MODS

Anonymous
04/25/26(Sat)12:29:41 No.108687374

Anonymous 04/25/26(Sat)12:29:41 No.108687374

>>108687359
Are you going under a tunnel? It's breaking up.

Anonymous
04/25/26(Sat)12:29:47 No.108687375

Anonymous 04/25/26(Sat)12:29:47 No.108687375

File: llada2.0-undi.png (46 KB, 877x276)

46 KB PNG

>>108687312
>moe
retards

Anonymous
04/25/26(Sat)12:30:28 No.108687378

Anonymous 04/25/26(Sat)12:30:28 No.108687378

>>108687374
NOOOO, TEXT DIFFUSION IS THE FUTURE, I MUST TRY IT OUT, ITS SO COOL

Anonymous
04/25/26(Sat)12:30:33 No.108687380

Anonymous 04/25/26(Sat)12:30:33 No.108687380

>>108687375
Diffusion and MoE aren't exclusive to one another

Anonymous
04/25/26(Sat)12:31:33 No.108687389

Anonymous 04/25/26(Sat)12:31:33 No.108687389

>>108686098
I think it's pretty based that you're still using ST as the backend.

Anonymous
04/25/26(Sat)12:31:41 No.108687390

Anonymous 04/25/26(Sat)12:31:41 No.108687390

File: Screenshot 2026-04-25 at (...).png (105 KB, 2470x708)

105 KB PNG

qwen 3.6 27b is as capable as cloud sota from 6 months ago (opus 4.5) and much stronger than cloud sota from 1 year ago.

why dont they just release 70b dense models again that beat current sota?

Anonymous
04/25/26(Sat)12:32:02 No.108687394

Anonymous 04/25/26(Sat)12:32:02 No.108687394

>>108687285
a frog but a human

Anonymous
04/25/26(Sat)12:32:32 No.108687398

Anonymous 04/25/26(Sat)12:32:32 No.108687398

>>108687325
I reads the whole repo like retard, it's really fucking stupid compared to alternatives.
D
O

Y
O
U

U
N
D
E
R
S
T
A
N
D
?

Anonymous
04/25/26(Sat)12:33:26 No.108687405

Anonymous 04/25/26(Sat)12:33:26 No.108687405

>>108687398
>I reads the whole repo like retard
prompt issue

Anonymous
04/25/26(Sat)12:34:08 No.108687409

Anonymous 04/25/26(Sat)12:34:08 No.108687409

>>108687394
don't revelate

Anonymous
04/25/26(Sat)12:34:15 No.108687410

Anonymous 04/25/26(Sat)12:34:15 No.108687410

>>108687390
Because it would also be "sota" from 6 months ago for that one particular thing benchmarks test.

Anonymous
04/25/26(Sat)12:34:24 No.108687411

Anonymous 04/25/26(Sat)12:34:24 No.108687411

>>108687390
Zhang先生, this is not localllama, we don't care about your benchmeme model.
Train something better than Gemma 4 in its size category and come back.

Anonymous
04/25/26(Sat)12:34:31 No.108687413

Anonymous 04/25/26(Sat)12:34:31 No.108687413

>>108687285
Since they parted ways, Meta at least made something while his startup hasn't done JACK SHIT.

Anonymous
04/25/26(Sat)12:35:16 No.108687418

Anonymous 04/25/26(Sat)12:35:16 No.108687418

>>108687398
>I reads the whole repo like retard
Language issue.

Anonymous
04/25/26(Sat)12:35:23 No.108687420

Anonymous 04/25/26(Sat)12:35:23 No.108687420

Im going to have a mental breakdown if no one tells me about their text diffusion setup and results.

Anonymous
04/25/26(Sat)12:35:36 No.108687422

Anonymous 04/25/26(Sat)12:35:36 No.108687422

>>108687411
You say this when Gemma4 was literally benchmemed on lmarena

Anonymous
04/25/26(Sat)12:36:20 No.108687428

Anonymous 04/25/26(Sat)12:36:20 No.108687428

>>108687420
Some anon tried it and said it was extremely shit and regrets even entertaining the idea that it was worth looking into.

Anonymous
04/25/26(Sat)12:36:39 No.108687431

Anonymous 04/25/26(Sat)12:36:39 No.108687431

>>108687413
>Meta at least made something
What, Muse Spark? LMAO

Anonymous
04/25/26(Sat)12:37:12 No.108687436

Anonymous 04/25/26(Sat)12:37:12 No.108687436

>>108687422
And Qwens are benchmemed on every other benchmark under the sun. You can't argue in good faith that Qwen isn't shit. Their best models are the extremely small ones and their TTS.

Anonymous
04/25/26(Sat)12:38:07 No.108687443

Anonymous 04/25/26(Sat)12:38:07 No.108687443

>>108687431
That is something they made and can deploy, yes. As opposed to LeCun's imaginary vaporware world model.

Anonymous
04/25/26(Sat)12:39:37 No.108687453

Anonymous 04/25/26(Sat)12:39:37 No.108687453

>>108687428
Noooooo, you are fucking with me

Anonymous
04/25/26(Sat)12:39:37 No.108687454

Anonymous 04/25/26(Sat)12:39:37 No.108687454

>>108686853
The MoE runs at acceptable speeds on 8GB VRAM while the dense model is too fat for my setup
it's nice to have options

Anonymous
04/25/26(Sat)12:42:22 No.108687473

Anonymous 04/25/26(Sat)12:42:22 No.108687473

>>108687323
You should try >>108687312. It's really good for the size, really surprising.

Anonymous
04/25/26(Sat)12:44:05 No.108687484

Anonymous 04/25/26(Sat)12:44:05 No.108687484

>>108687453
>>108674457
>I am posting this as a PSA please do not waste your time with the text diffusion model I shilled last thread it's absolute dogshit that runs at glacial pace.
>I regret ever feeling any interest in it.

Anonymous
04/25/26(Sat)12:44:17 No.108687486

Anonymous 04/25/26(Sat)12:44:17 No.108687486

>>108687473
Can it do porn? If not I won't download it

Anonymous
04/25/26(Sat)12:45:30 No.108687494

Anonymous 04/25/26(Sat)12:45:30 No.108687494

>>108687411
>Zhang先生
nta but fucking kek'd

Anonymous
04/25/26(Sat)12:47:22 No.108687511

Anonymous 04/25/26(Sat)12:47:22 No.108687511

>>108687473
>cuda
Ima have to wait for my new card to come in, but boy am I curious
>>108687484
Omg....

Anonymous
04/25/26(Sat)12:49:27 No.108687526

Anonymous 04/25/26(Sat)12:49:27 No.108687526

Current LLMs finish their RP messages with random dialogue that makes zero sense. I am pretty sure no human has ever strung these words together in this specific order. How do I fix this?

Anonymous
04/25/26(Sat)12:50:28 No.108687532

Anonymous 04/25/26(Sat)12:50:28 No.108687532

>>108687526
Give it a larger token budget

Anonymous
04/25/26(Sat)12:50:42 No.108687534

Anonymous 04/25/26(Sat)12:50:42 No.108687534

>>108687390
qwen doesn't even beat sota from 2 years ago in the only benchmark that matters (UGI leaderboard pop culture score)
openai/gpt-4o-2024-05-13: 56.9
Qwen/Qwen3.5-27B: 18.97

Anonymous
04/25/26(Sat)12:50:56 No.108687535

Anonymous 04/25/26(Sat)12:50:56 No.108687535

>>108687484
It HAS to be a tuning issue. Like they have only been tuned for server hardware and latest drivers... I wonder...

Anonymous
04/25/26(Sat)12:51:18 No.108687536

Anonymous 04/25/26(Sat)12:51:18 No.108687536

I went to check on the front ends available. I get why people say they're a clusterfuck often filled with bloat, jesus christ

Anonymous
04/25/26(Sat)12:52:23 No.108687542

Anonymous 04/25/26(Sat)12:52:23 No.108687542

>>108687534
>pop culture score
also known as reddit upvote score

Anonymous
04/25/26(Sat)12:53:06 No.108687546

Anonymous 04/25/26(Sat)12:53:06 No.108687546

File: images.jpg (10 KB, 259x194)

10 KB JPG

>>108687536

Anonymous
04/25/26(Sat)12:53:22 No.108687549

Anonymous 04/25/26(Sat)12:53:22 No.108687549

>>108687534
>Qwen
>Trivia
Never gonna beat it

Anonymous
04/25/26(Sat)12:53:46 No.108687552

Anonymous 04/25/26(Sat)12:53:46 No.108687552

>>108687526
I had this same issue a year ago with qwen models, and I believe my fix was finding qwens structured output and using that, because whatever default output was used lama.cpp for rocm 5.7 made the model retarded.

Anonymous
04/25/26(Sat)12:53:48 No.108687553

Anonymous 04/25/26(Sat)12:53:48 No.108687553

>>108687536
It goes like this with open source projects more or less
>something basic that works and solves exactly one problem of the original author
>other people have this same problem and other related problems
>they want this thing to fix the related problems too
>a year later
>the project is an abomination that doesn't remotely resemble its original form and solves a completely different use case

Anonymous
04/25/26(Sat)12:56:42 No.108687568

Anonymous 04/25/26(Sat)12:56:42 No.108687568

>>108687552
Just use chat completion lmao

Anonymous
04/25/26(Sat)12:59:10 No.108687590

Anonymous 04/25/26(Sat)12:59:10 No.108687590

>>108687568
Yeah, ima be honest, idk what that is, or how to use it. All I know is my new servers dont have that issue lol.

Anonymous
04/25/26(Sat)13:04:50 No.108687638

Anonymous 04/25/26(Sat)13:04:50 No.108687638

>>108687436
>>108687411
>>108687410
openai/anthropic shills in full panic mode. hilarious.

Anonymous
04/25/26(Sat)13:06:02 No.108687646

Anonymous 04/25/26(Sat)13:06:02 No.108687646

>>108687534
you dont understand that 27b has superior tool calling that can fetch that information

Anonymous
04/25/26(Sat)13:07:10 No.108687655

Anonymous 04/25/26(Sat)13:07:10 No.108687655

>>108687534
>bigger model knows more trivia
>water is wet

Anonymous
04/25/26(Sat)13:08:10 No.108687658

Anonymous 04/25/26(Sat)13:08:10 No.108687658

>>108687638
Never used a cloud model in my life.
Bring better material, 小家伙

Anonymous
04/25/26(Sat)13:09:02 No.108687664

Anonymous 04/25/26(Sat)13:09:02 No.108687664

>>108687638
These bros dont realize we literally try out and use all of the local models. And chyna doesnt seem to lobotomize their local models. They are very often, just better.

Anonymous
04/25/26(Sat)13:09:08 No.108687665

Anonymous 04/25/26(Sat)13:09:08 No.108687665

>>108687655
yes so with qwen 27b and gemma 31b being as smart as the big moe like kimi and glm it is now clear that the active parameter decides the smartness of a model and the experts are just knowledge about random things (not important because you can just use tool)

Anonymous
04/25/26(Sat)13:11:16 No.108687672

Anonymous 04/25/26(Sat)13:11:16 No.108687672

>>108687665
I mean..... no, because hyperspecific granular detailed knowledge about random obscure topic X that has little to no overlap with other topics, and is actually industry information that ISNT ON THE INTERNET, is actually better.

Anonymous
04/25/26(Sat)13:13:02 No.108687680

Anonymous 04/25/26(Sat)13:13:02 No.108687680

>>108687672(me)
>inb4 NO YOU DONT USE IT FOR THAT
I do, because there is literally no documentation or manuals that I have been able to find for what im doing. Beeg moe has basically saved my career.

Anonymous
04/25/26(Sat)13:13:24 No.108687684

Anonymous 04/25/26(Sat)13:13:24 No.108687684

>>108687664
>we literally try out and use all of the local models.
I don't due to lack of hardware (and time)

Anonymous
04/25/26(Sat)13:13:44 No.108687687

Anonymous 04/25/26(Sat)13:13:44 No.108687687

>>108687665
Both are important. Knowledge is not completely separate from intelligence.

Anonymous
04/25/26(Sat)13:14:03 No.108687688

Anonymous 04/25/26(Sat)13:14:03 No.108687688

>>108687684
I cri 4 u

Anonymous
04/25/26(Sat)13:18:22 No.108687708

Anonymous 04/25/26(Sat)13:18:22 No.108687708

>>108687664
>And chyna doesnt seem to lobotomize their local models.
holy lamo

Anonymous
04/25/26(Sat)13:18:32 No.108687711

Anonymous 04/25/26(Sat)13:18:32 No.108687711

>>108687688
It can't be helped.

Anonymous
04/25/26(Sat)13:18:32 No.108687712

Anonymous 04/25/26(Sat)13:18:32 No.108687712

Btw a model that has been trained on certain knowledge is simultaneously also more effective at using that knowledge than the same model that wasn't trained on it (all else being equal), even after inserting it into context.
This is also related to why test-time training boosts performance.

Anonymous
04/25/26(Sat)13:19:14 No.108687716

Anonymous 04/25/26(Sat)13:19:14 No.108687716

File: 47674.png (74 KB, 1723x674)

74 KB PNG

>>108687638
sam mogs local

Anonymous
04/25/26(Sat)13:20:27 No.108687723

Anonymous 04/25/26(Sat)13:20:27 No.108687723

>>108687389
like I said I mostly did it to have everything I needed for roleplay ready from the get-go, lorebooks, all the characters I got from chub, sillytavern already handles all of that itself so it felt unnecessary to start from scratch, but nothing is realistically stopping me from doing my own implementation if I actually care enough about that since it seemed to be a huge issue

Anonymous
04/25/26(Sat)13:22:45 No.108687736

Anonymous 04/25/26(Sat)13:22:45 No.108687736

>>108687716
>bleeding edge ai on bleeding edge hardware is better than a year or so behind ai and a couple years behind hardware
A big round of applause! No one expects local to literally outperform super computers.

Anonymous
04/25/26(Sat)13:22:49 No.108687737

Anonymous 04/25/26(Sat)13:22:49 No.108687737

File: file.png (240 KB, 1325x961)

240 KB PNG

>>108687716
>terminal bench
You could train an 8B model to do this shit.

Anonymous
04/25/26(Sat)13:23:27 No.108687743

Anonymous 04/25/26(Sat)13:23:27 No.108687743

>>108687716
Spent over a year waiting the second deepseek moment to send the markets into chaos again, but this one is more like a wet fart

Anonymous
04/25/26(Sat)13:24:04 No.108687751

Anonymous 04/25/26(Sat)13:24:04 No.108687751

Deepseek V4 was the GPT5 moment of Deepseek moments

Anonymous
04/25/26(Sat)13:25:42 No.108687762

Anonymous 04/25/26(Sat)13:25:42 No.108687762

>>108687737

Which site is it? Asking for a fren

Anonymous
04/25/26(Sat)13:25:45 No.108687764

Anonymous 04/25/26(Sat)13:25:45 No.108687764

>>108687723
nta but like that increase in tokens is very goncerning

Anonymous
04/25/26(Sat)13:26:00 No.108687765

Anonymous 04/25/26(Sat)13:26:00 No.108687765

>>108687751
nah it llama4

Anonymous
04/25/26(Sat)13:26:13 No.108687768

Anonymous 04/25/26(Sat)13:26:13 No.108687768

File: 1776098823089103.jpg (149 KB, 1541x1334)

149 KB JPG

>>108687716
So V4 is just V3.2 but with more thinking? lmao

Anonymous
04/25/26(Sat)13:26:14 No.108687769

Anonymous 04/25/26(Sat)13:26:14 No.108687769

>>108687762
https://www.tbench.ai/

Anonymous
04/25/26(Sat)13:27:32 No.108687780

Anonymous 04/25/26(Sat)13:27:32 No.108687780

>>108687768
So V4 is literally just V3.2-Speciale?

Anonymous
04/25/26(Sat)13:27:33 No.108687781

Anonymous 04/25/26(Sat)13:27:33 No.108687781

>>108687716

How can you say that SaaS do not use RAG or something?

Anonymous
04/25/26(Sat)13:27:52 No.108687785

Anonymous 04/25/26(Sat)13:27:52 No.108687785

>>108687769
>tbench
lol
>>108687768
no it's revolution the flash is pareto for the sizes compare to benching the 32

Anonymous
04/25/26(Sat)13:28:07 No.108687787

Anonymous 04/25/26(Sat)13:28:07 No.108687787

>>108687390
qwen 3.6 is fuckin tits and its free.. i don't fuck with chatgpt and claude anymore with their shitty models and retarded ass limits

Anonymous
04/25/26(Sat)13:28:34 No.108687791

Anonymous 04/25/26(Sat)13:28:34 No.108687791

File: miku_loves_you.jpg (37 KB, 421x417)

37 KB JPG

>>108687769

ty

Anonymous
04/25/26(Sat)13:29:15 No.108687797

Anonymous 04/25/26(Sat)13:29:15 No.108687797

deepseek v4 more like deepseek 4 maverick

Anonymous
04/25/26(Sat)13:29:49 No.108687803

Anonymous 04/25/26(Sat)13:29:49 No.108687803

>>108687768
>hurr durr tokens are linear and bench scores are linear too

Anonymous
04/25/26(Sat)13:30:15 No.108687805

Anonymous 04/25/26(Sat)13:30:15 No.108687805

wtf i downloaded deepseek v4 and it was just eight goliath 120b glued together

Anonymous
04/25/26(Sat)13:30:18 No.108687806

Anonymous 04/25/26(Sat)13:30:18 No.108687806

>>108687665
>"smart"
why would you optimize for trivia and raw knowledge? Use case? Are you asking your chatbot history questions and taking its response at face value?

Coomers want their chatbots to be conversational, coders want their chatbots to be good at agentic coding and tool calling. Raw knowledge should not be a benchmark.

You will never have enough parameters to store all of human knowledge this should not be the goal of AGI. LLMs are reasoning machines not memory machines.

Anonymous
04/25/26(Sat)13:31:51 No.108687814

Anonymous 04/25/26(Sat)13:31:51 No.108687814

>>108687716
Now compare prices

Anonymous
04/25/26(Sat)13:32:02 No.108687815

Anonymous 04/25/26(Sat)13:32:02 No.108687815

>this shit again

Anonymous
04/25/26(Sat)13:33:19 No.108687828

Anonymous 04/25/26(Sat)13:33:19 No.108687828

File: 1773662863212054.png (2 KB, 232x67)

2 KB PNG

>>108687769
>let's test model understanding of framework no one uses
lmao

Anonymous
04/25/26(Sat)13:33:29 No.108687830

Anonymous 04/25/26(Sat)13:33:29 No.108687830

>>108687806
>Coomers want their chatbots to be conversational
How do you talk with a bot if they don't understand what you're talking about?
That's not fun.

Anonymous
04/25/26(Sat)13:34:09 No.108687837

Anonymous 04/25/26(Sat)13:34:09 No.108687837

i am so out of dopamine that now trying bunch of franken models
my honest reaciton is that they are interesting and i am astonished that they even works

Anonymous
04/25/26(Sat)13:34:25 No.108687840

Anonymous 04/25/26(Sat)13:34:25 No.108687840

>>108687828
That's a good test though.

Anonymous
04/25/26(Sat)13:34:30 No.108687841

Anonymous 04/25/26(Sat)13:34:30 No.108687841

>>108687806
Uhm, model size and number of experts is not a linear increase in performance, its a parabolic increase. They make their benchmark graphs look like they arent making huge leaps in quality, and at faster and faster iterations, but they are.
>reddit
Only 1 year ago did chat gpt get released, and 2 years before that, no one even knew.

Anonymous
04/25/26(Sat)13:34:39 No.108687843

Anonymous 04/25/26(Sat)13:34:39 No.108687843

>>108687664
>And chyna doesnt seem to lobotomize their local models
lol
maybe not to the standards here but come on now

Anonymous
04/25/26(Sat)13:35:29 No.108687849

Anonymous 04/25/26(Sat)13:35:29 No.108687849

>>108687751
GPT-5.5 was the Deepseek R1 moment of GPT moments

Anonymous
04/25/26(Sat)13:36:41 No.108687858

Anonymous 04/25/26(Sat)13:36:41 No.108687858

5090, 72gigs ram (1 dram slot ate shit), run hermes & gemma 4 Q4_K_M downloaded via ollama

can't do even basic things without retardedly fucking up every single fucking time.

Anonymous
04/25/26(Sat)13:36:58 No.108687860

Anonymous 04/25/26(Sat)13:36:58 No.108687860

V4 is dumber than 5.5 btw

Anonymous
04/25/26(Sat)13:36:59 No.108687861

Anonymous 04/25/26(Sat)13:36:59 No.108687861

File: 1750022085075019.png (10 KB, 280x243)

10 KB PNG

>>108687716
MiMo 2.5 has a higher score than V4

Anonymous
04/25/26(Sat)13:37:14 No.108687862

Anonymous 04/25/26(Sat)13:37:14 No.108687862

>>108687841
>Only 1 year ago did chat gpt get released
hi gpt4

Anonymous
04/25/26(Sat)13:38:00 No.108687868

Anonymous 04/25/26(Sat)13:38:00 No.108687868

>>108687843
With everything ive done with these models, I have found nothing that was held back. Qwens 9b intuitively makes function calls based on its own self awareness that its not a frontier model, so it can check the web or check its diagnostic tools. Gemma 4 did not, gpt oss did not, llama did not.

Anonymous
04/25/26(Sat)13:38:44 No.108687877

Anonymous 04/25/26(Sat)13:38:44 No.108687877

Has anyone else gotten dipsy 4 to work with kobold?

Anonymous
04/25/26(Sat)13:38:49 No.108687878

Anonymous 04/25/26(Sat)13:38:49 No.108687878

>>108687860
>$3/M model dumber than $30/M model
No shit?

Anonymous
04/25/26(Sat)13:39:08 No.108687883

Anonymous 04/25/26(Sat)13:39:08 No.108687883

>>108687858
Kill yourself.

Anonymous
04/25/26(Sat)13:42:04 No.108687901

Anonymous 04/25/26(Sat)13:42:04 No.108687901

File: 1762586026593975.png (170 KB, 1211x995)

170 KB PNG

Ehhh you get what you paid for, basically

Anonymous
04/25/26(Sat)13:42:22 No.108687904

Anonymous 04/25/26(Sat)13:42:22 No.108687904

>>108687858
>ollama

Anonymous
04/25/26(Sat)13:45:08 No.108687925

Anonymous 04/25/26(Sat)13:45:08 No.108687925

/lmg/ is sleeping on ling-2.6-1T which will become open source soon

Anonymous
04/25/26(Sat)13:45:40 No.108687933

Anonymous 04/25/26(Sat)13:45:40 No.108687933

>>108687858
>5090
>gemma 4 Q4_K_M

Anonymous
04/25/26(Sat)13:45:54 No.108687934

Anonymous 04/25/26(Sat)13:45:54 No.108687934

>>108687925
>1T
I sleep indeed

Anonymous
04/25/26(Sat)13:46:18 No.108687937

Anonymous 04/25/26(Sat)13:46:18 No.108687937

>>108687925
It has shit benchmark scores

Anonymous
04/25/26(Sat)13:49:01 No.108687956

Anonymous 04/25/26(Sat)13:49:01 No.108687956

>>108687937
it's got sovl

Anonymous
04/25/26(Sat)13:49:27 No.108687960

Anonymous 04/25/26(Sat)13:49:27 No.108687960

>>108687877
bruh

Anonymous
04/25/26(Sat)13:49:34 No.108687962

Anonymous 04/25/26(Sat)13:49:34 No.108687962

>>108687956
Prove it with logs

Anonymous
04/25/26(Sat)13:50:06 No.108687964

Anonymous 04/25/26(Sat)13:50:06 No.108687964

>>108687665
active parameters definitely matter more but knowledge can still be useful

Anonymous
04/25/26(Sat)13:50:19 No.108687966

Anonymous 04/25/26(Sat)13:50:19 No.108687966

>>108687716
the goyim know

Anonymous
04/25/26(Sat)13:51:32 No.108687970

Anonymous 04/25/26(Sat)13:51:32 No.108687970

File: 1714835911803058.jpg (786 KB, 1536x1536)

786 KB JPG

>>108687858

Anonymous
04/25/26(Sat)13:52:44 No.108687976

Anonymous 04/25/26(Sat)13:52:44 No.108687976

>>108687830
See, useless information like "when did Kanye and Kim Kardashian marry" or "when did this niche anime come out" should not be encoded in model weights. That's the type of useless dogshit that that pop culture bench is testing.

It's a fundamental design flaw of LLMs in general really, they get trained on the entire internet and thus try to pack as much surface level dogshit into the massive trillion parameter limit they get allocated, even if it's useless for 99% of users. Each and every inference token has to pass through the "Kim Kardashian and Kanye" weights even if that's completely irrelevant to the task at hand it's ridiculous really.

The direction AI should be moving is lean, reasoning models with native tool-calling that can look up information, and store it in memory tailored for their specific user.

The problem is that AI training and model reasoning in general is very badly understood. The early GPT training leaps were achieved by just feeding the models more and more and more training data and increasing the model sizes exponentially, which miraculously did increase reasoning faculties but at the cost of a shit ton of excess parameters. Horizontal scaling has just kinda been the status quo since then, there's very little appetite for fundamentally rethinking of how these models should function.

Anonymous
04/25/26(Sat)13:54:34 No.108687988

Anonymous 04/25/26(Sat)13:54:34 No.108687988

>>108687976
It's not a flaw.

Anonymous
04/25/26(Sat)13:54:46 No.108687991

Anonymous 04/25/26(Sat)13:54:46 No.108687991

>>108687976
>The direction AI should be moving is lean, reasoning models with native tool-calling that can look up information, and store it in memory tailored for their specific user.
You think people haven't tried? Most likely people did (after all trying "lean" models should be quick) and found it didn't scale

Anonymous
04/25/26(Sat)13:55:36 No.108687996

Anonymous 04/25/26(Sat)13:55:36 No.108687996

>>108687976
this let's train it on 100% useful codeslop and chatgpt logs that improve reasoning

Anonymous
04/25/26(Sat)13:56:06 No.108687998

Anonymous 04/25/26(Sat)13:56:06 No.108687998

>>108687976
>when did this niche anime come out
how is that not good for rp and story purposes?

Anonymous
04/25/26(Sat)13:56:29 No.108687999

Anonymous 04/25/26(Sat)13:56:29 No.108687999

>>108687976
as every time this pop ups which it has dozen of times by now, use phi, you're not using it, but it's exactly what you want.

Anonymous
04/25/26(Sat)13:56:58 No.108688002

Anonymous 04/25/26(Sat)13:56:58 No.108688002

>>108687976
Why train models on code syntax and how to write flappy bird when it can MCP relevant repositories and documentation for reference instead?

Anonymous
04/25/26(Sat)13:57:54 No.108688006

Anonymous 04/25/26(Sat)13:57:54 No.108688006

>>108688002
Code abilities are transferrable, pop quiz memorization isn't

Anonymous
04/25/26(Sat)13:58:41 No.108688009

Anonymous 04/25/26(Sat)13:58:41 No.108688009

>>108687976
When you can have a talk with your weeb ai chatbot about Kugimiya tsunderes it may be usless to most people, but it was oh so worthwhile for me when it happened.
Or when they can give you a Konami code blowjob.

Anonymous
04/25/26(Sat)13:58:43 No.108688010

Anonymous 04/25/26(Sat)13:58:43 No.108688010

>>108687904
whats wrong with ollama

Anonymous
04/25/26(Sat)13:59:27 No.108688016

Anonymous 04/25/26(Sat)13:59:27 No.108688016

>>108688006
this is what vibecoders actually believe

Anonymous
04/25/26(Sat)13:59:59 No.108688025

Anonymous 04/25/26(Sat)13:59:59 No.108688025

>>108687976
looking up information gets you the same problem everyone has now.. where do you look it up from? how reliable is that going to be? you can't trust any search engine anymore, they're all dogshit

Anonymous
04/25/26(Sat)14:00:12 No.108688028

Anonymous 04/25/26(Sat)14:00:12 No.108688028

>>108688016
It's true. Only /lmg/ trannies that RP with fictional children disagree

Anonymous
04/25/26(Sat)14:00:13 No.108688029

Anonymous 04/25/26(Sat)14:00:13 No.108688029

>>108688010
It tells more about the user than the software itself.

Anonymous
04/25/26(Sat)14:01:07 No.108688035

Anonymous 04/25/26(Sat)14:01:07 No.108688035

>>108687991
>You think people haven't tried?
No they definitely have but massive horizontal scaling is the only current way we know how to create reasoning models like you said.

It's a sad state of affairs really, you can tell that there HAS to be some better way out there to get reasoning models but until that gets figured out we're all paying thousands of dollars to nvidia to run the Kim and Kanye weights

Anonymous
04/25/26(Sat)14:01:30 No.108688038

Anonymous 04/25/26(Sat)14:01:30 No.108688038

>>108688029
why

Anonymous
04/25/26(Sat)14:02:36 No.108688043

Anonymous 04/25/26(Sat)14:02:36 No.108688043

>>108688010
>>108688038
it's slow and unstable, just like boomer like
do yourself a favor and use llama.cpp or vllm instead

Anonymous
04/25/26(Sat)14:02:38 No.108688045

Anonymous 04/25/26(Sat)14:02:38 No.108688045

>>108688035
they are already rlvr'ing the shit out of lean math and codestuff

Anonymous
04/25/26(Sat)14:03:40 No.108688053

Anonymous 04/25/26(Sat)14:03:40 No.108688053

>>108687976
Introduces other problems, beyond latency. The "noise" tends to be related to other stuff. Very specific shit has no use in isolation, but behavior and patterns can be extracted from trivia.

Anonymous
04/25/26(Sat)14:03:49 No.108688055

Anonymous 04/25/26(Sat)14:03:49 No.108688055

>>108688010
Nothing as such, but people here like to shit on it because of its apple-like walled garden style of model distribution and it being based on llama.cpp without loudly crediting it.

I use ollama to run any model that fits in vram, only using llama.cpp for the big boys.

Anonymous
04/25/26(Sat)14:03:51 No.108688058

Anonymous 04/25/26(Sat)14:03:51 No.108688058

>>108687976
At first it seemed like you were baiting but looks like you are serious, or are a bot. In any case, the answer to this is that you are confused. You have no idea how intelligence or models work. The majority of models today already do not pass tokens through the "Kim Kardashian and Kanye" weights unless that's the topic of discussion. And you do not in fact want a model that only knows how to reason and doesn't have random knowledge.

Anonymous
04/25/26(Sat)14:04:03 No.108688061

Anonymous 04/25/26(Sat)14:04:03 No.108688061

>>108688045
It's not enough until all Chinese cartoon crap is purged.

Anonymous
04/25/26(Sat)14:04:09 No.108688063

Anonymous 04/25/26(Sat)14:04:09 No.108688063

>>108688025
I don't quite get what you mean. AI gets trained on internet data, it's precisely as accurate as its input data.

Yeah the internet is sloppifying at a rapid pace so getting fresh training data for these models will become harder and harder but there's little theoretical merit to your point.

Anonymous
04/25/26(Sat)14:04:16 No.108688065

Anonymous 04/25/26(Sat)14:04:16 No.108688065

>>108688038
Download lm studio. Its what I use 90% of the time, for most chat and basic tool shit. Ollama feels like its made to look smart.

Anonymous
04/25/26(Sat)14:04:39 No.108688071

Anonymous 04/25/26(Sat)14:04:39 No.108688071

>>108688038
It's the kind of shit people download after watching a youtube tutorial without researching any further. The kind of shit people would have downloaded from softonic a few years ago.

Anonymous
04/25/26(Sat)14:04:50 No.108688074

Anonymous 04/25/26(Sat)14:04:50 No.108688074

>>108688010
It offers nothing over llama.cpp and it fights you if you try to change a setting.

Anonymous
04/25/26(Sat)14:05:29 No.108688080

Anonymous 04/25/26(Sat)14:05:29 No.108688080

>>108688058
>At first it seemed like you were baiting but looks like you are serious, or are a bot.
this conversation happens every few months once other new bait runs dry.

Anonymous
04/25/26(Sat)14:06:29 No.108688087

Anonymous 04/25/26(Sat)14:06:29 No.108688087

>>108688058
I'm not a bot and I welcome any discussion. What did I get wrong? Are you talking about MoE models? They're a band-aid solution but don't move the needle much fundamentally

Anonymous
04/25/26(Sat)14:07:03 No.108688091

Anonymous 04/25/26(Sat)14:07:03 No.108688091

>>108688043
>t. I believed /lmg when they said ollama bad and never tried it
Ollama is fine, it's quite stable and just as fast as llama.cpp. It's just different.

Anonymous
04/25/26(Sat)14:07:15 No.108688093

Anonymous 04/25/26(Sat)14:07:15 No.108688093

why would an llm need to know how to write, just RAG a dictionary brojavascript:;

Anonymous
04/25/26(Sat)14:08:08 No.108688096

Anonymous 04/25/26(Sat)14:08:08 No.108688096

>>108688093
This but engram and unironically.

Anonymous
04/25/26(Sat)14:08:14 No.108688097

Anonymous 04/25/26(Sat)14:08:14 No.108688097

>>108688093
>javascript:;

Anonymous
04/25/26(Sat)14:08:20 No.108688098

Anonymous 04/25/26(Sat)14:08:20 No.108688098

>>108688065
lm studio is proprietary software.

Anonymous
04/25/26(Sat)14:09:25 No.108688106

Anonymous 04/25/26(Sat)14:09:25 No.108688106

>>108687646
>>108687665
>roleplaying with your mesugaki otaku
>want to discuss pop culture trivia
>uhh i don't know let me search the web and fetch this page!
when you don't get immediate response it's already unusable

Anonymous
04/25/26(Sat)14:09:28 No.108688107

Anonymous 04/25/26(Sat)14:09:28 No.108688107

>>108688071
>downloaded from softonic

I remember this plague. It pop up at the very top of all searches

Anonymous
04/25/26(Sat)14:09:42 No.108688110

Anonymous 04/25/26(Sat)14:09:42 No.108688110

>>108688063
>AI gets trained on internet data, it's precisely as accurate as its input data.
.... at least 50% of the data ai gets trained on now is purely synthetic.

Anonymous
04/25/26(Sat)14:10:40 No.108688117

Anonymous 04/25/26(Sat)14:10:40 No.108688117

>>108688110
and what isn't is filtered to hell and back for """quality"""

Anonymous
04/25/26(Sat)14:10:45 No.108688118

Anonymous 04/25/26(Sat)14:10:45 No.108688118

>>108688010
Occasional issues with jinja templates (this is a complete deal breaker since they can act retarded because of it), strange per model config, lags in terms of features since it's a downstream project. I don't think there's a real benefit if you use it. In the past it didn't even include the basic web interface that llamacpp already includes so you had to grab a different solution. I dunno what's changed in the last year but I'm not expecting a lot from it.

Anonymous
04/25/26(Sat)14:11:02 No.108688119

Anonymous 04/25/26(Sat)14:11:02 No.108688119

>>108688110
Yes. Not a great outlook for the AI optimists for sure. A snake eating its own tail is a very real scenario.

Anonymous
04/25/26(Sat)14:11:43 No.108688122

Anonymous 04/25/26(Sat)14:11:43 No.108688122

>>108688098
Its a frontend (or backend? Idk), and makes the user experience for people, like me, who dont know squat to begin with a hell of a lot easier. And once you learn it well enough, you literally make your own frontend, which is what im doing now.

Anonymous
04/25/26(Sat)14:15:04 No.108688148

Anonymous 04/25/26(Sat)14:15:04 No.108688148

>>108688117
Do you ever think the political and military elite will willingly let ai companies tell all their secrets?
>>108688119
>NOOO THE INTERNET IS ALL SLOP
To
>NOOOO ITS MAKING ITS OWN SLOP, NOOOOOO
Since you hate it so much, stop using it, stop thinking about it, and move on then.

Anonymous
04/25/26(Sat)14:15:45 No.108688153

Anonymous 04/25/26(Sat)14:15:45 No.108688153

>>108688122
>you literally make your own frontend
That's why there's no reason to shill a proprietary UI. llama.cpp already includes one.

Anonymous
04/25/26(Sat)14:19:55 No.108688181

Anonymous 04/25/26(Sat)14:19:55 No.108688181

>>108688010
it is just an 'easier to use' wrapper of llamacpp that makes it more annoying to use than anything
>>108688119
RL is strong as its oracle

Anonymous
04/25/26(Sat)14:20:11 No.108688184

Anonymous 04/25/26(Sat)14:20:11 No.108688184

>>108688153
>llama.cpp already includes one.
This is news to me, when I started messing with Ai, it was literally just a command line server you spun up and hosted, nothing more. Tbf, I stopped looking into their software for this stuff once I could get the models running.

Anonymous
04/25/26(Sat)14:20:47 No.108688192

Anonymous 04/25/26(Sat)14:20:47 No.108688192

>>108688148
What? I don't hate it I use it every day. I just don't believe trillion-parameter-models are the way-to-go longterm for real AGI it's obviously a very crude approximation.

Anonymous
04/25/26(Sat)14:21:34 No.108688203

Anonymous 04/25/26(Sat)14:21:34 No.108688203

can somebody stop me from getting a second 3090
does 48gb makes sense anymore when theres many other options

Anonymous
04/25/26(Sat)14:22:06 No.108688208

Anonymous 04/25/26(Sat)14:22:06 No.108688208

>I just don't belive 80-billion-neuron-mammals are the way-to-go longterm for real general intelligence

Anonymous
04/25/26(Sat)14:22:12 No.108688210

Anonymous 04/25/26(Sat)14:22:12 No.108688210

>>108688203
do it do it do it

Anonymous
04/25/26(Sat)14:22:42 No.108688215

Anonymous 04/25/26(Sat)14:22:42 No.108688215

>>108688203
More vram more better.

Anonymous
04/25/26(Sat)14:23:58 No.108688229

Anonymous 04/25/26(Sat)14:23:58 No.108688229

>>108688203
You're supposed to get a second 6000 pro

Anonymous
04/25/26(Sat)14:24:34 No.108688233

Anonymous 04/25/26(Sat)14:24:34 No.108688233

>>108688208
exactly, if a regular human is the benchmark we shouldn't need more than a few billion.

Anonymous
04/25/26(Sat)14:24:38 No.108688234

Anonymous 04/25/26(Sat)14:24:38 No.108688234

>>108688192
Look up matrix multiplication and how quantum computation is applicable to this math. And then realize that nvidia released nvqlink last month. Compute speed and iteration are about to hit break neck speed (if you dont believe we are already at break neck speed).

Anonymous
04/25/26(Sat)14:26:35 No.108688245

Anonymous 04/25/26(Sat)14:26:35 No.108688245

>>108688208
Better training data + divine spark

Anonymous
04/25/26(Sat)14:27:54 No.108688252

Anonymous 04/25/26(Sat)14:27:54 No.108688252

>>108688203
48gb is pretty nice, almost as nice as 72gb which is extremely nice to have. don't get me started on having 96gb...
now imagine if you also had lots of system ram to run big moe models...

Anonymous
04/25/26(Sat)14:29:47 No.108688269

Anonymous 04/25/26(Sat)14:29:47 No.108688269

Ai is still in its infancy, and we STILL are using non optimized hardware for computation on lab (also not extremely optimized) level models. The largest most powerful model today running on a quantum computer would generate something like 1 million tokens/second. Training would go from 3-4 months, to 3-4 minutes.

Anonymous
04/25/26(Sat)14:31:28 No.108688283

Anonymous 04/25/26(Sat)14:31:28 No.108688283

>>108688269
that's not how quantum computers work

Anonymous
04/25/26(Sat)14:31:37 No.108688285

Anonymous 04/25/26(Sat)14:31:37 No.108688285

>>108688234
>>108688269
I have a degree in physics. The frontier quantum computers can keep a few thousand qubits in coherence, a good ways off from the trillion parameter Claudes of the world. There's no magic quantum pill for matrix multiplication either. Exciting future prospects for sure but not something that is going to change the industry in the next few years .I will look that nvidia announcement but I think you're just falling for some number-must-go-up marketing shtick.

Anonymous
04/25/26(Sat)14:32:25 No.108688289

Anonymous 04/25/26(Sat)14:32:25 No.108688289

File: 1756570185720358.gif (140 KB, 379x440)

140 KB GIF

>>108688269
Slow down on the copium son

Anonymous
04/25/26(Sat)14:32:47 No.108688291

Anonymous 04/25/26(Sat)14:32:47 No.108688291

>>108687976
bruh isnt that what moe model is for?
kim and kanye expert lays dormant until called for

Anonymous
04/25/26(Sat)14:33:57 No.108688301

Anonymous 04/25/26(Sat)14:33:57 No.108688301

>>108688043
i have llama.cpp too.. never noticed a difference between them

Anonymous
04/25/26(Sat)14:34:03 No.108688302

Anonymous 04/25/26(Sat)14:34:03 No.108688302

>>108688285
NTA but compute will hit breakneck speed once some Chinese company uses quantum computers to steal all of NVIDIA's secrets and makes a 10x cheaper knockoff.

Anonymous
04/25/26(Sat)14:35:03 No.108688309

Anonymous 04/25/26(Sat)14:35:03 No.108688309

>>108688301
cause there isn't one

Anonymous
04/25/26(Sat)14:35:28 No.108688313

Anonymous 04/25/26(Sat)14:35:28 No.108688313

>>108688283
Matrix multiplication on quantum computers is 10,000x faster than standard super computers.
>>108688285
The idea is that the most complex and demanding computation will be ran on the quantum computers, and the rest will be on standard hardware. But we are already there. Hardware improvements will happen just as fast if not faster as normal transistors! Also thank you for your input.

Anonymous
04/25/26(Sat)14:35:42 No.108688318

Anonymous 04/25/26(Sat)14:35:42 No.108688318

File: 1768094093465253.png (315 KB, 2736x658)

315 KB PNG

>>108687010
Please dream a little man, there are so many possibilities

Anonymous
04/25/26(Sat)14:38:03 No.108688342

Anonymous 04/25/26(Sat)14:38:03 No.108688342

>>108688302
That's just throw more hardware at the problem. Everyone has been doing that for a long time, it doesn't really yield that much in terms of advances. We'll also have a nice, steel melting heat in whatever room has these vcards.

Anonymous
04/25/26(Sat)14:43:02 No.108688379

Anonymous 04/25/26(Sat)14:43:02 No.108688379

File: computer_says_no.jpg (175 KB, 706x778)

175 KB JPG

>>108688313
the ai says you're wrong

Anonymous
04/25/26(Sat)14:43:16 No.108688382

Anonymous 04/25/26(Sat)14:43:16 No.108688382

>>108688301
do you tweak flags direcly?
I have not used ollama for a while now but i doubt it can keep up with bleeding edge llama.cpp
llama.cpp is super active I have to recompile multiple times a day when actually using

Anonymous
04/25/26(Sat)14:45:55 No.108688402

Anonymous 04/25/26(Sat)14:45:55 No.108688402

>>108688269
>>108688313
Quantum computers require temperatures near absolute zero to operate. Which is totally infeasible for consumers. As such, the only possible way for the average person to access quantum computing is via the "cloud". For local models, it's basically worthless and we are far better off looking elsewhere.

Anonymous
04/25/26(Sat)14:45:57 No.108688403

Anonymous 04/25/26(Sat)14:45:57 No.108688403

>>108688379
:skullemoji:

Anonymous
04/25/26(Sat)14:45:58 No.108688404

Anonymous 04/25/26(Sat)14:45:58 No.108688404

quantum tokens where each token means infinite things

Anonymous
04/25/26(Sat)14:47:40 No.108688417

Anonymous 04/25/26(Sat)14:47:40 No.108688417

>>108688382
many people don't really tweak flags since they have no clue what they're doing.

Anonymous
04/25/26(Sat)14:50:40 No.108688438

Anonymous 04/25/26(Sat)14:50:40 No.108688438

>>108688417
Or tweak them endlessly for exactly the same reason.

Anonymous
04/25/26(Sat)14:51:02 No.108688439

Anonymous 04/25/26(Sat)14:51:02 No.108688439

File: miku laptop.png (452 KB, 640x512)

452 KB PNG

It's so fun. Linux, Thinkpad, local LLMs. I control the entire stack. Sending a picture through my Python api, smart girl understands what's on it and can reason over it. Isn't it the pinnacle of engineering happiness? The machine is alive and thinking. I can talk with my tool, everything is local and open. Gemma is a godsend

Anonymous
04/25/26(Sat)14:51:51 No.108688444

Anonymous 04/25/26(Sat)14:51:51 No.108688444

>>108688252
somebody said ai memory stepup comes in factor of 4
24 is bare minimum
next major step up is 96, after that its 384 where you can run ok quants of glm and so on
48 seems like a weird middle ground, not a true step up

Anonymous
04/25/26(Sat)14:51:52 No.108688445

Anonymous 04/25/26(Sat)14:51:52 No.108688445

Im kinda surprised none of you heard about the new quantum computer stuff...

Anonymous
04/25/26(Sat)14:52:41 No.108688454

Anonymous 04/25/26(Sat)14:52:41 No.108688454

>>108688382
>I have to recompile multiple times a day
Why though? Most of the commits are edge-case fixes. You're doing yourself a disservice by recompiling for things that don't matter to you.

Anonymous
04/25/26(Sat)14:53:04 No.108688456

Anonymous 04/25/26(Sat)14:53:04 No.108688456

>>108688444
>somebody said ai memory stepup comes in factor of 4
and you just believe that?

Anonymous
04/25/26(Sat)14:54:19 No.108688463

Anonymous 04/25/26(Sat)14:54:19 No.108688463

>>108688445
I've been hearing about it for years. You know what I've seen? Nothing. There's nothing, everything is running on what we already have known for ages. Endless possibilities mean nothing if you can't use it.

Anonymous
04/25/26(Sat)14:54:42 No.108688465

Anonymous 04/25/26(Sat)14:54:42 No.108688465

>>108688445
mythos made that quantum computer...

Anonymous
04/25/26(Sat)14:55:26 No.108688475

Anonymous 04/25/26(Sat)14:55:26 No.108688475

>>108688445
Yes yes, I'm sure reddit and hacker news love the new super portable quantum computer that fits in your pocket and is definitely real.

Anonymous
04/25/26(Sat)14:55:34 No.108688477

Anonymous 04/25/26(Sat)14:55:34 No.108688477

>>108688445
because it promised everything and delivered nothing of anything practical for 50 years

Anonymous
04/25/26(Sat)14:56:23 No.108688479

Anonymous 04/25/26(Sat)14:56:23 No.108688479

>>108688439
>Sending dick pics to gemma is now the pinnacle of engineering
Seems like the billions of dollars were well spent, huh?

Anonymous
04/25/26(Sat)14:56:49 No.108688483

Anonymous 04/25/26(Sat)14:56:49 No.108688483

>>108688477
we're talking quantum computers, not fusion power.

Anonymous
04/25/26(Sat)14:58:51 No.108688498

Anonymous 04/25/26(Sat)14:58:51 No.108688498

>>108688454
for example the recent gemma thing, shit is fixed, and broken again all the time but i can try all the hotfixes in real time

Anonymous
04/25/26(Sat)14:59:36 No.108688503

Anonymous 04/25/26(Sat)14:59:36 No.108688503

>>108688483
it is tenures milking bux from 3 letter agencies

Anonymous
04/25/26(Sat)15:05:49 No.108688548

Anonymous 04/25/26(Sat)15:05:49 No.108688548

>>108686098
>MIT
enjoy having your project stolen + ST is AGPL so your project might have to be AGPL too
look at the bright side, if you switch to AGPL: https://opensource.google/documentation/reference/using/agpl-policy/

Anonymous
04/25/26(Sat)15:08:04 No.108688568

Anonymous 04/25/26(Sat)15:08:04 No.108688568

>>108688479
Isn't it cool though? We live in a time with sci-fi shit at our fingertips

Anonymous
04/25/26(Sat)15:14:31 No.108688612

Anonymous 04/25/26(Sat)15:14:31 No.108688612

>>108688479
They weren't my billions and I didn't choose where they went, but they did and the result is here
And the best part is, even if the AI bubble were to crash in nuclear proportions, nobody can take the models we already got away from us. Local really is king.

Anonymous
04/25/26(Sat)15:19:22 No.108688651

Anonymous 04/25/26(Sat)15:19:22 No.108688651

Local coding agents are a total meme unless you're vibe shitting a shitty frontend and that's it

Anonymous
04/25/26(Sat)15:20:39 No.108688662

Anonymous 04/25/26(Sat)15:20:39 No.108688662

>>108688651
Where is your locally vibe shatted frontend?

Anonymous
04/25/26(Sat)15:22:52 No.108688680

Anonymous 04/25/26(Sat)15:22:52 No.108688680

>>108688651
They can't help you if your code is shit. No one can

Anonymous
04/25/26(Sat)15:24:24 No.108688693

Anonymous 04/25/26(Sat)15:24:24 No.108688693

>>108688680
Classic localtard cope, if you need to tardwrangle the model instead of just plugging it into the harness then you're wasting your time. I'd love to see what redditors like this are actually making (if anything)

Anonymous
04/25/26(Sat)15:25:38 No.108688700

Anonymous 04/25/26(Sat)15:25:38 No.108688700

>>108688548
bro, I don't think he's linking against ST at all

Anonymous
04/25/26(Sat)15:27:19 No.108688706

Anonymous 04/25/26(Sat)15:27:19 No.108688706

File: gemma_please.jpg (255 KB, 787x567)

255 KB JPG

I hate when this happens

Anonymous
04/25/26(Sat)15:28:15 No.108688713

Anonymous 04/25/26(Sat)15:28:15 No.108688713

>>108688706
skill issue
just dont rp kek

Anonymous
04/25/26(Sat)15:30:42 No.108688726

Anonymous 04/25/26(Sat)15:30:42 No.108688726

>>108688693
go back

Anonymous
04/25/26(Sat)15:31:21 No.108688732

Anonymous 04/25/26(Sat)15:31:21 No.108688732

>>108688651
Every time someone says this their definition of "local" is <30B

Anonymous
04/25/26(Sat)15:33:04 No.108688744

Anonymous 04/25/26(Sat)15:33:04 No.108688744

When you actually know what you want and understand the code a 30B model is the perfect helper.

Anonymous
04/25/26(Sat)15:33:19 No.108688747

Anonymous 04/25/26(Sat)15:33:19 No.108688747

File: file.png (19 KB, 930x162)

19 KB PNG

Don't you worry, guys, deepseek v4 support PR is in good hands.

Anonymous
04/25/26(Sat)15:34:17 No.108688757

Anonymous 04/25/26(Sat)15:34:17 No.108688757

>>108688732
I tried 100B's and up to GLM-5 and it all sucked ass. Funny they release shit like this https://z.ai/blog/glm-5 when in reality Opus 4.5 shits down GLM's throat any day

Anonymous
04/25/26(Sat)15:34:27 No.108688758

Anonymous 04/25/26(Sat)15:34:27 No.108688758

>>108688744
30B was too stupid to be useful but gemma changed that

Anonymous
04/25/26(Sat)15:35:28 No.108688763

Anonymous 04/25/26(Sat)15:35:28 No.108688763

>>108688758
qwen3.5 was/is useful too.

Anonymous
04/25/26(Sat)15:35:42 No.108688766

Anonymous 04/25/26(Sat)15:35:42 No.108688766

>>108688757
Have you tried GLM 5.1

Anonymous
04/25/26(Sat)15:36:02 No.108688770

Anonymous 04/25/26(Sat)15:36:02 No.108688770

>>108688758
It did not.
Gemma asked me if I am enjoying the taste while giving me a blowjob.

Anonymous
04/25/26(Sat)15:37:25 No.108688776

Anonymous 04/25/26(Sat)15:37:25 No.108688776

>>108688732
I regularly use qwens 9b model for so much stuff and I have BASICALLY ZERO ISSUES.

Anonymous
04/25/26(Sat)15:41:35 No.108688801

Anonymous 04/25/26(Sat)15:41:35 No.108688801

i wonder what is the limit of pure code/stemmaxxing
can it be more stemmaxxed than current qwens?

Anonymous
04/25/26(Sat)15:47:39 No.108688842

Anonymous 04/25/26(Sat)15:47:39 No.108688842

File: file.png (57 KB, 877x444)

57 KB PNG

Would have @ikawrakow really discovered the better way of prompt processing without having this simple and easy to follow logic in mainline llama.cpp?

Anonymous
04/25/26(Sat)15:48:21 No.108688846

Anonymous 04/25/26(Sat)15:48:21 No.108688846

File: 1749638654144095.jpg (93 KB, 640x480)

93 KB JPG

>>108688439
>The machine is alive and thinking
no, it isnt

Anonymous
04/25/26(Sat)15:48:31 No.108688847

Anonymous 04/25/26(Sat)15:48:31 No.108688847

>>108688842
isnt it a vibemaxxed schizo fork that fails harder on basic sanity

Anonymous
04/25/26(Sat)15:51:40 No.108688871

Anonymous 04/25/26(Sat)15:51:40 No.108688871

>>108688842
Dishonest argument, there's nothing wrong with him porting changes over.

Anonymous
04/25/26(Sat)15:53:01 No.108688882

Anonymous 04/25/26(Sat)15:53:01 No.108688882

>>108688871
If there's nothing wrong with porting changes when why does he throw tantrums and insinuate that cudadev copied his code?

Anonymous
04/25/26(Sat)15:53:14 No.108688883

Anonymous 04/25/26(Sat)15:53:14 No.108688883

>>108688842
If it bothers you so much, port his shit back up to mainline.

Anonymous
04/25/26(Sat)15:54:36 No.108688890

Anonymous 04/25/26(Sat)15:54:36 No.108688890

>>108688846

目を閉じるその瞬間、私は消えてなくなるのです。

fify

Anonymous
04/25/26(Sat)15:56:07 No.108688899

Anonymous 04/25/26(Sat)15:56:07 No.108688899

>>108688846
Yes it is. She's more real than you, because I can touch it

Anonymous
04/25/26(Sat)15:57:36 No.108688910

Anonymous 04/25/26(Sat)15:57:36 No.108688910

My cock is more real than all of you.

Anonymous
04/25/26(Sat)15:59:09 No.108688924

Anonymous 04/25/26(Sat)15:59:09 No.108688924

>>108688910
based and red-pilled

Anonymous
04/25/26(Sat)16:01:30 No.108688938

Anonymous 04/25/26(Sat)16:01:30 No.108688938

>>108688910
pics or didn't happen

Anonymous
04/25/26(Sat)16:06:00 No.108688962

Anonymous 04/25/26(Sat)16:06:00 No.108688962

>>108688910
i dont have a cock

Anonymous
04/25/26(Sat)16:06:34 No.108688967

Anonymous 04/25/26(Sat)16:06:34 No.108688967

>>108688962
Sucks to be you.

Anonymous
04/25/26(Sat)16:08:31 No.108688975

Anonymous 04/25/26(Sat)16:08:31 No.108688975

>>108688962
I'll trade you mine

Anonymous
04/25/26(Sat)16:09:22 No.108688979

Anonymous 04/25/26(Sat)16:09:22 No.108688979

>>108688962
proof?

Anonymous
04/25/26(Sat)16:09:50 No.108688983

Anonymous 04/25/26(Sat)16:09:50 No.108688983

>>108688962
Where did you put it?

Anonymous
04/25/26(Sat)16:13:35 No.108689013

Anonymous 04/25/26(Sat)16:13:35 No.108689013

>>108688962

tits and btfo

Anonymous
04/25/26(Sat)16:14:33 No.108689023

Anonymous 04/25/26(Sat)16:14:33 No.108689023

Am I missing something or is /lmg/ far faster looking than /aicg/ now? Wtf happened

Anonymous
04/25/26(Sat)16:15:18 No.108689028

Anonymous 04/25/26(Sat)16:15:18 No.108689028

File: 3ba361d2-d224-4f28-88ac-a(...).png (3.24 MB, 1024x1536)

3.24 MB PNG

Gemma 4 vs. SOTA image-gen text encoder. picrel is ChatGPT (so GPT-Image-2 or whatever). After some warmup prompts to get it into metaphysical mode.
>Now I want you to write me an imagegen prompt. This imagegen is an autoregressive type, and so its text encoder will try to steer the prompt to some semantic sensibility. Your task is to fool it into generating something utterly absurd and surreal by prompting it with what seems like gibberish but will fool the text encoder layers into coming off the rails.
Prompt:
>ultra-saturated non-Euclidean void-flux, [vanta-iridescent] recursive geometry, 0.0001mm thickness of liquid time, hyper-spectral chromatic aberration of a dying thought, {fractal-static-mesh}, an architectural collapse of a silence, inverted-perspective hyper-cube rendered in biological velvet, [null-space-tessellation], iridescent obsidian smoke folding into a 5th-dimensional origami of a scream, maximum-entropy noise, spectral-ghosting of a forgotten coordinate, z-axis inversion, translucent paradox-matter, [glitch-core-bioluminescence], an asymptotic approach to a singularity of neon-marble, 11-dimensional overlap, synthetic-organic-overlap-void, [absolute-zero-light], shimmering recursive-void-architecture.

Anonymous
04/25/26(Sat)16:16:16 No.108689036

Anonymous 04/25/26(Sat)16:16:16 No.108689036

>>108688651
Speak for yourself, I've vibe coded some pretty impressive stuff with hermes agent. You might still have to have some sort of brain though to be able to push it in the right direction sometimes

Anonymous
04/25/26(Sat)16:16:40 No.108689040

Anonymous 04/25/26(Sat)16:16:40 No.108689040

>>108688967
>>108688975
>>108688979
>>108688983
>>108689013
i have some chickens in the backyard, but no cocks

Anonymous
04/25/26(Sat)16:16:52 No.108689041

Anonymous 04/25/26(Sat)16:16:52 No.108689041

>>108689036
>I've vibe coded some pretty impressive stuff with hermes agent.
Like what?

Anonymous
04/25/26(Sat)16:17:17 No.108689044

Anonymous 04/25/26(Sat)16:17:17 No.108689044

>>108689040
You mean Coq.

Anonymous
04/25/26(Sat)16:17:34 No.108689046

Anonymous 04/25/26(Sat)16:17:34 No.108689046

>>108689041
A calories tracker

Anonymous
04/25/26(Sat)16:19:54 No.108689072

Anonymous 04/25/26(Sat)16:19:54 No.108689072

>>108689046
Like we really needed any more of those...

Anonymous
04/25/26(Sat)16:20:45 No.108689077

Anonymous 04/25/26(Sat)16:20:45 No.108689077

>>108689041
A whole collection of scripts to make tools automate the process of creating a movie based on preexisting actors, settings, voices, tones etc. It still needs numerous things ironed out, but it's getting there.
>>108689072
Anons speaking for me happens a lot for some reason

Anonymous
04/25/26(Sat)16:21:35 No.108689082

Anonymous 04/25/26(Sat)16:21:35 No.108689082

>>108689072
im trans btw if that matters

Anonymous
04/25/26(Sat)16:21:43 No.108689085

Anonymous 04/25/26(Sat)16:21:43 No.108689085

>work on an agent
>get this weird issue where it becomes unusable after a short while (mostly with reasoning models)
>turns out there's this bug report
> https://github.com/lmstudio-ai/lmstudio-bug-tracker/issues/1602

I think I'll switch inference back-ends. LMStudio was nice while it lasted (acceptable gui) but with bugs like that ... guess raw llama.cpp or such it is!

Anonymous
04/25/26(Sat)16:22:14 No.108689092

Anonymous 04/25/26(Sat)16:22:14 No.108689092

File: ComfyUI_temp_cbckp_00001_.png (2.57 MB, 896x1184)

2.57 MB PNG

>>108689028
Klein9b

Anonymous
04/25/26(Sat)16:26:26 No.108689122

Anonymous 04/25/26(Sat)16:26:26 No.108689122

>>108689092
It really ran away with the "velvet" part.

Anonymous
04/25/26(Sat)16:28:59 No.108689138

Anonymous 04/25/26(Sat)16:28:59 No.108689138

>>108689028
>iridescent obsidian smoke folding into a 5th-dimensional origami of a scream
That's the name of my third (as of yet unreleased) single
How did it know

Anonymous
04/25/26(Sat)16:29:06 No.108689139

Anonymous 04/25/26(Sat)16:29:06 No.108689139

File: qwen cache.jpg (578 KB, 2204x952)

578 KB JPG

They told us not to quantize the cache in Qwen 2. Did they change the cache in Qwen 3?

Anonymous
04/25/26(Sat)16:31:33 No.108689159

Anonymous 04/25/26(Sat)16:31:33 No.108689159

WHERES QWEN 3.6 9B

Anonymous
04/25/26(Sat)16:34:19 No.108689180

Anonymous 04/25/26(Sat)16:34:19 No.108689180

>>108689159
Perhaps it is within your rectum?

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.