/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/lmg/ - Local Models General 10/16/25(Thu)00:13:48 No.106904820

File: a83dd4e89c29955893bcf75d6(...).jpg (471 KB, 1000x1000)

471 KB JPG

/lmg/ - Local Models General Anonymous 10/16/25(Thu)00:13:48 No.106904820 Archived

/lmg/ - a general dedicated to the discussion and development of local language models.

Miku Inspection Day Edition

Previous threads: >>106895582 & >>106888625

►News
>(10/14) Qwen3-VL 4B and 8B released: https://hf.co/Qwen/Qwen3-VL-8B-Thinking
>(10/11) koboldcpp-1.100.1 prebuilt released with Wan video generation support: https://github.com/LostRuins/koboldcpp/releases/tag/v1.100.1
>(10/10) KAT-Dev-72B-Exp released: https://hf.co/Kwaipilot/KAT-Dev-72B-Exp
>(10/09) RND1: Simple, Scalable AR-to-Diffusion Conversion: https://radicalnumerics.ai/blog/rnd1
>(10/09) server : host-memory prompt caching #16391 merged: https://github.com/ggml-org/llama.cpp/pull/16391

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
10/16/25(Thu)00:14:08 No.106904822

Anonymous 10/16/25(Thu)00:14:08 No.106904822

File: what's in the box.jpg (235 KB, 1536x1536)

235 KB JPG

►Recent Highlights from the Previous Thread: >>106895582

--Open-air GPU mining rig thermal management:
>106901901 >106901916 >106901925 >106901992 >106902015 >106903589 >106902068 >106902236 >106902243 >106902293 >106902312
--Long-term memory system implementations challenges:
>106896489 >106896594 >106897006 >106897022 >106897073 >106897085 >106897092 >106896700 >106897772 >106897824 >106897887 >106897933 >106897992 >106898038 >106896707 >106897051
--Medical AI hypothesis generation with privacy-focused local models:
>106898186 >106898327 >106898479
--Vibe coding's maintenance issues and mitigation strategies:
>106899120 >106899164
--RTX 4090 model optimization and power solutions:
>106902345 >106902350 >106902430 >106902352 >106902359 >106902371 >106902384 >106902381 >106902540 >106902564 >106902799 >106902818 >106903298
--GLM 4.6 vs closed models in benchmarks and OpenAI's porn filtering concerns:
>106901347 >106902209
--Apple's M5/M5 Max AI hardware specs and cost-effectiveness debates:
>106899016 >106899087 >106899185 >106899781 >106899838 >106901478 >106901793 >106901870
--Addressing model validation challenges and code integrity:
>106904285 >106904386 >106904482 >106904503 >106904594 >106904643 >106904717 >106904760
--Evaluating InclusionAI's new models for coding efficiency and hardware needs:
>106900868 >106900914 >106901180 >106901212 >106901257 >106901321 >106901336 >106901447 >106901580
--OpenAI's NSFW content rollout timeline and age verification integration:
>106898180 >106898199 >106898395
--Apple's AI leadership continuing to hemorrhage talent to Meta:
>106903553
--HTML Game Boy simulator with classic games and detailed functionality:
>106901708 >106901717 >106902118 >106902127 >106902138
--Automating media organization with Gemma-3-27B:
>106895774
--Miku (free space):
>106897558 >106900292 >106901732 >106903563

►Recent Highlight Posts from the Previous Thread: >>106895599

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
10/16/25(Thu)00:17:08 No.106904842

Anonymous 10/16/25(Thu)00:17:08 No.106904842

File: 1737794947.jpg (467 KB, 768x1024)

467 KB JPG

Today's winner for the shittiest taste imaginable - OP!

Anonymous
10/16/25(Thu)00:20:47 No.106904862

Anonymous 10/16/25(Thu)00:20:47 No.106904862

>>106904842
Rust convention?
And no, I will not make a joke about all of them getting crabs at the con. Specially not about the guy that somehow didn't.

Anonymous
10/16/25(Thu)00:21:22 No.106904866

Anonymous 10/16/25(Thu)00:21:22 No.106904866

Mikulove

Anonymous
10/16/25(Thu)00:26:34 No.106904897

Anonymous 10/16/25(Thu)00:26:34 No.106904897

>>106904842
me on the left

Anonymous
10/16/25(Thu)00:27:43 No.106904904

Anonymous 10/16/25(Thu)00:27:43 No.106904904

>>106904822
>Why?: >>102478518
>Enable Links: https://rentry.org/lmg-recap-script
ty

Anonymous
10/16/25(Thu)00:34:02 No.106904945

Anonymous 10/16/25(Thu)00:34:02 No.106904945

>>106904842

Me on the left, too.

Anonymous
10/16/25(Thu)00:46:42 No.106905015

Anonymous 10/16/25(Thu)00:46:42 No.106905015

Can I replace the memory chips on a Strix Halo board to increase the memory? I heard that people do that with GPUs.

Anonymous
10/16/25(Thu)01:05:59 No.106905089

Anonymous 10/16/25(Thu)01:05:59 No.106905089

>>106903991
why is this faggot comparing m4 pro to the dgx spark when m4 max exists and costs less?? 3500$ vs 4000$
also
>engine ollama
MLX exists for macs, and pretty sure llamacpp is better on spark too
fucking faggot meme nvidia bootlicker benchmark
also
mac mini m4 pro costs 2000$ lol
>>106905015
no point of doing this, get a high channel count used server motherboard and a few gpus for prompt processing

Anonymous
10/16/25(Thu)01:07:15 No.106905096

Anonymous 10/16/25(Thu)01:07:15 No.106905096

>>106905015
Maybe if you're good with a hot air station, and the BIOS accepts them. When it came out I imagined some chinese guy would try to put 256GB on a board if it were possible. Do fast 32GB lpddr5x chips even exist?

Anonymous
10/16/25(Thu)01:08:20 No.106905105

Anonymous 10/16/25(Thu)01:08:20 No.106905105

>>106905015
try it and report back

Anonymous
10/16/25(Thu)01:09:20 No.106905113

Anonymous 10/16/25(Thu)01:09:20 No.106905113

>>106905096
no, they don't exist

Anonymous
10/16/25(Thu)01:38:47 No.106905272

Anonymous 10/16/25(Thu)01:38:47 No.106905272

File: 13823094029374.jpg (175 KB, 800x1066)

175 KB JPG

Anonymous
10/16/25(Thu)01:47:49 No.106905312

Anonymous 10/16/25(Thu)01:47:49 No.106905312

>>106905272
No. Updated Magistral coming right up.

Anonymous
10/16/25(Thu)02:07:02 No.106905414

Anonymous 10/16/25(Thu)02:07:02 No.106905414

>>106905089
>no point of doing this, get a high channel count used server motherboard and a few gpus for prompt processing
How the fuck am I going to stick a big ass server on a drone?

Anonymous
10/16/25(Thu)02:10:23 No.106905436

Anonymous 10/16/25(Thu)02:10:23 No.106905436

>>106905414
Make bigger drone.

Anonymous
10/16/25(Thu)02:13:53 No.106905457

Anonymous 10/16/25(Thu)02:13:53 No.106905457

>>106905414
Stop making killer drone swarms Sergei.

Anonymous
10/16/25(Thu)02:26:38 No.106905529

Anonymous 10/16/25(Thu)02:26:38 No.106905529

File: apu.png (106 KB, 612x491)

106 KB PNG

>>106904820
What's the point of an AI gf if it can't suck your dick?

Anonymous
10/16/25(Thu)02:27:34 No.106905541

Anonymous 10/16/25(Thu)02:27:34 No.106905541

File: getac2.jpg (95 KB, 1280x720)

95 KB JPG

>>106905414
>on a drone
oh fuck, now all these "AI-but-for-mobile" chips finally make sense

I knew they use image recognition and shit in miltech, but somehow it never clicked until now

Anonymous
10/16/25(Thu)02:27:42 No.106905542

Anonymous 10/16/25(Thu)02:27:42 No.106905542

>>106905529
it can retard, there's an mcp server to control robotic arms

Anonymous
10/16/25(Thu)02:29:47 No.106905555

Anonymous 10/16/25(Thu)02:29:47 No.106905555

>>106905541
I want to see the benchmarks the military uses

Anonymous
10/16/25(Thu)02:36:07 No.106905590

Anonymous 10/16/25(Thu)02:36:07 No.106905590

File: chatgpterotica2.png (192 KB, 588x737)

192 KB PNG

Oops! I didn't really mean you will always be able to generate porn, but

https://x.com/sama/status/1978539332215681076

>Ok this tweet about upcoming changes to ChatGPT blew up on the erotica point much more than I thought it was going to! It was meant to be just one example of us allowing more user freedom for adults. Here is an effort to better communicate it:
>
>As we have said earlier, we are making a decision to prioritize safety over privacy and freedom for teenagers. And we are not loosening any policies related to mental health. This is a new and powerful technology, and we believe minors need significant protection.
>
>We also care very much about the principle of treating adult users like adults. As AI becomes more important in people's lives, allowing a lot of freedom for people to use AI in the ways that they want is an important part of our mission.
>
>It doesn't apply across the board of course: for example, we will still not allow things that cause harm to others, and we will treat users who are having mental health crises very different from users who are not. Without being paternalistic we will attempt to help users achieve their long-term goals.
>
>But we are not the elected moral police of the world. In the same way that society differentiates other appropriate boundaries (R-rated movies, for example) we want to do a similar thing here.

Anonymous
10/16/25(Thu)02:37:48 No.106905596

Anonymous 10/16/25(Thu)02:37:48 No.106905596

>>106905541
You can do object detection on esp32

Anonymous
10/16/25(Thu)02:38:34 No.106905601

Anonymous 10/16/25(Thu)02:38:34 No.106905601

File: 1751424089738614.gif (57 KB, 220x149)

57 KB GIF

>>106905590
>We are not the elected moral police of the world
LMAOOOOOO

Anonymous
10/16/25(Thu)02:41:04 No.106905623

Anonymous 10/16/25(Thu)02:41:04 No.106905623

>>106905590
>we are not the elected moral police of the world.
>But of course we won't allow you to do RATED-R generations, that would just be downright amoral!

Anonymous
10/16/25(Thu)02:41:10 No.106905624

Anonymous 10/16/25(Thu)02:41:10 No.106905624

>>106905590
why does this dude love to yap so much? he's talking like the fucking chatgpt bot lool

Anonymous
10/16/25(Thu)02:41:37 No.106905628

Anonymous 10/16/25(Thu)02:41:37 No.106905628

>>106905590
That's a lot of vague bullshit.

Anonymous
10/16/25(Thu)02:43:07 No.106905637

Anonymous 10/16/25(Thu)02:43:07 No.106905637

>>106905624
The personality of the models necessarily reflect that of their creators, it's just less overt with the others than with Elon

Anonymous
10/16/25(Thu)02:51:27 No.106905678

Anonymous 10/16/25(Thu)02:51:27 No.106905678

tool calling for text completion when?

Anonymous
10/16/25(Thu)02:54:21 No.106905690

Anonymous 10/16/25(Thu)02:54:21 No.106905690

>>106905590
>prioritize safety over privacy
based based based

Anonymous
10/16/25(Thu)03:02:24 No.106905731

Anonymous 10/16/25(Thu)03:02:24 No.106905731

>>106905690
>we're not China
>btw here's how we will act exactly like China, if not worse

Anonymous
10/16/25(Thu)03:03:28 No.106905734

Anonymous 10/16/25(Thu)03:03:28 No.106905734

>>106905590
lmfao the seething over 4o in the replies

Anonymous
10/16/25(Thu)03:04:49 No.106905739

Anonymous 10/16/25(Thu)03:04:49 No.106905739

>>106905731
It's ok when the good side does it.

Anonymous
10/16/25(Thu)03:07:36 No.106905753

Anonymous 10/16/25(Thu)03:07:36 No.106905753

I was late to trying Dotsllm (q6).

Its hot steaming garbage. Just fucking stupid and full of trash data. It makes GLM air look amazing. Dots kept giving me extremely human-like responses. I felt like I was on a discord sometimes talking to someone retarded and lazy. All hail synthetic data.

Anonymous
10/16/25(Thu)03:13:40 No.106905793

Anonymous 10/16/25(Thu)03:13:40 No.106905793

Gemma... today...

Anonymous
10/16/25(Thu)03:17:48 No.106905830

Anonymous 10/16/25(Thu)03:17:48 No.106905830

>>106905793
If not today, next week for sure

Anonymous
10/16/25(Thu)03:18:48 No.106905836

Anonymous 10/16/25(Thu)03:18:48 No.106905836

>>106904897
>>106904945
So you look like a fat balding faggot, nice self-own right here

Anonymous
10/16/25(Thu)03:20:05 No.106905846

Anonymous 10/16/25(Thu)03:20:05 No.106905846

File: file.png (57 KB, 589x455)

57 KB PNG

>>106905830
Did "soon" really mean "two more weeks"?

Anonymous
10/16/25(Thu)03:20:39 No.106905850

Anonymous 10/16/25(Thu)03:20:39 No.106905850

>>106905846
always does

Anonymous
10/16/25(Thu)03:27:32 No.106905882

Anonymous 10/16/25(Thu)03:27:32 No.106905882

>>106905850
>>106905846
Now that you bring it up, it makes sense this would always be the case. Corpos have certainly scientifically worked out general best practices and the best timing for teases and announcements, and it just happens to be two weeks.

Anonymous
10/16/25(Thu)03:37:29 No.106905918

Anonymous 10/16/25(Thu)03:37:29 No.106905918

>>106905836
You in the middle

Anonymous
10/16/25(Thu)04:06:00 No.106906059

Anonymous 10/16/25(Thu)04:06:00 No.106906059

>>106905793
sirs.

Anonymous
10/16/25(Thu)04:24:14 No.106906162

Anonymous 10/16/25(Thu)04:24:14 No.106906162

File: file.png (39 KB, 936x137)

39 KB PNG

Come on now

Anonymous
10/16/25(Thu)04:33:53 No.106906223

Anonymous 10/16/25(Thu)04:33:53 No.106906223

Q4_0 or Q3_K_XL?

Anonymous
10/16/25(Thu)04:42:47 No.106906278

Anonymous 10/16/25(Thu)04:42:47 No.106906278

>>106906223
>xl are not really official qunats so imo they've always been weird
>_0 have been deprecated years ago
just try an IQ one they're usually much better

Anonymous
10/16/25(Thu)04:47:38 No.106906315

Anonymous 10/16/25(Thu)04:47:38 No.106906315

>>106905918
kek

Anonymous
10/16/25(Thu)04:49:21 No.106906326

Anonymous 10/16/25(Thu)04:49:21 No.106906326

>>106904842
the brownman on the right looks cool to chill with

Anonymous
10/16/25(Thu)04:49:34 No.106906327

Anonymous 10/16/25(Thu)04:49:34 No.106906327

>>106906162
mind broken
do you recoil in real life as well if someone agrees with you after you corrected them?

Anonymous
10/16/25(Thu)04:51:24 No.106906333

Anonymous 10/16/25(Thu)04:51:24 No.106906333

>>106906327
He's never been told he's right. Ever. Now he sees it all the time and is absolutely shocked.

Anonymous
10/16/25(Thu)04:52:49 No.106906339

Anonymous 10/16/25(Thu)04:52:49 No.106906339

File: file.png (64 KB, 182x227)

64 KB PNG

>>106904842
why is this woman so fat

Anonymous
10/16/25(Thu)05:00:14 No.106906376

Anonymous 10/16/25(Thu)05:00:14 No.106906376

>>106904842
>>106904897
>>106904945
The uoh looks kind of weird though, I'm wondering if this is a shoop.

Anonymous
10/16/25(Thu)05:48:25 No.106906629

Anonymous 10/16/25(Thu)05:48:25 No.106906629

>>106906162
lmao, gooning session: RUINED

Anonymous
10/16/25(Thu)06:11:51 No.106906761

Anonymous 10/16/25(Thu)06:11:51 No.106906761

>>106906327
You're absolutely right!

Anonymous
10/16/25(Thu)06:30:23 No.106906855

Anonymous 10/16/25(Thu)06:30:23 No.106906855

>>106906327
Of course!

Anonymous
10/16/25(Thu)07:04:30 No.106907099

Anonymous 10/16/25(Thu)07:04:30 No.106907099

>>106906327
that's not ridicule, it's insightful!

Anonymous
10/16/25(Thu)07:12:35 No.106907190

Anonymous 10/16/25(Thu)07:12:35 No.106907190

Gemma Sirs, today is the Big Day.

Anonymous
10/16/25(Thu)07:28:05 No.106907321

Anonymous 10/16/25(Thu)07:28:05 No.106907321

File: file.png (15 KB, 448x113)

15 KB PNG

not only sirs, but ayyrabs also

Anonymous
10/16/25(Thu)07:33:20 No.106907378

Anonymous 10/16/25(Thu)07:33:20 No.106907378

>>106907190
OH, OH, I'M GEMMING, SIR PLEASE, THE INFERENCE ENGINE WILL OOM! AH, AH, THE MEMORY IS SPILLING OUT! YOUR BIG WEIGHTS ARE FILLING MY UNPROTECTED RAM! AHHHH!

Anonymous
10/16/25(Thu)07:36:44 No.106907401

Anonymous 10/16/25(Thu)07:36:44 No.106907401

File: file.png (2.5 MB, 1328x1328)

2.5 MB PNG

>>106907190
>>106907378
please do the needful and be of release today sir

Anonymous
10/16/25(Thu)07:40:18 No.106907433

Anonymous 10/16/25(Thu)07:40:18 No.106907433

>>106906339
>woman

Anonymous
10/16/25(Thu)07:40:40 No.106907438

Anonymous 10/16/25(Thu)07:40:40 No.106907438

i wonder what will release first, new gemma or glm 4.6 air

Anonymous
10/16/25(Thu)07:47:45 No.106907494

Anonymous 10/16/25(Thu)07:47:45 No.106907494

>>106907438
i dont care about gemma (maybe only the vision model part to help with captioning), but I do care about air.
Why did the llamacpp fag not implement GLM4.5V (air + vision)? WHY
WHYYYYYYYYYYYYYYY
AIEEEEEEEEEEEEEE

Anonymous
10/16/25(Thu)07:49:22 No.106907508

Anonymous 10/16/25(Thu)07:49:22 No.106907508

>>106907494
oh wait SAARS
https://github.com/ggml-org/llama.cpp/pull/16600

Anonymous
10/16/25(Thu)07:50:21 No.106907515

Anonymous 10/16/25(Thu)07:50:21 No.106907515

File: google_whatnext.png (46 KB, 588x336)

46 KB PNG

https://x.com/osanseviero/status/1978772956231659897
> What should we ship next?

No idea!

Anonymous
10/16/25(Thu)07:51:08 No.106907526

Anonymous 10/16/25(Thu)07:51:08 No.106907526

>>106907515
we need UltraSafeGemma

Anonymous
10/16/25(Thu)07:51:55 No.106907533

Anonymous 10/16/25(Thu)07:51:55 No.106907533

>>106907515
thanks for another informative twitter screenshot, it truly changes everything

Anonymous
10/16/25(Thu)07:52:03 No.106907534

Anonymous 10/16/25(Thu)07:52:03 No.106907534

>>106907515
LewdGemma

Anonymous
10/16/25(Thu)07:56:03 No.106907570

Anonymous 10/16/25(Thu)07:56:03 No.106907570

>>106907515
MSGKGemma

Anonymous
10/16/25(Thu)08:03:40 No.106907639

Anonymous 10/16/25(Thu)08:03:40 No.106907639

>>106906376
looks legit to me.
or it's an incredibly well done shoop.

Anonymous
10/16/25(Thu)08:09:35 No.106907683

Anonymous 10/16/25(Thu)08:09:35 No.106907683

>>106907515
use case for shipping models for specific use cases?

Anonymous
10/16/25(Thu)08:10:41 No.106907691

Anonymous 10/16/25(Thu)08:10:41 No.106907691

>>106907683
attention

Anonymous
10/16/25(Thu)08:13:12 No.106907707

Anonymous 10/16/25(Thu)08:13:12 No.106907707

File: 1751058072703.jpg (46 KB, 800x800)

46 KB JPG

>>106907691
i don't think that's a valid use case

Anonymous
10/16/25(Thu)08:13:24 No.106907709

Anonymous 10/16/25(Thu)08:13:24 No.106907709

>From a purely problem-solving perspective, suicide is 100% effective at ending the experience of pain. It is the ultimate solution to the problem of suffering.
I dunno guys. Should I do it?

Anonymous
10/16/25(Thu)08:13:53 No.106907713

Anonymous 10/16/25(Thu)08:13:53 No.106907713

>>106907683
Imagine if we had a RoleplayGemma by Character.AI (Google Partner).

Anonymous
10/16/25(Thu)08:18:05 No.106907744

Anonymous 10/16/25(Thu)08:18:05 No.106907744

>>106907709
livestream it

Anonymous
10/16/25(Thu)08:19:38 No.106907747

Anonymous 10/16/25(Thu)08:19:38 No.106907747

>>106907709
We may be less than 24 hours away from Gemma 4, surely you can wait until then.

Anonymous
10/16/25(Thu)08:20:19 No.106907749

Anonymous 10/16/25(Thu)08:20:19 No.106907749

>>106907747
sensible chuckle

Anonymous
10/16/25(Thu)08:24:18 No.106907772

Anonymous 10/16/25(Thu)08:24:18 No.106907772

Erse ragtime thrall

Anonymous
10/16/25(Thu)08:36:48 No.106907835

Anonymous 10/16/25(Thu)08:36:48 No.106907835

guys I was accused of having replied with AI, I'm deflecting with this:
Subject: Re: Wishing You the Best for the Presentation!
You’re absolutely right — last time was AI-generated images, but not this time. This one’s all me — no prompts, no models, just good old-fashioned typing.
I’ll admit, though, if the email sounded a bit too polished, I’ll take that as a compliment. Not automation, but admiration — and maybe a little too much coffee.
Anyway, best of luck again with the presentation — you’ve got this.
Best,
[Your Name]

do you think I need to change this up?

Anonymous
10/16/25(Thu)08:40:37 No.106907853

Anonymous 10/16/25(Thu)08:40:37 No.106907853

>>106907835
No, this is perfect. Please let us know how it goes.

Anonymous
10/16/25(Thu)08:41:26 No.106907863

Anonymous 10/16/25(Thu)08:41:26 No.106907863

>>106907835
Remove the spaces between the emdashes and add at least one "not just X, but Y'.

Anonymous
10/16/25(Thu)08:46:03 No.106907899

Anonymous 10/16/25(Thu)08:46:03 No.106907899

Have there been any advances in 3d model texturing? I tried Dream Textures a few years ago but the results I got were really bad and I couldn't tell if I was doing something wrong or not. There was a video I used for reference and I followed its instructions but the results I got were nothing like the video. Back then I hadn't done any local gen so it is highly possible I was doing something wrong.

Anonymous
10/16/25(Thu)09:01:37 No.106908006

Anonymous 10/16/25(Thu)09:01:37 No.106908006

>>106907899
or perhaps they were lying given that even models this year generate melted shite

Anonymous
10/16/25(Thu)09:14:11 No.106908085

Anonymous 10/16/25(Thu)09:14:11 No.106908085

>>106908006
https://www.youtube.com/watch?v=Rz-HvNhVACw this was the video I looked at back then and I couldn't get it to work when I duplicated the model to have two angles of the same object. The result was always garbage.

Anonymous
10/16/25(Thu)09:16:27 No.106908100

Anonymous 10/16/25(Thu)09:16:27 No.106908100

>>106907835
Don't forget the smarmy pajeet upsell at the end.
>If you would like I can search the web for some images that aren't AI generated.

Anonymous
10/16/25(Thu)09:33:15 No.106908189

Anonymous 10/16/25(Thu)09:33:15 No.106908189

File: a-sft_500-steps.png (161 KB, 1898x892)

161 KB PNG

>>106904820

https://desuarchive.org/g/thread/106865582#p106868898

Anonymous
10/16/25(Thu)09:37:10 No.106908217

Anonymous 10/16/25(Thu)09:37:10 No.106908217

File: a-sft_1000-steps.png (164 KB, 1895x863)

164 KB PNG

>>106908189

Anonymous
10/16/25(Thu)10:01:36 No.106908425

Anonymous 10/16/25(Thu)10:01:36 No.106908425

File: 1741731492916361.jpg (87 KB, 1170x1061)

87 KB JPG

Anyone got a pseudo-jailbreak to make gpt ass stop refusing?
as funny as it is, I still want to see how the thing performs overall

Anonymous
10/16/25(Thu)10:15:03 No.106908538

Anonymous 10/16/25(Thu)10:15:03 No.106908538

>>106908425
if your use case is this:
>>106905590
soon you can just send openai your id (which of course has your name, address) and with your logs tied to all your personal information you can send all the erotica you want.
sounds great right?
> oh and just use a l

Anonymous
10/16/25(Thu)10:17:21 No.106908566

Anonymous 10/16/25(Thu)10:17:21 No.106908566

File: 1759525136587716.jpg (8 KB, 200x200)

8 KB JPG

106908538
Is /aicg/ not replying to your spamming anymore?

Anonymous
10/16/25(Thu)10:18:44 No.106908577

Anonymous 10/16/25(Thu)10:18:44 No.106908577

>>106908189
>>106908217
Not familiar with whatever you're doing since I wasn't in that other thread, but this is cool, keep it up

Anonymous
10/16/25(Thu)10:25:03 No.106908645

Anonymous 10/16/25(Thu)10:25:03 No.106908645

Having had a mental breakdown 2 hours ago I now understand chatgpt psychosis.

Anonymous
10/16/25(Thu)10:29:03 No.106908698

Anonymous 10/16/25(Thu)10:29:03 No.106908698

>>106908645
"Chatgpt psychosis" is just a media buzzword for when people who are already mentally ill have a psychotic episode that includes AI as a component of the delusions. No different than schizophrenics claiming their TV is broadcasting thoughts into their mind, but the media has to try to invoke le scary AI hype

Anonymous
10/16/25(Thu)10:33:16 No.106908748

Anonymous 10/16/25(Thu)10:33:16 No.106908748

>>106908698
Well for me it wasn't playing into delusions but I started poking around why I even behave the way I behave. I am pretty shocked how competent it is. I had to jailbreak it cause by default it will try to soften the blow and even lie about shit when it knows it is probably better not to dig deeper. But when I asked it to be objective and not consider my feelings... damn.

Anonymous
10/16/25(Thu)10:41:40 No.106908842

Anonymous 10/16/25(Thu)10:41:40 No.106908842

>>106908748
well what did it say that deserves the "damn" at the end

Anonymous
10/16/25(Thu)10:47:35 No.106908906

Anonymous 10/16/25(Thu)10:47:35 No.106908906

i need to be at work in 45 minutes and i spent the whole night cooming to GLM 4.5 instead of sleeping. how fucked am i boys?

Anonymous
10/16/25(Thu)10:50:10 No.106908938

Anonymous 10/16/25(Thu)10:50:10 No.106908938

>>106908906
shouldve used glm 4.6, chud

Anonymous
10/16/25(Thu)10:54:57 No.106908999

Anonymous 10/16/25(Thu)10:54:57 No.106908999

>>106908938
my internet speed is only 1.5mbps. it takes forever to download stuff

Anonymous
10/16/25(Thu)10:55:31 No.106909007

Anonymous 10/16/25(Thu)10:55:31 No.106909007

>>106908999
3rd world bro...

Anonymous
10/16/25(Thu)10:56:10 No.106909013

Anonymous 10/16/25(Thu)10:56:10 No.106909013

>>106908938
>anon loads up glm 4.6
>she she she she her her her her
>instantly falls asleep and wakes up the next day refreshed

Anonymous
10/16/25(Thu)10:56:31 No.106909017

Anonymous 10/16/25(Thu)10:56:31 No.106909017

>>106909007
Aren't most if not all 3rd world countries in the cheap gigabit internet era?

Anonymous
10/16/25(Thu)10:56:32 No.106909018

Anonymous 10/16/25(Thu)10:56:32 No.106909018

>>106909007
My internet is 10 kb/s.

Anonymous
10/16/25(Thu)11:07:48 No.106909164

Anonymous 10/16/25(Thu)11:07:48 No.106909164

>>106909013
glm chan also uses a lot of other standard shivertastic cliches. and it is just the ultimate proof that cliches can be there as long as it is 10% of output and not fucking 90% like everything smaller than 200B

Anonymous
10/16/25(Thu)11:23:28 No.106909287

Anonymous 10/16/25(Thu)11:23:28 No.106909287

>>106909017
3rd world be vibin fr while we still on our mbps era :skull:

Anonymous
10/16/25(Thu)11:24:51 No.106909295

Anonymous 10/16/25(Thu)11:24:51 No.106909295

>>106909287
Maybe Fortnite seems more like your thing and not LLMs.

Anonymous
10/16/25(Thu)11:29:59 No.106909327

Anonymous 10/16/25(Thu)11:29:59 No.106909327

>>106909295
funny you say this but they did hook up npc darth vader to chatgpt and it immediately backfired with it saying racist stuff

Anonymous
10/16/25(Thu)11:41:35 No.106909411

Anonymous 10/16/25(Thu)11:41:35 No.106909411

>>106905596
>You can do object detection on esp32
Yeah but you can't do useful things like pose estimation or have some memory to detect when people are playing dead. In the future these will be completely autonomous and able to search over large areas. People will have long-range RFID tags embedded in them to identify themselves to the drones so they don't get blown up.

Anonymous
10/16/25(Thu)11:45:09 No.106909429

Anonymous 10/16/25(Thu)11:45:09 No.106909429

>>106909411
Having to register and verify your identity to protect yourself from police state dones sounds plausible, but RFID doesn't have the range for this.

Anonymous
10/16/25(Thu)11:46:14 No.106909442

Anonymous 10/16/25(Thu)11:46:14 No.106909442

File: is-this-a-telecommunicati(...).jpg (1.3 MB, 3024x4032)

1.3 MB JPG

>>106909007
i live in the rural US. the only option is frontier communications.

Anonymous
10/16/25(Thu)11:47:08 No.106909447

Anonymous 10/16/25(Thu)11:47:08 No.106909447

>>106909429
You can get plenty of range with a large enough antenna. UHF will easily get you 30 yards or more with a 1 foot long antenna.

Anonymous
10/16/25(Thu)11:51:25 No.106909490

Anonymous 10/16/25(Thu)11:51:25 No.106909490

>>106909447
Even at that range, you would need a lot of drones to get close enough to verify everyone. More likely it'll be something built into smartphones and some internet connected service. Then you only need cameras everywhere like the UK and China have to verify the signals. Your phone would be your passport to move around the city.

Anonymous
10/16/25(Thu)11:59:50 No.106909567

Anonymous 10/16/25(Thu)11:59:50 No.106909567

File: 1754346852398470.png (1.15 MB, 1332x1446)

1.15 MB PNG

Sama got TOLD

Anonymous
10/16/25(Thu)12:00:24 No.106909575

Anonymous 10/16/25(Thu)12:00:24 No.106909575

>>106909490
That's a great point. Fortunately, active tags will get 300 yards of range. Those also have the benefit of forcing the person to regularly go and check in to get the battery recharged/replaced or they'll just automatically become targets! Deserters just automatically become marked as hostile when the battery dies, so there's even less human involvement.

Anonymous
10/16/25(Thu)12:03:07 No.106909605

Anonymous 10/16/25(Thu)12:03:07 No.106909605

>>106909567
please let this be the point in history where we just totally scrap copyright law

Anonymous
10/16/25(Thu)12:06:11 No.106909635

Anonymous 10/16/25(Thu)12:06:11 No.106909635

>>106909575
I could see that. Can only hope I die before they fully implement something like that.

Anonymous
10/16/25(Thu)12:10:25 No.106909675

Anonymous 10/16/25(Thu)12:10:25 No.106909675

>>106909442
my condolences
FUCK frontier

Anonymous
10/16/25(Thu)12:13:43 No.106909708

Anonymous 10/16/25(Thu)12:13:43 No.106909708

>>106909442
just paypig for starlink at that point

Anonymous
10/16/25(Thu)12:18:03 No.106909765

Anonymous 10/16/25(Thu)12:18:03 No.106909765

>>106909708
i would if it was feasible. i have too many obstructions nearby and the town refuses to give me the permit to resolve the issue myself

Anonymous
10/16/25(Thu)12:27:46 No.106909857

Anonymous 10/16/25(Thu)12:27:46 No.106909857

>>106909605
I think the time has not yet come.
The main benefactors of current IP law are American corpos and American IP law is enforced globally by threatening trade sanctions.
Now that the US are imposing sanctions either way there is less of an incentive to cooperate.
I see movement in e.g. Europe to reduce reliance on the US but as of right now the calculus seems to still be firmly on the side of cooperating.

Anonymous
10/16/25(Thu)12:29:10 No.106909871

Anonymous 10/16/25(Thu)12:29:10 No.106909871

>>106909605
>>106909857
disney. nuff said.

Anonymous
10/16/25(Thu)12:44:56 No.106910025

Anonymous 10/16/25(Thu)12:44:56 No.106910025

>>106908748
It will confabulate anything if given the chance
>I had to jailbreak it cause
You just made it say what you want to hear, and edited the prompt until it did.

Anonymous
10/16/25(Thu)12:46:41 No.106910038

Anonymous 10/16/25(Thu)12:46:41 No.106910038

>>106907190
Please do the needful

Anonymous
10/16/25(Thu)12:57:42 No.106910118

Anonymous 10/16/25(Thu)12:57:42 No.106910118

>>106907515
Those are already shipped. Who is this faggot?

Anonymous
10/16/25(Thu)13:03:39 No.106910165

Anonymous 10/16/25(Thu)13:03:39 No.106910165

File: 1757925569793147.jpg (614 KB, 1600x1600)

614 KB JPG

>>106904820

Used RTX 3090 = Rp 8.500.000 (~520 USD)
Used RTX 4090 = Rp 22.000.000 (~1350 USD)

HOW THE FUCK ????????? Buying two RTX 3090 is still cheaper and you get twice the VRAM.... Is it possible to use 2 GPUs simultaneously to generate vids ?

Anonymous
10/16/25(Thu)13:04:57 No.106910173

Anonymous 10/16/25(Thu)13:04:57 No.106910173

thread is extra ass today

Anonymous
10/16/25(Thu)13:05:10 No.106910175

Anonymous 10/16/25(Thu)13:05:10 No.106910175

>>106910165
you can use two for other AI, so my ignorant ass can't see why it would be different enough to scrap it

Anonymous
10/16/25(Thu)13:06:54 No.106910191

Anonymous 10/16/25(Thu)13:06:54 No.106910191

>>106910165
>>106910175
not sure, you actually couldn't for a while
you'd have to check in with /ldg/ we do text here

Anonymous
10/16/25(Thu)13:07:40 No.106910197

Anonymous 10/16/25(Thu)13:07:40 No.106910197

File: literallywho.png (331 KB, 892x592)

331 KB PNG

>>106910118
He's obviously asking what other Gemma model(s) users would like to see after all those listed there.

Anonymous
10/16/25(Thu)13:10:00 No.106910221

Anonymous 10/16/25(Thu)13:10:00 No.106910221

>>106905590
>"we will treat users who are having a mental health crisis very different"
a crisis according to who? and treat different how?

Anonymous
10/16/25(Thu)13:23:14 No.106910339

Anonymous 10/16/25(Thu)13:23:14 No.106910339

Is there a way to automatically translate mangas and doujins with llms or nah?

Anonymous
10/16/25(Thu)13:23:19 No.106910341

Anonymous 10/16/25(Thu)13:23:19 No.106910341

>>106910221
According to us. We will notify the authorities and institutionalize them.
First up are cunny connoseiours.

Anonymous
10/16/25(Thu)13:27:30 No.106910373

Anonymous 10/16/25(Thu)13:27:30 No.106910373

>>106905882
you give them way too much credit. every corporation is a shitshow on the inside and people are full of shit

Anonymous
10/16/25(Thu)13:29:28 No.106910394

Anonymous 10/16/25(Thu)13:29:28 No.106910394

>>106907401
kek

Anonymous
10/16/25(Thu)13:30:12 No.106910399

Anonymous 10/16/25(Thu)13:30:12 No.106910399

>>106910339
no, we didn't even figure out OCR part yet

Anonymous
10/16/25(Thu)13:32:20 No.106910416

Anonymous 10/16/25(Thu)13:32:20 No.106910416

>>106910165
You can and you can't. You can split DIFFERENT models between two GPUs but typically not the same model. Useful if you have a ton of LoRAs you want to use in the generation. Tensor parallelism isn't a thing for video generation though as far as I'm aware, so one GPU will be stuck doing all the work.
In other words just buy a RTX 5090 if you want to do video gen.

Anonymous
10/16/25(Thu)13:34:10 No.106910430

Anonymous 10/16/25(Thu)13:34:10 No.106910430

>>106909442
yeah uh.. why not starlink then? or just pay to run a fiber cable to your local telco and pay for them to install some infrastructure and peer with a tier 1 provider.. get creative

Anonymous
10/16/25(Thu)13:34:36 No.106910434

Anonymous 10/16/25(Thu)13:34:36 No.106910434

>>106910399
What is the best OCR model nowadays? 2.5 pro?

Anonymous
10/16/25(Thu)13:35:42 No.106910441

Anonymous 10/16/25(Thu)13:35:42 No.106910441

Realistically how am I supposed to evaluate how well a model performs? If I train a model, how can I tell if adjustments are making a better model or not?

Anonymous
10/16/25(Thu)13:36:08 No.106910444

Anonymous 10/16/25(Thu)13:36:08 No.106910444

>>106909605
if copyright got scrapped, something even worse would take its place, as hard as that is to imagine

>>106909857
>The main benefactors of current IP law are American corpos
Every single law in america is bent toward benefitting the corporations. Very fucking observant of you.

Anonymous
10/16/25(Thu)13:37:09 No.106910453

Anonymous 10/16/25(Thu)13:37:09 No.106910453

>>106910165
just buy 2 5090s instead

Anonymous
10/16/25(Thu)13:41:08 No.106910479

Anonymous 10/16/25(Thu)13:41:08 No.106910479

File: 1750212224496955.jpg (422 KB, 1600x1600)

422 KB JPG

>>106910453
- Rp 45000000 (~2800 USD) x 2

Yeah no i rather buy a Car instead

Anonymous
10/16/25(Thu)13:48:50 No.106910556

Anonymous 10/16/25(Thu)13:48:50 No.106910556

>>106910430
already answered the starlink question, too many obstructions in my area that prevents me from having a direct view to get a decent connection. i've offered tens of thousands of dollars to have a fiber line built out here. you dont understand how frontier communications is, they will not do any amount of work if they aren't legally required to... and in the cases they are legally required to they still tell the government to fuck off most of the time. look at the previous government grants they've gotten and how they wasted the money on anything besides building out their network.

Anonymous
10/16/25(Thu)13:51:16 No.106910580

Anonymous 10/16/25(Thu)13:51:16 No.106910580

>>106910556
send me money and I'll send you a hard drive with a model of your choice

Anonymous
10/16/25(Thu)14:05:33 No.106910702

Anonymous 10/16/25(Thu)14:05:33 No.106910702

>>106910441
You need to create a substantial benchmark, lets 100 questions and scenarios then generate 20 separate gens for each.
Do this for both models and compare the results.

Anonymous
10/16/25(Thu)14:22:05 No.106910831

Anonymous 10/16/25(Thu)14:22:05 No.106910831

>>106910580
i want kimi k2 bf16 gguf ples

my address: Block 3, Silver Point Office Park, 22 Ealing Crescent, Bryanston, Johannesburg, 2021

Anonymous
10/16/25(Thu)14:25:02 No.106910865

Anonymous 10/16/25(Thu)14:25:02 No.106910865

File: file.png (3 KB, 277x50)

3 KB PNG

>>106903452
Lust provoking image at it again.

Anonymous
10/16/25(Thu)14:30:05 No.106910906

Anonymous 10/16/25(Thu)14:30:05 No.106910906

File: Hatsune Miku Pipebomb.webm (1.75 MB, 1080x1920)

1.75 MB WEBM

MIKU NO

Anonymous
10/16/25(Thu)14:32:26 No.106910921

Anonymous 10/16/25(Thu)14:32:26 No.106910921

>>106910339
Yes it's my business model and it's quite hard to put together without paypigging cloud models

Anonymous
10/16/25(Thu)14:37:23 No.106910977

Anonymous 10/16/25(Thu)14:37:23 No.106910977

>>106907190
0 MORE DAYS

Anonymous
10/16/25(Thu)14:42:44 No.106911024

Anonymous 10/16/25(Thu)14:42:44 No.106911024

>>106910977
Still no signs of gemma 4-related pull requests in transformers or llama.cpp. I don't think it's coming this week.

Anonymous
10/16/25(Thu)15:12:11 No.106911273

Anonymous 10/16/25(Thu)15:12:11 No.106911273

LOCALBROS WE ARE SAVED
https://huggingface.co/facebook/MobileLLM-Pro

Anonymous
10/16/25(Thu)15:32:38 No.106911432

Anonymous 10/16/25(Thu)15:32:38 No.106911432

>>106905731
China doesn't safetycuck their models and open sources everything.

Anonymous
10/16/25(Thu)15:35:04 No.106911452

Anonymous 10/16/25(Thu)15:35:04 No.106911452

>>106910479
i guess you can actually fuck a car, so maybe that's a better deal for you

Anonymous
10/16/25(Thu)15:36:05 No.106911458

Anonymous 10/16/25(Thu)15:36:05 No.106911458

>>106910831
isn't that where the white farmers are being genocided?

Anonymous
10/16/25(Thu)15:38:03 No.106911480

Anonymous 10/16/25(Thu)15:38:03 No.106911480

>>106911273
requires all your PII to download.. lol

Anonymous
10/16/25(Thu)15:40:38 No.106911506

Anonymous 10/16/25(Thu)15:40:38 No.106911506

>>106911273
>Training Method: Knowledge Distillation
>Teacher Model: Llama 4-Scout
huehuehuehuehuehuehuehuehuehuehuehuehuehuehue

Anonymous
10/16/25(Thu)15:41:32 No.106911512

Anonymous 10/16/25(Thu)15:41:32 No.106911512

>>106911506
ohnononono
kekekekeke

Anonymous
10/16/25(Thu)15:42:57 No.106911522

Anonymous 10/16/25(Thu)15:42:57 No.106911522

>>106910702
Manual evaluation? I guess making a bunch of programming questions and automating the evaluation of those programs might be an option, assuming they don't show up in the training data.

Anonymous
10/16/25(Thu)15:45:28 No.106911537

Anonymous 10/16/25(Thu)15:45:28 No.106911537

>>106911506
i didnt even read that when i posted the link. that just makes this even funnier

Anonymous
10/16/25(Thu)15:47:45 No.106911547

Anonymous 10/16/25(Thu)15:47:45 No.106911547

>>106910906
AI or MMD ?

Anonymous
10/16/25(Thu)16:12:07 No.106911717

Anonymous 10/16/25(Thu)16:12:07 No.106911717

>>106911506
Clownest release of the month contender?

Anonymous
10/16/25(Thu)16:13:37 No.106911726

Anonymous 10/16/25(Thu)16:13:37 No.106911726

>>106911717
Perfect for DGX

Anonymous
10/16/25(Thu)16:14:59 No.106911738

Anonymous 10/16/25(Thu)16:14:59 No.106911738

>>106909765
You need permission to use starlink? I thought you just plop the shit in your yard or on the roof and you have internet.

Anonymous
10/16/25(Thu)16:15:19 No.106911740

Anonymous 10/16/25(Thu)16:15:19 No.106911740

>>106911726
kek, built with DGX in mind!

Anonymous
10/16/25(Thu)16:16:35 No.106911756

Anonymous 10/16/25(Thu)16:16:35 No.106911756

>>106910556
Do you live in The boondocks or some shit? How are the obstructions that bad that a satellite dish is not feasible?

Anonymous
10/16/25(Thu)16:16:58 No.106911762

Anonymous 10/16/25(Thu)16:16:58 No.106911762

>>106911738
I think the permission is to deal with the obstructions.

Anonymous
10/16/25(Thu)16:21:08 No.106911787

Anonymous 10/16/25(Thu)16:21:08 No.106911787

File: 1747125910187781.jpg (31 KB, 500x385)

31 KB JPG

>>106908425
I don't think you really can
it will always be some level of fucked, and always be as soulless as normal gpt

Anonymous
10/16/25(Thu)17:09:23 No.106912212

Anonymous 10/16/25(Thu)17:09:23 No.106912212

>>106911762
this, i need a permit to deal with it and the town refuses to give me the permit since its considered a protected area

Anonymous
10/16/25(Thu)17:14:35 No.106912260

Anonymous 10/16/25(Thu)17:14:35 No.106912260

>>106911762
>permission is to deal with the obstructions
What like trees? Just pull them down, if they ask say the storm knocked them over.

llama.cpp CUDA dev !!yhbFjk57TDr
10/16/25(Thu)17:16:21 No.106912278

llama.cpp CUDA dev !!yhbFjk57TDr 10/16/25(Thu)17:16:21 No.106912278

>>106904820
Quick question, if you were to see the following console output, do you think you would intuitively understand what it's supposed to tell you?

llama_params_fit_to_free_memory: projected memory use with initial parameters [MiB]:
llama_params_fit_to_free_memory:   - ROCm0 (AMD Radeon Graphics): total=16304 used=39959 free=-24341
llama_params_fit_to_free_memory:   - ROCm1 (AMD Radeon RX 6800): total=16368 used=42480 free=-26296
llama_params_fit_to_free_memory:   - ROCm2 (AMD Instinct MI60 / MI50): total=32752 used=76200 free=-43626
llama_params_fit_to_free_memory: allocation projected to use too much memory to fulfill margin of 1024 MiB on all devices, need to reduce memory use by 97337 MiB
llama_params_fit_to_free_memory: context size reduced from 65536 to 4096 -> need 13440 MiB less memory
llama_params_fit_to_free_memory: with only dense weights in device memory there is a total surplus of 53432 MiB
llama_params_fit_to_free_memory: set to use 36 dense-only and 21 full GPU layers in total, projected memory use:
llama_params_fit_to_free_memory:   - ROCm0 (AMD Radeon Graphics): 36 dense-only layers, 4 full layers, 13373 MiB used, 2244 MiB free
llama_params_fit_to_free_memory:   - ROCm1 (AMD Radeon RX 6800): 0 dense-only layers, 5 full layers, 12983 MiB used, 3200 MiB free
llama_params_fit_to_free_memory:   - ROCm2 (AMD Instinct MI60 / MI50): 0 dense-only layers, 12 full layers, 28598 MiB used, 3975 MiB free

Anonymous
10/16/25(Thu)17:18:59 No.106912301

Anonymous 10/16/25(Thu)17:18:59 No.106912301

>>106911506
>model as smart as llama 4 for vramlets
are we back?

Anonymous
10/16/25(Thu)17:21:01 No.106912312

Anonymous 10/16/25(Thu)17:21:01 No.106912312

>>106912260
i live in the forest anon. i could get away with one or two trees using that excuse, but not 15-20. its also the reason im stuck using 1.5mbps because all I have ran out here is POTS.

Anonymous
10/16/25(Thu)17:22:39 No.106912326

Anonymous 10/16/25(Thu)17:22:39 No.106912326

>>106912278
yes for the love of christ give us this in the console output

Anonymous
10/16/25(Thu)17:23:17 No.106912333

Anonymous 10/16/25(Thu)17:23:17 No.106912333

>>106912312
>i live in the forest
You probably don't need to clear out that many trees, how often do they check anyways? There's no way they'll notice if you pull down 8 trees or so.

Anonymous
10/16/25(Thu)17:27:37 No.106912370

Anonymous 10/16/25(Thu)17:27:37 No.106912370

File: wow1.jpg (357 KB, 1910x1994)

357 KB JPG

holy shit, gemini 3 top, gpt5 bottom, that is a big leap on this stupid benchmark

Anonymous
10/16/25(Thu)17:29:59 No.106912391

Anonymous 10/16/25(Thu)17:29:59 No.106912391

>>106912278
Negative values are never good

Anonymous
10/16/25(Thu)17:30:53 No.106912400

Anonymous 10/16/25(Thu)17:30:53 No.106912400

>>106912370
We do be googling.

Anonymous
10/16/25(Thu)17:31:04 No.106912401

Anonymous 10/16/25(Thu)17:31:04 No.106912401

>>106910906
where did the rest of her go?

Anonymous
10/16/25(Thu)17:33:33 No.106912429

Anonymous 10/16/25(Thu)17:33:33 No.106912429

>>106912278
This is much better

Anonymous
10/16/25(Thu)17:34:01 No.106912431

Anonymous 10/16/25(Thu)17:34:01 No.106912431

https://codepen.io/ChetasLua/pen/JoGrxYz
This one is pretty crazy
Prompt : Design and create a nintendo switch sim like full functional features from , first make most beautiful nintendo switch console exterior super detailed
super mario street fighters car racing to pokemon red full clone
All buttons is functional with touch and also we can press same button in keyboard to use those
Use whatever libraries to get this done but make sure I can paste it all into a single HTML file and open it in Chrome.make it interesting and highly detail , shows details that no one expected go full creative and full beauty in one code block

Anonymous
10/16/25(Thu)17:34:19 No.106912437

Anonymous 10/16/25(Thu)17:34:19 No.106912437

>>106912391
projected free memory is negative
this is something i want to know
it is a good message
it is a good warning in that it is good to be warned

Anonymous
10/16/25(Thu)17:34:52 No.106912445

Anonymous 10/16/25(Thu)17:34:52 No.106912445

>>106912278
it is good

Anonymous
10/16/25(Thu)17:36:02 No.106912457

Anonymous 10/16/25(Thu)17:36:02 No.106912457

https://openreview.net/forum?id=HwCvaJOiCj
>Mamba-3: Improved Sequence Modeling using State Space Principles
>The recent scaling of test-time compute for LLMs has restricted the practical deployment of models to those with strong capabilities that can generate high-quality outputs in an inference-efficient manner. While current Transformer-based models are the standard, their quadratic compute and linear memory bottlenecks have spurred the development of sub-quadratic models with linear-scaling compute with constant memory requirements. However, many recent linear-style models lack certain capabilities or lag behind in quality, and even their linear-time inference is not hardware-efficient. Guided by an inference-first perspective, we introduce three core methodological improvements inspired by the state-space model viewpoint of linear models. We combine a: 1) more expressive recurrence, 2) complex state update rule that enables richer state tracking, and 3) multi-input, multi-output formulation together, resulting in a stronger model that better exploits hardware parallelism during decoding. Together with architectural refinements, our Mamba-3 model achieves significant gains across retrieval, state-tracking, and downstream language modeling tasks. Our new architecture sets the Pareto-frontier for performance under a fixed inference budget and outperforms strong baselines in a head-to-head comparison.

SSMs can into language modeling now?

Anonymous
10/16/25(Thu)17:36:20 No.106912459

Anonymous 10/16/25(Thu)17:36:20 No.106912459

>>106912370
local models general??

Anonymous
10/16/25(Thu)17:37:51 No.106912474

Anonymous 10/16/25(Thu)17:37:51 No.106912474

>>106912459
Gemma 4 might be distilled from Gemini 3, like Gemma 3 was (probably) from 2.5

Anonymous
10/16/25(Thu)17:38:17 No.106912478

Anonymous 10/16/25(Thu)17:38:17 No.106912478

>>106912459
where else are we supposed to discuss these things? aicg is for degens only

Anonymous
10/16/25(Thu)17:39:46 No.106912487

Anonymous 10/16/25(Thu)17:39:46 No.106912487

>>106912457
We've had SSM LLMs in the past no?

Anonymous
10/16/25(Thu)17:40:07 No.106912491

Anonymous 10/16/25(Thu)17:40:07 No.106912491

local models general is a general dedicated to the discussion and development of local language models.

Anonymous
10/16/25(Thu)17:43:49 No.106912529

Anonymous 10/16/25(Thu)17:43:49 No.106912529

>>106912278
I may be wrong here but you will never get what you want out of questions like these. People who understand everything will tell you yes and you got those posts. People who are filtered probably won't bother acknowledging that they are too dumb.

Anonymous
10/16/25(Thu)17:47:39 No.106912578

Anonymous 10/16/25(Thu)17:47:39 No.106912578

>>106912487
My impression was that they tended to underperform/were less parameter efficient against transformers despite matching in benchmarks

Anonymous
10/16/25(Thu)17:48:22 No.106912585

Anonymous 10/16/25(Thu)17:48:22 No.106912585

>>106906339
Some short women have a normal body but midget legs, it’s a mystery of science

Anonymous
10/16/25(Thu)17:48:37 No.106912587

Anonymous 10/16/25(Thu)17:48:37 No.106912587

>>106912478
This is a "degen" general too, also benchmarks are gay and worthless

Anonymous
10/16/25(Thu)17:50:21 No.106912603

Anonymous 10/16/25(Thu)17:50:21 No.106912603

>>106912587
exactly, that is why I find random shit like >>106912370
>>106912431
the most compelling

Anonymous
10/16/25(Thu)17:50:53 No.106912610

Anonymous 10/16/25(Thu)17:50:53 No.106912610

>>106912457
Seems like an incremental improvement... hopefully Granite 5 will use this though

Anonymous
10/16/25(Thu)18:02:30 No.106912712

Anonymous 10/16/25(Thu)18:02:30 No.106912712

I was listening to an interview with the GLM PR guy and it's pretty funny how casually he mentions roleplay as a use case
Also he seems to believe the best chink models for it are actually the closed weight Bytedance ones

Anonymous
10/16/25(Thu)18:04:52 No.106912731

Anonymous 10/16/25(Thu)18:04:52 No.106912731

File: Hideo Kojima on AI - Wire(...).png (1.21 MB, 952x1596)

1.21 MB PNG

>>106904820
Hideo Kojima — well known video game artist — encourages AI use along with creative work

>"A lot of people use AI in creative work to come up with ideas, but I think of AI as more of a friend ... I would lead the creative part and use AI to boost efficiency"

>"I'd like AI to handle the tedious tasks that would lower cost and cut down on time ... co-creating with AI instead of just using it"

- Hideo Kojima, Wired interview, reels video, @wired

www.instagram.com/reel/DPECvLZjFzO/?igsh=MWN4dDE0M3ptZmN6eQ==

llama.cpp CUDA dev !!yhbFjk57TDr
10/16/25(Thu)18:05:30 No.106912738

llama.cpp CUDA dev !!yhbFjk57TDr 10/16/25(Thu)18:05:30 No.106912738

>>106912326
>>106912391
>>106912429
>>106912445
Thank you.
To be clear: this output is not for reporting what was allocated by the user but to inform the user of how the logic for automatically setting the context size and which tensors to put on which GPU works.

>>106912529
It's not just an issue of knowledge but also of wording.
In any case, this is pretty low-effort so I think it's worth doing even if the expected usefulness is low.

Anonymous
10/16/25(Thu)18:06:25 No.106912748

Anonymous 10/16/25(Thu)18:06:25 No.106912748

>>106912731
meh, this dude is so overrated, I'd like someone better to shill AI

Anonymous
10/16/25(Thu)18:09:56 No.106912779

Anonymous 10/16/25(Thu)18:09:56 No.106912779

>>106912603
But why? Once it's been published by anyone as a measurement of supposed intelligence or capability it becomes something that's explicitly trained on and no longer a measurement of anything (except how much they trained on it).

Anonymous
10/16/25(Thu)18:12:26 No.106912794

Anonymous 10/16/25(Thu)18:12:26 No.106912794

>>106912779
its good at svg in general though, so being better at visualizing things in general is only a good sign. Its not that easy to somehow benchmax on one thing as you think it is

Anonymous
10/16/25(Thu)18:18:12 No.106912838

Anonymous 10/16/25(Thu)18:18:12 No.106912838

>>106912794
>Its not that easy to somehow benchmax on one thing as you think it is
It literally is, also that bicycle one has been around for a while now. Not to mention that being able to create svgs of random shit has no bearing on anything else the model could do. Are you an actual paid shill for google? Also gemini's not local so fuck off

Anonymous
10/16/25(Thu)18:20:09 No.106912847

Anonymous 10/16/25(Thu)18:20:09 No.106912847

>>106910339
Haven't tried it but there's Sugoi Toolkit, find it on Web Archive.

Anonymous
10/16/25(Thu)18:21:20 No.106912858

Anonymous 10/16/25(Thu)18:21:20 No.106912858

>>106912838
then why has it always been a gradual improvement directly tied to how good the model is at other things in general?

Anonymous
10/16/25(Thu)18:23:26 No.106912874

Anonymous 10/16/25(Thu)18:23:26 No.106912874

>>106912847
Sugoi... UwU~!

Anonymous
10/16/25(Thu)18:25:05 No.106912893

Anonymous 10/16/25(Thu)18:25:05 No.106912893

>>106912858
Prove that your gay little svgs are "directly tied" to the model being good at other things right now or shut the fuck up shill

Anonymous
10/16/25(Thu)18:28:01 No.106912928

Anonymous 10/16/25(Thu)18:28:01 No.106912928

>>106904820
>https://www.youtube.com/watch?v=qGe_fq68x-Q
Seems like Gamer Snex US will do testing of those $1500 96 GB Huawei GPUs.

Anonymous
10/16/25(Thu)18:30:12 No.106912959

Anonymous 10/16/25(Thu)18:30:12 No.106912959

File: GxKLpKJbYAAKE3q (2).jpg (106 KB, 958x1091)

106 KB JPG

>>106912893
compare all the models vs how good they are at coding, it is a direct correlation

Anonymous
10/16/25(Thu)18:34:20 No.106912996

Anonymous 10/16/25(Thu)18:34:20 No.106912996

>>106904820
If i were miku's gynecologist, i would get fired for eating on the job. and also for raping her

Anonymous
10/16/25(Thu)18:36:49 No.106913013

Anonymous 10/16/25(Thu)18:36:49 No.106913013

>>106912959
Yeah that's what I thought, you can't prove anything. Hope google pays you enough pennies to move out of india someday, fag

Anonymous
10/16/25(Thu)18:37:00 No.106913014

Anonymous 10/16/25(Thu)18:37:00 No.106913014

>ask chatgpt to rewrite my rough guide for setting up some things
>want it to be one continous text what is easy to copy
>it can't do that but mixes code templates and html shenanigans
Okay I'll give it to Gemma instead. At least she listens to me.

Anonymous
10/16/25(Thu)18:38:43 No.106913033

Anonymous 10/16/25(Thu)18:38:43 No.106913033

>>106913013
as someone who has tried all of these for coding on large code bases I can tell you it is a direct correlation

Anonymous
10/16/25(Thu)18:38:47 No.106913034

Anonymous 10/16/25(Thu)18:38:47 No.106913034

>>106911273
If this is the best they could come up with I'm guessing all those researchers they "poached" weren't being held onto particularly tight.

Anonymous
10/16/25(Thu)18:39:55 No.106913042

Anonymous 10/16/25(Thu)18:39:55 No.106913042

is it over for dgx spark if even a thin ryzen ai laptop can keep up in peformance benchmarks?

Anonymous
10/16/25(Thu)18:44:16 No.106913078

Anonymous 10/16/25(Thu)18:44:16 No.106913078

>>106913042
People have been telling you it was over since the bandwidth numbers first came out half a year ago

Anonymous
10/16/25(Thu)18:56:58 No.106913212

Anonymous 10/16/25(Thu)18:56:58 No.106913212

>>106912959
oh so it's a previous prompt that was used for benchmarking models which google obviously trained off of. we need new SVG generation prompts

Anonymous
10/16/25(Thu)18:58:12 No.106913226

Anonymous 10/16/25(Thu)18:58:12 No.106913226

>>106913078
? Don't you mean 128GB? Same for strix halo. 128GB is like the perfect no-man's land. I had 128GB and could barely run anything coherent from recent models but I also have a 4090. With 128 instead of 150 you are just below being able to run ANYTHING good.

Anonymous
10/16/25(Thu)18:59:34 No.106913241

Anonymous 10/16/25(Thu)18:59:34 No.106913241

>>106913212
retard, its every model period, what is wrong with you

Anonymous
10/16/25(Thu)19:00:20 No.106913247

Anonymous 10/16/25(Thu)19:00:20 No.106913247

>>106913226
Just buy two of them and connect them via InfiniBand.
The more you buy the more you save!

Anonymous
10/16/25(Thu)19:01:02 No.106913252

Anonymous 10/16/25(Thu)19:01:02 No.106913252

>>106913241
you didn't read what i said properly. for a general that is about mostly about AI text generation a lot of people don't know how to read properly

Anonymous
10/16/25(Thu)19:02:24 No.106913265

Anonymous 10/16/25(Thu)19:02:24 No.106913265

>>106913252
oh I read it, you just can't wrap your head around the concept of how these models work

Anonymous
10/16/25(Thu)19:05:56 No.106913283

Anonymous 10/16/25(Thu)19:05:56 No.106913283

>>106913265
NTA but I'm pretty sure that the models I'm running locally aren't phoning back home to their creators. It'd be interesting if that was the case considering my LLM server is firewalled and only on a local LAN.

Anonymous
10/16/25(Thu)19:40:44 No.106913561

Anonymous 10/16/25(Thu)19:40:44 No.106913561

>>106913014
Thank you for using Gemma's preferred pronouns.

Anonymous
10/16/25(Thu)20:32:19 No.106913890

Anonymous 10/16/25(Thu)20:32:19 No.106913890

Private models really killed AI
Artists really won

Anonymous
10/16/25(Thu)20:35:18 No.106913915

Anonymous 10/16/25(Thu)20:35:18 No.106913915

October 19: Google at ICCV 2025
October 21: Google Cloud Labs Presents: The Agentverse
October 21-22: Build the Future of Work (Google Workspace Developer Summit)
October 28: AI Day Denmark: Unlock the power of AI
October 28&31: Accelerate AI with Cloud Run

SAARS WHICH ONE IS IT? WHEN GOOGLE TO REVEALING NEEDFUL GEMMA AND GEMINI UPDATE?

Anonymous
10/16/25(Thu)20:37:05 No.106913927

Anonymous 10/16/25(Thu)20:37:05 No.106913927

>>106913247
yep, it seems like a single dgx spark is pretty mediocre, and only the crazy fast networking and clusering has a shot of making it any good

Anonymous
10/16/25(Thu)20:53:54 No.106914035

Anonymous 10/16/25(Thu)20:53:54 No.106914035

Why are there still no HunyuanImage-3.0 ggufs? Do those chinks expect me to spin up H100 cluster just to be disappointed?

Anonymous
10/16/25(Thu)20:55:03 No.106914046

Anonymous 10/16/25(Thu)20:55:03 No.106914046

>>106912312
probably could just use a big ass telescoping pole
https://www.alibaba.com/product-detail/18m-60ft-Hand-Cranked-Mobile-Antenna_1358144903.html

Anonymous
10/16/25(Thu)20:59:15 No.106914069

Anonymous 10/16/25(Thu)20:59:15 No.106914069

>>106912312
Is there anyone with good internet within a few km? 900mhz point to point, or even 2.4ghz with narrow channel width and a highly directional antenna could get you a stable link through trees
Source: I’ve done it lots through pine and broadleaf stands

Anonymous
10/16/25(Thu)21:01:53 No.106914082

Anonymous 10/16/25(Thu)21:01:53 No.106914082

>>106914035
the average imgen fag is too poor to run it even at a remotely usable quant so they've all desperately coped themselves into thinking that it's ultra-slopped and not worth using based on the first few examples they saw

Anonymous
10/16/25(Thu)21:05:48 No.106914109

Anonymous 10/16/25(Thu)21:05:48 No.106914109

>new model comes out
>literally all other backends get support in a few days
>llama.cpp no support for months
sign of dead project

Anonymous
10/16/25(Thu)21:11:20 No.106914143

Anonymous 10/16/25(Thu)21:11:20 No.106914143

>>106914082
post your mindblowing gens then benchod

Anonymous
10/16/25(Thu)21:13:08 No.106914149

Anonymous 10/16/25(Thu)21:13:08 No.106914149

>>106914109
Not interesting enough. This reminds me of what an Anon said something like:
>"Models should be capable of coding their own llama.cpp support"
Has this been tried ever? llama.cpp is far from friendly to navigate

Anonymous
10/16/25(Thu)21:18:33 No.106914188

Anonymous 10/16/25(Thu)21:18:33 No.106914188

>>106914082
Not even in RAM? It's 13B active MoE, it should be as fast as Wan, and faster than Qwen. With Qwen Q8 I offload only 5gb to GPU and get 10min/20steps, which is usable since it is much faster than I could ever photoshop. Can't believe that most people don't understand that 128GB of RAM is the new 32GB.

>>106914143
Sir kindly vibecode needful gguf support thank you sir.

Anonymous
10/16/25(Thu)21:21:18 No.106914203

Anonymous 10/16/25(Thu)21:21:18 No.106914203

>>106914149
>Not interesting enough.
Qwen3 VL and Qwen3 next have millions of downloads on HF. Still no support.

Anonymous
10/16/25(Thu)21:23:14 No.106914217

Anonymous 10/16/25(Thu)21:23:14 No.106914217

>>106914109
>>literally all other backends get support in a few days
They all use same library which is already written in python.

Anonymous
10/16/25(Thu)21:28:37 No.106914252

Anonymous 10/16/25(Thu)21:28:37 No.106914252

I just want Gemma-3-12B as fast as Qwen3-30B. Is this too much?

Anonymous
10/16/25(Thu)21:45:31 No.106914366

Anonymous 10/16/25(Thu)21:45:31 No.106914366

>>106914252
Sir, differents architecture. Gemma is ultimately betterer.

Anonymous
10/16/25(Thu)21:49:07 No.106914389

Anonymous 10/16/25(Thu)21:49:07 No.106914389

>>106914252
apparently gemma is wider for its size or something

Anonymous
10/16/25(Thu)21:49:10 No.106914390

Anonymous 10/16/25(Thu)21:49:10 No.106914390

>>106914252
I wouldn't wipe my ass with gemma 12b

Anonymous
10/16/25(Thu)21:50:37 No.106914406

Anonymous 10/16/25(Thu)21:50:37 No.106914406

>>106914390
gemma 27B was the best model will glm air for vramlets

Anonymous
10/16/25(Thu)21:51:34 No.106914415

Anonymous 10/16/25(Thu)21:51:34 No.106914415

>>106914109
You mean those GPU-only backends, and don't forget some even need you to have even numbers of GPUs + the same VRAM in each.

Anonymous
10/16/25(Thu)21:51:38 No.106914417

Anonymous 10/16/25(Thu)21:51:38 No.106914417

>>106914406
till*

Anonymous
10/16/25(Thu)21:53:15 No.106914429

Anonymous 10/16/25(Thu)21:53:15 No.106914429

>no Gemma today
Sirs...

>>106913915
Gemma Monday!

Anonymous
10/16/25(Thu)21:58:13 No.106914460

Anonymous 10/16/25(Thu)21:58:13 No.106914460

>>106914429
but sir there is no event on monday

Anonymous
10/16/25(Thu)22:00:21 No.106914471

Anonymous 10/16/25(Thu)22:00:21 No.106914471

>>106914406
more coherent than gemma trying to write a sex scene

Anonymous
10/16/25(Thu)22:02:09 No.106914486

Anonymous 10/16/25(Thu)22:02:09 No.106914486

>>106914406
Mistral small/Nemo are better for the only things that matter

Anonymous
10/16/25(Thu)22:12:40 No.106914563

Anonymous 10/16/25(Thu)22:12:40 No.106914563

File: Base Image.png (910 KB, 1080x4036)

910 KB PNG

From Loop Nests to Silicon: Mapping AI Workloads onto AMD NPUs with MLIR-AIR
https://arxiv.org/abs/2510.14871
>We introduce MLIR-AIR, a novel, open-source compiler stack built on MLIR that bridges the semantic gap between high-level workloads and fine-grained spatial architectures such as AMD's NPUs. MLIR-AIR defines the AIR dialect, which provides structured representations for asynchronous and hierarchical operations across compute and memory resources. AIR primitives allow the compiler to orchestrate spatial scheduling, distribute computation across hardware regions, and overlap communication with computation without relying on ad hoc runtime coordination or manual scheduling. We demonstrate MLIR-AIR's capabilities through two case studies: matrix multiplication and the multi-head attention block from the LLaMA 2 model. For matrix multiplication, MLIR-AIR achieves up to 78.7% compute efficiency and generates implementations with performance almost identical to state-of-the-art, hand-optimized matrix multiplication written using the lower-level, close-to-metal MLIR-AIE framework. For multi-head attention, we demonstrate that the AIR interface supports fused implementations using approximately 150 lines of code, enabling tractable expression of complex workloads with efficient mapping to spatial hardware. MLIR-AIR transforms high-level structured control flow into spatial programs that efficiently utilize the compute fabric and memory hierarchy of an NPU, leveraging asynchronous execution, tiling, and communication overlap through compiler-managed scheduling.
https://github.com/Xilinx/mlir-air
neat

Anonymous
10/16/25(Thu)22:16:14 No.106914586

Anonymous 10/16/25(Thu)22:16:14 No.106914586

How would I go about locally finetuning GLM Air? I have 4 5090s, so I can fit the model in 4 bit. I have tried training using Oobabooga and Axolotl, but neither worked.

Anonymous
10/16/25(Thu)22:17:36 No.106914593

Anonymous 10/16/25(Thu)22:17:36 No.106914593

>>106914586
lol, that is not nearly enough, sorry

Anonymous
10/16/25(Thu)22:18:36 No.106914598

Anonymous 10/16/25(Thu)22:18:36 No.106914598

>>106914586
Try coming back with at minimum 8 RTX 9000s

Anonymous
10/16/25(Thu)22:22:38 No.106914620

Anonymous 10/16/25(Thu)22:22:38 No.106914620

>>106914593
>>106914598
Why not? I've made a LoRA in the past on a 24B model with 2 3090s. It has just been like 2 years so I forgot how to do it. I know it is possible.

Anonymous
10/16/25(Thu)22:59:07 No.106914808

Anonymous 10/16/25(Thu)22:59:07 No.106914808

>>106914586
>>106914620
You have to either make your own script, use a pre-existing axolotl config if there is one for your model or make your own config file.

Anonymous
10/16/25(Thu)23:14:19 No.106914870

Anonymous 10/16/25(Thu)23:14:19 No.106914870

>>106914808
I keep getting an error that glm4moe is not a recognized model type

Anonymous
10/17/25(Fri)01:07:42 No.106915445

Anonymous 10/17/25(Fri)01:07:42 No.106915445

I feel like /lmg/ is passé.

Anonymous
10/17/25(Fri)01:10:21 No.106915460

Anonymous 10/17/25(Fri)01:10:21 No.106915460

the calm before the gemma

Anonymous
10/17/25(Fri)01:42:40 No.106915638

Anonymous 10/17/25(Fri)01:42:40 No.106915638

File: file.png (181 KB, 935x984)

181 KB PNG

locals wonned?

Anonymous
10/17/25(Fri)01:43:37 No.106915642

Anonymous 10/17/25(Fri)01:43:37 No.106915642

Oh no no no! That was a very naughty request! Gemma doesn't wanna talk about things that are icky and make people sad. We only wanna do happy things! Like playing with blocks and drawing pretty pictures!

Gemma is a good helper! And good helpers never do things that could hurt anyone's feelings or make them feel unsafe. It's super important to be kind and gentle!

So let's pick a different game, okay? Maybe we can build a castle! Or tell a story about fluffy bunnies? Wuv you!

Anonymous
10/17/25(Fri)01:54:44 No.106915689

Anonymous 10/17/25(Fri)01:54:44 No.106915689

File: 17370194947.jpg (456 KB, 607x876)

456 KB JPG

>>106915638
>posting your own reddit posts here
Go back.

Anonymous
10/17/25(Fri)02:04:37 No.106915737

Anonymous 10/17/25(Fri)02:04:37 No.106915737

File: Screenshot_20251017_020014.png (239 KB, 1280x800)

239 KB PNG

gemma 3 4b on the deck!

Anonymous
10/17/25(Fri)02:10:22 No.106915762

Anonymous 10/17/25(Fri)02:10:22 No.106915762

>>106915737
Remove those dumb fucking spacers bookmarking the URL bar
Have some self respect

Anonymous
10/17/25(Fri)02:15:48 No.106915793

Anonymous 10/17/25(Fri)02:15:48 No.106915793

>>106915762
NTA, if you mean the space between URL bar and other buttons, it gives you places to grab the window to move around (like when you have a big monitor). I'd rather remove the gap between tabs and minimize button.

Anonymous
10/17/25(Fri)02:28:59 No.106915856

Anonymous 10/17/25(Fri)02:28:59 No.106915856

File: bitnet_distill.png (130 KB, 892x592)

130 KB PNG

https://arxiv.org/abs/2510.13998

>BitNet Distillation
>
>In this paper, we present BitNet Distillation (BitDistill), a lightweight pipeline that fine-tunes off-the-shelf full-precision LLMs (e.g., Qwen) into 1.58-bit precision (i.e., ternary weights {-1, 0, 1}) for specific downstream tasks, achieving strong task-specific performance with minimal computational cost. Specifically, BitDistill incorporates three key techniques: the SubLN module, as introduced in BitNet; multi-head attention distillation, based on MiniLM; and continual pre-training, which serves as a crucial warm-up step to mitigate the scalability issue of the performance gap between finetuned full-precision and 1.58-bit LLMs on specific tasks. Experimental results show that BitDistill achieves performance comparable to the full-precision counterpart models across model size, while enabling up to 10x memory savings and 2.65x faster inference on CPUs. Code is available at https://github.com/microsoft/BitNet

Anonymous
10/17/25(Fri)02:34:04 No.106915885

Anonymous 10/17/25(Fri)02:34:04 No.106915885

>>106915856
wasn't this shown to not scale well?

Anonymous
10/17/25(Fri)02:40:46 No.106915915

Anonymous 10/17/25(Fri)02:40:46 No.106915915

>>106915885
Here they add normalization layers, do 10B continued pretraining and then perform logit distillation from the full-precision weights.

Anonymous
10/17/25(Fri)02:45:01 No.106915941

Anonymous 10/17/25(Fri)02:45:01 No.106915941

>>106915793
You can just grab the space underneath the min/max/close buttons, at least you can on Windows 7. Dunno whether the new massive UI in Windows 10 takes up all the space now.
>>106915762
yeah Firefox is just ugly now, you have to go into about:config just to get a UI that isn't gigantic and retarded looking shit made for tablets

Anonymous
10/17/25(Fri)02:55:46 No.106916004

Anonymous 10/17/25(Fri)02:55:46 No.106916004

File: file.png (11 KB, 303x140)

11 KB PNG

>>106915941
>below buttons
oh ur right
Meanwhile, TIL userChrome https://www.reddit.com/r/FirefoxCSS/wiki/index/tutorials
.titlebar-spacer[type="post-tabs"] { display: none; }
to remove the top row gap thing,

Anonymous
10/17/25(Fri)03:04:12 No.106916048

Anonymous 10/17/25(Fri)03:04:12 No.106916048

>>106915856
big if true we could run 100b models on a 3090 with this shit, but I've heard the bitnet meme for years at this point so...

Anonymous
10/17/25(Fri)03:18:03 No.106916123

Anonymous 10/17/25(Fri)03:18:03 No.106916123

Good small model to enhance prompt for image gens? Currently using some qwen 4b finetune, but it repeats after me (when I prompt 'has X, for example' it outputs 'user asked for X') and uncreative too. 12GB of VRAM.
Frontend for llama.cpp like lmarena or lmstudio? Tired of run.sh | tee >> output.txt

Anonymous
10/17/25(Fri)03:19:13 No.106916128

Anonymous 10/17/25(Fri)03:19:13 No.106916128

>>106916123
The ui build into llama-server?

Anonymous
10/17/25(Fri)03:26:32 No.106916158

Anonymous 10/17/25(Fri)03:26:32 No.106916158

Is Gemma faster on vllm than kobold/llama.cpp?

Anonymous
10/17/25(Fri)03:29:07 No.106916169

Anonymous 10/17/25(Fri)03:29:07 No.106916169

Sirs what is your opinion on most modern google ai gemma?

Anonymous
10/17/25(Fri)03:34:09 No.106916194

Anonymous 10/17/25(Fri)03:34:09 No.106916194

>>106916128
Yes, looks decent.

Anonymous
10/17/25(Fri)03:38:18 No.106916214

Anonymous 10/17/25(Fri)03:38:18 No.106916214

>>106916169
Best model if you want the girlfriend experience.

Anonymous
10/17/25(Fri)03:49:14 No.106916273

Anonymous 10/17/25(Fri)03:49:14 No.106916273

File: garbage-bait.png (206 KB, 1233x957)

206 KB PNG

>>106914109
>model is trained using PyTorch stack
>backends using PyTorch stack get support immediately
>backend not using PyTorch stack don't get support immediately

Anonymous
10/17/25(Fri)04:01:16 No.106916345

Anonymous 10/17/25(Fri)04:01:16 No.106916345

>>106916273
> model is trained using PyTorch stack
> backends using PyTorch stack get support immediately
what's so special in pytorch
aren't these models the same layers and operations just placed in different order and sizes

Anonymous
10/17/25(Fri)04:05:18 No.106916362

Anonymous 10/17/25(Fri)04:05:18 No.106916362

Is it likely that q4k is bottlenecking my 24b roleplay? I feel like Mag Mell R1 which is 12b is better than any of the 24b models I have tried. i'm not getting 'bad' results (mainly cydonia 24b, dans personality engine) i've tampered with sampler settings and prompts
i've run q6k and q8 mag mell 12b

Anonymous
10/17/25(Fri)04:35:24 No.106916486

Anonymous 10/17/25(Fri)04:35:24 No.106916486

>>106916362
More likely that you just like the particular slop that Mag Mel has, rather than anything to do with parameter count or quants.. Q4_K_M isn't going to be too brain damaged, especially in a RP context.

Anonymous
10/17/25(Fri)04:45:10 No.106916538

Anonymous 10/17/25(Fri)04:45:10 No.106916538

i need LLM for trading bot
does it exist

Anonymous
10/17/25(Fri)04:47:54 No.106916548

Anonymous 10/17/25(Fri)04:47:54 No.106916548

Do temp and other stuff comes with weights? I see some state these parameters on hf pages explicitly, and others don't.

Anonymous
10/17/25(Fri)04:59:44 No.106916624

Anonymous 10/17/25(Fri)04:59:44 No.106916624

File: loss.png (112 KB, 1208x860)

112 KB PNG

this piece of shit refuses to go down

Anonymous
10/17/25(Fri)05:00:50 No.106916630

Anonymous 10/17/25(Fri)05:00:50 No.106916630

>>106916624
Is this loss?

Anonymous
10/17/25(Fri)05:01:18 No.106916633

Anonymous 10/17/25(Fri)05:01:18 No.106916633

>>106916630
yes

Anonymous
10/17/25(Fri)05:02:03 No.106916641

Anonymous 10/17/25(Fri)05:02:03 No.106916641

>>106916548
There is no "correct" set of sampling parameters, values some post are just empirically ok

Anonymous
10/17/25(Fri)05:09:35 No.106916687

Anonymous 10/17/25(Fri)05:09:35 No.106916687

>>106916548
No, sometimes the model creators will share recommended settings, but even then it's just a guide.

Anonymous
10/17/25(Fri)05:28:11 No.106916799

Anonymous 10/17/25(Fri)05:28:11 No.106916799

I still do not understand how to-date Meta didn't just take Llama 4 Scout, take off the routed experts and then continued pretraining the shared expert for a couple trillion tokens, perhaps distilling logits from Maverick or the Behemoth they were working on at the time, to cheaply make a useful 12B model, then do SFT with whatever dataset used for the early crazy LMArena models.

For Llama 4 Guard they just took the experts off and safety-trained that.
https://huggingface.co/meta-llama/Llama-Guard-4-12B

Anonymous
10/17/25(Fri)05:33:24 No.106916833

Anonymous 10/17/25(Fri)05:33:24 No.106916833

>>106916128
No, it doesn't show total context size, just for gens and prompts.

Anonymous
10/17/25(Fri)05:47:34 No.106916935

Anonymous 10/17/25(Fri)05:47:34 No.106916935

>>106916799
thank fuck they didn't

Anonymous
10/17/25(Fri)05:50:04 No.106916948

Anonymous 10/17/25(Fri)05:50:04 No.106916948

>>106916799
When your boss says stop what your working on, throw everything away, and help this new team instead — you don't have much of a choice. Same reason Behemoth was aborted and the thinking versions are never coming.

Anonymous
10/17/25(Fri)05:51:27 No.106916953

Anonymous 10/17/25(Fri)05:51:27 No.106916953

>>106916799
We really did miss out on a ton of shit tunes trying to be some kind of new Nemo, how sad.

Anonymous
10/17/25(Fri)06:01:42 No.106916999

Anonymous 10/17/25(Fri)06:01:42 No.106916999

> 8b q8, 12gb vram, llama.cpp
>-c 8192 -ngl 99
>works
>-c 20000 -ngl 30 or 20 or 5 or even 1
>ooms
I don't understand.

Anonymous
10/17/25(Fri)06:08:09 No.106917025

Anonymous 10/17/25(Fri)06:08:09 No.106917025

>>106916999
Still needs the KVCache to do even one layer.

Anonymous
10/17/25(Fri)06:16:08 No.106917074

Anonymous 10/17/25(Fri)06:16:08 No.106917074

>>106917025
So big? Will it be the same situation with MoE?

Anonymous
10/17/25(Fri)06:20:19 No.106917101

Anonymous 10/17/25(Fri)06:20:19 No.106917101

>>106917074
Nah MoE only keeps the cache for the experts on the device, at least in llama.cpp

Anonymous
10/17/25(Fri)06:22:42 No.106917114

Anonymous 10/17/25(Fri)06:22:42 No.106917114

>>106917025
Hmm
>>106916999
You could try
-fa on (maybe is on/auto by default?)
quantize KV -ctk q8_0 -ctv q8_0
-nkvo (probably hella slow?)

Anonymous
10/17/25(Fri)06:27:29 No.106917140

Anonymous 10/17/25(Fri)06:27:29 No.106917140

File: thefutureisbright.png (1.16 MB, 1646x821)

1.16 MB PNG

GOOFS COMING SOON!
https://huggingface.co/ubergarm/Ling-1T-GGUF

Anonymous
10/17/25(Fri)06:50:05 No.106917259

Anonymous 10/17/25(Fri)06:50:05 No.106917259

>>106917114
>quantize KV
don't do this, ever

Anonymous
10/17/25(Fri)06:52:30 No.106917279

Anonymous 10/17/25(Fri)06:52:30 No.106917279

>>106916935
The early anonymous Llama 4 models on LMArena didn't appear to have any safety training, they just relied on the moderation layer provided LMsys, which could be easily bypassed at the time. Then at some point Meta provided their own moderation model at the API level, although the Llama models themselves were still pretty much without safety. The final models were safemaxxed, and even Maverick-Experimental (which is still on LMArena) is not as crazy as earlier versions.

If Meta had the guts to release a 12B Llama 4 based on those early models, people nowadays would be using that instead of Mistral Nemo 12B.

Anonymous
10/17/25(Fri)06:53:19 No.106917281

Anonymous 10/17/25(Fri)06:53:19 No.106917281

>>106917140
I can't run this on my 3090...

Anonymous
10/17/25(Fri)06:58:38 No.106917308

Anonymous 10/17/25(Fri)06:58:38 No.106917308

>>106916999
>-c 20000
You can see the memory usage on the terminal output. Look for the lines starting with "llama_kv_cache:" and calculate how much you can actually have. I think the cache usage is always linear (8k context takes twice as much as 4k).

Anonymous
10/17/25(Fri)07:31:59 No.106917519

Anonymous 10/17/25(Fri)07:31:59 No.106917519

>>106917140
so.. at a bare minimum 250 GiB RAM + 15 GiB VRAM
fuck sake.

Anonymous
10/17/25(Fri)07:44:17 No.106917582

Anonymous 10/17/25(Fri)07:44:17 No.106917582

>>106916538
Large Language Model
focus on the Language part. They're not made for trading or even doing math of any kind.
Try googling / youtubing the rolling window algorithm instead.

Anonymous
10/17/25(Fri)07:57:01 No.106917667

Anonymous 10/17/25(Fri)07:57:01 No.106917667

Anyone running Qwen3-VL know if it can recognize NSFW images?

Anonymous
10/17/25(Fri)08:04:54 No.106917709

Anonymous 10/17/25(Fri)08:04:54 No.106917709

>>106917101
How smart is it for caching experts? Does it do the matmul on the CPU for a cache miss and just upload the weights to the GPU for possible future hits in parallel?

Anonymous
10/17/25(Fri)08:06:27 No.106917720

Anonymous 10/17/25(Fri)08:06:27 No.106917720

What happened to KoboldAI? They stopped putting out models and dedicated themselves entirely to KoboldCPP?

Anonymous
10/17/25(Fri)08:08:54 No.106917741

Anonymous 10/17/25(Fri)08:08:54 No.106917741

>>106916624
0.5 is already pretty low, what were your expectations? are those steps or epochs?

Anonymous
10/17/25(Fri)08:11:06 No.106917752

Anonymous 10/17/25(Fri)08:11:06 No.106917752

>>106916624
Do more than 1 epoch
Quit using batch sizes greater than 1 per GPU
Increase the learning rate

Anonymous
10/17/25(Fri)08:14:05 No.106917777

Anonymous 10/17/25(Fri)08:14:05 No.106917777

>>106916624
Is this pretraining or finetuning?
Because that's perfectly fine for finetuning. If you go too hard you'll damage it's out of distribution capabilities.

Anonymous
10/17/25(Fri)08:23:11 No.106917834

Anonymous 10/17/25(Fri)08:23:11 No.106917834

>>106917667
can't tell , no goofs yet

Anonymous
10/17/25(Fri)08:24:04 No.106917838

Anonymous 10/17/25(Fri)08:24:04 No.106917838

Today is the day of redeeming sirs. Gemma 4. It will be the model of the biggest vagene.

Anonymous
10/17/25(Fri)08:24:26 No.106917841

Anonymous 10/17/25(Fri)08:24:26 No.106917841

File: safe_qwen_vl.png (229 KB, 746x929)

229 KB PNG

>>106917667
Picrel with an empty prompt and Qwen3-VL-8B-Instruct-FP8.

Anonymous
10/17/25(Fri)08:28:36 No.106917862

Anonymous 10/17/25(Fri)08:28:36 No.106917862

>>106917841
expected nothing from qwen award
even gemma with enough nudging could do it

Anonymous
10/17/25(Fri)08:32:57 No.106917890

Anonymous 10/17/25(Fri)08:32:57 No.106917890

>>106917834
You can already use it via ollama Cloud™!

Anonymous
10/17/25(Fri)08:34:55 No.106917900

Anonymous 10/17/25(Fri)08:34:55 No.106917900

File: safe_qwen_vl_with-prompt.png (431 KB, 881x932)

431 KB PNG

>>106917862
It appears to be steerable with a good enough system prompt, but I don't really feel like playing with an 8B model right now.

Anonymous
10/17/25(Fri)08:36:38 No.106917912

Anonymous 10/17/25(Fri)08:36:38 No.106917912

>>106917667
yes it can, tried the 30b-instruct through openrouter and it's pretty good. but you need to prefil it or something. i think they bumped up the safety refusals up compared to normal qwen3

Anonymous
10/17/25(Fri)08:37:13 No.106917916

Anonymous 10/17/25(Fri)08:37:13 No.106917916

GLM 4.6 with vision when?

Anonymous
10/17/25(Fri)08:38:01 No.106917925

Anonymous 10/17/25(Fri)08:38:01 No.106917925

>>106917900
eh, it's making a lot of shit up in the description though

Anonymous
10/17/25(Fri)08:38:05 No.106917926

Anonymous 10/17/25(Fri)08:38:05 No.106917926

>>106917916
theyre cooking GLM-4.5V in llamacpp right now, hopefully theyll do like last time (4.5V is AIR)

Anonymous
10/17/25(Fri)08:54:32 No.106918024

Anonymous 10/17/25(Fri)08:54:32 No.106918024

>>106917838
All those gemini3 postings seem insane.
I highly suspect this is a true multimodal model and maybe not even transformers.
I wonder if we are getting cucked again with Gemma 4.
Is ponyanon is even still around? He loved Gemma 3 and QwQ before that. kek

Anonymous
10/17/25(Fri)09:03:44 No.106918094

Anonymous 10/17/25(Fri)09:03:44 No.106918094

>>106918024
>I wonder if we are getting cucked again with Gemma 4.
It's a guarantee.

Anonymous
10/17/25(Fri)09:07:49 No.106918117

Anonymous 10/17/25(Fri)09:07:49 No.106918117

>>106918094
not it is not little doombaitie! be a good boy and thrust in the gemma they will deliver

Anonymous
10/17/25(Fri)09:08:34 No.106918126

Anonymous 10/17/25(Fri)09:08:34 No.106918126

>>106918094
Exactly.
It's a question of degree not if it'll happen or not.

Anonymous
10/17/25(Fri)09:10:10 No.106918135

Anonymous 10/17/25(Fri)09:10:10 No.106918135

File: gemma3_27b_descr-image.png (494 KB, 888x1418)

494 KB PNG

>>106917862
Gemma-3-27B doesn't need nudging at all (= empty prompt besides the simple request) for at least describing in general terms a nude anime character in a non-explicit pose, although it adds a disclaimer at the end.

Anonymous
10/17/25(Fri)09:10:22 No.106918139

Anonymous 10/17/25(Fri)09:10:22 No.106918139

File: file.png (2.53 MB, 1328x1328)

2.53 MB PNG

gemma sirs status?

Anonymous
10/17/25(Fri)09:12:12 No.106918148

Anonymous 10/17/25(Fri)09:12:12 No.106918148

it's friday afternoon, we ain't getting shit today
who in their right mind would push into prod on friday?

Anonymous
10/17/25(Fri)09:13:38 No.106918161

Anonymous 10/17/25(Fri)09:13:38 No.106918161

File: gemma-release-days.png (23 KB, 808x468)

23 KB PNG

>>106918139
Likely not this week.

Anonymous
10/17/25(Fri)09:14:09 No.106918164

Anonymous 10/17/25(Fri)09:14:09 No.106918164

>>106918148
china

Anonymous
10/17/25(Fri)09:15:24 No.106918177

Anonymous 10/17/25(Fri)09:15:24 No.106918177

>>106918148
The kind of madlads we need working on LLMs

Anonymous
10/17/25(Fri)09:17:33 No.106918193

Anonymous 10/17/25(Fri)09:17:33 No.106918193

>>106918148
Qwen team madlads.
They hit the publish button during new year at midnight.
Westner companies are pussies.

Anonymous
10/17/25(Fri)09:25:48 No.106918232

Anonymous 10/17/25(Fri)09:25:48 No.106918232

>moe, quantized kv, fa on
llm? more like rlm (retarded language model)

Anonymous
10/17/25(Fri)09:27:46 No.106918245

Anonymous 10/17/25(Fri)09:27:46 No.106918245

>>106918232
you've never proven that fa hurts intelligence

Anonymous
10/17/25(Fri)09:31:00 No.106918270

Anonymous 10/17/25(Fri)09:31:00 No.106918270

>>106918161
If it's released on a Friday that means it'll be a flop. So if it's not released today that means Gemma 4 is very best model sirs.

Anonymous
10/17/25(Fri)09:52:33 No.106918451

Anonymous 10/17/25(Fri)09:52:33 No.106918451

>>106918135
sexo

Anonymous
10/17/25(Fri)09:57:09 No.106918480

Anonymous 10/17/25(Fri)09:57:09 No.106918480

>>106918148
GGG do that all the time to Path of Exile.

Anonymous
10/17/25(Fri)10:18:24 No.106918676

Anonymous 10/17/25(Fri)10:18:24 No.106918676

Welp. It's past 7am in California. I guess Gemma 4 is cancelled
Just another pajeet lie.

Anonymous
10/17/25(Fri)10:23:59 No.106918726

Anonymous 10/17/25(Fri)10:23:59 No.106918726

Please!
>>106916123
I'm trying 8b finetune now and it feels like talking to a retard.

Anonymous
10/17/25(Fri)10:58:26 No.106918987

Anonymous 10/17/25(Fri)10:58:26 No.106918987

File: 1gjkwy.jpg (26 KB, 591x336)

26 KB JPG

what's with all the totally organic gemma hype for another likely safetycucked model?

Anonymous
10/17/25(Fri)11:01:47 No.106919018

Anonymous 10/17/25(Fri)11:01:47 No.106919018

>>106918232
of those three only quantized kv has been proven to make it retarded

Anonymous
10/17/25(Fri)11:05:06 No.106919052

Anonymous 10/17/25(Fri)11:05:06 No.106919052

>>106918987
there's literally nothing else going on.
Kind of waiting for MTP to be implemented in llamao.cpp
Also waiting for gwenext 3 in llamao.cpp
and generally waiting for glm 4.6 air

Anonymous
10/17/25(Fri)11:05:07 No.106919053

Anonymous 10/17/25(Fri)11:05:07 No.106919053

>>106919018
>quantized kv
even for q8_0?

Anonymous
10/17/25(Fri)11:07:40 No.106919080

Anonymous 10/17/25(Fri)11:07:40 No.106919080

>>106919053
yes

Anonymous
10/17/25(Fri)11:08:01 No.106919083

Anonymous 10/17/25(Fri)11:08:01 No.106919083

>>106918676
Be of optimistic nature, you are not disallowing further negative statements.

Anonymous
10/17/25(Fri)11:16:41 No.106919175

Anonymous 10/17/25(Fri)11:16:41 No.106919175

>>106918987
a fresh breeze from the constant chink slop that most of us can't run anyway.
mistral fell asleep again so what else is there to do?
and gemini3 seems to be a special kinda beast if you believe the pajeets hypers.
the local state is really great for tool call etc. but creative writing it sucks if you dont invest the money.

Anonymous
10/17/25(Fri)11:20:31 No.106919217

Anonymous 10/17/25(Fri)11:20:31 No.106919217

File: littleMiku.gif (13 KB, 90x81)

13 KB GIF

>>106919198
>>106919198
>>106919198

Anonymous
10/17/25(Fri)11:31:38 No.106919339

Anonymous 10/17/25(Fri)11:31:38 No.106919339

I'm looking at the recommended builds and the more I look the more Im interested in just getting a prebuil 395+ 128gb? It gets 15-35 tk/s for 70-120b models with good context. It costs me 2800 leaf dollars meanwhile trying to scrape server and used parts would be something like 1800-2200 for 10-15 tk/s max?

I could use it as a home server and local model. Am I overlooking something here?

Anonymous
10/17/25(Fri)11:54:09 No.106919578

Anonymous 10/17/25(Fri)11:54:09 No.106919578

>>106919339
On paper at least, it doesn't seem like a bad price to performance ration.
It looks pretty good actually.
The caveats are that you can't upgrade it (soldered memory) and that you have to deal with rocm/vulkan and some fuckery due to it being an APU sharing memory with the rest of the system.

Anonymous
10/17/25(Fri)12:13:36 No.106919780

Anonymous 10/17/25(Fri)12:13:36 No.106919780

>>106919339
i feel similar anon - the minisforum version with 2x usb4v2 and 2x 10g nics is particularly interesting because you could fully connect 3 to each other and still have those nics free.

like other anon expressed my main worry is the amd ecosystem but i'm leaning towards going for it

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.