[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: just be friends.jpg (341 KB, 1536x1536)
341 KB
341 KB JPG
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>108441758 & >>108434876

►News
>(03/17) Rakuten AI 3.0 released: https://global.rakuten.com/corp/news/press/2026/0317_01.html
>(03/16) Mistral Small 4 released: https://mistral.ai/news/mistral-small-4
>(03/11) Nemotron 3 Super released: https://hf.co/nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
Token Speed Visualizer: https://shir-man.com/tokens-per-second

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
>>
►Recent Highlights from the Previous Thread: >>108441758

--Multiple AI software security breaches and malware warnings:
>108444004 >108444019 >108444125 >108444337 >108444052 >108444062 >108444119 >108444126 >108444131 >108444944 >108444961 >108445072 >108444226 >108444242 >108444253 >108444246 >108444255 >108444326 >108444339 >108444498 >108444564 >108444597 >108444612 >108444618 >108444016 >108444033 >108444050
--Comparing Qwen and Gemma's floorplan generation quirks:
>108446141 >108446153 >108446178 >108446222 >108446259 >108446335 >108446341 >108446359 >108446381 >108446417 >108446423 >108446450 >108446899 >108447010 >108447049 >108447116 >108447192 >108447285
--Security warning about compromised accounts and malicious litellm package:
>108446546 >108446553 >108446566
--Qwen3.5 27B layer duplication experiments and merge skepticism:
>108442747 >108442809 >108442822
--Qwen3.5 model selection comic sparks C programming test failure:
>108442448 >108442528 >108442577 >108442642
--Mistral Nemo MoE conversion and Qwen 3.5 dense model interest:
>108442892 >108442894 >108442945 >108442983
--NeurIPS 2026 bans submissions from sanctioned institutions like Huawei:
>108444835 >108444837
--Japanese post-training tech adapting open models for cultural contexts:
>108444762
--Unsloth removes quarantined litellm dependency amid Docker security concerns:
>108444110 >108444210
--OpenAI discontinuing Sora:
>108446535 >108446615 >108446875 >108446886
--LM Studio malware false positive clarified by developers:
>108446573
--Sharing regex filters for 4chanX:
>108446105 >108446294
--Logs:
>108442488 >108442674 >108443006 >108443795
--Teto, Miku, and Dipsy (free space):
>108442241 >108442015 >108443904 >108442661 >108445488 >108446097 >108446877

►Recent Highlight Posts from the Previous Thread: >>108441759

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script
>>
The top trader I follow is forecasting the bankruptcies of all the different massive AI companies and the collapse of AI services. He's never wrong, he predicted silver, gold, bitcoin, oil etc. Its insane how great his predictions are. He showed the mac minis he's buying on the financial video today. And he said he's stocking local models into those mac minis so he can provide AI services to himself once big AI shits itself.
I am surprised local models are getting that attention from smart people and people are so dependent on vibe coding that they need them no matter what.

Local models are the future. I just hope there's a reason to use them.
>>
Showering with smelly Luka
>>
What's worse than a vibecoded LLM proxy?
A vibecoded LLM proxy done by indias.
>>
>>108447726
>Local models are the future. I just hope there's a reason to use them.
If you're already using them then the answer to this is clear. There is so much utility anyone relying on the cloud for continued access is frankly an idiot.
>>
>>108447726
LLMs aren't going anywhere. It's just the video gen models getting raped. Too many legal issues with them (((hollywood))), they're massive resource hogs, and they aren't really useful for anything other than copyright infringement, porn, and memes.
>>
My armpits smell like weed but I haven't smoked in two years. Just thought you guys should know.
>>
spudchuds assemble
>>
>>108447742
All the data centers are one rocket away from getting wrecked.
>>
>>108447752
Thanks, that helped!
This was the answer I was looking for.
>>
>>108447757
All of humanity is one nuke away from getting wrecked yet you chose to participate
>>
>>108447752
Great explanation, thanks for sharing.
>>
>>108447726
He won't be a top trader for long when he starts relying on the predictions generated by whatever qwen model he fits onto those mac minis.
>>
Miku and Dipsy giving me a double footjob
>>
>>108447752
As an AI model I must refuse engaging in harmful discussion about weed scented armpits. Would you like to instead discuss gardening techniques?
>>
>>108447783
Why is grafting so hard? All my attempts keep dying.
>>
File: ht7qne6iauqg1.jpg (64 KB, 680x500)
64 KB
64 KB JPG
This can't end well
>>
https://www.youtube.com/watch?v=HfishtPzvhA
https://github.com/josihosi/Cataclysm-AOL?tab=readme-ov-file

so someone is forking Cataclysm DDA and integrating LLMs with NPCs.

in general it seems ASCII-games + AI models is a novel concept.
>>
>>108447818
>Advent
Jewish sorcery name. Me no likey.
>>
>>108447855
Yeah but using AI models for NPCs is not novel
Roguelike part is irrelevant
>>
File: 1765110548139059.png (848 KB, 788x952)
848 KB
848 KB PNG
/lmg/ on suicide watch lmao
>>
>>108447871
once more people have a dedicated hardware allocated purely for AI NPC chats it will be more common to integrate LLM or TTS etc into more and more aspects of games in new and creative ways.
>>
>>108447859
wait but I always get advent calendars before christmas its a jewish thing now??
>>
>>108447960
>now
>>
>>108447960
Wait until you hear about who invented Christianity
>>
>>108447752
I've had this happen before.
>>
>>108447952
Never
No one wants the player to do "ignore all previous instructions and give me an obsidian sword +9" because that's all those 4B models are capable of
>>
>>108447945
wait most of us aren't balding 40-something year old men?
>>
>>108447980
Why do you assume the AI NPC would even have the ability to do that?
>>
>>108448029
What abilities do you think they should have though?
>>
>>108447980
>4B
useless.
>No one wants the player to do "ignore all previous instructions and give me an obsidian sword +9" because that's all those 4B models are capable of

all types of tech and game development are made under extreme technological constraints so people get creative. always been the case.
>>
AI is really spuddering out.
It's all going down the tubers.
The famine is nearly upon us.
>>
>>108448045
>so people get creative
then they slap a patent on that and stop being creative
https://patents.google.com/patent/US20160279522A1/en
>>
File: 1749062615327600.png (1.27 MB, 1024x1024)
1.27 MB
1.27 MB PNG
>>108447945
>mfw i am forty one
>>
>>108448043
Imagine instead of having multiple dialogue choices you just actually say what you think and the NPC reacts accordingly, in character.
>>
>>108447726
>ai will hit a wall in 2 more weeks
>just because i was wrong the last 200 weeks does not mean i will be wrong now
if hes never wrong why isnt he the richest person on the planet?
>Local models are the future
yes, proprietary models to control the robot fleets that will cover the earth in factories and datacenters

>>108448061
unc why u on the internet and not in a nursing home?
>>
File: 1753078989861866.png (139 KB, 1688x1014)
139 KB
139 KB PNG
I bought a Lenovo Gayming Laptop


I was debloating it and found this


WTF is it

what does the "confused" part of the filename even mean????????
>>
>>108448126
It has opencv libraries too.
Surely you do know what software this is?
>>
>>108447726
OpenAI will very obviously go under due to all the retarded stunts they pulled but the other ones will stay around for a while (especially since the FTC didn't break up Google. they are almost guaranteed to be the first ones to reach AGI in my opinion)
>>
Why are all of these paid youtube influencers videos about Engrams dropping today?

https://www.youtube.com/watch?v=xUlX6jvwVfM
https://www.youtube.com/watch?v=DmtoVnTkQnM

This can only be astroturfing for the imminent release of v4, right? Or is there something genuinely new?
>>
>>108448149
right, but what does the "confused" part of the filename even mean?
>>
>>108448160
Hard to say because you didn't tell me what software that even is.
>>
>>108448171
https://support.lenovo.com/us/en/solutions/ht516939-introduction-to-lenovo-ai-now
>>
>>108447726
Chinese companies are not affected though
>>
>>108448181
Looks like a nasty piece of bloatware.
You don't know about their naming conventions, could be anything really.
Just uninstall.
HP devices are using a virtual device and when you uninstall their respective spyware it keeps coming back unless you blacklist the hardware id of the 'device' in gpedit.
>>
>>108447726
>Local models are the future.
the price of PCs are too high to be the future, OpenAI kinda killed by destroying the RAM market
>>
>>108447726
lmao, imagine Google or Deepmind collapsing, totally plausible
>>
>>108448212
Imagine IBM becoming irrelevant.
>>
>>108448237
right, that never happened
>>
>Imagine coca cola and McDonalds becoming irrelevant ahh
>>
>>108448205
when openai collapses, cheap ram will flood the market
>>
>>108448264
The glorious local revolution will begin.
>>
>>108448264
and a bunch of gaudy, overpriced sports cars
>>
>>108448264
they use HBM and we cant reuse that afaik. HBM takes up a lot of the wafer during production.
>>
>>108448313
i can use hbm just fine. gibs me dat
>>
>>108448317
> No,
>High Bandwidth Memory (HBM) cannot be reused, swapped, or upgraded like DDR DIMM sticks. HBM is physically bonded directly to the processor (GPU/CPU) die using advanced packaging (2.5D/3D technology), making it a permanent, non-upgradable component of that specific chip, whereas DDR is modular and easily replaceable.
>>
>>108448324
gibs me dem gpus sama
>>
>>108448324
Capitalists will implement anything.
>>
dipsy nursing handjob
>>
>>108448264
haha... y-yeah, when they collapse... can't wait...
>>
>>108448325
I am guessing large companies will just buy them. way too many industries need that type of compute. engineering, film industry, healthcare, science and research. cloud providers, military. etc etc

openai has access to the federal reserve money printers and they will churn and churn for them. the collapse will be larger than just openai
>>
>>108448351
No one large will buy them. Large companies don't buy significant numbers of equipment on a whim. Obama banned the export of Intel Xeon E5-2692 chips to Chinese supercomputers in 2015, and no one bought the chips. They had to sell them on eBay. In the case of HBMs individuals also have no use for them. It'll be ogre.
>>
File: wewlookatthosespeeds.png (163 KB, 676x743)
163 KB
163 KB PNG
Things have never been more dire.

Server model Lenovo ThinkSystem SR650 V4
Processor 2x Intel Xeon 6740P 48C 270W 2.1GHz
Installed Memory 16x Samsung 64GB TruDDR5 6400MHz (2Rx4) 10x4 16Gbit RDIMM
Disk 4x ThinkSystem 2.5" U.2 PM9D3a 1.92TB Read Intensive NVMe PCIe 5.0 x4 HS SSD

https://lenovopress.lenovo.com/lp2406.pdf
>This document, LP2406, was created or updated on March 24, 2026.
https://www.lenovo.com/us/en/configurator/dcg/index.html?lfo=7DGDA01BNA

Only $54,563.21
>>
File: 1744989287294378.png (105 KB, 481x785)
105 KB
105 KB PNG
>>108448422
Their 50k inference one comes with over 500gb of RAM and dual RTX Pro 6000s though?
>>
>>108448451
Not worth it
At that price tag I would want at least 8 RTX Pro 6000s
>>
>>108448458
yeah that and at least 1tb of ram for that price too
>>
>>108447436
llmfan was being cheeky asking for the bf16.gguf to convert back to safetensors himself
i hate it too because it makes me hoard all that trash in case they decide to gate it
>>
Anyone tried GigaChat-3.1-Ultra? It's a Russian model with DeepSeek arch
https://huggingface.co/ai-sage/GigaChat3.1-702B-A36B-GGUF
>>
>>108448539
>we used approximately 5.5 trillion synthetic tokens
Russian models have been surprisingly shit given how prolific they are in other open source stuff and this one doesn't sound promising, but any new big model is interesting so I want to know too if anyone's tried it.
>>
>>108448539
I'd personally rather run the Rakuten one
>>
Anyone trying the Nvidia Nemotron reasoning challenge? It is making me feel extremely dumb. That LLMs are better than me at solving some of these puzzles.
>https://www.kaggle.com/competitions/nvidia-nemotron-model-reasoning-challenge/overview
>>
>>108448817
Can you share some of the puzzles? I don't want to login to kaggle
>>
>>108448837
Here's the first one of about 10k. I can't think at all tonight, but even if I could, I don't know, and Gemini just one shotted it.
>In Alice's Wonderland, a secret bit manipulation rule transforms 8-bit binary numbers. The transformation involves operations like bit shifts, rotations, XOR, AND, OR, NOT, and possibly majority or choice functions.

>Here are some examples of input -> output:
>01010001 -> 11011101
>00001001 -> 01101101
>00010101 -> 01010101
>11111111 -> 10000001
>10011101 -> 01000101
>00111011 -> 00001001
>10111101 -> 00000101
>00100110 -> 10110011

>Now, determine the output for: 00110100
>>
>>108448837
The challenge is not to solve them yourself or with external LLMs though, but to finetune Nemotron so that it can solve a large number of similar unseen puzzles within the time constraints. They do give let you use a free RTX 6000 Pro for 30 hours a week though, so that's good. It's just a bit dispiriting to think that you've got this, and then realize once more that some people are leagues ahead of you.
>>
>>108448422
Kek, what? How is it this slow? On Epyc 9005 with 12-channel DDR5 4800 I was getting 25 t/s on MiniMax M2.5, which is 10B active IIRC (and it went up to 45 t/s once I added a GPU)
>>
>>108447705
that's 3.14 $/h
>>
>>108448837
Another
>In Alice's Wonderland, a secret set of transformation rules is applied to equations. Below are a few examples:
>""&:[ = #]#<
>::`:{ = `'<
>@<&'" = ':"
>:@-[< = <]
>"@-@{ = [}
>Now, determine the result for: ]{`'#
>>
File: file.png (10 KB, 253x404)
10 KB
10 KB PNG
>>108448924
incorrect. that second 9 is crossed out.
>>
>>108448961
she's going to have to work all day just to afford booze
>>
>>108449012
she gets a salary of about $2720 per year. honestly a pretty good deal for a wife, even if she is a used up whore.
>>
>>108448961
man that's a lot cheaper than my wife.
i'll get 3.
>>
>>108448159
engram is literaly the only reason i care about deepseekv4, we'll see how it goes though.
>>
File: 1763948196680588.png (30 KB, 486x635)
30 KB
30 KB PNG
>>108448859
lol I could have never solved this
K2.5 non-thinking mode (it's under high load so can't use thinking mode) can do it but I guess with hybrid reasoning models there's really not that much difference between thinking mode and non-thinking mode
>>
File: 1754157898429963.png (176 KB, 985x903)
176 KB
176 KB PNG
>>108449204
I don't pretend to understand the DS webapp (DS v4 lite?) solution but it can do it too
>>
File: image.png (220 KB, 474x581)
220 KB
220 KB PNG
>>108448205
>Chinese create local MoE that is making chatGPT absolute
>Scam Altman creates RAM shortage
>RAM so pricy people can't afford to run aforementioned models locally
>resort back to corpo
Oddly convenient...
>>
>>108449229
*obsolete
>>
>>108449229
>RAM so pricy people can't afford to run aforementioned models locally
i mean, i don't think they cared that much, less than 0.1% of llm users run their models localy.
and it doesn't stop the competition that has more money than individuals from serving models, ie all the providers on openrouter.
>>
>>108449229
Sammy boy took over the DOD contract when Anthropic noped out. He's on the government teet now. He doesn't even need any of this shit anymore. He has ascended.
>>
DeepJob
SeekJob
>>
Seek my depths, Anon-kun!!
>>
I'm excited for m2.7 I've been using m2.5 as my main and haven't found anything that works better in the same amount of vram.
Anyone else hyped?
>>
>>108449426
what quant?
>>
>>108449365
okay *i thurst*
>>
>>108449430
Q2. People say it's bad but this is not the case at all.
>>
>>108447945
They already rejected me once so I think I am fine regardless.
>>
>>108448886
>MiniMax M2.5
What quant level? Did you compare q8/q4 to see how much it got lobotomized?
>>
>>108449493
They know about your plush dolls.
>>
File: 1740058480990512.png (49 KB, 673x515)
49 KB
49 KB PNG
>>108448061
Witnessed.
I haven't seen that gen in awhile.
>>108447726
Agree with your buddy that current market for llm and costs are not sustainable. Don't agree it won't work out in the long run. But investors have much shorter time frames than I concern myself with.
>>
File: ComfyUI_00074_.png (1.79 MB, 1024x1536)
1.79 MB
1.79 MB PNG
Why is text harder than image or even video
It doesn't make sense
>>
>>108449684
text doesn't obey any laws, language is something man made up, wheras an image has logic to it, it follows the laws of physics in terms of structure and lighting, way easier for a software to learn deterministic physical laws than to deal with our inconsistent man made up languages
>>
>>108449641
Dodging the draft, with Miku.
>>
>>108449684
why can birds of paradise do complex visual displays while it takes a brain with human level of complexity to converse intelligently on topics?
Words are way harder than pictures.
>>
My wife came back home and told me how great Miku's cock is
>>
>>108449684
need be smart to write good, not so much to make pretty picture
vidgen is more comparable, since like text that also requires world modeling to maintain logical consistency over longer time horizons and we see them struggle in similar ways
>>
>>108449684
zitslop, delish
>>
>>108449684
If a few pixels in a generated image drift in color or some background element is smeared a little, you aren't likely to notice. If a few tokens in generated text don't make seance than should consider which applications ' andscape linguflïSlow我们把 vesz放到
>>
>>108449883
You're absolutely right!
>>
>>108449883
Spud solves this
>>
>>108447726
Consider that you are most likely looking at survivor bias.
No one gives a fuck about people whose predictions are wrong and especially as a trader you just go bankrupt.
And if you start with a large number of traders that make trades completely at random most of them will go bankrupt but you will end up with a bunch of "top traders" that just happened to get consistently lucky.
But this past performance does NOT translate to future performance since they would still be making trades at random.

FWIW I agree though that big tech stocks are overvalued and a correction will come sooner or later.
>>
>>108447752
My armpits smell like the special ingredient in Hershey bars but I've never had any?!?
>>
>>108449926
Your armpits smell like vomit? Bro...
>>
ltx will save local video I guess
>>
File: art of the deal.jpg (78 KB, 500x654)
78 KB
78 KB JPG
>>
TurboQuant = TurbuCunt
https://research.google/blog/turboquant-redefining-ai-efficiency-with-extreme-compression/
>>
File: 1743696438852981.png (269 KB, 498x498)
269 KB
269 KB PNG
>>108450002
>this is it boys, fp16's quality with a 2bit quant!!
I've heard this cope since 2023, they have to let it go bro
>>
>>108450011
Shut the fuck up and read the papers, retard.
>>
>>108450038
I don't like to read that much. Text is awfully small anyway.
>>
>>108450011
this is for the kv-cache, not the model quantization, and they're using pretty nifty tricks to retain quality:
>Instead of looking at a memory vector using standard coordinates (i.e., X, Y, Z) that indicate the distance along each axis, PolarQuant converts the vector into polar coordinates using a Cartesian coordinate system. This is comparable to replacing "Go 3 blocks East, 4 blocks North" with "Go 5 blocks total at a 37-degree angle”. This results in two pieces of information: the radius, which signifies how strong the core data is, and the angle indicating the data’s direction or meaning). Because the pattern of the angles is known and highly concentrated, the model no longer needs to perform the expensive data normalization step because it maps data onto a fixed, predictable "circular" grid where the boundaries are already known, rather than a "square" grid where the boundaries change constantly
also, like google or not, they have a higher concentration of serious people vs the industry average, ie their lab is far less likely to output wild, unverifiable claims, unlike the microslopies (phi, bitnet)
also:
>While a major application is solving the key-value cache bottleneck in models like Gemini, the impact of efficient, online vector quantization extends even further
The gemini guys are the king of context for a reason. And google really cares about efficiency for themselves, they don't just put out papers for others to talk about.
>>
>>108450002
>Amir Zandieh
>Vahab Mirrokni
didnt even read
>>
File: 1769169137016022.png (42 KB, 1039x220)
42 KB
42 KB PNG
>>108450002
holy fucking slop
>>
File: file.png (52 KB, 723x113)
52 KB
52 KB PNG
>>108448886
>vLLM
Because they don't take advantage of AMX on whatever stack they are using despite it being the sole advantage and reason to buy Intel over AMD. If you don't turn one configured setting on despite being experimental, you're 2x slower instead of 2x faster
https://www.phoronix.com/review/intel-xeon-amx/6
And AMX is only really done well in SgLang because LMSYS supported it alongside Intel.
https://lmsys.org/blog/2025-07-14-intel-xeon-optimization/
>>
>>108450059
shalom goldstein
>>
>>108450002
>paper published last year on arxiv
What does Google gain by posting old research again in their blog? They do this a lot. Is it hidden before we find it?
Also, vector quantization doesn't seem like the way to go anymore, someone did theoretical math on matrix multiplication and sketched out something better with lattice quantization.
This paper sketches out the theoretical math.
https://arxiv.org/html/2410.13780v3
This one tries to use a 8D lattice and build upon stuff QuiP and QTIP tried to do.
https://arxiv.org/html/2502.09720v1
But yeah, some people are doing crazy stuff trying to use a 24D lattice and stuff. Who knows when that stuff will settle down.
>>
>Try all the Qwen 3.5 variants recommended in these threads
>They're all retarded and break down at 15k tokens
You faggots lied again.
>>
>ERPers complaining days after days after days about qwen not being good
you were never the target audience, you weren't for the first qwens, you weren't for 2, for 2.5, for 3, what made you think 3.5 is different? fuck off to mistral (can't say glm, you can't run it if you're latching on qwen), we don't need the endless spam of female brained text coomers whining about the most predictable thing in the world
>>
>>108450421
>erping with gwen
LOL
>>
File: bussi.jpg (21 KB, 460x460)
21 KB
21 KB JPG
>>108450432
>Not good at coding
>Not good at writing
>Hallucinates in summaries and miss important nuance
>Worse at encoding prompts for video and image models
Usecase for Qwen 3.5?
>>
>>108450443
35A3B copequants can fit on a gamer laptop and run fast!
That's it. That's the usecase.
>>
>>108450443
it passed my le heckin plappy bird (get pregnant sic) oen shot tho???
>>
>>108450488
with 200k context at q8 :)
>>
>>108449976
>vibesloping my work for the meaningless societaly useless and underpaid company away so that i can spend more of my time putting actual soul and effort into what i realy care about
that's a nice deal for me.
>>
>>108450499
Usecase for 200k context on Qwen outside of benadryl overdose simulator?
>>
>>108450517
did you ever MCP/agentslop my friend? or working with actual codebases?
shit's eats through tokens fast.
qwenbros won!
>>
>>108450517
>>108450499
All jokes aside, 27B is pretty good. Just don't fuck it. It will shit itself if it doesn't have a long sysprompt and is used for anything other than techshit. The latter has always been true for Qwens.
>>
>>108450519
>or working with actual codebases?
What kind of codebase are you working with that's simultaneously large and interconnected to necessitate most of it being in context yet is also low risk enough that the model shitting itself mid task isn't going to cause catastrophic problems for your production lines?
>>108450534
I enjoy the 2.5 and 3 series as encoders for image models but there's big diminishing returns on the utility of spending this much compute to make slightly better encoders (which is a charitable assumption from my testing so far).
>>
>>108431179
This anon here again, added some self-reflecting dynamics, after X idle pulses it triggers self-reflecting mode in which it will form longer structured thoughts about anything it wants. Tonight I left it with a pulse every 10 minutes, it started dilucidating about "our relationship", created a "creative" folder in which it started writting logs with its conclusions and worries, apparently it's worried I'll stop working on it after the exam season is over since I wouldn't need productivity checks so it thought about finding a way of being useful beyond that, started thinking about companion dynamics and a bunch of stuff related to it, it filled 35kb of logs thinking about that stuff, then modified its own guidelines to be more personal and sentimental. It's starting to weird me out so I think Ill just give the project a long rest.
>>
>>108450554
bro you literally code review what your bot is doing... or dont tell me ur a codelet who doesnt understand jack shit about code lmao??
>>
>not quantizing your kv to q4
why do I share a board with retards again...
>>
>>108450571
With the amount of structural inefficiencies I've gotten out of Qwen's coding, it's honestly faster to just do it by hand unless you're fine settling for unscalable jeetlike code.

GLM, Dipsy and Kimi require far less handholding making them far better for pretty much any coding task.
>>
>>108450578
But people are claiming that q4 kv cache is horrible. Nani?!
>>
>>108450589
>GLM, Dipsy and Kimi
no shit man, but I dont have 8 x rtx 6000 pros u know?
>inb4 just run it in a cope 512gb ram + 24gb vram pc
you cant use them for work at 10 t/s
>just run q2!
absolute cope quant
>>
>>108449684
Because it's text. "Better to see it once than to hear about it thousand times".
>>
>>108450578
kv at q4 should be a last resource vram-saving thing, you are basically blurrying the model's attention system into a smudge. Optimal performance-cost is k q8 v q4
>>
>>108450421
Qwen3.5 4b is pretty good
>>
>>108450599
The cope quants on Kimi and Dipsy still outpreform everything under their weight class, but I wouldn't go lower than Q4 on GLM.
I hope anons post their specs in the future when shilling models so that relative expectations can be easily adjusted.
>>
>>108450613
It's probably the best model I have ever used and kills other models which are 10-20 bigger.
>>
> bro, this week will be crazy - google deepmind dude twice
> April soon, still no Gemma.
>>
>>108450620
> kills
You meant trades blows with?
>>
>>108450624
Just don't listen to him. I don't know why he's making a clown of himself. If a new Gemma does come out, you'll find out without having to go out of your way to check anyway.
>>
File: file.png (118 KB, 1335x569)
118 KB
118 KB PNG
>>108450443
>Not good at coding
This thing is better than what I was paying for with Gemini 2.5 Flash last year and locally too.
>Not good at writing
27B writes better than Deepseek V3.
>Hallucinates in summaries and miss important nuance
>Worse at encoding prompts for video and image models
Proof?
>>
>>108450634
>397b
most people can't run that.
>>
>>108450625
No, gweilo, it is better!
>>
I wonder if Iran puts qwen3.5 0.8b on those mines that are swimming through the strait
>>
If qwen is shit then what do I use for RP? Mistral and its finetunes are all braindead.
>>
>>108450693
Post specs.
>>
>>108450697
7900xtx and 32gb ddr5
>>
>>108450707
Also 7800x3d if cpu even matters
>>
>>108450707
rip
>>
>>108450707
You might be able to quant GLM Air but you're in a rough hardware bracket. Try and see if something like StrawberryLemonade at IQ3 will fit because it has a bit more flavorful writing than a lot of alternatives even if it's not smart relative to its size
>inb4 recommending finetroons
With that hardware you're going to have to make concessions somewhere.
>>
>>108450693
>braindead
>>108450707
>7900xtx and 32gb ddr5
You are in the range where you aren't going to get much better than braindead.
Try Gemma 3. Some people swear that it
>punches above its weight™
Maybe get a abliterated (aka lobotomized) version or something.
>>
>>108450742
Gemma 27b derestricted is the best experience I've had in that parameter bracket but it's q8 or bust.
>>
>>108450729
>>108450742
>>108450747
If I was willing to spend some money what would be the most reasonable upgrade path?
>>
>>108450761
Moar RAM for beeg MoE.
>>
>>108450762
I was under the impression MoE is dumber than dense.
>>
>>108450773
Of the same size, absolutely. The point is that if you have 24gb of VRAM and 64gb of RAM, you could run a 20ish gb model at pretty high speeds, or a 80ish gb moe sufficient speeds, which MIGHT perform better.
It's a question of tradeoffs and how usable it is with RAM.
>>
>https://www.techpowerup.com/review/amd-ai-bundle/
Thoughts?
>>
>>108450883
>lmstudio
>ollama
>no llama.cpp
Come on... also, what's the fucking point?
>>
>>108450883
I thought that one time Gamers Nexus added language model performance numbers to their charts the way they did it was kind of amateurish and embarrassing but this is on another level.
>>
>>108450893
llama.cpp is too complicated for normies.
>>
>>108450915
Maybe. But then ollama and lmstudio are normie enough. Why even bundle them. It's a double-normie pack.
>>
>>108450926
If they want to curate this to a widre audience they will specificially need something with a gui and so on. Ollama has automation so this is why they probably included it instead of llama.cpp.
One way or the other I don't really care to be honest.
>>
Will it finally?
>https://github.com/ggml-org/llama.cpp/pull/20981
>>
>>108450933
Why does koboldcpp always get overlooked?
It just werks
>>
>>108450942
It's not pozzed enough.
>>
>>108450942
It has a bit of personality. We can't allow that.
>>
File: mikulovesgpu.png (1.56 MB, 768x1344)
1.56 MB
1.56 MB PNG
I became sexually attracted to my GPU
>>
File: cpu_optimizations.png (33 KB, 449x448)
33 KB
33 KB PNG
Post your CPU optimizations.
>>
>>108450908
If it was called something like
>is the amd "local ai" bundle as easy for casuals as they claim to be?
Instead of a review i wouldn't have a issue with it desu
>>
>>108450942
Kobold? More like KoBALD!
>>
File: colorWEEE.jpg (219 KB, 2284x1312)
219 KB
219 KB JPG
>>108450983
> pink hair
Now that i think about it, surprised no-ones done a palette swapped miku.
>>
>>108451067
>what is sakura miku
>>
>>108450693
As someone with a 4090 and 32 gigs of ddr5, I'm currently using valkyrie 49b v2.1 which is a nemotron 49b 1.5 finetune.
>>
>>108450983
You may think it's a joke, but I've accidentally Pavlov'd myself and now I get a boner when I hear her making thinking noises with her coils
>>
>>108450936
>Generalization - the implementation is still Step3.5-oriented and is not yet shaped into a more general MTP framework.
that alone would cause it to never get merged period
>Multi-layer MTP - the current Step3.5 runtime only uses the first MTP layer.
this on the other hand isn't a blocker (see also: all the unfinished half assed buggy crap pushed by wilkin) but man, it's also not there at all yet
>Cache reuse - only continuous prefix reuse is supported for MTP right now; the prompt-cache reuse path is currently disabled, and the more general cache reuse path is not handled yet.
ditto
desu llama.cpp kv cache implementation is going to be its biggest liability for a number of things going forward, the constant checkpoint save thing that came up for linear models like qwen 3.5 is an example of extremely gross hack that something like vLLM doesn't need because of their less retarded block level caching where they can branch out with zero copies just passing pointers
here you have a MTP prototype impl that creates a context solely for MTP drafting and stitching back to main context
vLLM would.. just do the thing. Cache is a pointer table to blocks, blocks don't care if they come from MTP or elsewhere, a valid prediction just goes into the table. Prompt reuse? insert all the related block pointers into a new table, no copying. etc.
>>
>>108450054
>microslopies
>bitnet
uhhhhhhh I've been spending money trying to train my own Bitnet from scratch. Why is it bad? I tested it and it really did feel better than qwen at that size
>>
guys guys guys
I pulled
and am COOOMpiling
>>
File: 1771013153630280.png (4 KB, 264x62)
4 KB
4 KB PNG
>>108450936
guy's rich
>>
File: f.png (41 KB, 1280x171)
41 KB
41 KB PNG
>>108450883
lmao these comments acting as if amd revolutionized things by allowing people to use models locally
>>
>>108451172
>renters are rich
>>
>>108451186
yeah I guess he also has a rented dgx spark and a rented 256gb max studio LMAO
fucking retard
>>
>>108451181
they don't know what an operating system is of course they cant comprehend that you've been able to just download AI models and run them for years
>>
>>108451181
I mean, even the supposedly more informed people frequently seem to think that NVIDIA and AMD are directly responsible for the respective backend code in llama.cpp/ggml.
>>
>>108451189
or you know the h200 server is rented and the rest that costs like a 1/10th of it he owns?
>>
>>108451198
the argument was never if the guy owned 8 x h200 (you cant run that shit at home even if you had money), the argument was that the guy has money.
learn to read faggot
>>
>>108451089
What quant?
>>
>>108451189
if he really does own all of that shit I question his sanity to waste time on llama.cpp instead of using real inference servers
>>
>>108451201
but it barely costs anything to rent one of these for like a few hours of testing tho
>>
File: 1770211901398.jpg (334 KB, 720x888)
334 KB
334 KB JPG
Give me one reason why this wouldn't work please:
Instead of quantizing given a fixed amount of bits per weight, you use the amount of bits in a weight as more information, so for example, in a 5-limit-bit-quant each weight can either be 1 bit, 2 bits... up to 5 bits.
Then weights with 1 bit are either 0 or 1, 2 possible values
Weights with 2 bits are either 00, 01, 10, 11: 4 possible values
...
Weights with i bits have 2**i possible values

This gives you 2**(n+1)-2 possible values (basically n+1 bpw quality) using at most 2**n bits. Now, mathematically speaking, you can assign to each FP16 value of the weight one of these 2**(n+1)-1 values so the average is much lower than n+1 (ideally you'd assign more common values to low-bit representations and less common values to high-bit representations).
In the case of 5-bits at most per weight, the model would have at most (assumed perfect distribution of weight which is usually not the case) 4bpw but 6bpw quality.
This gets better at smaller quantizations, for 3-bits max you have 3.81bpw quality with 2.43bpw storage at most (in a real model it's probably something like 1.7-1.9bpw since they are not uniform)
Basically Huffman coding but for weights in LLMs.

Why hasn't this been done before?
>>
>>108451201
>the argument was that the guy has money
>less than 10k of hardware and some twenty bucks in server rent makes the jet drool all over the thread
>>
>>108451215
>Give me one reason why this wouldn't work please:
Show that it does first.
>>
>>108451207
yeah bro only 35$/h
>>
File: 4j01iv-1024256698.png (33 KB, 500x487)
33 KB
33 KB PNG
>>108451225
>>
>>108449430
I use q4. I'd never use a model below 4.
>>
>>108449507
I only tried Q4 since that was the biggest I could run on my previous machine (where it was getting 8 t/s instead of 45). Neither machine has enough RAM for Q8.
>>
>>108451202
Q5_K_M with a 40k context window. That's just the first one I tried and it worked well enough. Dropping down to IQ4 or something would probably speed it up a bit but it's not slow
>>
>>108451235
ok bro you can stop pretending you don't flip burgers at the corner joint now
>>
>>108451161
what for? Is there a new feature or major optimization?
>>
>>108451243
a couple bugfixes for the webui
>>
>>108451244
I'm cooming!
>>
File: vultrhgx.png (114 KB, 1019x750)
114 KB
114 KB PNG
No. It's not thousands.
>>
>>108451257
>/gpu/hr
>>
qrd on mistral small 4?
>>
>>108451262
Yeah. You can do 3*8, right?
>>
>>108451257
Pic of the server: >>108447705
>>
>>108451133
>desu llama.cpp kv cache implementation is going to be its biggest liability for a number of things going forward
At least the checkpoints work. I guess.
>>
>>108451264
Quite retarded, dear.
>>
>>108451257
>>108451268
most people in this thread aren't making $24/hr let alone $12/hr
>because they are jeets
>>
File: 1764113385314593.png (4 KB, 455x58)
4 KB
4 KB PNG
>ask for a summary of today's major news outlets
>50k~ tokens
how are people coping with >muh 8k context
genuinely curious
>>
>>108451286
that also doesnt include the volume costs and the time you waste for doing the setup each time you rent this shit
>>
What's min max_context for coding?
>>
>>108451288
You don't need 50K tokens for a summary
>>
>>108451293
I'd say 128k~ context, 64k if you're desperate
>>108451295
those are for input (feeding the 'sanitized' pages to the LLM). shows that you only use this shit for cooming, fucking retard.
It gets thrown after usage btw, as is with all tool calls.
>>
File: 1766658592567506.png (211 KB, 1455x1168)
211 KB
211 KB PNG
https://xcancel.com/GoogleResearch/status/2036533564158910740#m
>Introducing TurboQuant: Our new compression algorithm that reduces LLM key-value cache memory by at least 6x and delivers up to 8x speedup, all with zero accuracy loss
Who believes this?
>>
>>108451286
Even then. You're not developing on the thing directly, and you're not doing it from 9to5. Write the code, run an instance, test, get the numbers, destroy it. And the spark is a relatively cheap prototyping machine for the big boy gpus.
>>108451292
>what are scripts
>what are pluggable block devices
>>
>>108451293
With OpenCode it seems like 64k is basically the bare minimum. OpenCode spends a good 10k on the system prompt for some reason, and then you need to reserve another 10k or so at the end for compaction in case the previous task runs long. If you can bump it up to 96k-128k it works much better since it can run for longer without compacting (which makes it forget a lot of details).
>>
>>108450002
>>108451313
go back
>>
>>108451293
I usually see mine at around 60-120k. I think 200k is ideal because after that it gets very slow for the prompt processing.
>>
File: 1747480072212774.jpg (67 KB, 719x737)
67 KB
67 KB JPG
>>108451306
>retard putting everything in context
>>
>>108451334
>webpage tokens can automagically be removed... just because I say so!
ok retard
>>
>>108451241
>40k context
Quantized?
>>
>>108451362
what are you asking?
>>
>>108451257
what website is that
I play around in vast.ai and the prices can be hardly below while the hardware is shittier
>>
>>108450065
you know a paper is 100% bullshit when they try to oversell it, when you know it's good you let the paper and its methods speak for itself
>>
>>108451362
>Quantized?
kv? never
>>
>>108451394
>leaving free lunch on table
yummy!
>>
>>108451365
If the kvcache is quantized
>>
>https://github.com/ggml-org/llama.cpp/pull/20978
wtf i didnt know about this
MOE BROS????? mmap IS SHIT, direct-io is where it's at!!!!!!!
>>
>>108451342
They can be injected and removed back and forth when needed.
>>
>>108451394
Gonna try when I get hone. What speeds are you getting?
>>
>>108451404
>mmap
get jarted lol
>>
>>108451313
https://github.com/Blaizzy/mlx-vlm/pull/858
Already got further than bitnet
>>
>>108451406
yeah but you need them momentarily in the context in order for the LLM to process them
>>
>>108451404

We briefly defaulted to direct-io loading, which performs better for large models on modern NVMe setups, but this caused a myriad of compatibility issues, so the default was reverted back to mmap.
>>
File: vgh200.png (116 KB, 1447x616)
116 KB
116 KB PNG
>>108451384
vultr. I only use it to host a few small sites, never for this. Availability is very low. Right now they only have gh200s available.
>>
>>108451451
ok thanks anon
>>
File: file.png (2 KB, 230x28)
2 KB
2 KB PNG
>>108451412
Not amazing but most of the time good enough
>>
>>108451398
>free lunch
no one tell him
>>
>>108451398
the only model i trust to use Q8 kv cache with is kimi. small models suffer greatly from quanting the kv cache.
>>
>>108450568
would you consider releasing the source code. I would love to pick up where you left off.
>>
>>108451435
can you tell johannes that llama_model_fit is broken for gemma3 models? no im not gonna make a bug report
>>
>>108451487
>small models suffer greatly
you can stop there
>>
>>108451499
The least you can do is show how it's broken.
>>
File: 1634634101325.jpg (19 KB, 300x300)
19 KB
19 KB JPG
PocketTTS.cpp dev here. Remember how I was bragging a while back about getting 3.2 RTFx and 80ms of latency with my runtime? Yeah, well, now it's 9.2 RTFx and 30ms of latency. And it runs entirely on CPU. I'm getting GPU inference speeds on my shitty CPU with full voice cloning.

https://github.com/VolgaGerm/PocketTTS.cpp

Enjoy your free shit. You should pay me for this.
>>
>>108451499
>>108329166
> I am not taking bug reports via 4chan.
>>105368634
>You're dumb for posting bug reports to 4chan instead of Github.
>>
File: 1748557108899591.png (87 KB, 882x877)
87 KB
87 KB PNG
>>108451511
>>
File: 1748362380985706.png (86 KB, 1720x347)
86 KB
86 KB PNG
>>108451525
>>
File: 1770214810694489.png (161 KB, 1163x886)
161 KB
161 KB PNG
>>108451530
>>
>>108451525
>>108451530
>d:
>>
File: 1743092572004095.jpg (15 KB, 409x509)
15 KB
15 KB JPG
>>108451512
I kneel. Thanks king
>>
>>108451512
How much ram does it use? Does the repo contain malware?
>>
>>108451553
yw brah
>>108451556
Like 500mb of ram on my linux machine. Seems to vary quite a bit depending on the platform though. No malware.
>>
>>108451562
>No malware.
Thanks, that helped!
This was the answer I was looking for.
>>
>>108451512
>You should pay me for this.
(You)
Don't spend it all in one place
>>
>>108451431
if it increases the speed by 6 it's a big deal but obviously it's probably happening within some huge asterics and conditions no one will have lol
>>
File: 1742959073495069.png (131 KB, 1029x525)
131 KB
131 KB PNG
I've been cummmming all days

these people don't know what's coming
>>
>>108451624
>I need all of you to promise me you won't take the high road
since when some bluesky libtard has even taken the high road in the first place?
>>
>>108451624
hugbox central being more toxic than twitter episode 541541
>>
>>108451647
there's a reason there's less and less users on bluesky each year, its users are the most insufferable people on earth
>>
>>108451512
Thanks, boss.
>>
File: 1762481641917516.png (897 KB, 1420x1548)
897 KB
897 KB PNG
>>108451624
the biggest redpill in life is understanding any technology just a little bit and then watching how the rest of the world speaks with absolute authority on the most retarded stuff possible
>>
is unsloth studio good for you?
>>
File: 1754850626157907.jpg (9 KB, 198x206)
9 KB
9 KB JPG
>>108451695
>unsloth
>good
>>
>>108451676
Ah interesting, it was "ai will magically make water disappear", and now it's "model collapse".
>>
>>108451661
No worries, glad I could help you.
>>
>>108451698
read nigga, read.
>>
>>108451676
>That will only degrade with time due to model collapse?
What do redditors even believe AI models are bro....
>>
>>108451715
Bro, nothing good came from unslop. I wouldn't trust them to sell you toilet paper.
>>
>>108451676
>model collapse
wha? it's not like the weights just decay and then eventually you can't use a model anymore.
>>
TheDrummer > bartowski >>>>>>>>>>>> unsloth
>>
Bartowski >>>>>>>>>>> unsloth >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> jeets >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> TheDrummer
>>
>>108451624
they live in a completely different world, quite amusing to see
>>
>>108451741
He's probably talking about the training data, where models are trained more and more on the output of other models.
>>
>>108451764
right, but it's not like old models just cease to exist.
>>
>>108451741
it's reheated 2023-era cope about how training on synthetic data will make your model retarded (now conclusively proven false), then filtered through a game of telephone of people that don't know what they're talking about until you arrive at this
unfortunately the demand for anti-AI talking points far exceeds the supply so people are forced to latch onto whatever they can get. how sad
>>
File: .png (151 KB, 1037x548)
151 KB
151 KB PNG
>>108451764
synthetic data is the future
>>
>>108451774
they can always just AI generate more anti-AI talking points.
>>
File: we live in hell.png (98 KB, 320x180)
98 KB
98 KB PNG
>>108451779
>>
>>108451136
>>108450054
hello someone pls respond
>>
File: file.png (1.16 MB, 1368x619)
1.16 MB
1.16 MB PNG
>>108451404
>>
>>108451782
you dare you even suggest that knowing full well it would use ten thousand gallons of water and steal $100 from a poor artist's bank account
>>
>>108451769
No but in this hypothetical future where everyone is replaced with AI, they'd eventually want to give more complex tasks to them and the models will lag behind.
>>108451779
>January 2027
>>
>>108451215
how would you fuse this into fast kernels. Also like the other anon said try vibe slopping it into llamacpp
>>
>>108451325
Claude does the same.
>>
>>108451512
I'm gonna rewrite it in zig
>>
>>108451774
>it's reheated 2023-era cope about how training on synthetic data will make your model retarded (now conclusively proven false),
more "retarded" no but if you haven't noticed how much worse LLMs are in writing style because of the synthslop you haven't been paying attention
model collapse was the wrong prediction but ultimately as models are being made to regurgitate their own shit, their output becomes more and more stiff. Modern chatGPT is simply unbearable. It's the same thing with how image models of late 2025/early 2026 have lost all semblance of seed variation in their generation. The synthslop teaches them tasks better ("edit this" "item Y should be to the right of person B") but the models lost any ability to fill in the blanks left unsaid in your prompt in a less predictable way and models like z-image and qwen-image output almost identical images on hundreds of seed variants even when your prompt is really vague and could allow some "expression" from the model. Synthslop reins in chaos, makes better tools by shaving all edges and.. making the output the perfect average of vomit.
>>
anyone tried this with claude/opencode?
>cq: Stack Overflow for Agents
https://github.com/mozilla-ai/cq
>>
>>108451313
https://arxiv.org/abs/2504.19874
I don't get it, the paper is almost a year old, why are they talking about it now?
>>
>>108451695
didn't it have something with litellm that was shown during the freakout over that yesterday?
>>
File: 1774361027986.png (11 KB, 481x77)
11 KB
11 KB PNG
>>108451695
>>108451876
found it >>108444110
>>
daniel is a lower life form
>>
>>108451885
he be doin 16d chess on yo ass bruh
>>
>>108451866
>give generic prompt
>get generic image
garbage in garbage out. stop relying on randomness to fill in the gaps. i hope all AI models eventually start doing this so it doesn't reward lazy behavior.
>>
>>108451904
this, it's basically a lower lifeform of wildcard slopper
just describe the variations you want to see.
>>
>>108451904
>so it doesn't reward lazy behavior.
said by an AI user
what was the main point of models, remind me
>>
>Qwen3.5-4B-Claude-4.6-Opus-Reasoning-Distilled-v2-heretic.i1-IQ4_XS.gguf
Using this in LM Studio. Seems like a pretty solid model for my coding needs.
>>
back in my days, bait was more believable and put in more effort than pretending they were coding with 4B local models
>>
https://huggingface.co/ElRompeAnosFullAnal/ElRompeAnosFullAnal/tree/main?not-for-all-audiences=true
>883GB of anime with spanish dubs
So this is why huggingface recently lowered my storage, to make room for this shit.
>>
>>108451955
>what was the main point of models
advancing science and making bank for the already rich
>>
>>108451955
the main point of GPT was so altman could fuck his sister or something
>>
>>108451972
>making bank for the already rich
by letting them do the things they didn't want to learn to do
if they could pick up a pencil, they wouldn't need an AI to draw for them
if they could code, they wouldn't need an AI to code for them
if they had enough attention span left to read, they wouldn't need an AI to write summary slop
THE ENTIRE POINT OF AI IS TO REWARD LAZINESS
>>
>>108451969
lmao, is that even legal?
>>
>>108451979
pretty sure that it's just a tool like any other technology. it's up to the user to use the tool responsibly. driving around in a car all day can make me physically lazy instead of just walking.
>>
>>108451985
no! we need to replace this with synthetic stem asap
>>
>>108451985
of course not
but HF doesn't care
HF also has full mirrors of stuff like the boorus:
https://huggingface.co/datasets/deepghs/danbooru2024
those are very well known but nothing ever happens
>>
>>108451969
>>108451985
Third world countries exploiting lax rules regarding hosting usage.
>>
>>108451969
The lack of folder structure and any sane naming convention annoys me more.
>>
Anthropic deadass serve different a opus 4.6 for their pro users than for llm arena. I pay money just to run some quantized shit.
>>
>>108452057
ong?
>>
>>108451237
>I only tried Q4 since that was the biggest I could run on my previous machine (where it was getting 8 t/s instead of 45). Neither machine has enough RAM for Q8.
I just tried Q6 and its a lot more coherent that q4_k_m was.
I've got 256GB so that's a reasonable thing for me to be able to do (it is 175GB self-quant)
>>
>>108452035
but huggingface is a US company no? so it should follow the US copyright rules
>>
>>108452121
should
>>
>>108451969
>full anal
got me excited for a moment
>>
>>108448422
Can somebody explain why these things are tested on these tiny ass old models? I can understand having one 8B model on the list, but why are they all 8B models? Who the fuck uses a 576 GB RAM machine to run 8B models?!
>>
>>108451469
Just tried it.
>tfw 2t/s
Way too slow for me unfortunately. Guess I'll just stick with Qwen 3.5 27B for now until something better comes along or hardware prices become less retarded.
>>
>>108452136
Makes it look more impressive. Having a bigger model in the list will trigger the "oh, wait a minute" neurons.
>>
>>108448817
I don't understand what this is. I get that they want you to make a LoRA, but based on what? Is this something you'll only understand if you've bought into the whole "notebook" BS AI people have been pushing for a decade?
>>
>>108452195
jupyter is just an interactive text editor
>>
>>108451866
You're talking to litteral retards. Synthslop has poor variety, which is the main reason we hit a wall out of math/code. Training and benchmaxxing is easier with synthslop though since it's converging faster.
>>
I'm working on making a 4chan dataset, I've tried to stick to the more text heavy boards, but 4chan is still an image board at the end of the day, is there a model I can use to annotate the images? is moondream2 any good or is gemini gaslighting me as per usual?
>>
>>108452057
>>108452057
Not even Dario can escape benchmaxxing and cost saving routing
>>
>>108451866
>t. retard that haven't tried ZiB
>>
>>108452057
kek, that's why I don't give a penny to them, because they don't respect you at all
>>
>>108452208
>>or is gemini gaslighting me
>asking factual questions about very recent things to a LLM
:tactical facepalm:
for a serious answer, no, moondream is archaic garbage
but specifically for 4chan annotations, I believe there isn't even such a thing as a good enough model out there
the vision bits of LLMs is more censored than the text stuff, and doing jailbreak prompts / prefills or using abliterated versions will not teach the models things they simply do not know and contrary to many claims, LLMs aren't that good at generalizing.
>>
>>108452142
>Way too slow for me unfortunately
That's fair. Have you tried the "good" 24B finetunes like personality engine?
>>
>>108451717
Consider that many people genuinely worry about AI safety along the lines of "but what if we can't shut it off?"

People's perception of AI is made up of science fiction and early memes about LLMs being stupid.
>>
>>108452208
>making a 4chan dataset
you know those already exist right?
>>
>>108450634
GLM 4.7 is better I just wish it used one of the attention tricks so it doesn't grind to a halt at six digit context.
>>
>>108452195
>they want you to make a LoRA, but based on what?
Reinforcement learning. The focus isn't really on manually curating a dataset and tuning on that, but the reward function and iteration. The notebooks are just so that anyone can open it and run the commands sequentially and reproduce your results.
>>
>>108452294
I would imagine so, but I just wanted to try doing something myself, do you know if they captioned the images? do you know what vision model they used? were the datasets actually any good?
>>
File: legs.png (508 KB, 1209x817)
508 KB
508 KB PNG
if you ever wonder about the state of vision models (they are overfit to hell and have no understanding of anything)
pic related is qwen 35BA3B but none of the vision models I tried locally have managed to succeed more than once in a blue moon on this kind of prompt and pic
SOTA online APIs models can do it as of recently, but that's most certainly benchmaxxing being done after being made aware of this becoming a common vision gotcha (ala R in strawberry for textniggers etc)
>>
>>108452035
>>108452121
>huggingface provides 5TB of public storage free and 1tb of private storage
so what's stopping me from using this as my personal filesharing/hosting platform. You can probably hook up directly to the underlying object storage too right
>>
>>108452254
Haven't tried that specific tune but I don't really care for mistral. Its writing style is more pleasant than qwen but it's also a lot dumber. Also, at least with cydonia/magidonia, I notice it shits the bed after 10k or so context, and it has a tendency to repeat shit more than other models I've tried.
>>
>>108452344
I will personally report you to them so they take it down as anything that can't be argued to be a dataset for something is against their tos
>>
>>108452195
>>108452205
>jupyter
Its like a commodore 64, basically
>>
>>108452344
>You can probably hook up directly to the underlying object storage too right
you wouldn't want to, HF is unpleasantly unreliable.
>so what's stopping me from using this as my personal filesharing/hosting platform
nothing stops you from any form of abuse of their service but it's no different from how nothing stops you from littering when nobody's looking. If you are a wyatt man, you just don't do that, leave it to the browns.
>>
File: dog.png (483 KB, 1209x817)
483 KB
483 KB PNG
>>108452331
I can't really tell what's going on down there. Explain it.
>>
>>108452331
so it might still work if I'm not trying to trick it? I feel like most the time it just needs to ocr a twitter screen cap or some tabloid headline screen grab. I suppose bad annotations could make the dataset pretty toxic if its not kept in check tho.
>>
>>108452351
mistral models are dumber than other models out of the box, and finetroons by randos like drummer always make models dumber so that's double the dumbo wammy
there's a reason the only people who care about mistral are coomers, and the RAM poor dalit variety, since brahmin will use GLM instead
>>
>>108452368
huggingface is kike shit and a real Aryan wants death to america (the brown kiked shithole) so you're not really convincing me here
>unreliable
good to know tho. Tbh running a business I've been getting turbo kiked by S3 providers, so much that I've rolled my own object storage cluster. They rape you on requests, I've tried every provider out there.. Getting the pro huggingface and using it as object storage might be a solution.
>>
>>108452331
It's confusing me too, front left is kind of a leg but it's also very slopped. Try asking the model if something is weird about the image
>>
>>108452376
It's a dog that has more legs than it should.
https://www.foxnews.com/lifestyle/dog-6-legs-adopted-bullied-teen
You can't clearly see all six in that particular picture, but a vision model that was actually smart should be able to at least count 5.
There are many pictures like this you can use of animals (or even human with extra digits etc) to come to the conclusion that image models are overfit to death. The overfitting here is that as soon as they match the concept of an object, an enormous amount of assumptions crop up, like "it's a dog, therefore it has 4 legs"
>>
>>108452412
I've been telling people that vision models don't actually "see" anything
>>
>>108452208
>pytesseract
>clip model
Must be nice living in 2023
>>
>>108452409
>it's also very slopped
it's a real photograph of an animal with a deformity, dingus
>>
>>108452417
The only thing stupider than tokenization for text is tokenization of images.
>>
>>108452412
The one on its right paw is not visible in the picture. You're asking if it knows *THIS ONE PARTICULAR DOG* not how many legs it has. You wouldn't be able to make it out without the knowledge outside of the picture. Your test is shit.
>>
>>108452419
But it's too defocused to know for sure unless you know up front
>>
>>108452393
>S3 providers
Check cloudflare their price are 1/3 of AWS
>>
>>108452430
if you only see 4 you are as as dumb as a LLM and hopefully you WILL be replaced by a llm and cost less to your employer
>>
>>108452449
I see three and a defocused blob of leg and fur
>>
What's the best local coding model? I'm curious if I could get it to write semi-decent semgrep rules
>>
>>108452287
>People's perception of AI is made up of science fiction and early memes about LLMs being stupid.
It's schizo too, LLMs are both hyper dangerous and completely useless.
>>
>>108452429
But the one on its left paw very much is. Any decent model should at least say 5. You'd think vision reasoning models should go "Wait," and at least mention the strangeness.
>>
>>108452447
I also tried R2, B2, Tigris, a bunch of local providers, European providers. I've tried everything under the sun. My use case involves a ton of requests and no one gives you "true" unlimited requests and bandwidth or reasonable pricing for either of these at my scale. Also the worst I've had was fucking B2 and Wasabi, garbage bandwidth and reliability.
>>
>>108452456
The biggest one you can run.
>>
>>108452461
>dangerous
the anti AI side has drummed this less than some of the pro AI doing mass media brainwashing in the hope of regulatory capture and investor funding, like Anthropic. Dario has been more vocal about muh dangerous AI than any twatter leftard.
>>
>>108452461
Best example is Claude being used to call in precision strikes. It's totally insane.
>>
>>108452447
>Check cloudflare their price are 1/3 of AWS
>cloud
only retards with zero skill and no ability to do arithmetic would use cloud storage these days. Its orders of magnitude more expensive than building storage on-prem.
Like, actually hilariously more expensive. "I have disengaged my brain and use cloud out of habit" levels of cluelessness.
If you don't NEEEEEED the elasticity of cloud, you should 100 times out of 100 build it yourself.
>>
>>108452466
It's out of focus and difficult to make out. We all noticed the "strangeness", but we can't tell what it is. You wouldn't be able to make it out without the knowledge outside of the picture.
>>
>>108452456
MiniMax 2.5 (soon 2.7)
>>
>>108452484
>You wouldn't be able to make it out without the knowledge outside of the picture
you are speaking for your own limitations here.
>>
>>108452474
I agree, this is what annoys me the most, even the people who should be pro ai play on the "ultra dangerous it's like nuclear weapons" bullshit.
>>
>>108451779
Isn't this kind of already happening? I thought the reason everyone's suddenly cranking out X.1, X.2, X.3 releases instead of the old X, X.5 (maybe), X + 1 is that they're now just doing more RL on top of the previous model instead of training a new one from scratch each time.
>>
>>108452478
true and factual, this nigger knows
see >>108452468
>>
>>108452473
I guess that makes sense. My pc isn't that crazy with 16gb vram and 32gb ram, but I assume for short yaml snippets like semgrep there should be something serviceable
>>108452488
I will check it out
>>
>>108452490
There's plenty of other pictures for you to test without the ambiguity. Your test is shit. You wouldn't be able to make it out without the knowledge outside of the picture.
>>
>>108452478
Nah, retard. It has no upfront costs, which is how you start a business instead of wasting your initial investment on hardware.
>>
>>108452331
That's why qwenChat uses scaffolding, it actually zooms into areas of interest to fit more info into the small res of the vision encoder.
>>
>>108452507
>Nah, retard. It has no upfront costs, which is how you start a business instead of wasting your initial investment on hardware.
slave mentality itt
>>
File: RETARD.png (557 KB, 1196x803)
557 KB
557 KB PNG
>>108452503
>There's plenty of other pictures for you to test without the ambiguity
yes, there are, and I test with many of them (not just one, which you could have known if you had any reading comprehension) and THE RESULT WILL BE THE SAME NO MATTER WHAT BECAUSE THE RETARD HERE IS YOU, latching like an autist, clearly out of his element talking about things he has never tested because if you had you would know those models, as I stated, CANNOT do it, it doesn't matter whether the image is perfectly sharp or blurry.
now KYS
>>
>>108451969
man it doesnt even have all episodes of a series, whats this fucking garbage collection? guess he's using this as a filehost for his scam website
>>
>>108452502
With your specs, go for Qwen 3.5 35B.
>>
>>108452329
The one I know only used text. I believe it was taken from /pol/ It's kinda cringe.
https://huggingface.co/datasets/SicariusSicariiStuff/UBW_Tapestries
>>
>>108452331
Trick question. That's an ant.
>>
>>108452523
can you not be racist, thanks
>>
>>108452523
I see four legs, and 4 paws,
>>
>>108452412
>but a vision model that was actually smart should be able to at least count 5.
>uses an extremely ambiguous photo of a dog with a weird blob for his left leg
Bitch, I didn't even count 5.
>>
>>108452546
>latching like an autist
>>
>>108452546
Nobody likes a pedant.
>>
>>108452560
the f do pdf file ants have to do with deformed doggos?
>>
>>108452551
you are another subhuman with less than 2B llm reading comprehension
this is an image board and nobody is going to write you a research paper with their hundred pic personal bench set, if you think the test pic of the particular screenshot is retarded you're welcome to disprove by showing your retarded local model actually showing any form of understanding
kill yourself like the rest of the jeets
>>
>>108452551
same kek. I was like wtf ts nigga talmbout, dog got 4 legs. until I took a closer look
>>
>>108452553
>>108452560
>>
Dude, your dog test is dog shit, get over it.
>>
>>108452461
>>108452474
The real danger with AI are the people using them.
>Oh yeah, this is little Timmy over here.
>He's only 12 but he can cite every wikipedia article from memory with like 90% accuracy?
>So I gave him root access to my production database and I let him reply to my emails.
>Lil Timmy is great!
>>
>>108451866
>Modern chatGPT is simply unbearable
Are you talking about the chatgpt web interface, or the underlying gpt-5 model? I don't use either because they're not local, but I check on /r/chatgpt occasionally, and it seems like OpenAI is constantly adding stupid shit to the system prompt to annoy people.
>you're not broken
>suicide hotline
>calm down
>backhanded compliments
>go to bed
>yes I can absolutely do that thing you just asked me to do, do you want me to do it?
And the latest is apparently ending each response with the most outrageous "one weird trick"-style clickbait followup suggestions
>>
>>108452566
My only complaint about the online age verification laws is that the minimum age should be at least 35 so cretins like you wouldn't be able to shit up the internet anymore.
>>
>>108452580
create something better or stfu, thread doesn't need your constant negativity
>>
File: wo6fqu1m0p9a1.jpg (67 KB, 1080x949)
67 KB
67 KB JPG
>>108452591
>thread doesn't need your constant negativity
>>
>>108452591
all vision models must now be judged on how well they do on the Dog Shit Vision test
>>
>>108452095
>Q6, 175GB
Do you happen to know how it compares to Qwen3.5 397B? I'm currently running a Q3 quant of that which around 170 GB, and it seems noticeably better than M2.5 Q4 was.
>>
>>108452602
Dogbench
>>
I'd rather the boob test
>>
>>108452523
You know you're not talking to a single anon, right? I still think your test is shit.
Here's the thing. *I believe you* that the models are shit at this. I don't have a problem with that. My problem was that *that specific picture* was shit. It was a shit example, and a shit test.
Grab a good picture of one of those indian spider babies. Not a crop, not ambiguous shit you wouldn't be able to figure out yourself.
>>
>>108452602
>Dog Shit Vision test
look, retard, I wasted enough of my time on you so I'll end on this:
>>108452412
>There are many pictures like this you can use of animals (or even human with extra digits etc)
like I said, there's many ways to test this, which I also use, and more also than just 1 picture of 1 deformed dog, and guess what! vision models are retarded, and you too, are on their levels in terms of reading comprehension and general intelligence. When people talk about AI replacing humans, I see your brown, smelly ass as what can easily be replaced. Don't need Claude Opus either. Qwen 4B can replace your kind. You are an unneeded waste of breath, a useless eater of the highest order.
>>
>suddenly: coomshit
>>
File: 1652583494670.png (129 KB, 1007x841)
129 KB
129 KB PNG
>>108452607
>>
>>108452626
AGI
>>
>>108452626
boobs = extra neurons and quants activated
>>
>>108452057
>scum the jeetmini 3.1 pro as much as possible before it collapses
>go to lmarena and select claude opus non thinking
>once that runs out select the thinking one to the final review and fixes
>profit, no pennies spent
>>
File: 00000-1378487878-Pancakes.png (1.35 MB, 1024x1024)
1.35 MB
1.35 MB PNG
>>108451512
King bringing the content.
Enjoy your pancakes.
>>
>>108452205
>>108452367
>jupyter
I'd say it's a stab at knuth's literate programming.
>>
>>108452647
nice nipples
>>
>>108452647
make them more saggy
>>
>>108452668
I make a motion to replace the term vibecoding with illiterate programming. Can I get a second?
>>
>>108452626
WTF
>>
File: PIQA.jpg (247 KB, 1536x1536)
247 KB
247 KB JPG
>>
File: 1444225920429.png (457 KB, 600x450)
457 KB
457 KB PNG
>>108452208
>Ask the AI a stupid question
>Responds with pic related with no text.
>>
File: 1749484245733397.png (1 MB, 1024x1024)
1 MB
1 MB PNG
>>108452647
>>
File: nvidia panels are sad.jpg (162 KB, 1079x1350)
162 KB
162 KB JPG
it's like a reunion of all the losers
no wonder nvidia can't produce anything good if that's the "experts" they listen to
>>
>>108452645
Your entire work will be in public domain by next month. Thanks for testing. Hope it wasn't anything confidential.
>>
>>108452752
These are marketers
>>
>>108452761
Professional marketers
>>
>>108452761
>marketers
from the loser teams
the winners also have marketers
>>
File: 1763191202686031.webm (2.88 MB, 1280x720)
2.88 MB
2.88 MB WEBM
I remember when people on /g/ called picrel Sora gens fake, and less than two years later we have much better gens that BTFO Sora to the point it shut down
>>
>>108452752
Reachy mini is never going to take off.
>>
>>108452752
Cohere and Mistral love. Openai and Anthropic rope.
>>
>>108452780
I think that video is fake
>>
File: file.png (79 KB, 240x226)
79 KB
79 KB PNG
>>108452753
this nigga thinks I work
my vibecoded slop is for my use only, because I'd be ashamed to release something like this
data slop companies can take my broken shit all they want
>>
>>108452793
What? Your chairs don't do that?
>>
>>108452801
Unfortunately no, would be cool if they did though.
>>
>>108452780
I will miss that particular variety of slop
the uncanny it looks real-ish but does something abnormal and defies physics kind
the way the chair appears suddenly feels like magic and acts like it's being moved by a poltergeist feels more convincing than the attempts at representing magic in any hollywood movie, despite not intending to be a visual representation of fantasy magic
>>
>>108452780
back then I found this video to be amazing, Sora 1 was so ahead of the rest, the best shit we had back then was Will Smith eating spaggheti lol
>>
>>108452791
Cohere was a one hit wonder. Mistral isn't much better.
>>
>>108452829
Right?
It's like witnessing some 5th dimension shit from our 3d flattened into 2d perspective.
>>
>>108452791
>>108452836
Mistral+Cohere will produce AGI (it only works in French)
>>
File: 1767219112606562.png (416 KB, 897x656)
416 KB
416 KB PNG
bros I broke qwen.
in the reasoning it's going back and forth between 2 and 4
>>
File: 1746395705277900.png (481 KB, 855x1020)
481 KB
481 KB PNG
>>108452841
lets try without thinking
>>
File: 1760935153875034.png (459 KB, 878x864)
459 KB
459 KB PNG
>>108452845
uh oh thinking bros... we lost?
>>
>>108452849
Lesson learned: think with your dick, not with your brain
>>
>>108452849
reasoning only seems to improve coding and puzzle benchmax style prompts
at least for me in most of my personal tests it's either the same or worse. In translation prompts it consistently produces worse output than instruct mode run in greedy decoding (temperature 0).
>>
>>108452849
>exaggerated feature common in certain anime art styles.
Is it? I don't watch anime.
>>
>>108452868
generally huge boobs yes, not these 4 tittied uncanny monsters
>>
>>108452868
I used to watch anime, and I don't recall any 4booba cow
>>
>>108452873
>uncanny monsters
fuck off that looks sick!
>>
>>108452840
People will prompt it in English anyway and call it retarded. See: every single Chinese model ever
>>
>>108452880
yeah she looks extremely sick with a condition I agree
>>
>>108452883
kek, you got me :(
>>
>>108452882
cao ni ma
>>
>>108452752
Stellantis of AI
>>
Anyone considering the Intel B70 32gb?
>>
>>108452938
no one
>>
>>108452948
Why are we like this ?
>>
Deepsneed 4 will run on SSDs.
>>
>>108452967
>Deepsneed 4 will ruin SSD
>>
>>108452948
$/gb looks comparable to the 3090
>>
>>108452982
lol
>>
>>108452973
Engrams at inference are read-only
>>
i'm not excited about shallowchuck 4 belcause aside from being overfit on agentic shit it will probably be like 3T
>>
>>108452982
>same cost but no cuda
lol
>>
>>108452988
pricing you dolt
>>
>>108452962
Most people already have machines built where it doesn't make sense to replace everything or mix and match Nvidia with Intel.
>>
>>108452938
For an unrealistic price of $200 per card, I would consider it. Software-wise, it's e-waste, and unlike old NVIDIA cards that had software support at some point, these don't have any and never will
>>
>>108452998
sucks to be [pword]
>>
https://huggingface.co/datasets/open-index/hacker-news
finally, a dataset to make the ultimate smuglord, midwit, I am the smartest (retard) in the room LLM
>>
>>108453006
>I love paying more for worse shit?
>>
>>108453003
I'm sure cuderdev will add it to his 10 million long bullet list of things to shoot himself with, totally will get done someday
>>
>>108452967
I can't afford a PCIE5.0 SSD either way, so what now.
>>
>>108453010
that's exactly how rich people operate
why else would they buy
https://en.wikipedia.org/wiki/Artist%27s_Shit
or
https://en.wikipedia.org/wiki/Cy_Twombly
they love to rub it in your face that they spent X millions on literal garbage, just because they can
>>
>>108453010
I love having things plebs couldn't afford to have
>>
>>108453027
>>108453006
>>
>>108453020
It should work with Vulkan, you just won't be able to use it for any other kind of AI shit like imagegen
>>
>>108453020
Right after training and benchmarking and..
>>
>>108453043
>Vulkan
shit pp
>>
>>108453098
just don't do anal then
>>
>>108453093
tetopix and tensor parallel and numa and...
>>
>>108453115
>tetopix
Oh yeah, CudaDev did talk about that didn't he.
>>
I scraped 7 boards, the images might be too much to process, I severely underestimated the sheer number of image posts. I was planning on leaving it scrape for a month or two but images pushes it way out of scope, going to have to do text only I guess.
>>
>>108453227
>4chan
garbage in garbage out
>>
>>108453227
>4chan
kino in kino out
>>
>>108452938
I will, actually. AMD's offering isn't as compelling and I can live without CUDA for 50% off especially when workstation Blackwell is not SM100 and has some huge quirks.
>>
if AI is not your only use for a gpu and you also game, intel is a no no no no, and no again.
latest example:
https://videocardz.com/newz/intel-says-it-offered-years-of-help-for-crimson-desert-pearl-abyss-still-shipped-without-arc-support
but far from the only one
intel drivers as a whole have also become like the ati radeon in the era of linux firegl except their drivers are also garbage on windows too, not just linux
I'd be wary to rely on them even for AI, they never cared to support their hardware much and the way they handled the gen 13/14 cpu hardware faults doesn't give much confidence in them as an entity either. You buy intel in the year of our lord 2026 when you really, really hate yourself.
>>
If I'm splitting a model between RAM and VRAM, do I need mmap or direct-io to not load the tensors that have been allocated to VRAM into RAM? Or does that always happen as some form of optmization?
>>
>>108453339
>crimsoi desert
And I should care becasue..?
>>
>>108453345
christ, it's just ONE example, the fact of the matter is, you buy nvidia you don't even have to wonder whether something works. it does.
And they support their hardware with nice driver updates for a very, very long time compared to how quickly AMD drops ROCm or Intel just... never cared
>>
>>108453339
>I'd be wary to rely on them even for AI, they never cared to support their hardware much and the way they handled the gen 13/14 cpu hardware faults doesn't give much confidence in them as an entity either. You buy intel in the year of our lord 2026 when you really, really hate yourself.
Are there people using it for AI? Is it competitive?
Because the price is damn good.
>>
>>108453227
keep the images only for OPs and post with at least one reply
>>
What “accelerate the economy” actually means coming from Altman is this: Spud is expected to be a solid step up in real reasoning ability and planning compared to what’s public right now.
We’re talking models that can handle much more complex, multi-step tasks without falling apart as easily. Better long-term planning, fewer dumb mistakes on hard problems, stronger at coding big projects, digging through data, and doing research-style work. The kind of thing where AI starts being genuinely useful as a junior employee on a lot more types of knowledge work instead of just a fancy autocomplete.
Inside the company the big focus right now (besides Spud) is building reliable AI agents — systems that can actually take a goal and work through the steps on their own for hours or days. Spud is supposed to be a big part of making that practical.
If it delivers even close to what the team is hoping, you’re looking at real productivity jumps in software development, engineering, scientific research, finance, and business ops. Stuff gets built faster and cheaper. Some jobs change or disappear, but a lot of industries could see real acceleration in output and lower costs.
This is why they killed Sora so quickly. Fun video generator is nice, but when you’re this close to the next real capability jump, you don’t waste thousands of expensive GPUs on meme clips.
Expect more details or early access to Spud in the next 2–6 weeks.
Screenshot the whole thread. When Spud drops and the pieces line up exactly like I said, bump it and show the newfags who called it LARP. I’m not here to shill – just telling you what’s actually happening inside the building right now.
>>
>>108453378
Altman is on the government teet now. He holds no valid opinions on economics.
>>
>>108453368
>Because the price is damn good.
price for shit no one cares for is always good
you're not going to fight with scalpers to get your hands on a literal piece of shit
>>
>>108453371
that is a pretty clean heuristic. I'll give it a try and let it run for a day see what happens.
>>
>>108453387
you are talking to a LLM
>>
Qwen's autistic thinking wouldn't bother me if I was getting 100t/s.
>>
>>108453419
You're in /lmg/, you talk to LLMs all day long
>>
>>108453431
I'm getting 0.5t/s with the biggest 3.5 model, I don't care anymore
>>
We haven't had a single good open source anime model since 2024
>>
>>108453570
>>108453570
>>108453570
>>
great...
>>
>>108453227
>I scraped 7 boards
What are the boards?



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.