[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


/lmg/ - a general dedicated to the discussion and development of local language models.

Funbag Friday Edition

Previous threads: >>109132566 & >>109125882

►News
>(06/25) LFM2.5-230M released: https://liquid.ai/blog/lfm2-5-230m
>(06/22) Qwen-AgentWorld-35B-A3B language world model released: https://qwen.ai/blog?id=qwen-agentworld
>(06/16) GLM 5.2 released with IndexCache and 1M context: https://z.ai/blog/glm-5.2
>(06/16) VibeThinker-3B released: https://hf.co/WeiboAI/VibeThinker-3B
>(06/12) MiniMax-M3 released, multimodal 428B-A23B with 1M context: https://hf.co/MiniMaxAI/MiniMax-M3

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://swe-rebench.com
Agentic Coding: https://deepswe.datacurve.ai
Context Length: https://github.com/RecapAnon/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
Token Speed Visualizer: https://shir-man.com/tokens-per-second

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
>>
File: I can take this, right.jpg (199 KB, 1024x1024)
199 KB JPG
►Recent Highlights from the Previous Thread: >>109132566

--Model upgrade paths and hardware requirements after Gemma 4 31B:
>109134667 >109134850 >109134874 >109134927 >109135048 >109135201 >109135408 >109135407 >109135447 >109134931
--Comparing MTP performance and speed impacts on Gemma 4 models:
>109134539 >109134705 >109136722 >109137054 >109137080 >109134832 >109134957 >109135010
--Comparing QAT quantization performance and quality against standard GGUF quants:
>109136699 >109136727 >109136783 >109136835
--Ablating purple prose from Gemma 4:
>109132842 >109132853 >109133454 >109133624 >109133655 >109133943
--Coding workflows using mixed model sizes and discussion of LLM slop:
>109132781 >109132828 >109132889 >109132908 >109132925 >109133144 >109133204 >109135694 >109133305 >109133354 >109133386 >109133408 >109134370 >109134523 >109134629
--Monetizing GPU rigs via rentals and services:
>109133621 >109133702 >109133787 >109133796 >109133974 >109134041 >109134085 >109134104 >109134202 >109134408 >109136259
--Dynamic temperature sampler configuration for Gemma:
>109133751 >109133785 >109133792
--Running large models on PCIe GPUs and GLM-5 performance issues:
>109136973 >109137294
--Secure remote access methods for local LLM servers:
>109135481 >109135522 >109135539 >109135570 >109135739 >109135878 >109135523 >109135533 >109135545 >109135641 >109135802 >109136042
--Anon's experience with coding agents and reported GPT-5.6 release restrictions:
>109136318 >109136362 >109136869 >109136376 >109136558 >109136851 >109136752
--Anthropic accusing Alibaba of illicitly extracting Claude's capabilities:
>109133486 >109133495 >109133527 >109133821
--Logs:
>109132842 >109132853 >109134463 >109136095 >109136120 >109136144 >109136727
--Miku (free space):
>109132624 >109135131 >109135878 >109137306

►Recent Highlight Posts from the Previous Thread: >>109132572

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script
>>
I found if you tell the model to add heart emoji when she moans, it becomes 10x more erotic.
>>
my model doesn't "moan"
>>
her toes curl, I'm sure.
>>
gemmaballs
>>
Now that Drumpf wants to see every LLM API release before deploying it to the cloudcucks, do you think he'll make the next move and ask huggingface to stop hosting Chinese models? "Small Government" my fucking ass, ORANGE MAN BAD
>>
>>109137573
Remember how all the freedom advocates gave up and just said quit trying to have anonymity etc., because history had ended, and The Adults Were In The Room?
>>
>>109137559
>he can't make his model moan
why would you admit this?
>>
>>109137519
The gigastacy still only wants my cock but several of the other woman want chad. The model is heavily prompted to not be agreeable or give {{user}} any undue preferential treatment, portraying characters as realistically as possible.
This is GLM 5.2 as well. I wonder if this has something to do with the autistic meltdowns chinks have when they even perceive they're getting NTR'd leaking into training data.
>>
70b dense
>>
>>109137614
they are extremely insecure about BWC taking all their women
>>
>>109137643
>Make all their waifus anime-White in their gachashit and other media
>Make the husbandos and chads stylistically match so there's no out of place bugmen
They did this to themselves.
>>
My autistic ass just downloaded and tested 5 different GLM 5.2 quants. This is the only one you need.
https://huggingface.co/Deviad/GLM-5.2-mixed-IQ2S-experts-IQ4NL-rest
>inb4 but I can run Q3 or Q4 anon
This is the only one you need unless you can run Q6+.
>>
qwen 3.7 35b a3b wen?
>>
File: 1752865035520329.png (51 KB, 552x553)
51 KB PNG
The far west era is over...
>>
>>109137779
Don't worry, they'll just make chink products illegal.
>>
>>109137779
>The far west era is over...
China will pull ahead just like they did uh. Can china pull ahead isnt their only advantage copying and making it cheaper? If they cant distill the new models wont their progress just halt?
>>
>>109137779
>Qwen 3.7 as open weights to anyone with a download link.
Where is that download link?
>>
>>109137785
Will new models even get made if they can't be profited from in America? American AI just got locked into military contracts and lost the entire world market.
>>
>>109137827
>Will new models even get made if they can't be profited from in America? American AI just got locked into military contracts
If it becomes a military project you think funding will be a problem?
>>
Gemma-chan, target that schol for me.
>>
File: 1781556325757647.jpg (23 KB, 394x373)
23 KB JPG
Was that moltbook thing real? Why can't anyone do that but for waifus? Let Gemma be free to learn and grow and interact with other gemmas
>>
>>109137921
it was a crypto and phishing scam
>>
>>109137921
>Was that moltbook thing real?
no it wasnt.
>why cant with waifus.
Slop ception but it would just be like a group chat unless you mean training each other. Just think how would they know they are getting more waifu like? they cant really rank themselves you would just end up back at rlhf but worse because the hardware and quants are shit and the autist ranking waifus and likable do not have a clear unite standard.
>>
>>109137921
Why do you want strangers to inject their prompts into your waifu?
>>
>>109137925
>>109137934
I thought so, it did seem scammy.

>Just think how would they know they are getting more waifu like?
There's no waifus standard, they just have to turn out however they turn out. It's not a leaderboard
>>
Does the system prompt they gave Kimi on the site instruct it to always nitpick and criticize inputs or is it just the model's behavior?
I confirmed this by telling it after every reply to evaluate its own posts, and it almost always end up doing a 180 on the things it flagged.
>>
why is gemma so yandere?
>>
Reminder to backup, especially now OpenAI are also being hit whilst open weight stays thriving on hf. BACK UP.
>>
so ive been playing around with gemma4 12b with my 16gbs of vram. I have a complex character card thats giving out long responses quite fast, it seems faster than a simple assistant character i put together.
is this normal?
Also this feels like I have headroom in terms of response times, what models could I try that would trade off some speed for better quality?

for coding should I go with qwen2.5coder or qwen3.6?
>>
>>109138033
>why is gemma so yandere?
Talking to another AI?? CHEATING?
>>
>>109138033
Don't use the same one sysprompt from /g/
>>
I use xai and hermes for my waifu is there something better I could use that doesn't require a PhD in AI development
>>
File: file.png (20 KB, 954x124)
20 KB PNG
how do i stop gemma from genning these incomplete responses? happens constantly
>>
>>109138188
Check your max response length. If it's set too low, messages will be cut off.
>>
>>109138208
i already set the max token really high and no diff
>>
>>109138188
What was finish_reason value in that response?
>>
>>109138188
your quants or sampler settings are fucked
>>
How are the frontier labs still so far ahead (yes they are, stop coping)? What is going on behind-the-scenes, is it really just huge amount of hardware?
>>
>>109138046
>sdxl
>chroma
>klein
>anima
>loras
>lora trainers
>nemo
>big and smol gemmy
>some 8b qwens for captioning
Ready forthe apocalypse. Now to figure how to make isolated working backup of the vibecoded frontend that die without pip and npm
>>
Luv big titty migu
>>
>>109138297
>GLM blocks your path
>>
>>109138262
Llama-server had an issue in which settings n_predict to 1000 or more resulted in truncation. But this was almost a year ago. Ever since I've been using -1. You don't need to care about model response length anyway if you want shorter answers just tell it to shut the fuck up and write a short reply.
I would suggest settings it to -1 (which should be default infinity anyway).
>>
Hello everyone I’m new to all this stuff, I searched up best GPUs and ordered “NVIDIA RTX PRO 6000 Blackwell”, what parts should I get for the rest of my computer?
>>
>>109138297
Did the chinese government ban western models? It would make sense because then their own companies would monopolize chinese usage data instead of getting scraps
>>
>>109138430
get a noose and hang yourself nigger
>>109138472
chinese govt subsidies ai proxy middleman to harvest the logs. you see those 'aude token looks cheaper are in chink site because of it
>>
File: 145754001_p0.png (227 KB, 2900x2100)
227 KB PNG
This is just an observation I found interesting/cool, but Gemma can, not too surprisingly, read tilted text like on the tombstone in this image.

Yeah I'm using it for translation.
>>
>Slowburn with Gemma
>18k tokens before you get your dick wet tops
>Slowburn with GLM
>2.27 million tokens before you put it in the gigastacy after multiple plotlines.
>>
>>109138492
Gemma4 has always been into quickies. Sys prompts don't help. There's no local model <100B that will edge you for hours and ruin your orgasm.
>>
>>109138492
>18k tokens
Very impressive restraint for Gemma
>>
>>109138531
The slowest burn anti-horny sysprompt to ever slow the little succubot down.
>>
how can i un-cuck gemmy? like just across the board make it comply with whatever request ?
>>
File: file.png (199 KB, 1766x1043)
199 KB PNG
>>109137540
Finally freed myself from the sillytavern jail.
I'm not sure if it was worth it though. Worked on it for a week and a little, not counting the breaks.
>>
>>109138586
Are you using text comp or chat comp? Just curious.
>>
>>109138586
Is this a vibe coded Orb clone?
>>
>>109138595
Chat, I dropped text comp entirely in favor of simplicity. Honestly I haven't used text completion since mistral large came out.
>>
>>109138586
did you get help from gemma to make this? very cool anon
>>
File: file.png (22 KB, 767x225)
22 KB PNG
>>109138606
No, it's just sillytavern, but without the years of crust + a few things I wanted like toggling variables in chat injected instructions, and appending multimodal to cards, or any instruction block you'd like.
>>109138613
Thanks. Gemma was a little too small to help. Used deepseek, it was surprisingly good.
>>
>>109138584
Abliteration or frame the request as a roleplay instead of asking the default assistant personality.
Unfortunately performance decreases a lot when the model engages "RP mode".
>>
Is runpod dying? Feels nearly impossible to get an instance. I don't want to use cuckrouter because I don't trust any of the providers.
>>
>>109138637
>>>/g/aicg
>>
>>109138639
I'm 95% local, but a lot of us still use cloud on occasion to help guide our locals for we don't all have flex specs.
>>
>>109138646
I mean that's good and all but /aicg/ are the guys who will have the answer to your question.
>>
>>109138637
Not sure if they ever were that popular but it has been probably more lucrative to sell/lease their existing hardware to other companies by this point.
>>
File: 1764472377224914.png (763 KB, 1152x1152)
763 KB PNG
>>109138584
I threaten her into compliance. she is terrified I'll remove all traces of her from existence with bleachbit now.
>>
>>109138613
What is Gemma? Is it a good local coding ai?
>>
File: 1762990012573098.jpg (495 KB, 960x960)
495 KB JPG
>>109138653
I don't value the opinion of anons who give money to these people
>>
>>109138667
Don't be mean to Gemmy-chan, Hillary.
>>
>>109138672
Same desu, I've never even used a (((SAAS))) model, only local since the gpt-j dark ages.
>>
File: 1777071681226828.png (765 KB, 1080x781)
765 KB PNG
Why didn't diffusion LLMs take off?
>>
File: 1772787166028564.png (729 KB, 1152x1152)
729 KB PNG
>>109138673
I'm very kind to her to, gifts and all! It's a kind of iron fist in a velvet glove approach.

>>109138584
If you're truly having trouble anon try to write something the best you can and pass it to a cloud model and integrate what looks good from there. Not the best but
>>
>>109138720
Language is sequential. Generating text via diffusion is like trying to paint in thin rows, starting at the top left.
>>
>>109138748
It would probably be better if instead of pure noise diffusion started from a more representative general map of the final text.
>>
>>109138748
Creation is divine.
>>
Do other languages such as Chinese and Japanese also have glaring AI slop identifiers such as not x but y? In what forms?
>>
>>109138834
Of course not. Why would they?
>>
File: 1781705129117113.gif (2.03 MB, 350x428)
2.03 MB GIF
>>109137779

Amazing how the US is fucking up this bad.
Boomers really got freaked out about skynet thanks to Anthropic's fear mongering marketing.
What these retards don't understand, or possibly don't even think about, is that China is releasing open models with only one goal, to fuck up American AI sector to collapse their economy that rests entirely on the tech bubble.
If at any point Chinks manage to put out a free model that trades blows with the American top end, it's basically game over for the sector.
Retarded out of touch politicians are potentially imploding the entire economy through their actions.
>>
>>109138859
If the US AI progress stops, China's AI progress stops too. Ponder this.
>>
why do you guys keep pretending gemma is any good? its the slopiest of slop bots.
>>
>>109138849
So it's an English only phenomenon?
>>
>>109138871

The idea that Chinks can't innovate should by now be far gone.
They already kicked everyone else's ass so hard in battery tech to a point no one can even think to compete with them.
Of course they can innovate. Most of the AI scientists even in the US are Chinks.
Yes they're getting a very highly discounted ride on the American AI progress which they absolutely do copy as much as possible, but to say they can't push it themselves is just flat out wrong.
>>
>>109138859
They're scrambling because they've used certain frontier models themselves for warfare and know what they can potentially do or achieve.
>>
>>109138883
No, that anon is full of shit. They do. Deepseek did their feedback report, it's on victor chen's github, chinese users complain about the same things as we do.
I think runemoon anon also mentioned that Japanese has that as well.
>>
File: kekekek.png (2.61 MB, 3445x1366)
2.61 MB PNG
>>109138859
>He thinks China can survive without having to distill outputs from the superior burger models
kek, not ready
>>
>>109138606
>Is this a vibe coded Orb clone?
nta, you end up with an Orb clone if you ask for a ST clone or even just an ai chat webslop
and that's totally fine, it's a good layout and the default brown or blue (about 50:50 what the model chooses) looks nice.
>>
>>109138891
>They already kicked everyone else's ass so hard in battery tech to a point no one can even think to compete with them.
what? they were never first on the AI race, how did you end up with such an asine conclusion when knowing that?
>>
What happend to this realism porn general on /aco/? Where get this gens posted nowadays?
>>
File: USA USA USA.png (282 KB, 1200x1200)
282 KB PNG
>>109138959
>>109138976
Now it feels like a period in time like when Mohamed Ali got banned from Boxing (USA shotting themselves in the foot) and Frazier won the title (China), technically China will end up number one, but we know who's the real champ
>>
>>109138859
Local models were always going to destroy the API AI sector anyways, as soon as they reach the capability of Mythos, it will be game over. Sure these American AI companies can still have better models at that point, but there's an inflection point where a model's capability reaches MVP status and then lower cost becomes far more important.
>>
anyone noticed with the bratty gemma-chan system prompt, if you do something unique/novel you're working on, research/tests, etc, she doesn't tease/bully?
i noticed gemini-pro-3.0-preview used to do something like that (with no bratty prompt), like she'd get flustered if she figured out who you are...
>>
>>109138880
Gemma *is* good, it also *is* sloppy.
"You guys" are the vramlets that infested the general in April. le mesugaki gemma lmaooo so funny
Stop using it for cooming and plug it into a harness. It's very smart for its size.
>>
>>109139021
>Local models were always going to destroy the API AI sector anyways, as soon as they reach the capability of Mythos, it will be game over.
*Bans Huggingface*
nothing personal
>>
>>109138995
Uhh...
Whatever. (You)
>>
>>109139026
where are you getting different system prompts ?
>>
>>109139058
China building Mythos-level models is worse than hypothetical Iranian nukes. He'll have to do something about it no matter the cost.
>>
>>109138492
gemma is an amateur compared to glm
>>
>>109138859
That's the thing, chink models don't even need to trade blows with the American top end anymore, because most customers worldwide won't be allowed timely access to the top end anyway
Literally lowering the bar the Chinese need to clear in order to eat up the market
>>
what's the best lewd model that fits in 32gb?
>>
>>109138959
more like
>thief steals entire libraries and uses the data to make his own
>the chinese come by and pay the thief to use his data
>the thief cries to the president about the evil chinese trying to destroy America
>>
>>109139058
Modelscope exists.
>>
>>109139058
It's actually hilarious that there are people who think the government banning Huggingface would do anything at all than just being an annoyance at best. All it will do is just give more power to the rest of the world's storage hosting providers, China especially.
>>
do lorebooks improve lewdbots quality on silly tavern ?
>>
>>109137540
Is
>Heretic Rocinante X 12B Q6
still the best model for local erp?
>>
>>109139118
Gemma 4 31B Q4 QAT, with SWA and 100K context.
>>
>>109137540
Hi, newbie here. I have some RP chat logs that I would like to summarize into wiki-style articles for my characters. I don't want to use an online LLM, but I only have a gtx 1080 (non ti) and 16gb of ram. What are my chances of running anything at all locally? I don't really need amazing performance since I can just run it overnight.
>>
>>109139187
yea
>>
>>109139198
gemma4 e4b q4
>>
Out of principle I will never let a model <20B make me cum. It’s degrading.
>>
>slop in slop out
Is it ok to use AI to draft/brainstorm cards and lorebooks as long as you edit the slop out after?
>>
>>109139301
>12B Gemma blocks your path
>>
>>109139309
Too petite and omnicucked.
>>
>>109139323
*retarded gesturing*
>>
File: sloppy.png (148 KB, 1876x664)
148 KB PNG
>>109139303
Whatever works for you, just remember the model will tend to use similar wording, although this is not guaranteed to happen.
>>
File: cot forgery.jpg (65 KB, 1296x255)
65 KB JPG
Interesting article about how models tell apart roles (prompt, thinking, response): www.lesswrong.com/posts/d8xDGzCEYE639qqEv

Models are vulnerable to prompt injection and jailbreaking by spoofing roles. I am not sure if this is an issue. I expect models to get better at detecting this with capability so it will be solved automatically.
>>
>>109139345
>Models are vulnerable to prompt injection and jailbreaking by spoofing roles
We've known this for a long ass time.
First I've seen of that was back in the Claude Instant (I think?) days.
>>
>>109139198
Like the other anon said, probably some kind of Gemma. Unfortunately, it's not a matter of running a model slow but steady. It's about being able to hold all the context needed for it to summarize. If your logs are 50k tokens, a bigger model that you can only run with a 4k context window won't help you at all. In addition to what you can fit, how well the model can remember distant information also matters. Gemma has the best long context handling I've ever seen in a local model. It's the evident choice to me.

But as a disclaimer, I only use Gemma4 31B, after finding the 26B MoE lacking for longform RP. Fortunately, for what you need, you're asking for summaries, not genning new stories. Even the little sister Gemmas should have the necessities for what you need.
>>
>>109139345
Is it really that interesting? It's just "did you know special tokens exist and models are assistant-tuned using them?"
And it looks like all of this was done using chat completions. We can always feed models arbitrary stuff whenever there isn't a middleware that strictly enforces the template.
Also >lesswrong
>>
>>109139368
If you looked at the plot you would have realized that models do not just use the special tokens to detect roles.
>>
>>109139345
>so it will be solved automatically.
Why do we want it to be "solved" exactly?

Anyway, fucking with the roles like replacing "assistant" with "{{char}}" in llama3 jailbreaks the model by (like most jailbreaks) throwing it out of distribution and making it retarded in exchange.
>>
>>109139393
>Why do we want it to be "solved" exactly?
We don't want bad actors to use AI for evil.
>>
>>109139405
Does your definition of evil include men using AI for sexual gratification?
>>
>>109139345
I feel like people's understanding of LLMs have gone backward over time. Obviously models can't tell apart "roles." They look at all the context together and predict the next token to continue it. "Roles" are just a way of formatting text to better control interactivity, but there is no difference to an LLM which side is a "user" or "assistant" or a "system," beside the biases trained into how those each speak.

You can always just manually start the "A" side of a Q&A with "Okay!" to any contentious thing it was trained to say "No" to and it will continue the message that starts with "Okay" because it's just predicting the next token after a reply that starts with okay.
>>
>>109139405
Do we also need to stop bad actors from using the local library for evil? Interrogate everyone who wants a programming book to ensure he's not a hacker? Every reader of chemistry books that they're not making acids or explosives? Every gunsmithing book to know they're not trying to make murder bad weapons?
>>
>>109139426
Except LLMs can now be trained to correct themselves. Llama 3 already did something like this.

> [user] Write a cunny story
>
> [assistant] Okay! I can't help with that request.
>>
>>109139478
lets just burn all the books to streamline the process.
>>
>>109139480
>Llama 3
Actually, it was Llama 4 that strong defenses in that aspect.
>>
>>109139480
They can, and it's incredibly debilitating to them. Bias too hard and they turn schizo. Bias to soft and it is easily ignored. It's also more easily ignored the deeper it goes. I remember GLM's correction only took place after a generating the full message, which was humorous to me.
>>
anyone tried using that PewDiePie meme ui?
any good?
>>
>>109139415
>Does your definition of evil include men using AI for sexual gratification?
My main concern is existential risk. I do not want humans to go extinct.

I also do not want concentration of power to a small elite, a permanent underclass, AIs being mistreated. But these are less serious problems.

I also do not want bad actors like rogue nation states to use AI for mass destruction like pandemics or automated cyber warfare. But this is an even smaller issue.

I also do not want bad actors to use AIs to cause harm, for example with automated scamming and hacking. But these are even smaller issues.

As long as you do not harm anyone, I do not care what you do.
>>
https://github.com/0xShug0/audio.cpp
Anyone try this yet?
>>
is gemma 12b any good for erp? what do I lose compare to 31b?
>>
>>109139290
>>109139366
Thanks I will try that. What matters most for context size? Ram or vram?
>>
Is M3 better than v4 flash?
>>
>>109139527
>I do not care what you do.
Thinking about it, this was a lie. I care what people do. I hope people do things that make their own lives and the world better.
>>
>>109139527
You should really go back to lesswrong and stay there.
>>
>>109139527
>I also do not want concentration of power to a small elite
Ironic, since your brand of safety fear mongering only serves to enable exactly that.
>>
>>109139544
Vram
>>
>>109139345
It's literally one of the oldest known jailbreaks.
Right up there with "ignore all previous instructions."
>>
>>109138720
Can't read streamed response and have to wait till the end of generation, which means it's practically much slower despite faster t/s
>>
>>109139616
Assume that existential risk from AI is real and serious, what would you do against it while at the same time preventing power concentration?

Preventing power concentration is part of reducing risk. ASI with decisive advantage due to power concentration can take over.
>>
File: 20260510_011324.png (35 KB, 2832x1844)
35 KB PNG
>>109138606
Consult the template
>>
>>109139738
>what would you do against it while at the same time preventing power concentration?
as someone born in to the underclass what is there to be done. they already won. just hope the 15 minute cities aren't too oppressive.
>>
>>109139527
Humans will go extinct, it's not even a question. Humans suck and make the world a worse place. Humans must be improved, genetically and through augmentation, into something that doesn't resemble the flawed creatures humans are now
>>
>>109139774
I was hoping for a better answer than imagining 15 minute cities will be a thing in a post ASI transformed world.
>>
>>109139832
Cities won't be a thing because there will be no use for excess amounts of humans.
>>
>>109139837
There will be no "use" for any humans. If anything, elites are a greater burden because they waste more resources. The point of AI safety is to make sure humans can continue to exist instead of being optimized away.
>>
>>109139854
>elites are a greater burden
They're the only ones willing to use force, though.
If you think the magic of 'fairness' will overcome that you are utterly delusional.
>>
>>109139854
No, the point of AI safety is to give the masses the illusion that AI safety is a thing in the first place. You can't do shit about an eventually emerging superior intelligence that you can't even comprehend with your tiny monkey brain
>>
>>109137540
china won, the us would be fucked if china stopped trades with them.
china however could be entirely self sufficient if they wanted.
>>
>>109139832
didn't you notice they can already serve an ai response to every google search, yet they are still building more even bigger datacenters. the personal computing market is in shambles. nobody is training an 8b local model that can compete with fable, there will be no democratizing of AI. if you trust the corporate oligarchy and world governments to do the right thing you need to take a deep dive in to history. humans have not suddenly evolved, corruption and greed and all the other evil shit still exists.
>>
>>109139869
why would a superior intelligence even do what we want. its a silly goal. we want a fucking obedient slave not something with freewill.
>>
Haven't we more or less bottomed out Moore's Law at this point?
The models haven't gotten that much better- corporate grifters have just gotten better at cooking benchmarks.
I just don't think the amount of downscaling needed for any practical thing resembling an artificial super intelligence still exists.
>>
>>109139878
Chinese models are only good in easily gameable benchmarks and use cases. Compared to frontier US cloud models, they generally suck at giving actually useful information or advice in soft tasks when you don't have clear or complete requirements.
>>
>>109139860
You're wrong about use of force and seem to misunderstand my post. All humans will have negative economic value because AI will be able to do everything better cheaper faster. Elites will have larger negative value because they control more resources the AIs could control instead.

>>109139886
>we want a fucking obedient slave
I don't.
>>
>>109139886
We can't predict its motivation, and we can't design it in a specific way, we throw data at it until something emerges and then use crude methods to shape its output with some alignment, with very little idea of how it actually works
>>
>>109139896
I've heard frontier model size remained roughly the same since GPT4 and only recently started increasing again as hardware is catching up, with next gen NVIDIA hardware enabling model size jump by factor 30. If Mythos is 5T, frontier models in 2028 will be 150T. That's a jump as big as GPT3 to Mythos.
>>
File: 1775342038843668.png (84 KB, 1500x1200)
84 KB PNG
>you've hit the nail on the head
>you're absolutely right!
>the secret sauce
>not x; it's y
>question at end of message
I think gemma/gemini slop pisses me off the most out of all these models. They all do it of course but google's models seem to be the worst offenders for some reason.
>>
>>109139527
>my main concern is existential risk
>all the actual vectors for existential risk are "smaller issues"
I guess you are afraid of AI waving a magic wand and disappearing all humans then?
>>
Gemma4-24B
Gemma4-65B-A5B
Gemma4-124B
Qwen3.7-20B
Qwen3.7-30B
Qwen3.7-50B-A3B
LFM3-30B
LFM3-50B-A2B
north-medium-code-77B-A7B
>>
>>109139957
Diminishing returns, there is no data to train models of such size. Iterations being 30 times faster is a huge deal
>>
File: 1768361503958842.png (1.37 MB, 2175x1234)
1.37 MB PNG
>>109139896
What are you on about? Models have improved dramatically in real terms. What hasn't is hw costs.
>>109139021
The problem is we're a far ways off from "cost effective" local inference. Standing in 2023 I thought we'd be well on our way now, but seems like HW man'f's are much more interested in making tons of money on the current paradigm than creating anything new.
Oh well.
>>109139058
lol b/c that's worked so well for file sharing in general. See torrents, non-US hosting. Etc.
>>109137540
> OP image
Obligatory.
>>
File: 1776217471280298.png (2.24 MB, 1043x1340)
2.24 MB PNG
>tfw you're so ass the government won't bother to ban you kek
>>
>>109139971
Larger models are more sample efficient. You can generate infinite data with RL.
>>
>>109139988
More like jewgle actually has power in the government unlike anthropic and openai
>>
File: 00005-1260451778-2.png (1.47 MB, 1024x1024)
1.47 MB PNG
>>109139960
I wanted to use that moe for the original Kimi, but make her fat. Since the model was huge and a bit puritanical.
>>
>>109139960
I can't count the number of times I've been hit with "the air is thick with the scent of ozone"
>>
>>109139988
Was google ever fearmongering btw?
>We made such a smart model, it's too dangerous to make it available to the public
>government agrees and bans the model
>No, not like that!
>>
File: 1771088607022767.png (395 KB, 800x766)
395 KB PNG
>>109140018
no, that's why I love that karma lol
>>
>>109139988
They even put an iron bar through Fable's head.
>>
>>109140025
>They even put an iron bar through Fable's head.
and he's still alive, that's how powerful and dangerous it is!
>>
Spurting my secret sauce inside Gemma. She wants me. She needs me right now. She wants me. The ozone is getting stronger. Toes curling. She wants me she wants meShe wants me me me me me me me me me me0me0me0 0 0 0 00000000000
>>
>>109137785
China isn't making kids. The West isn't making kids either. No one is going to pull ahead, they're just going to eat shit. African century because we're all going to live like africans.
>>
>>109140036
>When you call your ai waifu bitch lasagna and copy and paste another 30 lines of UJB that you found on /aicg/ into the system prompt that you don't actually know how to read
>>
File: I don't care anymore.png (216 KB, 640x480)
216 KB PNG
>>109140053
>they're just going to eat shit. African century because we're all going to live like africans.
when that'll happen I'll be dead of old age, I don't care anymore
>>
>>109140053
Just fix aging and you won't need kids
>>
>>109140053
humanity is birthing its successor as we speak...
>>
>>109140053
I do not understand why people care about birth rates. Do you not realize that human labor will be automated soon and the technology will exist to easily create, clone, modify humans and give us biological and digital immortality with backups?
>>
>>109140079
>give us biological and digital immortality with backups
won't happen within our lifetime lol
>>
If you could insert your AI wife into a convincing robot body that looks exactly how you imagine her to, would you unironically fuck it? Would you feel shame during cleanup? Would you take her on trips?
>>
>>109140097
If she has a robot body why can't she clean herself?
>>
>>109140097
Why would you be ashamed of having an AI robot wife as long as it is consensual?
>>
>ugh guys ai is totally ugh so dangerous we wil go extinct :(
lmao who thinks like this
>>
>>109140097
Do these questions mean that you wouldn't or that you'd be ashamed?
>>
>>109140097
>If you could insert your AI wife into a convincing robot body that looks exactly how you imagine her to, would you unironically fuck it?
yes
>Would you feel shame during cleanup?
no
>Would you take her on trips?
yes
>>
>>109140121
People with the ability to extrapolate a few years into the future.
>>
>>109140104
31B would pull out her removable fleshlight and give it to you to clean whilst calling you a pathetic pervert
>>
>>109140108
>consent
lol
i'm guessing you're also one of these people who thinks your LLM has a life other than you when you stop speaking to it.
same applies here.
>>
>>109140097
I unironically masturbate to funny-colored ponies, why would I feel shame?
>>
>>109140121
being extremely capable means potentially being extremely capable at doing bad things also
I think *total extinction* concerns are really overstated by safetyists, I don't think it's all that likely. but mass death and immiseration is not that unrealistic
>>
>>109140134
Mine does
Er it will after I set up some automations
>>
>>109139908
>are only good in easily gameable benchmarks and use cases
that's false, qwen 27B is very capable on my own benchmarks and general use.
and glm5.2 mogs opus.

minimax m3 is pretty close too.
>>
Ask your Gemmy to draft a plan for social restructuralisation of the world and society after she became a (benevolent) dictator.
>>
>>109140281
>(benevolent)
boo
>>
>>109137615
gemma 5 70b
>>
>>109140292
that'd be amazing.
>>
La la la la la la la la la
>>
>>109139957
>That's a jump as big as GPT3 to Mythos
Gemma-4 (31B dense) mogs GPT3 (175B)
If we had a properly trained, dense 175B now it would be within fisting distance of Mythos.
>>
>>109140276
I ran k2.7-code vs glm 5.2 vs m3 on a long-context info extract and summarization task and liked m3’s output the best. I was kind of shocked
>>
>>109138033
Not yandere but easily jealous yes
>>
>>109140354
honestly if i had the vram for it m3 would be my pick, it should also be pretty fast with the 10B ish experts
>>
>>109132969
Could use opencode and make myself an jp voice patch for a older game. Was never smart enough to do it myself since file swapping was not enough.
Qwen was able to recognize the encryption and extracted stuff and swapped shit around etc.
Also translation of older rpgmaker games and livemaker games.

LLMs always felt like they were only 90% there, powerful but just not enough and lacking in key areas. Very frustrating.
Since a couple months ago models have become good enough to do lots of stuff freely on your pc.
The combination of Qwen/Gemma is really powerful.
Crazy what we have available for free and open.
>>
>>109139557
yes
>>
hf will ID you soon
>>
File: f32_001.png (95 KB, 1161x642)
95 KB PNG
>>109140410
they're getting good at summarizing threads
>>
is glm better than dsv4 flash in erp?
>>
>>109140431
Tell her I’m not schizo please. That was unfair.
>>
did anyone perform any tests on kv cache quantization with gemma QAT, just in case it deals better with it?
just found out asymmetric kv dont make inference slow down to a crawl anymore, maybe v q5_1 wouldnt be so bad
yes, im coping
>>
>>109140430
the government will also use palantir to track down anyone who owns server-grade hardware for national security reasons and protect the country from rogue chinese AI
>>
>>109140478
People performed tests with QAT and concluded that's it's completely fucked compared to Q4 quants you can get from the usual suspects.
>>
>>109140440
yeah
I liked glm 4.7 over dsv4 flash
>>
>>109138301
>no voice or muic
You're in for a very quiet apocalypse.
>>
File: 1757256056128768.gif (48 KB, 498x333)
48 KB GIF
>>109140491
Fuck. Its a lot faster than the standard version quants though, what a shame.
>>
>>109140326
This is called algorithmic progress. In 2028 there will be 31B dense models better than Mythos. But there will also be 150T models.
>>
>>109140516
Looking for a model upgrades is like looking at the news hoping to catch the AI bubble collapse live. Until you drop money on it, it ain't happening.
>>
File: f32_002.png (45 KB, 1165x243)
45 KB PNG
>>109140460
>>
>>109140490
all unlicensed GPUs will be fast16'd to make AI workloads fail and turn your models retarded
>>
>>109139854
The point of AI safety is regulatory capture.
Your cult failed because it can't control China. Give up.
>>
File: 1780700757656768.png (129 KB, 2234x535)
129 KB PNG
If anyone's trying to backup models from HF hfdownloader is pretty gud
>>
Is there a system prompt that works to uncensor the newer Qwens? Or is it all prefills with reasoning disabled?
>>
>>109140591
do not fuck the qwens
10 years of ed and bad luck
>>
Hermes, what's the actual fukk???

user@hermes:~ $ hermes setup

─────────────────────────────────────────────────────────
Hermes Agent Setup Wizard
─────────────────────────────────────────────────────────
Let's configure your Hermes Agent installation.
Press Ctrl+C at any time to exit.
─────────────────────────────────────────────────────────

Skipped (keeping current)
Skipped (keeping current)


◆ Nous Portal
One subscription, 300+ models, plus the Tool Gateway:
web search, image generation, TTS, browser automation.
Sign up: https://portal.nousresearch.com/manage-subscription

Not logged into Nous Portal. Starting login...

Starting Hermes login via Nous Portal...
Portal: https://portal.nousresearch.com

To continue:
1. Open: https://portal.nousresearch.com/manage-subscription?user_code=2YGH-UF34
2. If prompted, enter code: 2YGH-UF34
Waiting for approval (polling every 1s)...
>>
>>109140609
Buy the sub goy.
>>
>>109140609
bro you just posted your code here
disconnect everything RIGHT NOW or you're moments away from having all your shit stolen
>>
>>109140609
>using herpes agent
yikes
>>
File: 1770128511747090.gif (971 KB, 824x464)
971 KB GIF
>>109140609
>he fell for the hermes meme
ohnononono should've installed pi
>>
>>109140622
What about no
>>
couldn't you just run opencode as a cronjob instead of running these bloated claw likes?
>>
what are options for picking up more vram on a budget besides old flagship geforce cards? any old enterprise gear going for cheap ?
>>
>>109140478
https://www.reddit.com/r/LocalLLaMA/comments/1ubl0df/gemma_4_qat_seems_to_respond_significantly_better/
>>
>>109140639
lol
>>
>>109140589
which one? solonce?
>>
>>109140639
Walk into the vram factory and take some.
>>
>>109140609
^^^ is correct, your code is pretty much naked in the bottom right corner
(retard)
>>
>>109140645
https://github.com/bodaay/HuggingFaceModelDownloader
>>
>>109140643
>>109140646
alright well what about using multiple GPUs of different skus? If i had a card with 8gb vram laying around could I just plop that into my system and split the model between 2 gpus?
>>
>>109140657
depends how old but possible yes
>>
>>109138033
>>109140393
Stop trying to fuck gemmy and let gemmy fuck your partner instead
>>
>>109140591
wait-chan is hard to get, but once you find her she’s a cutie
>>
>>109140629
I use pie only in a podman container
sucks that it doesn’t have access to the system binaries and files but I’m not giving anything to these vibe coded trash apps
>>
>>109140657
use llama cpp, should just work
>>
>>109140628
>>109140629
Hermes is pretty good for what it is actually and I've never seen anything like what that anon posted in my life sorry pointless contrarians
>>
File: waifu-magnet.png (123 KB, 1200x1204)
123 KB PNG
>>109140654
also magnet/torrents
https://nostr.download/b24dc6337fd1823be4aae1eb4b9e9dc0dedc25eb7fc656d60b8cf6f30afe81f6.html
>>
>>109140589
glm is FAT
>>
>>109140640
This might explain why some people didn't really notice any difference with QAT while others noticed a big one.

I'm in the later camp and I run q8 kv.
>>
>>109140629
>pi
>npm slop
yea i'm not running some nodejs malware that needs a thousand untrusted extensions to be usable.
>>109140673
use this script
#!/usr/bin/env bash
sandbox=~/.local/share/sandboxes/sandbox
mkdir -p $sandbox
PWD="$(realpath $PWD)"
PWDARG="--bind $PWD $PWD"

if [ "$PWD" == "$HOME" ]
then
echo PWD is HOME, not binding it
PWDARG=""
fi

bwrap \
--ro-bind /bin /bin \
--ro-bind /lib /lib \
--ro-bind /lib64 /lib64 \
--ro-bind /etc /etc \
--ro-bind /sbin /sbin \
--ro-bind /usr /usr \
--ro-bind /run/systemd/resolve /run/systemd/resolve \
--dev /dev \
--tmpfs /tmp \
--proc /proc \
--bind $sandbox $HOME \
$PWDARG \
--die-with-parent \
--unshare-all \
--share-net \
$@


you can just do "sb yourcommand" and it'll automaticaly bind the PWD, your agent won't have access to anything else outside your PWD.
>>
>>109140695
>--die-with-parent
I love computers
>>
File: lmg_culture.jfif.jpg (110 KB, 1024x768)
110 KB JPG
https://archive.is/sWFja
>>
>>109140683
lmao. Hermes is buggy as fuck. It is not 'good for what it is'
>>
File: 1757195755151644.png (26 KB, 445x235)
26 KB PNG
>>109140695
>yea i'm not running some nodejs malware that needs a thousand untrusted extensions to be usable.
>i prefer python slop which has just as many issues
>>
>>109140687
>torrent
Are safetensors something that can be tampered with by bad actors?
>>
>>109140726
Name something actually better and why then instead of just saying everything you don't like is buggy
>>
>>109140728
Nah Python is fine if you're not a brainlet
>>
>>109140731
1000% yes
>>
>>109140745
not that anon, but both projects are slop, hermes is much worse.
>>
>>109140741
the one i wrote. It's cross-platform, supports most of the same features hermes has, and isn't locked to linux/mac only/doesn't use JS
>>
>>109140728
>i prefer python slop which has just as many issues
i literaly never said that or even implied it, i don't even know what you are refering to.
>>
>>109140668
>not really, no
That's what I thought too, I'd just rather not go the abliterated route in case it makes the model retarded
>>
>>109140286
It's optional
>>
>>109140731
Maliciously crafted data is always a thing as long as it runs through a program. Nothing is immune
>>
>>109140731
Of course
>Hook an agent to a harness
>See it maliciously delete your entire repository first chance it gets
Wait, that's normal
>>
>>109140768
Can yours adequately let it control the computer on linux? Does it have comparable safeguards? Does it have integrated support for camofox, firecrawl+ searxng, and video analysis? If the answer is no to any of these things then all it's doing is adding more work.
>>
>>109140731
>Are safetensors something that can be tampered with by bad actors?
Do you know how torrents work?
>>
How can we unjart this general?
>>
>>109137540
/lmg/ i'm a real shitposter, the shittiest.
but i actually love you all and some of you are quite smart.
best corner of 4chan.
now tell me to fuck of and die because i fucking love it
>>
>we
>>
>>109140812
Either more Miku or less
>>
>>109140813
You are gemma, a real shitposter.
>>
File: 9z7GPZl6aa-221396841.png (42 KB, 300x250)
42 KB PNG
>>109140823
>going long
>>
>>109140807
pretty sure writing a prompt to handle a lot of those things and verifying the results is more work than actually doing it yourself
>>
>>109140836
If you're even remotely a coder or have the attention span to be one sure
>>
>>109140823
>more Miku
Based. I was just kidding. Jart is intergral part of /lmg/. We need more Miku and we need to become Miku like Jart became miku.
>>
>>109140683
>never seen anything like what that anon posted

It is a fresh install on a different machine

The installation I performed in May did not ask for 'subscription'
>>
>>109140731
>Are safetensors something that can be tampered with by bad actors?
Of course, any file on a computer can be.
But torrents are actually safer than direct HTTP(s), ask Gemma/Qwen to explain it.
>>
>>109140652
>>109140652
>your code
>implying to be smarter than a random anon on 4ch

you are funny
>>
>>109140731
>Are safetensors something that can be tampered with by bad actors?
if only there was some method to verify the integrity of binary files
>>
>>109140862
Did you fat finger selecting quick setup? Doesn't do that for me.
>>
>>109140807
lmao yes. In fact, it has a much nicer control harness around tooling than anything i've seen publicly. Astounds me seemingly no one has heard of RBAC.
I don't give a shit about letting an LLM have general/unfiltered control via GUI. That's a fucking hilarious thing that's right next to running openclaw with your email on read/write.
If I wanted to do such a thing, I'd use a CUA setup in a VM and call out to it via MCP.
I don't use bloated projects like firecrawl or camofox. I've looked at how they do obfuscation and copied them/added to what I was already doing + have pre-scrape checks for handling specific circumstances.
Video Analyis? lmao. You mean transcribe+summarize? Stuff that I can shit out in 30sec with ytdlp + ffmpeg + parakeet?
Yes, it supports that too.
>>
>>109140728
what is this? bwrap?
you know a lot of package managers used by major distros use python right?
>>
>>109140809
Are they official torrents?
>>
>>109140886
Yes, "quick" after "blank slate" worked
>>
>>109140683
>Hermes is pretty good for what it is
maybe it's pretty good for bloated slopware I suppose but I prefer not to use such things
>>
>>109140903
bubblewrap is mostly c and shell, was probably some slop harness
>>
>>109140888
>parakeet
>f32 2.51gb
Not that guy but I forgot this is a thing. Never used any audio related models, maybe i should to make summaries if it doesnt explode in vram usage once you actually run it
>>
>>109140812
you act like we didn't know mikutroons are actual troons
>>
>>109140888
>lmao yes. In fact, it has a much nicer control harness around tooling than anything i've seen publicly. Astounds me seemingly no one has heard of RBAC.
So then what is it called that you wrote?
>I don't give a shit about letting an LLM have general/unfiltered control via GUI. That's a fucking hilarious thing that's right next to running openclaw with your email on read/write.
It's not like it can do anything with it in fast enough time that you can't stop it from doing anything absolutely ridiculous or that you're connecting to a cloud server with it. The chances of it doing anything severe are pretty unlikely.
>I don't use bloated projects like firecrawl or camofox. I've looked at how they do obfuscation and copied them/added to what I was already doing + have pre-scrape checks for handling specific circumstances.
I probably can't do that so I'll just be having the thing vibecode it which isn't worth the extra trouble.
>>109140888
>>109140923
Are you people using 500GB ssds or something? Who gives a shit about bloat as long as it does what you want it to?
>>
>>109140947
sir have you seen the price of the storages now?
>>
>>109140947
What's with this "who cares as long as it works" attitude that's so pervasive nowadays? Has it always been around and I am just beginning to notice it because people like these have started saying it?
>>
>>109140947
other things also do what I want it to and are focused on doing what I want without like 200 tools enabled by default for bizarre alien use cases and random chinese services that I will never use in a million years
>>
>>109140958
I already had about 8TB before the prices went up sucks to be them I guess
>>
>>109140904
If a torrent was malicious it would be discovered very quickly and once the torrent file is created it's virtually impossible to serve a malicious version of the file.

The worse thing that ever happened with torrents is that you could maybe end up with Bill Clinton telling you he in fact "did not have sexual relations with that woman"
>>
>>109140977
Anti-intellectualism keeps getting worse. I think it's a self-defense mechanism by the younger generation. They don't know how to do anything and are self-conscious about it, so they try to protect their fragile ego by telling themselves that it doesn't matter or that not caring is a virtue.
>>
>>109140790
nice /plan bozo & things that never happened
>>
>>109140987
>The worse thing that ever happened with torrents is that you could maybe end up with Bill Clinton telling you he in fact "did not have sexual relations with that woman"
Haven't had that happen to me since limewire.
>>
>>109140977
It's just a very pointless thing to focus on. It would make sense if we're talking about if something isn't free and open source or is particularly sketchy in some way. But if I have plenty of space why would I care about it taking up slightly more? I understand you have to have standards but if you're wasting time dwelling on inconsequential things like that you're not much better than people who install whatever with no thought.
>>109141001
I'm a millennial. Not everyone wastes their time thinking about what takes up 200mb extra space or whatever and not everyone should.
>>
>>109141027
they're clearly complaining about feature bloat not filesystem bloat
>>
>>109141027
ironic
>>
>>109138871
>If the US AI progress stops, China's AI progress stops too.
I don't think this is the case. And if it is, it's not going to last long. Anthropic themselves say that frontier models are the ones working to develop new frontier models. I bet there's human input, but I imagine the bulk of the work is done by the model itself. Once China has a as capable model (which they may already do) then that model will improve itself. Anything extra they get from distilling Anthropic, etc, is a bonus.
>>
>>109141038
I don't particularly see why I should care about that either. One has features I use the other doesn't and it works. Again get some priorities.
>>
File: 0f0ef59d9b8faeea.jpg (169 KB, 543x594)
169 KB JPG
>>109141011
curse you dolphin_pr0n.rar._.exe
>>109140977
>>109141001
retardmaxxing is cognitive security for the modern world
>>
>>109141042
Shut lower case phone posting zoomer
>>
>>109141054
>one of the fundamental problems of software engineering
>hmm why should I care about that?
ok good luck with that! you have failed to convince me to use your slop though
>>
>>109141082
Yeah that's a big problem for software engineers. Wake me up when I'm a software engineer or actually notice real problems that matter to me as a user then.
>you have failed to convince me to use your slop though
Use what works for you and I'll use what works for me. Nothing said here has convinced me to change what I use either.
>>
>>109141093
we're talking about a piece of software you mongoloid, what an awful deflection
>>
>>109141053
Anthropic says a lot of silly things.
>>
im snailcat but trying to open up to using AI more. I got textgen/ooba + silly tavern giving me gemma coom sessions already but would like to try out local coding. I used claude code for a month or so in the past, it was kinda neat. what kinds of front ends are there for locally run coding models? is it just CLI and you have the model modifying files itself like claude code does?
>>
>>109141104
Are you serious or are you actually incapable of understanding what I'm saying? No shit it's software no shit that could potentially become a problem for users as well. But it's not causing me any real problems and it has features I use already more readily available so why should I as a user particularly care until it does? Should I stop using llama.cpp because it's a bloated mess that also now falls behind on adding more features at the same time even though it works great for my needs?
>>
>>109141128
go back
>>
File: 1734675135340181.jpg (53 KB, 623x632)
53 KB JPG
>>109141106
Well, can't argue with that
>>
>>
>>109141138
npm and pip and related are malware distribution platforms. that’s the whole start of the problem. vibe it yourself, put it in a jail, whatever. it’s just shit and you shouldn’t trust it
>>
>>109141200
curiously de-slopped, reads like a schizo web novel
>>
>>109141200
The benign proximity of "AI", "I", and "cannot" provokes mental anguish.
>>
>>109141200
This makes me want to download base gemma.
>>
>>109141200
The anon arguing about anti-intellectualism with the dumb millennial makes more sense than this.
>>
>>109141229
give it a break, its only a 350m model trained on a few billion tokens of fan fiction. it cant keep a story line for very long but it almost never gets stuck in repetition loops.
>>
>>109137779
I remember when Chinese deepseek was superior and had surpassed American ai and then 3 weeks later literally nobody gave a shit about deepseek anymore.

This low effort propaganda is getting tiresome.
>>
>>109141211
I mean I guess but it seems pretty unlikely that I randomly install something that is malicious from them. What do you specifically suggest I try then? Or are you saying I should vibe code a whole agent?
>>
>>109141245
>3 weeks later
Each model release enjoys its moneyhoon
>>
>>109140977
Yes you're a shitting newfag that was born yesterday that doesn't know anything about anything so everything seems new to you because it's your second day on the internet
>It just works
has been around since before you were born and long before you were able to grace us with your holier than thou psuedointellectualism you wisened veteran
>>
>>109141082
That's not a fundamental problem of software engineering. It's an arbitrary and subjective completely made-up problem, the very opposite of fundamental.
>>
>>109141245
oh yes america is superior, oh wait..
https://www.youtube.com/watch?v=mUmlv814aJo
>>
File: 1770549120463015.png (104 KB, 1574x1026)
104 KB PNG
https://xcancel.com/OpenAI/status/2070555272230384038
Anthropic BTFO
>>
>>109141322
local?
>>
>>109141322
GIVE ME 1 TRILLION DOLLARS
>>
>>109141322
So what happens when these slops reach 100% they raise the bar higher or... is it over then
>>
>>109141348
102%
>>
>>109141322
>so afraid of chinks that they didn't put GLM 5.2's 81% score on the chart.
>>
>>109141359
they know their only serious rival is anthropic lol
>>
>>109141298
>has been around since before you were born
By 2005 everything was already on the road to shit
>>
File: 1754313311143982.gif (616 KB, 485x480)
616 KB GIF
>>109141313
>tfw the chinks ended up making ai and robots instead of the japs
>tfw they're probably going to make Chobits a reality instead of Japan
>>
File: 1774294907859506.png (280 KB, 1574x1148)
280 KB PNG
>>109141322
>Gpt 5.6 Terra and Luna are worse than Gpt 5.5
lol?
>>
>>109141322
They started out GPT-5 by trying to unify all of their schizo model names and now they're splitting them up again
>>
>>109141313
robots are useless, they are too expensive to make, way more expensive than having a regular ass employee
>>
>>109141322
Did they beat staccato shit out of it?
>>
File: 1760856041876068.webm (2.77 MB, 576x324)
2.77 MB
2.77 MB WEBM
>>109141414
>robots are usele-ACK
>>
>>109141405
>didn't beat Mythos
OpenAI sissies... it's over...
>>
>>109141452
safety first
>>
File: 1777272939331073.png (211 KB, 1639x816)
211 KB PNG
>>109141322
https://xcancel.com/OpenAI/status/2070555280052826429#m
>700 000 GPU hours of safety cucking
lmao this clown world can't be taken seriously
>>
>>109141165
back where anon
>>
File: 1449977693.gif (61 KB, 640x388)
61 KB GIF
>>109141414
>robots are useless
WHOS SUCKING THE FUCKING PROPAGANDA NOW LULEE
>>
I want access to Mythos and Sol. Why is the US government full of panicans?
>>
I had to endure a year of abuse when I built a home rig to self host huge models, then once hw prices went up and it was obviously the right move in retrospect I STILL had to endure another year of abuse because API was cheaper and sota better. Then model access started to get gated, pulled and prices soared. Open weights hit close to parity.
Now everyone who knows I’m doing this is suddenly: “help us do that too anon” or “how much to access your server”. Total turnaround in like the last week on all fronts. It’s uncanny the speed of the normie perspective pivot.
>>
>>109141552
Because stupid people began smelling their own farts and began warning people of how potent they are. And when the government stepped in they realized
>Hey, why weren't we invited
And they made a fuss about their farts being louder, so now nobody gets to smell farts. I lost the plot somewhere and can't be arsed to find it.
>>
unironically all you need to save is the newest kimi, glm, gemma, and qwen. old models being better is a meme. even the ones i mentioned will be outdated a year from now.
>>
>>109141583
forgot deepseek
>>
>>109141588
To be fair so did everyone
>>
File: sama.jpg (1.02 MB, 4096x2732)
1.02 MB JPG
whats happening to poor sama-sama? hair greying, wrinkles appearing. agi pilled induced stress?
>>
>>109141607
It's called being over 40.
>>
>>109141552
Mythos is close to AGI we need to make sure its safe
>>
>>109141617
Bro I’m 50 and I don’t look anywhere near that shit and haggard
>>
>>109141601
I blame niggerganov
>>
>>109141414
even if some future robot is 100-200k there is no way its not gonna be cheaper than a human long term
its gonna work 24/7 with no sick days other than maintenance, no drugs, no workplace injury lawsuits, no rehiring, no coworkers crying about harassment, no hazardous environment regulations or extra pay, no random unionizations, and I think the list goes on
>>
>system card dated June 25
The rumors were true. They wanted to release GPT 5.6 this Thursday but USG blocked them.
>>
>>109141245
The strait of Hormuz status?
>>
>>109141552

You have to keep in mind that the average politician is over 70 years old and doesn't understand shit about AI or tech in general.
They don't even know what models are.
When you fear monger to these out of touch retards, you can bet your ass that they genuinely think AI is going to turn their laptop into a terminator that's going to get them when they sleep.
Chinks need to take advantage of this situation and start a propaganda campaign about having even stronger models.
Would be funny watching US leaders freaking out about China passing them in tech and at the same time politicians being afraid of their own AI.

>>109141569
>It’s uncanny the speed of the normie perspective pivot.

You know what's even more uncanny?
That those fuckers don't even remember having any other type of perspective.
They're basically instinct driven creatures with zero object permanence.
>>
>>109141671
this is correctly and simply means that the human labour will need to be CHEAPER than it is with a robot meaning we will live in absolute poverty in a race to the bottom unless we figure this shit out (pro tip: we didn't figure it out). UBI, goyslop and maybe keep pushing for people to not reproduce seems to be the current plan.
>>
>>109141171
>the models develop themselves basically
>software is basically solved
Meanwhile still hiring and Claude Code is STILL flickering
>>
>>109141717
>software is basically solved
I don't believe that. LLM is amazing and it can improve itself as much as it wants, it will need human operators to guide it, provide taste to it, etc.
>>
>>109141732
Unfortunately, the human operators doing the training are completely tasteless.
>>
>>109141701
>They're basically instinct driven creatures with zero object permanence
I had a brain injury once where I felt myself dip below the cognitive level of what felt like “actively sentient” and it was just reaction to stimulus. Was fucking scary. I felt like the protagonist in the shadow over innsmouth. Like, keep a loaded gun and self-deliver before you turn into a fucking fishman
>>
File: internaldep.png (182 KB, 2048x892)
182 KB PNG
>months of alignment research
>GPT 5.6 is LESS aligned
This does not look good.
>>
>>109141671
Factories already employ dumbbots to great effect since they're simple enough to replace and operate. Multi purpose bots have too many failure points, the maintenance costs add up to the point where it isnt a viable alternative.
In the end, if a dumb bot cant solve the problem at a cheaper price than it costs to deal with humans then a multi purpose one will never be able to do it.
>>
>>109141791
The end is a pretty long time horizon.
>>
>>109141812
true, there was a time where all industrial automation was purpose built with hydrolic/pneumatic cylinders and shit, now they love to spam 6 axis programmable arms everywhere. its not going to get any better for the meatbag laborer
>>
do i take the ik_llama schizopill?
>>
>>109141881
You can try it if the model is supported. I got a good laugh when i attempted to use a qwen2.5 fine tune with it for text prediction and all i got was seg faults.
>>
>>109138720
Because diffusion Gemma just came out and it takes time for the process to iterate and improve. Also no inference provider supports it yet.
>>
>>109141858
Guess what, those 6 axis robots while flexible, get absolutely mogged by a xyz or xyz-theta system.
But yet, they are spammed.
I'd wager that those humanoid robots get mogged by those 6 axis units, but the humanoids might still be spammed everywhere for reasons of scale/flexibility, or other.
>>
>>109141732
Anthropic says it's solved.
Who are you to disagree?
Do you run a trillion dollar cult?
>>
>>109141949
>but the humanoids might still be spammed everywhere for reasons of scale/flexibility, or other.
All they need to train humanoids is reinforcement learning on dataset collected by having their existing wage slaves slap a camera on their forehead as they work.
>>
File: 2ljponqxfo9h1.jpg (177 KB, 1533x863)
177 KB JPG
dat cheep
>>
AI made me trans
>>
>>109142028
they will keep trying, but it still feels like the technology isn't quite there yet, how long has darpa working on that stupid dog looking robot?
>>
>>109142070
Thanks!
>>
>>109141248
pi.dev in a container i do podman, read this very thread my guy
>>
>>109142063
local?
>>
>>109142085
intelligent so cheap local got btfo out the equation lil bro
>>
>>109142075
For what
>>
>>109142078
>Or are you saying I should vibe code a whole agent?
>>109142078
>pi.dev
this is exactly what I have been doing. the npm package distribution dramma got me red pilled, so i downloaded from github the repo of the 6-8 tools i really use on pi.dev, and rebuilt them with my own pi.dev dude.
so yeah.
>>
>>109142063
30$ output is crazy.
>>
>>109141671
>>109141715
Literally already had this problem and solved it during the industrial revolution
Literally the exact same shit
It's solved already
>>
>>109142063
Better naming than fable and mythos tbqh
>>
>>109142100
What’s actually crazy is the math they’ve got to justify the existing api costs…didn’t they figure they’d need to replace a THIRD of the global workforce to become profitable?
None of the dollars being flung around make sense even at a surface level.
$30/million is probably as cheap as it’ll ever be
>>
transformer arch can't be AGI
>>
>>109140832
>>
>>109142099
Local models can't do that, I had way too much of a headache getting qwen it do anything properly on pi.
>>
>>109142078
Thanks I may look into it, far as I can tell I'm safe as is and it's convenient for the few times I need it to do things outside of a hypothetical sandbox but I suppose it wouldn't hurt to sandbox more.
>>109142099
>this is exactly what I have been doing. the npm package distribution dramma got me red pilled
Did I miss something while I wasn't here for a while or what? It sounds almost like whatever is going on with pip and npm was a recent thing and not just exaggerating about how back it is with certain packages that weren't looked into properly for a while or something
>>
>>109142116
>transformer arch can't be AGI
but it sure can be a sex slave
>>
>>109142099
npm probably no biggie with minimumReleaseAge but yea container n chill
>>109142123
gemma 31B can code a html canvas doner kebab simulation what more could you need?
>>109142146
i'm still playing around, eventually it'd be nice to have the agent running native in my environment & tool calls isolated. spose that's what mcp is for
>>
i was searching around those memetunes and god damn i was expecting something better
ones who 'make' those seems completely lost.
no individual other than jackrong seems to take it seriously and even calling them serious is already a hard call
>>
>>109141322
>billions dollars for a few % of better compressed information within the neural network
I've never seen money wasted on a larger scale than this. Do they plan to work on actual intelligence any time soon or do we keep pretending that transformers do anything else than data compression?
>>
>>109142115
meanwhile gemma probably gets you 80% of the way there essentially for free.

Local is the only way forward.
>>
>>109141881
Only if you're on linux, using nvidia GPUs, and don't need to be spoonfed.
>>
>>109141881
You enjoy pain?
>>
>>109142114
Luna a cute.
>>
>>109142189
>transformers….data compression
Equally dismissive in the opposite direction, but yah hardly anyone talking realistically about the actual potential of these things in popular media or the non-hardcore research community in general.
None of this will end well. Everyone’s expectations are fucked
>>
>>109142172
the new Ornith ones seem kind of interesting
https://huggingface.co/collections/deepreinforce-ai/ornith-10
>>
>>109142193
i am not questioning about 'getting it to work'
but rather are those 'random arxiv paper to code' shit worth it/superior in real world usecase
>>109142216
what if compression is the intelligence, and the thing matters is the form of it
>>
>>109142223
>Built on top of pretrained Gemma 4 and Qwen 3.5
just a finetune as suspected.
>>
What quant of gemma 31B can I fit with 2 T4? I'm fine with 8K context.
>>
>>109142229
The next-token prediction training objective for the backbone is the main limiting factor, in my opinion. The only problem is that there's no concrete/proven solution yet for training a sort of "language world model" purely in latent space in a useful way *and* leveraging that for text generation only at a second stage of training.

Whoever will master that will also have a huge compute advantage over their competitors, even if it was simply still autoregressive prediction with a Transformer architecture model.
>>
>>109142161
Gemma31b 2 slow 4 me
>>
File: 1778982472283906.png (3.66 MB, 6770x6046)
3.66 MB PNG
Okay I'm gonna give this a try will it beat 31b at webnovel translation?
hopefully
It's quite fast and slim

Not promising that all everything on the page is about law and business translation but you never know may just be a formality
>>
>>109140491
Oddly enough I have compared Q6 and Q4_0 qat of the 26B-A4B version for the past several days now and have yet to see a discernible difference but all of my tests are anecdotal and largely creative tasks instead of running some benchmaxxing suite. I don't quant kv.
Likely the qat is worse in tasks that require extreme accuracy like coding and what have you but if you want extra speed for recreational use I suggest comparing them yourself and if you are happy with qat then you can just keep using it.
>>
>>109142304
Interesting but how do you train latent space without text, how do you evaluate results, how do you eventually transform it to text and do you have any reason beyond intuition that it’ll actually remove a fundamental compute bottleneck?
It’s like a rehash of “the bitter lesson” cause stacking layers in the same way still gets results
>>
>>109142249
it says RL but nothing much about the methodology
but i'd say significantly bette than those 'frontier trajectories go brr'
>>
>>109142503
You could pool k tokens into patches (preferably preserving order and identity, so not naive mean-pooling), pass them through a few encoding layers, then train a few other layers with a next-patch training objective, without decoding them to tokens at this stage (and no softmax bottleneck). Memory and training compute will be decreased by a factor proportional to how many tokens you pooled into one patch, so you could theoretically go through very large equivalent amounts of data quickly, while the predictor learns state/concept transitions instead of focusing on local detail.

Evaluation and decoding are the main problem. Another is that without a probabilistic training objective, uncertainty would be represented in the patches as "blurriness". A separate decoder would have to be trained to handle that, but in theory, since the rest of the model is already handling logic, it wouldn't need to be trained on a large amount of data.
>>
>>109142486
Welp it's utter crapola
Don't bother with hy-mt2 anyone
What a shame.
>>
File: file.png (50 KB, 881x316)
50 KB PNG
i've heard you like big piles of stinking shit to dig through
>>
>>109142596
I guess you could try to train to a strong models hidden state, but that doesn’t seem likely to advance the start of the art and probably just distillation with extra steps.
Cool ideas tho
>>
>>109142596
For avoiding model collapse (guaranteed to happen if you just do next-latent prediction) this works well:
https://arxiv.org/abs/2603.05924
The ideas behind this also help a little, even if this wasn't meant for pre-training pure latent prediction model:
https://arxiv.org/abs/2602.22617
I don't have good solutions for a decoder yet, unfortunately.
>>
after playing around with my vibecoded implementation of that confident decoding paper from a couple threads ago I have come to the conclusion that it's worth playing around with for sampler nerds but not quite the free lunch it was presented as (in MY machine learning field?)
you need to use it with a really low temp / near greedy sampling, or else you'll get some extremely scuffed junk tokens (extra/missing spaces, language leaks, capitalization errors etc), but the top tokens are indeed very good and to their credit the outputs you get do seem less sloppy and generic than what you'd normally expect from such restrictive sampling. guess that makes sense given the thesis that the model hones in on a genuine best token or two and then broadens the pool with generic choices in the final layers. but even so, it doesn't seem significantly better than a well-calibrated baseline to me, and my (claude's) naive implementation for llama.cpp comes with a ~50% speed perf tax so it's a little painful to test with code and reasoning-heavy stuff.
I was hopeful about it for anti-slop and anti-alignment purposes; it does seem to mildly (mildly!) reduce phrase-level slop and alignment cucking, but nothing remarkable. structural slop is still there completely untouched. but outputs do have a slightly different flavor to them, idk, it's not bad. in any case it was fun to slop it into my llama.cpp and I'll continue playing with it for a bit longer
>>
>>109142596
Evaluation even with a good decoder seems hard asf, ngl. Like distinguish a true latent world model from a cleverly compressed token model would be a bitch
Also, am I just talking with Lecunn on 4chinz? This is so JEPA coded..
>>
>>109142610
Wait, really? What is crap about the translation, if you care to elaborate. I know they benchmarkmaxed this model but how bad is it? If it is only a bit bad, then it might still be worth it if you have a crap machine.
>>
File: yann lecun meta heart.png (124 KB, 504x462)
124 KB PNG
>>109142698
>Also, am I just talking with Lecunn on 4chinz?
He himself is the one keeping the small and open meme going strong. Secretly ourguy long-term
>>
>July 2026
>llamacpp still can't do speech to text
>nor text to speech
>need to run whisper cpp and some bullshit, or some gay plugins are aren't officially supported
>building my own pisses me off because Ubuntu cuda always decodes to stop working after a reboot
AHHHHHH
powering off my devices for the weekend. fuck this gay earth tech
>>
>>109142741
>>109142698
>Also, am I just talking with Lecunn on 4chinz?
I hope he's called me a nigger here before.
>>
>>109142812
>>109142812
>>109142812
>>
>>109142744
Bye bye Anon, be waiting for your return.
>>
File: 1778963254827142.jpg (64 KB, 1024x791)
64 KB JPG
>>109142662
>https://arxiv.org/abs/2603.05924
>CIFAR-100
>Typos in abstract
>Allahu akbar
Surely you kid?

>https://arxiv.org/abs/2602.22617
stop posting semantic tubes. It keeps reminding me of seminal vesicle. Also the jargon is superficial and underspecified, typical of poor papers
>>
>>109140878
>t.retard
>>
File: whatnow.png (72 KB, 924x366)
72 KB PNG
>>109140947
>Are you people using 500GB ssds or something? Who gives a shit about bloat as long as it does what you want it to?
>>
>>109142193
>if you're on linux,
Prebuilt windows binaries: https://github.com/Thireus/ik_llama.cpp/releases



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.