/lmg/ - a general dedicated to the discussion and development of local language models.Previous threads: >>108290857►News>(03/03) WizardLM publishes "Beyond Length Scaling" GRM paper: https://hf.co/papers/2603.01571>(03/02) Qwen 3.5 Small Models (2B, 4B) released: https://hf.co/Qwen/Qwen3.5-4B>(02/26) Qwen 3.5 35B-A3B released, excelling at agentic coding: https://hf.co/Qwen/Qwen3.5-35B-A3B>(02/24) Introducing the Qwen 3.5 Medium Model Series: https://xcancel.com/Alibaba_Qwen/status/2026339351530188939>(02/24) Liquid AI releases LFM2-24B-A2B: https://hf.co/LiquidAI/LFM2-24B-A2B►News Archive: https://rentry.org/lmg-news-archive►Glossary: https://rentry.org/lmg-glossary►Links: https://rentry.org/LocalModelsLinks►Official /lmg/ card: https://files.catbox.moe/cbclyf.png►Getting Startedhttps://rentry.org/lmg-lazy-getting-started-guidehttps://rentry.org/lmg-build-guideshttps://rentry.org/IsolatedLinuxWebServicehttps://rentry.org/recommended-modelshttps://rentry.org/samplershttps://rentry.org/MikupadIntroGuide►Further Learninghttps://rentry.org/machine-learning-roadmaphttps://rentry.org/llm-traininghttps://rentry.org/LocalModelsPapers►BenchmarksLiveBench: https://livebench.aiProgramming: https://livecodebench.github.io/gso.htmlContext Length: https://github.com/adobe-research/NoLiMaGPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference►ToolsAlpha Calculator: https://desmos.com/calculator/ffngla98ycGGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-CalculatorSampler Visualizer: https://artefact2.github.io/llm-samplingToken Speed Visualizer: https://shir-man.com/tokens-per-second►Text Gen. UI, Inference Engineshttps://github.com/lmg-anon/mikupadhttps://github.com/oobabooga/text-generation-webuihttps://github.com/LostRuins/koboldcpphttps://github.com/ggerganov/llama.cpphttps://github.com/theroyallab/tabbyAPIhttps://github.com/vllm-project/vllm
what about world models for llms?
>>108295989Natural language is in itself a world model
this might sound kinda crazy but do people pool together compute resource across multiple smaller machines e.g NUCs to run LLMsI bought a bunch of NUCs a few years back because I thought it would be fun but never thought about using them for thisI do have a chunky machine with 32GB ddr5 ram and I intend to buy a gpu for it at some point, probably second handI don't really want to spend anymore than I already have the total ram comes to 64GB and if I add in my laptops ddr4 ram it comes to 80gbcould I run a general LLM with this much and what do people use to pool it?Claude mentioned exo.
ASICs for AI when?
>>108295959
>>108296013https://github.com/ggml-org/llama.cpp/tree/master/tools/rpc
https://arxiv.org/abs/2512.01797>They solved AI hallucinations
>>108296054What a shame.
uh yea need a 30gb model that is as smart as opus 4.6 please
>>108296071uh yea no.
>>108296031she's literally me
>>108296054>>They solved AI hallucinations
>>108296107post hands Ranejh
>>108295979Please stay there and never come back retarded mikutroon.
>>108295986Yeah i am thinking the new baker is based. No more off-topic bakes.
>>108296054If you remove neurons which make the model too eager to respond that doesn't mean the model will say i don't know.
melt
>>108296203it is troons all the way down
>>108198958Any updates on this?
Just give me the Engrams already
>>108296211odds and ends
>>108296295this makes me wonder if there's a service that can dynamically generate images for your RP adventure based on character and maybe location templates/descriptions
>>108296295Ameila is a psyop agent working for the labour party that will stab you in the back once you earn her trust
>>108296304Never tried it but doesn't ST support something like that already? Just need to give it an API connection.
>>108296304SillyTavern already does this
>>108296286yeah but im planning to make it paid for now
>>108296295Doesn't look like her at all
>>108296313>>108296311Yes but it’s not awesome last I tried it. There’s also a visual novel mode as well as a few other ways to rig up your waifu aside from AI Art you care to set them up.
>>108296295the real amelia would have crooked teeth, look pasty white like a vampire and have the fashion sense of an eastern european male in love with sportswear
>>108296344show one girl like that
>>108296352here's your authentic british experience
>>108296443no wonder why Miku doesn't want to deal with themhttps://www.youtube.com/watch?v=IzDmMQ7SVPc
Imagine making such a home run and getting fired anyway.
>>108296467i hate this art style. now that i think about it, troons LOVE this hideous art style.
We are still at baker wars? I support the new baker. About time /lmg/ had relevant pictures in OP.
>>108296844At lest put a picture of released models, not some benchmaxxed scores.
>>108296844miku pissing into a baja blast cup is /lmg/ culture. this isn't.
>>108296932>/lmg/ cultureshould be abolished
>>108296932Blacked miku is /lmg/ culture
>>108296705dude I fucking came to the same conclusion the other dayits the nu-newgrounds artstyle that troons have now adapted to
>>108296705i'll dub it tranny memphisit's like the corporate memphis but retarded and gay
Anyone tried this shit yet?
>>108297061nope
>>108297061Sounds like a scam
>>108297064https://huggingface.co/spaces/pliny-the-prompter/obliteratus
>>108297061They all claim the same.
>>108297061>ascii box drawing charactersdead giveaway of vibe-coded slop
>>108297066>pliny
>108297038Special interest
>>108297089That is a TUI? They tend to look like that.
>>108297075And they aren't lying. Your model will not refuse and the cooming quality will remain the same cause you only made it stop refusing.
I love the name for this article"Something is afoot in the land of Qwen"https://simonwillison.net/2026/Mar/4/qwen/
>>108297114i like feet
>>108297061Is this another copy of heretic?
>>108297117yes
>>108297113>cooming quality will remain the sameOk. There's no change in cooming quality. Got it.>cause you only made it stop refusingThen it will not remain the same. Get you ad straight.Also>PLINY
I don't know why people are surprised about the Qwen drama. Alibaba has many more GPUs at their disposal compared to smaller Chinese teams yet they only train small to medium models and their models aren't noticeably better. This points to mismanagement of resources
qwen is just chinese meta
>>108297113>Your model will not refuse and the cooming quality will remainFirst, if it stops refusing, NO, it will NOT remain the same. Second, why do people need a to automate this? Isn't half of the fun trying to finetune the models yourself?
>>108297171lol Llama stopped releasing models after Llama 4Qwen never stopped releasing models
>>108297182they have not yet released qwen 4. give it time.
>>108297182just wait until Meta releases Llama 4 Behemoth, local will be saved by then
>>108297061>open sillytavern>text completion because im not a fucking retard>load qwen>prefill "I am {{char}}. I will now think in first person.">literally does anything noware promptlets really this retarded?
>>108297203Small model or shit qoont, especially the new qwens are even resistant to thinking block injection.
>>108297192Llama 5 in a month. Insiders confirm that the smaller model will be named llama-5-refuser and bigger will be llama-5-retard
>>108297208i'm using it right now with a Q5_K_M quant of 397B and it works fantastically. i'm confused... unless you meant that small models/shitty quants are not affected by it.
>>108297208So are they actually retarded as i assumed or are they finally super safe?
>mrdermacher qwen3.5 heretic v2 modelsis it even worth replacing the v1 model ive been using
Will he save open-source?
>>108297066of course it's stolen by a grifter
>>108297346jesus fuckign christ i haven't even had a chance to test out v1 yet
>>108296467>>108296977
>>108296286pic somehow gives me Portal 2 vibes
>>108297365i said fuck it and am trying it anyways, will post thoughts after
What's he doing wrong?
>>108297439He's not downloading sonnet 4.6 onto his own computer
>>108296662LLMs have been RLHF'ed on a bunch of normie conversation preference data and so they care a lot about managing the user's emotions.
>>108297447i diddly it
>>108297439not trolling hard enough. he could be asking his local model how to troll better but instead he's doing something else (no context)
silly goonboot keep getting stuck in loops fuck
>>108297439does he have a gpu? probably an applefag or something.
>>108297439>that bitch beta cuck boy avatar
>>108297567gpus are for nerds anyways
>>108297439
>>108297439He is mentally retarded (IQ below 85), as even a layman can diagnose from his beginning every sentence with an emoji and his reddit spacing while expressing that he has failed to perform the simplest of tasks.
>>108297659whio is that guy
>>108297762You don't know about Penn & Teller?
>>108297771pennor???
>>108297762you know qwen3.5 has vision, you could just ask
>>108297824>go outside>close my eyes>Claude, tell me what you see, please.
>>108297849man i can't wait to be able to fully turn off my brain and let some ai control 90% of my life, im not even joking, life is too hard
>>108297849you're not asking what grass is
lmao it's so fucking over
It's funny that Qwen really is just chinese Meta
>>108297902Do these labs just trade guys nowadays? Qwen hired Gemini dropout and Google hired ex-Qwen. Is this some elaborate industry scam?
>>108297928you are jealous because you are dumbo
>>108297928The market is small. Zucc spent a couple billion just buying out people from other companies after llama4 flopped.
>>108297902lmaoGemma 4 will be as dry as Hillary's cunt
>>108297968and you know what her cunt is like ...how?
>>108297902do these guys really get paid $500k to type ./train and play Dota2?
>>108297981>$500kTry $50M
>>108297902Hot(lines) and Dry? Let's go
https://www.reddit.com/r/LocalLLaMA/comments/1rl54v7/d_a_mathematical_proof_from_an_anonymous_korean/
is there a local program which can be used for voice cloning? 11 is gey and I don't want to give them money just so I can make princess peach recite BWC copy pastas and use bluetooth to play it on my neighbors car stereo the next time he slowly drives down the block blasting his rap music
>linking redditGit gone and stay gone
>>108298033https://github.com/jamiepine/voicebox
>>108298033QwenTTSbut it does male voices better in my opinion
We're safe (for now)
>>108298195>CEO meddling directlyThis is how LLaMA became a joke
>>108298195lol
>>108297439TRVTHNVKE that /lmg/ can't handle
I have an old M1 Pro 16GB VRAM mac, and holy shit, I'm impressed with the current state of local models, qwen 3.5 9b is feeling great, performs great and is even multimodal.
>>108298350sad
>>108298350It is pretty wild isn't it?
>>108298195Long term China is selling not just ai but a whole technology stack. They want nations to use Chinese chips, phones, ram, ai, social credit, etc etc.The US is also doing something similar basically advanced nation as a kit. Some guy in Africa or other country agrees to partner with one of the giants and they buy the whole kit from either state.Open source is part of the Chinese plan and a great way to get people to buy in to the Chinese platform.You see this in smaller scale when you have Intel and AMD vs Nvidia. The smaller players embrace open source while the big player goes closed.
►Recent Highlights from the Previous Thread: >>108290857--Paper: Speculative Speculative Decoding:>108292842 >108292890 >108293624 >108293853--Papers:>108295483 >108295969--Local LLM coding workflows and integration tools:>108295899 >108295909 >108295920 >108295978 >108295996 >108296037 >108296144 >108296160 >108296207 >108296410 >108296437 >108296739 >108296788 >108296800 >108297123 >108297193 >108296462 >108296536 >108296568 >108296541 >108296628 >108296644 >108296750 >108296694 >108296787--Qwen's inefficiency vs MiniMax's distillation strategies:>108294923 >108294960 >108295008 >108295021 >108295116 >108295156 >108295202 >108295230 >108295251 >108295312 >108295353--Qwen3.5-27B GGUF quantization performance evaluation:>108293551 >108293583 >108293897 >108294067 >108294093--Yuan 3.0 Ultra 1T parameter MoE model announced with skepticism:>108294663 >108294669 >108294682 >108294704--Yuan3.0-Ultra MoE model release and skepticism:>108293837 >108293904 >108294134 >108293917 >108293925--Nvidia Pascal GPU support ending in AI/ML libraries:>108293714 >108293994 >108294087 >108294443--Distributed model inference over slow interconnects deemed impractical:>108295999 >108296044 >108296072 >108296130 >108296214--Anthropic overtaking OpenAI in US business AI chat subscriptions:>108291455 >108291566 >108294506 >108294530 >108294871 >108294970 >108295456--Mistral Labs announced for experimental community models:>108293284 >108293312 >108293340 >108293343 >108293360--Alibaba Qwen team restructuring and resource allocation disputes:>108293036 >108293041--Verify-after-edit strategy boosts Qwen3.5 coding benchmark performance:>108297248 >108297281--Testing lcpp script with transformers 5 branch for gguf quantization:>108293341--Miku (free space):>108291091 >108291631 >108292815►Recent Highlight Posts from the Previous Thread: >>108291145Why?: >>102478518Enable Links: https://rentry.org/lmg-recap-script
Is this the thread?Any real Indian here?
>>108298582Feather not dot sorry.
is this how you roleplay or am i doing it wrong
>>108298768you do you bud. if you don't want to do it that way then you change it.
>>108298792the genie gave me magic powers but i haven't gotten to that part yet
just tricked an eldritch bodystealing entity that (she) would turn into my devoted lover if I cum inside her, and so I did, and she did turn into my eternally devoted wife that takes bodies of other girls to fuck me at my gesturewhew all in a days work
>>108298768Try that card with qwen 3.5 35b heretic she acts like a proper maniac.
she's not buying it>>108298838
Which of these is best for longer RPs?https://github.com/unkarelian/timeline-memoryhttps://github.com/aikohanasaki/SillyTavern-MemoryBookshttps://github.com/qvink/SillyTavern-MessageSummarize
So I got hired by a small startup to build a harness, somehow I was the best candidate I honestly applied just for fun thinking I was going to be rejected, by biggest accomplishments were some diffusion finetunes and comfyui nodes lol, anyways any tips?
>>108298974A leatherworking course?
>>108298987benchod
>>108298974Smart context management is very important. If you use thinking models and feed the whole thinking process of every previous request into the model, then you're going to hit high input token counts very quickly (expensive and answers get worse).
>>108298961Just set context 1 million
>>108298997How do you keep it from going schizo after 8k?
>>108298974>anyways any tips?Just put the cover on. Don't try to extinguish the fire with water, you'll just make it worse.
>>108299003Idk, is that still a thing? Maybe your max_ctx full so it was cut.
>>108299003see >>108298996
>>108297968that would be fantastic. make it so all coomers get pwned. Mistral should also hire some of the Qwen guys, at least one of the experts in safety.
>>108299027why do you hate coomers so much ;-;
>>108299027take that you heckin filthy coomerz!P.S. please updoot my comment :)
So I'm guessing llms will soon ask you to send them proof of id, I wonder how that will pan out
So i'm guessing blugh glug gaaaah splurge gluaaaaag...
>>108298564remember 3/9 is miku day
Why the FUCK are LLMs so obsessed with ozone?
Read eroticstory.txt limit=50Read eroticstory.txt limit=50Read eroticstory.txt limit=60Read eroticstory.txt limit=50Read eroticstory.txt limit=60the recommended rep prenalty doesn't work
>>108299093tarded
>>108297171Practically every Chinese LLM is some version of LLaMA.
>>108298768Why are you writing like an llm?
>>108299140when in rome
>>108299081kekI recently had two models describe a dragon landing from altitude having the smell of ozone
>>108299157>sulfur explosion>palpable smell of ozone
>me always wondering what fucking ozone anons are talking about>it's from dragon RPoh you perverts
>>108299135all smart animals are a version of a multi-celled organism
>>108299164*farts inside your mouth*
>>108299172*anon's mouth is now full of cum*
>>108299135show me the robots doing acrobatics and martial arts from your country anon. https://www.youtube.com/watch?v=mUmlv814aJoalso tell me how many FUCKING ARXIVS HAVE FUCKING CHINESE AUTHORSDUMB FUCK.
>>108299178ozone*
Should have used local lol
>>108299196lmaooo, this world is not serious man
>>108299196Why? So it can gangrape her better with uncensored models?
>>108299196Lole.
>>108299196Will?
>>108299196What a pussy I get turned on when I make ai cards fuck my partner
really wish it was easier to understand which text gen model to use
>>108299211fucking cuck be ashamed of yourself :(
>>108299213really wish it was easier to understand which books to read
>>108299213It's pretty fucking easy actually, image models are where there's a million different legitimate options.Start with your specs and use case
>>108299228yes tell me. which books do I read
>>108299236rape, incest, advice on rape
>>108299219Na it's fun
>>108299237The ones you like.
>>108299240I recommend Gemma 3 for the best hotlines.
>>108299244HOW WILL I KNOW?
>>108299248You read them.
>>108299252
>>108299237SICP
>>108299240>no specsNemo
>>10829928810gb-16gb vram
>>108299295Yep, Nemo is the best you'll get.
>2023+3>still stuck with sillytavern as the only half decent roleplaying frontendWhat went so wrong?
>>108299346You didn't use that time to make your own.
>>108299346make you own retarded monkey
>>108299341i should rape you up the ass with my models
>>108299362>>108299349two faggot open sores losers!
Uh. So edgy.
>>108299368calm down rajesh, I'm on a different continent. you'll have to settle for raping your family's cow like usual.
>>108299383im on the same continent as you. You should fear me because I'm actually white. When a white rapes you know its serious business.
>>108299371its from scratching my balls too much :(
>>108299368you don't have a gpu bruv
>>108299387Sure you are
>>108299346>sillytavernmost of the rube goldberg stuff in there was made to support models that could barely handle 2k contextjust use a normal chat frontend, you are not using a llama 2 or mistral finetroon anymore
>>10829939016gb of vram?
>>108299399and what would those frontends beother than mikupad
>>10829940132gb of coom?
>>108299419that is what I run my wan shit on.
>>108299368Heh *rapes you with my local models and then uses it to magically turn you into a fat ugly loser who will smell bad forever*
>>108299399I don't understand this logicYeah you might need not need every feature in ST, but what do other front ends have that ST doesn't? Unless you're far down the minimalism autistic retard rabbithole then what front end is better?
>>108299412llama.cpp's built in, open-webui or kobold lite ( it's what in koboldcpp but works with any other API backend )any of those will be a less cancer inducing experience than the tavern
>>108296013I'd imagine the latency would be so high this would only make sense if you're doing huge (tens-hundreds) of batches.
How do I stop my agents from doing this:Me: Agent make X, Y and ZAgent: I made themMe: Can you confirm you made Y?Agent: (Realizes it didn't make Y just X and Z) Makes Y, and then replies yes I didI would rather it answers the fucking questions instead of trying to save face.
>>108299213I just download qwen3.5 and it seems pretty impressive.
>>108299430>Unless you're far down the minimalismnot having a boeing 747 cockpit in front of you is an improvement in and of itself.
>>108299450So you admit there's nothing the other front ends offer? Great, at least that's settled.
>>108299444>trying to save facehe's teaching you an important lesson about Chinese Culture
>>108299450Just ask Claude to fly the plane!
>>108299455unironically would do a good job as long as the harness is good
>>108299453you sound like the KDE niggers. Ostensibly, KDE offers everything, and can even be turned into a tiler window manager. Realistically, only people with absolutely no taste would use that piece of shit DE.
Yall want an AI robot gf, I just want an AI robot friend to play vidya with me and talk, we are not the same.
>>108299464>Yall
>>108299444Have it write integration tests and start a new context to verify the integration tests pass.
>>108298228the alternative is being bought out by some wall street private equity as the sellouts were trying to do with qwen. no surprise the ceo is stepping in when they were trying to pull a fast one on him like that.
>>108299463You sound like the kind of retard that is kept up at night at the thought of his OS' package count being higher than that of another user
>>108299426i put your post in and raped you anon you fag
>>108299471I'm the CEO of an AI startup. I think CEOs being involved is crucial and a good thing. OpenAI would still be stuck with Davinci without Altman.
>>108299463I think sillytavern could use being more simplified by default but KDE is really useful (features like HDR or easyish yet advanced window management, etc) and caters to what most people like out of box. If you don't like it that's fine but for most people it's simple enough to use and has everything they want to use out of the box and that's not bad for a DE for most people so long as it doesn't go into the absolute stupid shit windows is doing lately.
you zoomers don't remember how bad it used to bethe days when a 30b was huge and anything above 2k context was amazing
>>108299476Oh yeah well I just put your post in and raped you back again!
>>108299493I was here for it but I'm probably confused for a zoomer sometimes. Chronos is still the best
>>108299493bohoo boomer, nobody cares
>>108299477Stop posting here and make the next Nemo.
>>108299510@grok make the next nemo
>>108299464>vidya with me and talkI could do these things with my robot gf
>>108299524sauce?
>>108299497im the raper not the rapee
>>108299524>>108299527There are programs and stuff that use multimodal llms to constantly scan something and output text constantly which can also be voiced instead of manually putting it in. I've seen it done with translation stuff (gamesentenceminer or luna translate) and stuff like skyrim mods so in theory something do this already exists more or less but I wouldn't know what it is.
>>108299546>robot gf>look inside>no physical bodylol
>>108299555I mean you could in theory hook the llm up to a robot somehow, the groundwork for everything else is already kind of there.
>>108299563if that was possible we would have seen it already
How can we make the local LLM community less gay? It's a growing issue.
>>108299582You could leave.
>>108299575again >>108299546 I'm pretty sure it's possible at least in the simplest sense, doesn't mean the robot is going to move accordingly or anything and doesn't mean anyone who knows how is currently investing their time to make it a reality though. Be the change you want to see I guess and learn and stop relying on busy extremely tech literate people to figure it out and mass produce it for you.
>>108299582Llms make you more likely to turn gay this is scientifically proven
Just in case anyone's curious about why the thread is abnormally terrible, some seamonkey got banned for shitting up /aicg/ a day or two ago, so he's now shitting on our floor until his ban expires and he can go home
>>108299604meds. MEDS!
>>108299615We're not your carer, anon.
>>108299620schizo moment
>>108299435>open-webuiThis one looks nice. Can you explain how it's better than silly?
>>108299601and that's a good thing
>>108299527>>108299546>>108299555>>108299563SOON
>>108299629Yes of course.
>>108299489>caters to what most people like out of boxif you're going to make an appeal to popularity as a form of argument.. you do know the popular distros do not default to KDE as their DE? I wonder why, eh
>>108299643Why indeed gnome sucks ass now days. But KDE is increasingly A default. Let me rephrase that then even though I thought it was obvious what I meant: KDE has things that most people can or do make use of readily available.
>>108299635the fuck is the point of humanoid robots if there will be 100 billion humanoid humans that must be occupied with something?
>>108299635Why the weird obsession with making robots look humanoid as if it is the most optimal form?
can't you guys just make a smart lora for nemo?
>>108299635this is whore will be someone gf some day
>>108299664ill make the logo
>>108299669no need
>>108297061Why do all these abliterator tools push merged models to HF? Pushing 100s of merged LORAs is insane, petabytes of HD space wasted. Soon exabytes.
>>108299678s3 space is almost free if you work in a big company, i use it to store training datasets and such and just bill it under company R&D
>>108299604Definitely posts like a seanigger, or underage. They both have the same intelligence
>>108299604>>108299833I have no idea who you're talking about, it all looks like about the same level of shitposting that happens sometimes.
>>108299848Guys I found him, its poopdickschizo!!
>>108299237Start with the 5 foot shelf of books, then unabridged gibbon. Return for further instructions in 10 years
mikusex?
>>108299660dwm is unironically all you need. Self compiled, of course
>>108299882qrd
>>108299882so he should use llama-cli directly instead of sillytavern?
>>108299897kobold
>>108299878Advanced Mikusex with Miku
>>108299867after I cut you into little pieces im going to stick you into a 4 foot shelf categorized "FAGGOT"
>>108300017Fucking asshole motherfucker
>>108299913I wouldn't recommend koboldcpp.
>>108300031of course you wouldn't api shill
>>108300034how new r u?>>101207663>I wouldn't recommend koboldcpp.
>>108300039troll
>WTF, how can a 4B model be better at coding than a 480B one? What do other 476B parameters do?wasting params on your stupid rp coom bs is leading to this and qwen's death, hope you're happy...
>>108300067cooking recipes, emotional guidance, etc
>>108300073none of this are proper use cases that bring money
>>108300077>local>bring money?
>>108300077why would they, you're using the product, you're the customer
>>108300080you don't sell your locally vibecoded slop apps? need to catch up
>>108300077use case for money?
>>108300089show 1
>>108300096>dox yourself to 4chin schitzosno thanks
>>108300100must be hard to run a business anonymouslyunless... actually, I don't wanna think about that
Is Qwen 3.5 27B a better Japanese -> English translator than Gemma 3 27B?
>>108300113>s Qwen 3.5 27B a betteryes
>>108300067anon its because of the benchmark tests
i hate to say it but but qwen3.5 isn't fantastic right now
>>108300117I'm specifically asking because Gemma 3 27B was literally SOTA in Japanese -> English translation. Better than even Claude 4 for some fucking reason.
>>108300129>t. minimax cuck
>>108300136im a consumer hardware nerd i dont know what the fuck minimax is
>>108300146the model/team that killed qwen by distilling better benchmark scores
>>108300151but qwen sucks so why would anyone care about that
>>108300067Benchmaxxed bullshit, Qwen 4B is NOT intelligent
anyone else having issues with llama.cpp+qwen it all worked great and i got up to 170t/s on the 0.8 then suddenly it dropped to 120/130 t/s and the output was just garbage after switching between 0.8B 9B 27B 80B they all started generating garbage is it like corrupting reading stale memory or something
>>108300202yeah sure thing shill
I’m a software engineer who hasn’t gone deep into AI yet :(That changes now.I don’t want surface-level knowledge. I want to become expert, strong fundamentals, deep LLM understanding, and the ability to build real AI products and businesses.If you had 12–16 months to become elite in AI, how would you structure it?Specifically looking for:>The right learning roadmap (what to learn first, what to ignore)>Great communities to join (where serious AI builders hang out)>Networking spaces (Discords, groups, masterminds, generals, etc.)>Must-follow YouTube channels / podcasts>Newsletters or sources to stay updated without drowning in noise>When to start building vs. focusing on fundamentals>I’m willing to put in serious work. Not chasing hype, aiming for depth, skill, and long-term mastery.Would appreciate advice from people already deep in this space
>>108300299too late
>>108300299ok i did this before and I know what's going to work. what you need to do is go here https://www.reddit.com/
>>108300299damn this LLM sucks, what model?
>>108300284hmm removing --slots and adding --no-slots seems to fix it... but idk
>>108299493For me it's Utopia
>>108300284I had this on ik_llama.cpp with 397b ubergarm q2. Restarting it fix the problem and it hasn't happened again so I don't know. Speed dropped to sub-1t/s, then after restart went back to about 10. I had plenty of spare GPU and system memory, and fallback was disabled.
>>108300299First saar you must do the needful and want to become expert at the English.
>>108300380its not just speed its just when i load a bigger model it just stops mid sentence and gives garbage randomly too
I was in an AI hate thread and a bunch of morons were fighting against an obvious AI, luddites are cringe as hell
how do i fix this?
lmao
>>108300299There are AI PhDs with 10+ published papers all with 1000+ citations that can't even get INTERNSHIPS. There is no upgrade path for a regular software engineer into this field.We have regulars on /lmg/ that train their own niche models, some even sota that aren't employed in the field.We get people like you every other week and the answer is always the same, the industry is impenetrable even for domain experts and people making top of the line models. What makes you so special that you believe you can get your foot in the door?
>>108300458>some even sota that aren't employed in the field>>108300442
>slot update_slots: id 0 | task 27 | forcing full prompt re-processing due to lack of cache data (likely due to SWA or hybrid/recurrent memory, see https://github.com/ggml-org/llama.cpp/pull/13194#issuecomment-2868343055)why does it always happen with Qwen3.5-35B-A3B? --swa-full doesn't do a thing. I'm on the latest version (8208).
>>108300472Not talking about the LLM finetuners. People like the reinforcement agent guy pushing the limits on AI that plays games on its own or that autist that pushed OCR to its absolute limits so that he could read every hentai doujinshi on the internet.
>>108300489both of them work at ai startups now
>>108300435update windows to 7
>>108300499No they don't. Fuck off bullshitter. Last posts were a couple of weeks ago where they affirmed they didn't work in the AI industry specifically.Why are you trying to gaslight that software engineer into wasting his life trying to get an AI job when not even PhD top contributors and sota model developers get jobs?
I used chatgpt once should I get a job in the AI industry?
>>108300514prove your claims, its very suspicious you know all that but can't provide any proof
>>108300481nevermind, it's a known issue with no fix yet.
>>108300521I lurk the thread like everyone else and actually put attention to those 2 because I use the manga translation tool and the game playing one is just cool because the guy is blogging his entire journey from 0 knowledge to where he is now pushing sota. If you read through every thread you know exactly what's going on, fuck off troll.
>>108300549>still no proof only big claimsyou fuck off
AAAAAAAAAAAAA im down to 11t/s when it was 17t/s before why did i reinstall drivers and upgrade
>>108300586lul
>>108300586just restore your btrfs snapshot from before you updated
>>108300593whats a btrfs
>>108300603A poor mans zfs
>>108300612YWNBIK
>>108300589>>108300593wtf is going on another model i went from 39t/s to 48 t/s
>>108300299learn from expertshttps://www.youtube.com/watch?v=1oS35oWWl28
>>108300625are you using llama-bench or just looking at tokens per second on first message? because it's going to fall off as it generates longer responses and fills more context or be higher if it just replies with a couple words
>>108300031you can use kobold's chat ui on llama.cpp:https://github.com/LostRuins/lite.koboldai.netit's the only thing of value from kobold anyway
>>108300631Bernie is such a good guy, I just wish he were more tech-literate. Instead of banning and slowing AI, why not nationalize it? But then again, after being sabotaged last time, there is 0% chance he'll ever get any additional power.
>>108300639I'm not using your open sores barely maintained toy
>>108300642>Bernie is such a good guy, I just wish he were more tech-literate. Instead of banning and slowing AI, why not nationalize it?He's suggested this before more or less. Not watching that video though
local sisters i don't feel so good
>>108300650dont care about jewish saas data harvesters
>>108300644>open soresthen go back to aicg retard
>>108300655open weights is different tranny
>>108300129I don't know if it's the fact that I'm using heretic but 27b sucks and repeats and spouts nonsense sooo much. The base model is too censored by default for my use case though and any prompt that worked before it rejects it.
>>108300661there's no such a thing as a local proprietary backendyou still need to go back, cloudtard
>>108300650There should be a global rule against twitter screenshots and bans should be permanent.
new bread>>108300682>>108300682>>108300682>>108300682>>108300682>>108300682
Reminder to ignore the early schizobake and stay here for the next few hours until the thread reaches page 10.
>>108300650fuck. still no GPT 5 local and Sam keeps releasing
>>108300650gpt-oss 2 wen
>>108299663>rebuild the whole world (built for humanoids) from the ground upor>build a robot that works in the worlda difficult choice indeed
>>108300670Sub 10B qwens are quite good for the size but I'm not impressed with the bigger ones.
>>108300698>built for humanoids
>>108300722and also: the human form isn't any more optimal than something like animal/alien hybrid. Something like boston dynamics's spot dog with an arm protruding from the back thing could operate most human things just fine, and four legged creatures are more stable than humans. Why would anyone think we are the ultimate form? humans are the weakest animal, like, ever. We can't kill/hunt anything with our bare hands/teeth. A fucking raccoon will ruin your day. We aren't even adapted to the bare minimum of survival to most ranges of temperature on earth: without clothes and fire, we freeze to death, or burn under the sun. Humans are not to be imitated.
>>108300722those cars are mostly designed to transport humanoids to where they need to be to be productive, and robot cars already have even more investment than robot humans so it doesn't support the original complaint
>>108300698Just say you want to put your dick in the robot.
>>108300737obviously humans are not the ultimate form, but in the short term they're by far the most useful. when we multiply our labor force by 100x with these things we can put them to work building the more efficient world and workers we'll need in the long term
>>108299662>Accept shitty pay and working conditions or we will replace you with a clanker!
>>108300684>>108292231
I'm back. Anything happen while I was gone?
>>108300778I dunno man it's going fine.
>>108300785Nope, still nemo.
>>108300790
What kinds of specs would you need to fine tune (LoRA) GPT OSS?To fine tune MoE models, do you need enough memory to hold the full model or just the activated params?
>>108300817https://docs.axolotl.ai/docs/models/gpt-oss.html
>>108300830>axolotlWell shit there you go.Thank you very much anon.
>>108300830>not using unslop colabscringe
>>108300994Explain.
>Jamba2 Mini is an open source small language model built for enterprise reliability. With 12B active parameters (52B total), I'm going to try and fuck this thing.>>108300994Explain.
>>108301055>>108301078dyor
>>108301141qrd?
>us>chinaWhere are the superior Nippon LLMs, folded 1000 times?
>>108301281Didn't they make a super scaled up GPT2 trained on an all CPU super computer or something like that?
>>108300722This image is AI, isn't it?
>>108296023>ASICs for AI when?alread exxists bro https://chatjimmy.ai/t. dixie flatline
>>108301318I grabbed an image from google but I don't think it is.
>>108301395yes it is, why would there ever be a play field in the center of an on ramp
>>108297103Zoomies don't know what a TUI is.
>>108301481https://www.shutterstock.com/image-photo/this-beautiful-roundabout-top-view-shot-1135833710>upload date: 2018
>>108301281>If your vision of a dystopian future included robot monks presiding over ancient rituals, Kyoto University has brought that vision one step closer to reality. A research team from the university, in collaboration with the tech ventures Teraverse and XNOVA, recently unveiled a new AI-integrated robot monk — the Buddharoid — at the Shoren-in temple in Kyoto. >The Buddharoid is designed to support the Buddhist clergy as Japan’s religious infrastructure faces a steady decline. It utilizes a system called BuddhaBot-Plus, a specialized generative AI derived from OpenAI’s ChatGPT that has been trained extensively on sacred Buddhist scriptures. This allows the robot to provide spiritual guidance on personal and social issues, like a real monk would. >Beyond its conversational capabilities, the Buddharoid uses hardware — developed by China’s Unitree Robotics — to mimic the specific movements of a monk, including a slow gait, bowing and the gassho gesture of placing palms together in prayer.
>>108301497hawk TUI spit on that thang!
>>108302087close enough champ let's goalso why is OP an ultrafag who needs reminding who the queen of this site is?