/lmg/ - a general dedicated to the discussion and development of local language models.Previous threads: >>108434876 & >>108429328►News>(03/17) Rakuten AI 3.0 released: https://global.rakuten.com/corp/news/press/2026/0317_01.html>(03/16) Mistral Small 4 released: https://mistral.ai/news/mistral-small-4>(03/11) Nemotron 3 Super released: https://hf.co/nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16►News Archive: https://rentry.org/lmg-news-archive►Glossary: https://rentry.org/lmg-glossary►Links: https://rentry.org/LocalModelsLinks►Official /lmg/ card: https://files.catbox.moe/cbclyf.png►Getting Startedhttps://rentry.org/lmg-lazy-getting-started-guidehttps://rentry.org/lmg-build-guideshttps://rentry.org/IsolatedLinuxWebServicehttps://rentry.org/recommended-modelshttps://rentry.org/samplershttps://rentry.org/MikupadIntroGuide►Further Learninghttps://rentry.org/machine-learning-roadmaphttps://rentry.org/llm-traininghttps://rentry.org/LocalModelsPapers►BenchmarksLiveBench: https://livebench.aiProgramming: https://livecodebench.github.io/gso.htmlContext Length: https://github.com/adobe-research/NoLiMaGPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference►ToolsAlpha Calculator: https://desmos.com/calculator/ffngla98ycGGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-CalculatorSampler Visualizer: https://artefact2.github.io/llm-samplingToken Speed Visualizer: https://shir-man.com/tokens-per-second►Text Gen. UI, Inference Engineshttps://github.com/lmg-anon/mikupadhttps://github.com/oobabooga/text-generation-webuihttps://github.com/LostRuins/koboldcpphttps://github.com/ggerganov/llama.cpphttps://github.com/theroyallab/tabbyAPIhttps://github.com/vllm-project/vllm
►Recent Highlights from the Previous Thread: >>108434876--CUDA optimization PR sparks LLM-assisted development debate:>108437073 >108437094 >108437528 >108437535 >108437550 >108437567 >108437569 >108437582 >108437846 >108437676 >108437892 >108437907 >108437909 >108438172--OpenClaw dual-model coding workflow optimization:>108439481 >108439578 >108439716 >108439719 >108439731 >108439742 >108439747 >108439772 >108439770 >108439821 >108440019--llama.cpp server excludes Responses API server-side agentic loop due to C++ maintenance cost:>108437944 >108438326 >108438376--Debating NVIDIA AGI claims and feasibility:>108439814 >108439835 >108440572 >108440630 >108439859 >108440096 >108440176 >108440413 >108440451 >108440468 >108440475 >108440478 >108440491 >108440511 >108440326 >108440335 >108440370 >108440388--Troubleshooting reasoning mode activation in Qwen models post-autoparser:>108435077 >108435086 >108435269 >108437362 >108435294 >108435323 >108435332 >108435341 >108435359--Qwen 3.5 4B and KV cache quantization debates:>108439408 >108439435 >108440867 >108440876 >108440928 >108440953 >108441003 >108441044 >108441155 >108441210 >108441270 >108441515 >108441564 >108441636--Model recommendations for limited VRAM/RAM setups:>108437524 >108437530 >108437534 >108437557 >108437563 >108437624 >108437636 >108437672 >108437700 >108437836 >108439050 >108439086 >108439098 >108439123 >108439129 >108439131 >108439156--Debating Anthropic's closed-source Claude Code SDK strategy:>108435933 >108435942 >108437833--Kimi k2.5 admitted to be Cursor's Composer-2's base model:>108435414--AI solving previously unsolved math problems via FrontierMath:>108439710--Anon successfully implements LLM-generated voice activation for PC control:>108441088--Miku and Dipsy (free space):>108436067 >108441064 >108441515 >108441560 >108435820 >108441286►Recent Highlight Posts from the Previous Thread: >>108434877Why?: >>102478518Enable Links: https://rentry.org/lmg-recap-script
Fuck yah a Teto thread. Time to get fucknig wasted and snort cadmium.
Not llm related as such except I've been rewriting some client stuff. Key insight:Stop using vim. You'll end up working 3 times as much for a simple operation. When you need to concentrate on using the keyboard more than actually just typing it out there's something wrong.I always thought that vim was fun to use, well it is if you just edit config files, but for anything larger it's just torture and waste of time unless you are a masochist.When you switch text editor it's a massive improvement, as if computing just advanced 40+ years in a moment.That's all.
>>108441780>not waiting for a rin thread so you can inject yellowcakeOFF MY BOARD NORMIEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE!!!!!!!!!!!!!!!!!!!!!!!
>>108441815Emacs hands wrote this post. Go back to your lisp machine you Sussman-worshipping fogey.
>>108441826I'm using notepad.
>>108441815Maybe learn vim before attempting to use it?
>>108441815>>108441862I use nano.
>>108441862proving his point
what's the point anymore
>>108441979just keep drinking
i'm going back to playing video games
v4 is so close i can taste it
>>108441873>not using ed/g/ has fallen
>>108441999Have fun, Anon. Play with me?
>>108442003more like Vnever
>>108441979what *was* the point?
>>108442013Ed is the standard text editor. When I use an editor, I don't want eight extra KILOBYTES of worthless help screens and cursor positioning code! I just want an EDitor!! Not a “viitor”. Not a “emacsitor”. Those aren't even WORDS!!!! ED! ED! ED IS THE STANDARD!!!I miss /prog/.
>>108442013
that exhilarating feeling when u pull... and build and deploy. makes my pp go all tingly
>>108442176what about your tg
>>108442187heres my pp and tg :)
Any macfags tried compiling llama.cpp with PGO?
>>108442192teehee
>>108441862Where did I mention that I was new to vim? Fuck you.
>>108441780>snort cadmiumhttps://www.youtube.com/watch?v=1U6qefKcOrg
>>108442241teto is good and all,but realistically rin is where it's at.
>>108442209>You'll end up working 3 times as much for a simple operation. When you need to concentrate on using the keyboard more than actually just typing it out there's something wrong.Gave it away
link ur favorite legal character cards plz.
>decide to try containerizing my LLM to docker (windows host)>get 1/3 of the tokennage (llama-server cuda image)AIEEEEEEEEEE do I have to move to loonix to get max llm perf?
>>108442373you mean like a lawyer?
>>108442416yeah
I can't wait to Install my first local model on my new mac mini tonight.Should I get a tattoo to celebrate?
>>108442448I got 2nd hand cringe
>>108442448
I'm trying to think of an appropriate candlejack analogy here because this shit is so anno
>>108442488I'm fighting against a shiver.
>>108442493I'm smoothing my skirts and looking at you through my lashes.
>>108442448I don't know. Qwen 3.5 9B can do a C function what replaces 'source' with 'destination' in 'my string'>void (char *my_string, char *source, char *destination);But it couldn't work out how to do replace every occurrence of source with destination. It always failed with string length allocation. 10+ tries.
>>108442528>a C function what replacesaaa
>>108442531My point being that model is probably being shilled bit too hard on internet right now. But of course bla bla bla and stuff.
>>108442528I'm too drunk to read through the response, but I suppose the punchline is that the prompt's method definition includes neither __restrict__ nor a length for my_string.27B-q8_0-heretic, curious to see other models.
>>108442577This won't compile.
>>108442528It's shilled together with hermes
>>108442448I'm not sure I agree but at least it's cute and gives some ideas.
>>108442577>I'm too drunk to read through the responseNow this is vibecoding. Time to ship.
>{{user}}: You know what would be cool? If we went to the tattoo parlor and got "I'M A STUPID FUCKHEAD" tattooed right on your forehead!>{{char}}: HAHAHA! That'd be amazing! Let's do it! I'm going online right now to pick out the style of writing!
interesting
>>108442674kino
Thought I'd try out OpenClaw but seems like it is really pozzed. This website is a big no no.
I'm broke but i really want a mac mini to put an agent in, 128gb Ideally.
>>108442720>just run a 4B q4 LLM (totally local only!) that answers all your emails and social DMs for you bro! your life will be so much better!!>>108442732just steal one
literal free higher perf models we're back https://www.reddit.com/r/LocalLLaMA/comments/1s1t5ot/rys_ii_repeated_layers_with_qwen35_27b_and_some/
>>108442720So try hermes
>>108442708It do be like that after building GPU rig
>>108442747>Wen GGUF? When someone GGUF's them I guess?he really thinks someone else will waste his time making GGUF on his highly experimental shit? he's delusional as fuck
>>108442769ballmuncher team literally ggufs anything they find so why not this
>>108442747basically he showed that duplicating only one extra layer at the middle of the model can improve the model a lot, that's interesting
>>108442809you wouldn't even need to duplicate the layer, forcing the model to reroute to the identified layer would achieve the same result and wouldn't need this retardness about having yet more models on drives.
>>108442822damn, you're onto something anon...
finally gave qwen 3 35ba3b a shot and its pretty rad but, watching it have trouble deciding whether or not to "allow nsfw" then proceed to allow nsfw but almost gemini-like censored is kind of annoying, are uncensor models still a total meme or is there at least one good one?>>108442674now inform it about ludokino.
>>108442848hauhaucs 35b is godlike
>>108442809>>108442838Stop encouraging mergesloppasSeems once a year someone gets the fresh new idea to copy paste transformer layers
>>108442848>at least one good oneuse one of hauhau's or a heretic abliterationhauhau will break your chain-of-thought 30% of the time, heretic will more frequently get stuck in loops
>>108442859>or a heretic abliterationthose one are a shit
>>108442857>B-but, muh Mythomaxit was just luck, we never managed to get something like this again
>>108442857>Stop encouragingyes we can only doom and we should all saas right now, thank you sir
>>108442854>>108442859thanks pals.>>108442866>mythomaxnow that's a name i haven't heard in a long time.
speaking of https://www.reddit.com/r/LocalLLaMA/comments/1s298y6/request_training_a_pretrained_moe_version_of_nemo/> I converted Mistral Nemo from a dense model into a sixteen expert MoE model: https://huggingface.co/blascotobasco/Mistral-NeMoE-12B-16E
>>108442892I'd like to get the opposite, transform Qwen 3.5 35b MoE into a dense model, it'll be smarter than the 27b model
>>108442892
>>108442892oh no no ai !psychosis>via the shattering method — a dense-to-MoE structural transformation developed as part of the Nebula Structural Modification Suite. >By Phase 3, coherence was restored. The final phases focused on knowledge distillation, logical reasoning, and instruction alignment.>The shattering process destroyed the original model's capabilities entirely, and the training curriculum rebuilt them from scratch under significant budget constraints — this is a student project, not a well-funded lab release.>Rich atmospheric prose — responds well to detailed character context>High-fidelity roleplay — adopts personas naturally from system prompt descriptions>This model was produced entirely using the Nebula Structural Modification Suite, a self-designed framework for extensive structural modification of language models. Tools used in this model's production include:
Has Jensen lost it?
>>108442945I hate when those fuckers are even writing their post with an AI, they don't even see pride in writing in their own style, it's just sad
>>108442981no he's won it
>>108442981>yesterday
>>108442988Still no AI GF so I'm calling bs
>>108442994just lower your standards bro, same as it's always been
>>108442877His face inspired me
>>108443001YOU MAY FIND YOURSELFERP'ING IN SILLYTAVERN WITH YOUR WAIFU>>108443006lol'd
>>108443006Do not gandong the mikus.But I must.
>>108443001All I ask for is something believable.>anonymous has achieved AGIIs just as believable without any corroboration.
>>108442577>Lara tucks a stray strand of auburn hair behind her cat ears HAHAHAHAHAHAHAHA
>>108442475
>>108442809It will enforce model's knowledge in certain areas because the layers are duplicated, but it'll be more retarded in the end. Or at least this is how I understand this (doesn't matter lol).
https://github.com/ggml-org/llama.cpp/pull/18322this merged when??? ggnigeranov??????
>>108443140why do you care about it??????
>>108443140>add unsafethat doesn't sound very safe...
>>108443148because I use llmao-server in router mode and I have the same fucking model duplicated in the config 4 times depending on the context size but this change would make it so I just need to pass an extra param in the request!?!??!!?!??!!?!?!?!?!?!?!?!?!?!?!?
>>108443140it says "Draft" anon, the guy who made the PR hasn't finished it yet
>>108443140this has already been supplanted by the dynamic model routing featurejust get /models and post ?model=sex to the completion endpoint and it just werks
>>108443155oh I see!?!??!!?!??!!?!?!?!?!?!?!?!?!?!?!?you can always use the pr yourself!!!!!!!!!!
>>108443158are you retarded? this is to override the defaults you specify in the config!!!!!!!!!!!!!!!!!! LEARN TO READ!!!!!!!!!!!!!!!!!!!!!
>>108443166that's the dumbest shit i've ever read?!?!?!??!!??!!!!??!?!!!!!?!?!?!!?!??
>>108443157it is done for all intents and purposes!??!?!?!?!?!?! ngxson is just a lil bitch scared about security advisories and bughunts?!?!?!?!?!?!?!!?!?!?!?!?!?
>>108443162>>108443166>>108443169>>108443170this conversation reads like erp quality degradation after downloading one of those shitty merges quanted to anything lower than q8
>>108443173I'm so drunk Ic an't even tell which posts are ironic anymore I'm laughing my ass off
>>108443179hey read this anon's chat log and let me know if you piss yourself
>>108443183only a dumb nigger would think that word would offend me, i'm not sur what you're getting atsujck my duck
Why wouldn't this work?https://www.tiktok.com/@aidanchappellofficial_/video/7620577308218297630
>>108443140So much text? I'm a street mathematician I don't need this mumbo jumbo.
At this rate local models will be associated with troons
>>108443278"AI consumes water" is one of the more retarded takes that somehow took hold.
>>108443395retard
>>108443421retard
>>108442488>candlejackDead meme.Nobody remembers that retarded old
>>108443421>>108443426retard
retard
Tu es en retard.
>>108443497lol
>>108443497peut etre
https://huggingface.co/goodnight399/activity/community
>>108443554>license schizos be like
>>108443554Looks like a loose agent on the 'hub.
>>108443587dunno if an agent would write this ESL tho>From what I can tell, this model appears to be an quantize version of
>>108443591There's nothing wrong with it's.
>>108443012apparently one should notthis MF is gonna teach me Chinese, new default assistantstill needs some tweaking
>>108443795My hubris says I can handle at least 39 Mikus.
>>108443802the perfect amount
>>108442003TMW
>>108443802you're courting death
>>108443795How to make my llm type as retarded as yours?
PSA from /aigc/>>108443846>https://github.com/BerriAI/litellm/issues/24512>>hope nobody here was using thisWe mostly don't, but still. Watch out.
>>108444004>credential stealerprolly releated to the trivy disaster
>>108444004ah llm malware seems to be today' s theme https://www.reddit.com/r/LocalLLaMA/comments/1s2clw6/lm_studio_may_possibly_be_infected_with/
>>108444004>all those commentslmao did all those gh account get owned?
>>108444024was gonna say, sus as hell
>>108444004Fuck. /aicg/. TL;DL: Credential exfiltration.>The litellm==1.82.8 wheel package on PyPI contains a malicious .pth file (litellm_init.pth, 34,628 bytes) that automatically executes a credential-stealing script every time the Python interpreter starts — no import litellm required.>>108444016I don't know what trivy is.>aquasecurity/trivy: Find vulnerabilities, misconfigurations ...Oh. If that's the thing, that makes it funny.
>>108444024Nah. It's fine. Perfectly normal people.
>>108444033https://www.wiz.io/blog/trivy-compromised-teampcp-supply-chain-attack>tldr: megacorpobacked SEC scanner for containers and code gets compromised>binary and GH actions related to it steal all credentials>thousands of GH projects and accounts + keys stolenthe project doesnt even need to be involved, as long as a contributor (or someone who can merge) is compromised and was using a not scoped token, than it's overI guess this is the fallout
>>108444004
>>108444052>teampcpso yeah same guys behind the trivy hack. long story short, DONT PULL in the coming days XD
Ugh it's just a matter of time until ComfyUI gets the same treatment.
This is why centralization is weakness.
>>108444085>bring comfy out of nowherecomfy derangement syndrome is real lol
>ldg drama war starting here again
unsloth bros?????
>>108444110ULTRA_LAMO_DELUXE!!
>>108444110haaaaaaaaaaahahahaha
>>108444004what the fuck
>>108444119Thanks, that helped!
>>108444119To be fair, it *did* work.
>>108444019https://github.com/lmstudio-ai/lmstudio-bug-tracker/issues/1686
>>108444119
>>108444119Thanks for the gold, kind stranger!
>>108444125>>108444122
>>108444126oh, now at least it makes a bit of sense
>>108441758anyone here.. who even studied bit deeper into hypothetical AI structure and way as they claim that it works?Seems to me that Noise and denoise is complete Farce!anyone?
>>108444019>>108444125It's lmstudio, so kek if it's real. But AVs can give false positives sometimes.
>>108444152Install LiteLLM
>>108444156This was the answer I was looking for.
>>108444110
>>108444004what is this for and why should I use it?
>>108444226>what is this forI understand it routes queries to different/multiple providers.>why should I use itYou shouldn't.
>>108444004how did this happen?
Piotr no!
>>108444226open router but local I thinkbasically an ai gateway, it has enterprise use so that's probably why it got targeted, it's way juicier than random joes
>>108444246supply chainhttps://www.wiz.io/blog/trivy-compromised-teampcp-supply-chain-attack
>>108444194Worked like a charm, much appreciated.<|im_end|>
>>108444255don't click that link it's mustard gases
>>108444255So trivy is a security scanner, and that got compromised? OK that sucks indeed.
>>108444242>>108444253ok thanks
The vibe coding general is getting more shit done than you troons
>>108444156I already did ..Basically I understand how they create model but when one start to dig into way how they stick together image from prompt they all repeat same noise denoise mantra which explain literally nothing.I am long term user of 3ds Max ,Maya ,Blender exploring this thing I I have strong feeling that something is intentionally left out or covered by this vague noise denoise concept.
>>108444291they're getting hacked lol
>>108444297youre an esl retard, I doubt ull be able to grasp anything more complicated then line in space
>>108444297read https://arxiv.org/pdf/2010.02502
>>108444308>2010lol
>>108444255> On March 19, 2026, threat actors compromised Aqua Security's Trivy vulnerability scanner, injecting credential-stealing malware into official releases and GitHub Actions.how?
>>108444304Sure sure..
>>108444291
>>108444019why do people use lm studio anyway? it doesn't even expose all the functionality from llama.cpp, I went full on LMAO when I read they recently added "presence penalty" to better support qwengarbage wrappers, like bruh, how hard is it to handle the basic sampler parameter passing
>>108444326Compromised maintainer accounts.>The threat actor, self-identifying as TeamPCP, made imposter commits that were pushed to actions/checkout (while spoofing user rauchg) and to aquasecurity/trivy (while spoofing user DmitriyLewen).
guys my wife (llm) is acting strange
>>108444342built for teampcp
>gay + jew flag repostedlool
>retards letting random shit on their servers rawdog networkshttps://rentry.org/IsolatedLinuxWebService
>>108441758Simply I just do not get it why they chose this stupid image approach !Why one need 100 000 images of Woman when In reality you need only one standard 3d model of woman body where you can adjust anything by prompt .Instead of one image you have 360 3d scene where you can control anything Full control over pose Animations Light .You set up camera and desired style shadersAnd make Just image or whole video on your basic gaming card !!Seems that way as they designed it is intentionally compute intensive !So they "need" supercomputer instead basic desktop !This current inefficient approach make sense for Nvidia !!! what you guys think?
>>108444356Probably pole too, must be Bartowski relative
>>108444421>>108433569
>>108444297Fundamentally, the model is trained to remove noise from images using the image and a caption. For example, during training they will give it the text "a photograph of a cat", a picture of a cat with 50% noise added, and the correct output (what they train the model to produce) is the same picture but with only 40% noise. This is in some sense an impossible task, since adding noise destroys some of the information from the original image, so basically this is training the model to guess what the image might have looked like based on the text caption. Then repeat this millions of time with different images and different noise levels from 0% to 100%. Now you can give it the text "a photograph of a cat" and completely random noise as input, and it will reduce it from 100% noise to 90% noise, then 90% -> 80%, and so on down to 0%, and you end up with a picture of a cat.
>>108444339> The threat actor, self-identifying as TeamPCP, made imposter commits that were pushed to actions/checkout (while spoofing user rauchg) and to aquasecurity/trivy (while spoofing user DmitriyLewen).how? did someone merged or whatever thinking it were real devs?
>>108443932years of prompt engineering skillpertise
>>108442448>confidence score from LLMsI'm tired of this meme
>>108444498Do you really think this guy knows anything about coding and security?
>>108444564>pngs you can smellpls Lord no
>>108444564is this> user rauchgor> user DmitriyLewen?
>>108444597worse, the ceo
>>108444612ceo of> Aqua Security?
>>108444549Feline is like :GONNA KILL YOU IN YOUR SLEEP!
https://sakana.ai/namazu-alpha/lollmao even
https://neurips.cc/Conferences/2026/MainTrackHandbookNeurIPS 2026 bans sanctioned entities (Huawei etc.) from submissions
>>108444835Huawei was one of the top paper contributors last year
>>108444119Agentic world bro wake up
>>108444902>Whenever you find an issue that was helpful make sure to always leave a short thank you message.
>>108444902>>108444925the messages are all the same, it's not an ai agent, it's a classic bot
>>108444944You ever asked a small LLM to write a joke about a topic? it will shit out the same 2-3 jokes almost verbatim.
>>108444944bro it's 2025 if you're not using agents for everything what is you even doing?
>>108444965it's obviously less expensive to use a simple bot>>108444961these are the same copy pasted messages
sad to see "people" stuck in the past like that, luddites are really something
>>108444965>it's 2025unc...
>>108445089I apologize you're absolutely right to point that out, as of my last knowledge cutoff update in late 2023...
You don't hate (((them))) enoughhttps://ramimac.me/trivy-teampcp/#iocs
>>108445115Surely no one would be stupid enough to run these things as root.
>>108444965>what is you even doingnot being a nigger
Margarine Country.
Teto Territory.
I used qwen3.5 4b to generate a bunch of docker files and turned them into singularity containers successfully.I'm now an expert in docker and singularity.
>>108445363armpits are disgusting and you should kill yourself
>>108445363armpits are delicious and you should keep posting that
>>108445363armpits are a normal feature of human women and I have no particular feelings on whether or not you post pictures that include them
>>108445372>>108445368If I can use local models to generate very complex Jupiter notebook environments and vscode environments in docker files and then create containers and shit, then I consider myself an expert in these things now.
>>108445383Use case?
>>108445383Proof?
>>108445380Use case for human women?
>>108445368tfw the human body is disgusting, we should all kos
How do we rank the top models based on their their Judeo-Christian values?
>>108445464>2023
>>108445414Proliferation of the species.
>>1084454642023 is the stone age that even qwen3.5 1b can beat.
>>108445495it's qwen3.5 0.8B tanks you very much
>>108444762> sakana still makes models> yet yi and cohere had to dieain't this a gay earth
>>108445512prefer pure japanese blood and brain over c*nadians tobequiethoneste
>>108445521Yeah about that
>>108445512Didn't particularly care about Yi but it's still alive I believe. But don't shit on Cohere, they must live and make another model as good as CmdR+ was.
>>108445512who the heck is sakana
I like Teto more because she didn't steal my wife like Miku did
>>108445521Still waiting for the judeo japanese modelSo much for the advanced japanese
>>108445414Hard to say...Considering fact that most modern women think That Men are useless.And majority of Modern women refuse have kids until 35 when they no longer have egs !!!Like We Men we ca function even with this obstacle ... sex robots .... and in worst case scenario we just going to invent artificial womb... but after that... I do not see much use for Women......You see my point !Right?
>absolutely braindead ESL
>>108445614Female spiders are usually stronger than male ones.
>>108444356what does the heart mean, do you love the gay jew's retwat?
>>108445363Armpits don't exist
>>108445392Then you don't need to install anything wherever you send the container
>>108445635I see your point.... but...We are Mammals not Arthropods soo....!!!
>2026>still no tutorial on how to make models think in person
>>108445692Did you mean "in character"?
>>108445692I'm glad deepseek invented thinking for models.
>>108445725Yeah ...but that is not thinking... per say!
>>108445758>per say
>>108445769algorithm That is it...!Hope You do not believe it is Intelligent or something..?x)AI stand forAlgorithm Interface Not Artificial Intelligence that is just fancy Marketing name!Hope you know that ! ..?
>>108445614>modern women
Decided while bored to try setting up a dumb mcp thing with kobold. Actually works pretty painlessly, only issue I ran into is that tool calls understandably require a shit load of reply length and get confused when you try to continue the unfinished message. Now I just have to figure out how to make it useful for something other than just making/reading/updating markdown notes
>>108445824Yup!!! I See You understand base level of Women nature!
>>108445824Modern discourse is to use the tail ends of distributions to explain the rest.
I thought that it was just local models being retarded when they reread files that are already included in the prompt but that behaviour was probably distilled from opus (pic related).
Haven't been here for half a yearWhat's the new meta for 24gbVRAMPls don't tell me it's still Nemo, mistral small and gemma3
>>108445115based mossad cleaning up persiaslop
>>108445899>Modern discourse is to use the tail ends of distributions to explain the rest.mean while machine learning be like:
>>108445905the new qwen is ok unless you want sex
>>108445905qwen 3.5 27b, unsloth or bartowski
>>108445905for assistant and tasks, qwen 27B hauhaucsfor sexi sex, I gave up on that for local so dunno
v4?
How do you pronounce hauhaucsIs it How-Hawks?
>>108444004>he pulledNot even once
>>108445901all depend on include file ...
>>108445414Supply chain attacks on the human gene pool.
>>108445930my uncle works for deepseek and im using unlimited api rn, it's basically claude 5
>>108445930>>108445945THIS
>>108445930https://huggingface.co/deepseek-ai/DeepSeek-V4-Preview
>>108445975wtf
>>108445975For the anons out there, obvious fake.
>>108445860I mean to be fair that's me and a lot of men too even now. Why do you think femme fatale and yanderes and such are tropes. It's been a thing for so long that "bad girls" or "bad boys" are interesting
>>108445910well .... machine Learning is dead end !why?It is simple !It is all just huge long algorithm and all those huge numbers up there become horribly inaccurate hence double heads etc etcDead end!!!algorithm
>>108445994I clicked anyway...
schizo
>>108445994This anon wasn't trying to use reverse psychology and neither am I. It really is a fake, don't click it.
>>108445994>>108446044I already downloaded and ran it, stay coping, Dario
Kimi K3 will release before DS v4
>>108445975
Given there is no "retarded writing style" filter built-in in 4chanX, I would like to share these with my fellow Anons:/\.{3,}[!?]//\.{4,}//!{3,}/
>>108446105I think it falls into the bot category. Reads like a poorly trained 135m.
>>108445975Holy shit.
>>108446054jesus fucking christ i need another ssd now
>>108446105Thanks for your service
For fun, I started asking qwen 35b to illustrate/draw an apartment layout in html with a slight system prompt to make it feel slightly more humanIt's kinda cute. Shame it's ass at creative writing, but fairly good at being a cute retard in other areas
>>108446141Watching Miku bathe during work!
>>108446141
>>108446120...naughty naughty...
Why aren't people using a smaller uncensored model to try to steer a large censored/cloud model? If there is such a project I am not aware of it
>>108446153I'm not sure what you mean by this, I'm just trying to have fun by trying new things instead of testing models on their crippled writing abilities. Do you want me to dl a fat model and do the same test or something
>>108446169because it doesn't really work
>>108442409Yes because Docker on Windows is using WSL2 to do it and it is not zero cost.
>>108446169Because censored models aren't an issue. Refusals are all very easy to dodge with a tiny bit of prompting and maybe a prefill. I don't know why people keep trying to "solve" an issue that doesn't exist.The only models that are a problem is shit that's so sanitized that it barely knows what sex is on a fundamental level, but nothing is going to salvage that.
>>108446178Well, if you have some time to spare I'd like to see what gemma would draw there
>>108442409vLLM doesn't have this issue
>>108446233vLLM barely runs under WSL
>>108446222It's a damn shame I literally deleted my 27b of gemma and some other models so I could download others when I got home from workI'll redownload it anyways and see what it gives me since I do like the idea of trying to equate a model's "personality" through their idea of an ideal apartment floorplan.
>>108446153wonder why no one ask right questions?That is why it is so big Hardware need to be sold x)easy and obvious as fuckall hail Jensen!
>>108446260are you a bot? you type like a fucking retard, starting with your 'how doe diffusion woerk XDDD' question, which you could've asked any LLM.
>>108446105>Given there is no "retarded writing style" filter built-in in 4chanXthis is where agentic filtering would come into play
>>108446256Bro, I'm running vLLM inside a triton server inside docker inside WSL
>>108446294>willingly getting into an echo-chamberreddit brain
>>108446312>reddit brainwrong. grifter brain. i am trying to pitch the next big thing.
>>108446320I will make the icon :3
>>108446320This is where you're wrongControversial opinions lead to engagementEngagement leads to trafficTraffic leads to ads money
>>108446259Here's gemma. I used the same prompt encouraging informal speech like it was oldschool instant messaging but I'm surprised it couldn't do more than grids even after a couple regens and adjusting temperature and neutralizing some other samplers. It's somewhat more humanlike in its text, but I had to include the - and the "for well..." in the screenshot because that's gemma in a nutshell
>>108446335Damn, it's dire.
>>108446335nowhere near as sovlful as "Total: ~550 sq ft | Cozy & Functional :D"
>>108446282I basically understand whole concept but I still do not get why they need doing whole noise denoise step when they already have image done !Why even compare image million times when you have image don by prompt!Seems like whole noise denoise step is completely redundant...Like this step make sense only if you want achieve one thing!!!extreme inefficiency!!!
>>108446361reddit spacing btw
>>108446341YupYou either pick retard moe with small activated parameters that can somehow pull off unusual tasks really well but has the emotional intelligence and writing of a lawn gnomeOrYou run gemma, which is generally smart and creative, but constantly spams the same retarded shit involving em dashes, ellipses and overly praises the user.>>108446359the wonders two sentences of a system prompt you put zero thought into can do for a model
>>108446365all?Loot of luck to find me on some socnet
>>108446335>a kitchen for... well, you know... thighs...
>>108446417COCK
>>108446335Pitiful in the bad sense, I don't want to feel sorry for a bare model.
>>108446417it was too hung up on the fact that I was trying to make it act remotely human instead of beep boop token predictor when I asked it for it's ideal living space>Gemma: Okay, so... I don't actually live anywhere, being an AI and all, but designing my ideal space is awesome!>Qwen: Got it! Here's a fun, simple HTML/CSS floorplan of my ideal apartment. Think of it as a cozy, minimalist studio with a dedicated creative corner, a tiny garden nook, and a super comfy reading spot. No fancy 3D rendering, just clean lines and vibes :DSo I begrudgingly guess that's another point in qwen's favor that it can at least mostly get what I want from it without having to yell at it to maintain a suspension of disbelief. Still wish they'd train on more english books, maybe they could make a tiny model pretend like it's a first time dnd player and I'm guiding them through a theatre of mind campaign with minimal dice rolls
>>108445934how how counterstrike
>>108445934Hoax-Hoax
It's NYOVER
>>108446546link
>>108446553https://github.com/BerriAI/litellm/issues/24512
https://www.reddit.com/r/LocalLLaMA/comments/1s2clw6/comment/oc8mlmv/
>>108446535he canceled Sora to deploy AGI. If this is true Spud will change the world.
>>108446615strawberry bros we are so back
>>108446615Probably because it's a failbake that's worse than SeedDance 2.0
How good is Qwen 3.5 27b or 35b at writing smut compared to how dry Qwen 3 was? Any improvements?
>>108446656>Qwen>smutDoes not compute. Qwen has never been good at smut. Functionality wise it's definitely a step up from Qwen 3
>>108446665I was hoping it had reached that uncanny threshold where its prose comes across as more sexually repressed than incapable like Gemma does. Oh well.
Turns out data really is everythingCloode has sekret programming dataset and their LLMs excel at super-long context vs chink onesByteDance has near-monopoly on short videos and their video models BTFO western ones
>>108446737always wasif you ever tried training anything you'd knowwhich is extra weird considering that shittuners and ai schizos on hf don't know that despite the amount of mistral nemo/small tunes
>>108446737>Cloode has sekret programming datasetThey just have a lot of traces of people using their harness.
What ever happened to the nvidia thing? where's our Sloppotron AGI edition?
>>108446778This is why I believe MiniMax will come out as one of the winners on the Chinese side because many companies chose MiniMax + TRAE
>>108446789we need to safetymaxx it first
>>10844612099% COCK.This may just be the horniest model yet.looks like they cooked hard.
>>108446120>>108446816FUCK YOUI can't believe I actually fell for it.
>>108446615What are the chances they open-source Sora?
>>108446828
>>108446875-1
>>108446875After they open source their other too expensive to run models like o3
>>108446141>>108446335i let kimi k2.5 go wild using my dipsy card
>>108446899Cringe.
>>108446916>one word reply
>>108446899at the very least, the cells arent uniform. I wonder if cell/grid based layouts are the norm for most models if asked for a floorplan I was a bit surprised when qwen made a circle at all, as stupid as that sounds
>>108447010i mean when i asked kurisu the same question she was using different shapes than just a plain grid. every character should have a different interpretation of what that question means. i'll ask mayuri as well and post the response
>>108446875lmao
>>108447049I'd say that makes sense, if there's character detail it would naturally change their output and I do see a circle in your imageI just rawdogged it with a two sentence sysprompt because I did want to try the idea across other models and see what the outputs were like with as little difference as possible
Also hello corpos lurking the threadNot saying you should train on the goofy shit we're saying but you should consider parts of it as what you should include in your world model corpus, since models can't conceptualize an apartment. Or really a world at all to be honest
>>108447049>mayuriI'm sure she's fine living on a couch
>>108445927>hauhaucssamefag
>>108447229samefag
Which version of Cydonia is good again?
feet
>>108447192funny enough she did end up dedicating like 1/4 of the apartment for "Open Space for Twirling in Costumes~ "
>>108446656Idk, I messed with big Qwen a bit and it seemed to okay to me. But I haven't done much of this in general so I probably just have bad taste. It was kind of fun, though, to mess around with the <think> blocks and see how its behavior changed. Seems like you can get pretty far by prefilling the first turn with a "core beliefs" block (following its default thinking format) that establishes that the "assistant" is exactly the kind of person who would absolutely love to do the thing you just asked for.At one point I tried telling it that I had been editing its "thoughts" like this, and it was weirdly insistent that this isn't possible. It eventually went along with it but would sometimes think things like "user says he can edit my thoughts, I'll play along with this RP even though it's not very realistic from a technical perspective". It also didn't want to believe I could see its thinking trace at all, until I pointed out that I was not using a standard UI (whatever it believes that to be).
>thinking disabled>"hey can you make me a calculator?">"sure! here's your calculator">thinking enabled>"hey can you make me a calculator?">*hallucinates for a minute*>"sure! here's your calendar"
>>108447309Things that never happened
top of the foot
S P U D
so this is the magnificent brain power of finetrooners.can i have access to the old model? it's gated> no. we'll make you a new quanti just want the safetensors man> no. we'll make you a new quantok make bf16 quants of these models please> no. make the quant yourself
>>108447436He gave him what he originally wanted. It just looks like readyart just said "ah, fuck it. here you go". What's the problem?
>>108447465the problem is being massive faggots and gatekeeping old shit to force people to use your new slop. the only reason he opened the old repo is because he was too lazy to create the quant after saying he would make it twice. what's not to understand?
>>108447492But they did have breakfast
>>108447492>gatekeepingSeems like he deprecated it and has the opinion that his new models are better. But whatever. You seem to really care.
>>108447309Can you trust a machine that can't think for itself?
>>108447518is this /lmg/ or did i somehow walk into /aicg/? this is the same mentality as a proprietary cloud provider.
>>108447545I'd get it if it was an interesting model. It's just a shitty finetune.>so this is the magnificent brain power of finetrooners.And you don't seem to have a high opinion of them. And even then, he ungated after just a few messages. Not just for llmfan46.I don't think it was lazyness either. I think it was a "fuck it, here you go" kind of thing. But for whatever reason, he did the right thing and allowed downloads again. What's the problem?
>>108447595are you being obtuse on purpose? he clearly didn't want to make a quant once he realized that the person was asking for a bf16 quant. be it out of laziness or because he doesn't want to use his system resources for that, it's still inexcusable considering he was putting up a fight to open those repos. it's like calling customer service for a company and the representative telling you that they can't do something, and then on the third time they finally admit they can do it because they realize they are wasting more time on the call than it would take just to solve the issue. the repo takes up storage space regardless if its being gated or not, it's just easier to keep it open.
>>108447297qwen3.5 was trained to be a cloud model and insists it runs on hardware you don't own. They just trained it weird, and probably just stole from a real cloud only model.
>>108447705>>108447705>>108447705
>>108447706>qwen3.5 was trained to be a cloud modelSo it's distilled from Claude. Gotcha.
>>108447702>he clearly didn't want to make a quant once he realized that the person was asking for a bf16 quantFunny. That's the easiest one to make.>he was putting up a fightIs that what a fight looks like to you? Four back and forth resolved in 4 hours. Maddening.>Hey. Want model please.>It's old, but I can make a quant.>Want safetensors.>Sure you don't want quants?>Ok. B16>You know what. Here you go. Full access.>it's just easier to keep it openYeah. And he opened it. He closed it because he considers it deprecated. He thinks "this doesn't reflect the state of my tunes" for whatever they're worth.I'd get you being annoyed if he refused, but he opened the repo after 4 messages from a single dude. No mob needed, no shaming, no social media bullshit, nothing. Just a request.