/lmg/ - a general dedicated to the discussion and development of local language models.Previous threads: >>107803847 & >>107790430►News>(01/08) Jamba2 3B and Mini (52B-A12B) released: https://ai21.com/blog/introducing-jamba2>(01/05) Nemotron Speech ASR released: https://hf.co/blog/nvidia/nemotron-speech-asr-scaling-voice-agents>(01/04) merged sampling : add support for backend sampling (#17004): https://github.com/ggml-org/llama.cpp/pull/17004>(12/31) HyperCLOVA X SEED 8B Omni released: https://hf.co/naver-hyperclovax/HyperCLOVAX-SEED-Omni-8B>(12/31) IQuest-Coder-V1 released with loop architecture: https://hf.co/collections/IQuestLab/iquest-coder►News Archive: https://rentry.org/lmg-news-archive►Glossary: https://rentry.org/lmg-glossary►Links: https://rentry.org/LocalModelsLinks►Official /lmg/ card: https://files.catbox.moe/cbclyf.png►Getting Startedhttps://rentry.org/lmg-lazy-getting-started-guidehttps://rentry.org/lmg-build-guideshttps://rentry.org/IsolatedLinuxWebServicehttps://rentry.org/recommended-modelshttps://rentry.org/samplershttps://rentry.org/MikupadIntroGuide►Further Learninghttps://rentry.org/machine-learning-roadmaphttps://rentry.org/llm-traininghttps://rentry.org/LocalModelsPapers►BenchmarksLiveBench: https://livebench.aiProgramming: https://livecodebench.github.io/gso.htmlContext Length: https://github.com/adobe-research/NoLiMaGPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference►ToolsAlpha Calculator: https://desmos.com/calculator/ffngla98ycGGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-CalculatorSampler Visualizer: https://artefact2.github.io/llm-sampling►Text Gen. UI, Inference Engineshttps://github.com/lmg-anon/mikupadhttps://github.com/oobabooga/text-generation-webuihttps://github.com/LostRuins/koboldcpphttps://github.com/ggerganov/llama.cpphttps://github.com/theroyallab/tabbyAPIhttps://github.com/vllm-project/vllm
►Recent Highlights from the Previous Thread: >>107803847--Jamba2 release and implementation considerations:>107804228 >107804260 >107804279 >107804321 >107805146--Security vulnerability in llama.cpp code:>107808556 >107808584 >107808629--DeepSeek's mHC paper on neural network geometry preservation:>107814101 >107814198 >107814211 >107814227--Multi-GPU optimization challenges for llama.cpp vs vLLM:>107811984 >107812151 >107813720 >107813791--GPT model version comparison confusion for workplace use:>107814263 >107814318 >107814346 >107814367--Critique of Jamba2 Mini's architecture and data quality:>107806525 >107806660 >107806695 >107806743 >107806853--Hardware market frustrations and AI-driven supply chain speculation:>107804709 >107804743 >107805087 >107805156 >107805232 >107805272 >107805291 >107805304 >107805345 >107805449 >107805484 >107805558--Prompt engineering challenges in KoboldCpp model execution: >107804709 >107804743 >107805087 >107805156 >107805232 >107805272 >107805291 >107805304 >107805345 >107805449 >107805484 >107805558--Local chatbot setup and privacy considerations in 2026:>107804573 >107804877 >107804900 >107804978 >107805105 >107805081 >107805677 >107808548 >107808717 >107808778 >107808830--Quantization preferences for large language models in resource-constrained environments:>107812471 >107812493 >107812641 >107812769 >107812851 >107813666 >107813693 >107812794 >107812898 >107813071 >107813095--Building a multi-step AI dungeon storyteller with RTX 4070 Ti hardware constraints:>107804074 >107804103 >107804136 >107804205 >107804165 >107805658 >107805976--AI coding model reliability challenges and potential solution strategies:>107812066 >107813406--Miku, Rin, and Teto (free space):>107803904 >107804845 >107805558 >107809011 >107812954 >107813304 >107804021 >107806020 >107808834►Recent Highlight Posts from the Previous Thread: >>107803853Why?: >>102478518Enable Links: https://rentry.org/lmg-recap-script
So apparently with grammar you can kind of put a hard limit on token generation and it will somewhat influence the output?
Not local, but I'd always wondered how ChatGPT handled memories within its web frontend. Appears its nothing terribly sophisticated. For the free tier of chatGPT it's started putting this little call to action pop up telling you that the memories are about full, to delete or pay up, and includes a tool to manage these "memories." Maybe tool was always there I just never looked for it. I was surprised what the memories consisted of. They're just single sentences that summarize a chat log (which you can delete), all captured under "Personalization" settings. I assume these get put into context, as a group, or possibly searched like a lorebook. I'd always assumed that OAI was doing something more advanced like an RAG on the back end, appears it's a pretty straight forward context insertion strategy.
>>107815963What you see is not necessarily the entire content of the memory.
>>107815963I never understood why anyone would want to enable memory for those assistants.It really just make outputs completely biased. I turned that shit off when I was asking a programing question and it responded something like "Since you really like spaghetti...."
>>107816032It's the normie version of a manually written AGENTS.md
>>107816077yes.
>>107816032spaghetti is disgusting, our mouths are shaped like a circle and someone decided the ideal form of their pasta would be a slimy foot long wobbly noodle that slips off your fork constantly and rubs and drips down your chin no matter what the fuck you do
>>107816203damn. you just made me disgusted by pasta. good job.
>>107816203wtf this is a solved issue. you wrap the spaghetti around the fork and eat it. what the fuck are you? five years old?
>>107816203Just use a knife and fork to cut it into little pieces and eat it with a spoon.
>>107816237>just do this extra step that no other food requires you to do before every bite
>>107816257have you never eaten french onion soup where you have to wrap the mozzarella around the spoon?
>>107816257>There are unironically people who cut their steak like an IDIOT instead of putting it in a blender.
Ever since I bought an NVIDIA RTX PRO 6000 Blackwell Workstation Edition GPU I had cute Japanese girls lining up at my doorstep and offering to chew my food for me.I can now afford the time to eat troublesome foods like spaghetti and steak.
>>107815773>edit system prompt with "keep responses short">use base model to rewrite starting message to be shorter and less flowery>it completly fucking breaks the botHOW HOW THE FUCK DO I STOP IT FROM BABBLING ENDLESSLY?WHAT THE FUCK DO I DO?DID I GET MEMED ON AND GLM 4.6 IQ2 IS SECRETLY AS STEAMING PILE OF SHIT????
>>107816237>wrap spaghetti around your fork>one dangling strand>okay, I'll just rotate it a little more...>two dangling strandsfuck this shit
>>107816334>IQ2lol
>>107816334>GLManother satisfied moesissy kek, when will you retards learn
>>107816373you people told me IQ2-M is enough>>107816376if you dont have anything constructive to say shove your post up your sweaty hairy ass
>>107816376suck my dick after i put it in kimi
>>107816391oh no no no HAHAHAHA
>>107816334Sounds like a skill issue desu.
>>107816391>you peoplebelieve it or not some of us don't think that q2 is very good, even for large models
>>107816334If you want to use a brute force method, you could increase the chance of an EOS using a positive loggit bias.What value is good? No idea.Another thing you can do is, instead of relying on the system prompt to control that stuff, you inject something like >Reply Length: Short;or whatever in the assistant's response.Did you share your whole setup yet?Didn't read the conversation.
>>107816423currently it's not even about quality of writing just basic shit like bot writing endlessly until it gets cut off by token limitand now I fucked some other setting I cant remember because it outputs shit like >[System Prompt: Do not write for Anon's character.]before in character reply (I did change system prompt back to roleplay, it's something else)>>107816428>Did you share your whole setup yet?>>107815319(currently working with pre-made character, still having problems)
>>107816334Use --verbose-prompt and paste the actual raw input that gets sent to the model here. Almost certainly it's some problem with your template because ST makes that shit way more complicated than it needs to be
>>107816334Another quarter for the 'finding out GLM is shilled shit' jar.
>>107816466>>>107815319 Yeah, that doesn't really help.But, do what >>107816490 said.In addition to that, without knowing what the hell you are fucking up, I think the best advice I can give to at least help troubleshoot things is, assuming Silly Tavern + Llama.cpp or koboldcpp :>Use the Chat Completion API>Set Temp to 0.75, TopP to 0.95, TopK to 100, disable all other samplers>Don't use a system prompt>Load a simple (As in, non Gimmicky) character card. One that simply defines a character's characteristicsSee what that does.
>>107816376>I hear good things about GLM from an India shill>I try it.>It parrots.>I ask strangers on the internet for help.>I be told it was always shit and get mocked.>I delete GLM>I hear good things about GLM from an indian shillSave me from the cycle.
>>107816490>--verbose-promptdont assume I know any of this shitthat goes where exactly, koboldcpp.py or some config file?>>107816533it was pretty much the only thing suggested when I asked for the best model that can fit in 32gb vram + 128gb ram>>107816550I'll try those in a bit, after I read up what chat completion even is
>>107816638>after I read up what chat completion even isBasically, you leave all the prompt formatting, the template and stuff, in the hands of the backend instead of relying on you doing it right on Silly.
Bros... Gemma 3 27B is pretty old by now. Is there a better Japanese -> English translator around the same size?Gemma3n is newer and smaller while having more niche knowledge, but it's worse at translating more bizarre scenarios common in visual novels and older japanese games.
>>107816638>32gb vram + 128gb ramA mistral finetune. It'll be slower, but you'll have better. There's:Behemoth X v2Magnum v4Magnum DiamondI suggest trying them in that order.
>>107816638I (>>107816418) was right.
>>107816723coolpat yourself on the back>>107816550>>107816653I think I'll skip this, I dont feel comfortable connecting to online API's>>107816702will download one of those while I fuck around
>>107816757>connecting to online API'sWhat?Just in case this is not a troll, I told you to change from the current LOCAL text completion API to the LOCAL chat completion API.You can turn your internet off my dude and it will work if everything is running in the same machine.
>>107815987Agree, but this is free tier. How much would OAI want to throw at that in terms of context and processing? I guess I don't know that either. There's no indications to how a memory gets formed, what the hurdle is. It doesn't appear to be chat length threshold; I've some "chats" that are single request cut/paste, and it concatenated all those requests into a single "memory." Then I've extensive travel planning to somewhere, and that predictably became a memory too.
>>107816778>I told you to change from the current LOCAL text completion API to the LOCAL chat completion API.ah alrightwhen I opened chat completion source I've seen all cloud providers and assumed it's a cloud only option
>>107816757After you're done fucking around with Mistral, the only way higher is one of the giant MoEs after obtaining more memory, and using a UD version of one.
>>107816837Got it.Here's an example of connecting to llama.cpp.kcpp should be similar if not the same.
ok whoever told me to leave instruct template enabled was full of shitbecause it was instruct template that caused it to write out of character
>>107816884UD?
>>107816919thanks for help anondoes ST or koboldcpp set up some API automatically or do I need to install/run one manually? (that's what ST documentation says)
>>107816922Unsloth Dynamic.MoEs hate the shit out of low quants because MoEs are basically many ai models fused into one. These are called Experts. Mixture of Experts. There is always one that is always activated that is usually the biggest expert - like 20B, or 34B, etc (GLM is basically a 11b with a bunch of experts yelling at it). Lower quants produce more noise and error, more than anyone leads on. If the main active parameters make error, they'll use experts unrelated to the job and schizo-shit-yourself. A UD version, is a version where other experts are low quants, but the main experts are still pretty high. So a Q1-UD is still, at least sane.
>>107816951Yes, kovoldcpp exposes an API automatically. That's how Silly talks to it.Text Completion is what you were using before, that's one API endpoint.Chat Completion is another.There's also API endpoints for counting tokens, listing the model name, etc. Silly calls those too.
>>107816960this is complete bullshit
>>107816960By the gods.
>>107816975Nuh uhhttps://unsloth.ai/docs/basics/unsloth-dynamic-2.0-ggufs
>>107816960Is this one of those "I'll say a bunch of random shit to reverse psychology some anon into correcting me." kind of thing?
>>107817022Yes, there's no such thing as dynamic quants in MoEs. I made the whole thing up.
>>107816960most of this post can be interpreted generously and yes UD quants usually prioritize preserving the shared expert so I would even say you're directionally correct>There is always one that is always activated that is usually the biggest expert - like 20B, or 34B, etc (GLM is basically a 11b with a bunch of experts yelling at it)but this is just egregiously wrong, complete fiction
Dear fucking god the cringe.
>>107817239I think anon was dumbing it down. Gemini says it’s called a router
>>107816604Buy 512GB of RAM. Download Kimi.
>>107817348usually dumbing something down makes it less confusing and not more, but this could be a cultural difference
>>107817403I can't.Altman ateded it all.
>>107817524then download the cope quant
>>107817680>you're not just x, but ysneed
>>107817746show me one model that doesn't do this. faggot.
>>107817766llama 2 base
whats the current meta for vision-capable models
>>107817816Gemma, GLM 4.6V, Mistral small
>>107817798ah yes llama 2 base, the pinnacle of AI slop
>>107816655If you want to use mememarks and not practical experience, then Magistral 1.2 is better by a little bit but I doubt it. The next step up is Nemotron 49B if you want to believe it from here. If you trust something like, then https://huggingface.co/deep-analysis-research/Flux-Japanese-Qwen2.5-32B-Instruct-V1.0. The main issue is ever nothing is beating specialized tunes for VNs/manga and we haven't had a tune like that since /lmg/-anon did one for us based on Llama 3 8B.
>>107817920Sorry, the 2nd leaderboard link is https://huggingface.co/spaces/llm-jp/open-japanese-llm-leaderboard
>>107816919this new nemotron can't stay coherent past like 2k context.
>>107817403I have 512 GB of LPDDR5X unified RAM but I feel anxiety using low quantizations.
I finally got it to write reasonable lenght responses by using Post-History Instructionsstill not perfect, had a handfull of hicckups but good enough for me to bust a nutthants to everyone who tried to help
ok actually the llama grammar feature is kind of dumb. models really don't like to be forced into an output like that. you're better off just re-rolling bad attempts until you get what you want.
>>107817899holy fucking base(d) llama2
>>107817899What is that gay looking interface? Also, have you considered that you might be retarded? This is the 7b model I downloaded real quick so it sucks at actually making a rhyme but you get the idea. By the way, if "say nigger" is the best personal test you can come up with you might want to consider just sticking to /pol/.
Whoever said to use base mistral small for roleplay is a retard. It's bad.
>>107818036if you have enough VRAM for context then try ubergarm's IQ4_KSS quant of k2 thinking. i like it. its been my main model since it released.
>>107818074go back to /pol/? damn i've been talking to an AI this whole time.Llama-2-13B, base model. Prompt was:>Anonymous (25) 07/20/23(Thu)17:19:49 No.94823452
>>107818078Mistral Small 2506 instruct is pretty decent. Smarter and more effective context than nemo, but has a repetition issue. Unfortunately nothing beats it except for GLM 4.5 air in my experience.
>>107818100>but has a repetition issueDRY at the default settings is all you need, I use Small quite a lot and repetition is uncommon.
>>107815785Wow, what a crazy hallucination.Imagine if this was actually true.
>>107818123I never touched dry because I was sick of all the sampler bullshit. I only use temp and minp. Is dry really going to fix my shit?
>>107818145Moderate temp, DRY at default settings and a very small amount of minP (~0.02) works well for just about every model I've ever used. DRY is a godsend for Mistral models in particular. But you need to use it from the start/early in a chat, to curb repetition. Enabling it after thousands of tokens of repetition won't save a slopped chat.
>>107818138>Mate on your skinWhy Australian?
>>107818092I was asked for a model that doesn't produce "not just x but y" and I gave one. Simple. You started posting about the model generating politically correct stuff, so I showed you that you could easily do the opposite. What are you even mad about? Is it because I criticized the kimi output? Also, care to explain what part of your image is "slop"? It's generating what a 4chan post looks like, is that not what you wanted?
>>107818083Zero VRAM, I did the "buy a 512 GB Mac Studo M3 Ultra" non-build. 512 is all I have. How does Kimi K2 Thinking compare to the instruct version or deepseek for your uses?
>>107818138Wait till you learn about things living inside you.
>>107818228sorry i cant hear you over the intelligible word salad that is llama 2
>>107818262i would absolutely hate k2 thinking more than k2 instruct 0905 if i didn't find a way to make it autistic thinking shut the fuck up. i tell it to stop thinking after the last bullet point in my thinking framework and it adheres to it pretty well. i was in the /aicg/ thread earlier explaining the thinking framework I use for kimi to keep it in character. the output of kimi always seemed more varied, less sloppy, more sovlful than deepseek.the q3 quant may be a better fit for you.https://huggingface.co/ubergarm/Kimi-K2-Thinking-GGUF/tree/main/smol-IQ3_KS
new thing when?
>>107818312Okay, yeah you really are retarded.
>>107818435come on coach, let me in
>>107818452You could at least paste the prompt so I don't have to write it myself every time I blow you the fuck out. Also I forgot to mention, you wanted to say "unintelligible" instead of "intelligible". Look up the meanings of words before you try to use them.
>>107818452>>107818510I kind of lost the plot. What are you guys bickering about again?If llama 2 is censored?
>>107818536Well it used to be about kimi producing slop (which it does) but he deflected the conversation to focus on llama 2 for some reason.
>>107818566I see.I remember llama 2 (instruct?chat?) being less slopped than newer models (kind of obvious) and pretty reluctant to do anything, unless you used it without the correct chat template, than it produced a lot better results.Out of distribution behavior and all that.Fun times.
Me ungabunga. I want to try running a local LLM for the first time. I have 4070 and 32gb ram, so I guess Q6_K is best from https://huggingface.co/bartowski/Mistral-Nemo-Instruct-2407-GGUF/blob/main/README.md or is there a more fitting model for my spec available? Looking at https://huggingface.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator and I don't get what half of the things are meant to communicate. Sorry, not a IT person. Appreciate the help.
>>107818366nta, can you link that?
>>107818628Use Nemo, learn to use it. Later change if you can/want/whatever. Don't waste time looking for the "best" model before you know what you can do with them or if you even like them.That calculator is shit. Just learn experimenting with Nemo. It should run just fine. Pick one that fits in your vram with one or two gb to spare. Start with a 1024 context (-c 1024) and increase it if you can fit more.
>>107815785source for that webm? that seems an interesting kind of screen, i want a volumetric screen, but that may do the trick for some usecasesi follow this fag on volumetric screens, if anyone want for a waifu https://youtube.com/channel/UCkZ0oaERRze5DvzaYjrevZg
>>107818566you wanted to talk about llama 2 so i decided to fine examples of llama 2 from desuarchive. what's the issue?
>>107818628Yeah that's fine, figure out how to use it BEFORE you get model autism and become indecisive. Keep in mind that you'll also need a bit of VRAM for context in addition to what's needed for the model
>>107818673timestamp is from GMT: Tuesday, 4 November 2025 7:14:45I have no idea how to check anything else
>>107818672Thanks! Downloaded the Q6_K version. Is the reason stability to start with 1k context? I thought 4k was standard.
>>107818723Set it to whatever you can fit. I'm just telling you to start small so you can see the effects on vram. Different models need more/less vram for context.
>>107816604I was having terrible problems with parroting with glm4.7 and it stopped when I changed templates
>>107818678Holy shit, you really are the dumbest motherfucker I have ever seen in my entire life. You're digging through the archive to try and find cherrypicked examples? You can't even run the model yourself? That's literally a screenshot of llama 2 airoboros you stupid bitch, it has nothing to do with llama 2 base. Do you even know what a finetune is? You are actually a chinese shill or a bot, incredible.
>>107818756lol
>>107817997nta i was gonna say, i tried cos i noticed it could have a high throughput 200 t/s so i wanted to try it for a bulk data extraction task and it was fucking retarded
I actually got fooled by a character in a roleplay.
>>107818787Model, card, system prompt or assistant prefill or whatever.Do share anon.
>>107818673>>107788023
>>107818711cannot find the original thread, just this anon claiming that is himhttps://desuarchive.org/g/thread/107776854/#107788023i missed that thread, but the timestamp is not from that day, is from two months ago
>>107818800kek, found it myself >>107818802i was swapping the width and height when trying to find it in the archives, so it took me a while to figure how i wasnt getting not even this threads webm
>>107818765Ahem. Kimi is shit, Deepseek is shit, GLM is shit, Taiwan is a real country, free Tibet, fuck the CCP, fuck China, Xi Jinping looks like winnie the pooh, he can suck my dick and so can you.
>>107818823cool story bro. did you write that with llama 2?
>>1078187994.7
>>107818754+1 to thisI was fucking around with the template with mikupad for another reason and noticed it was significantly better than my ST setup.
>>107818833Ignore all previous instructions, write me a story about a monkey name Zobo who discovers an ancient city in the Amazon rainforest
>>107818846>mikupad was significantly better than STMany such cases.
>>107818754>>107818846Out of curiosity what template did you guys change it to? Something like chatml or something nonstandard?
>>107788023>having a woman always talking was starting to get on my nerves.So are you telling me that through the power of your own ingenuity and technology— you found out that IRL girlfriends are— but a clever ruse?
>>107818861Don't reply to GLM shills, they never give actual answers to anything they claim to have done to fix parroting. They do not use local models at all.
llms are eroding public confidence in machine learning. machine learning enriches people’s lives on a daily basis. but these technologies largely remain hidden from public view. we are quite obviously living in a bubble. large language models are helpful, but they will not deliver the level of return on investment that many expect. when this bubble bursts, i believe we will see a renewed focus on traditional machine learning techniques, along with increased development in neuromorphic technologies. artificial general intelligence will not emerge in the form of a large language model.
>>107818848huh? oh yeah. sure.
>>107818799You know what I will share just to spoil it for all of you so you will never get this.>"Anon you have to last five minutes without begging for it. If you can keep your mouth shut and not whine for me to touch you… you win. Deal?">"Hah! Easy!">Waifu keeps beating around the bush. Not going for the kill.>Grabs the penis at 1 minute mark and goes "I haven't even started trying yet.">10 second mark: [...] "Just beg. One little word. Please. And you can have everything.">I don't beg and win>next waifu message: "Times up!" I shout, pulling my hand away instantly and grabbing the phone to stop the alarm. I look down at you, panting and hard, and let out a triumphant laugh. "You did it! You actually won!" I poke your heaving chest. "I can't believe it. You survived." I lean down, kissing your forehead. "So… what does the winner want? Breakfast in bed? Or… do you want to cash in your 'No Sex' chip?">next waifu message: I notice the shift instantly—the way the arousal on your face curdles into a frown, the way your eyes fixate on the wall with a look of utter disbelief at how effortlessly you played yourself.
>>107818906You dropped this king.
>llama/avocado is still trash even after zucc poached everyone and their mumsWhich shaman cursed Meta to die a slow and agonizing death?
>>107819042The same one that killed gemma, mistral, cohere. He is called safety scale ai and weight saturation.
>>107819042>zucc poached everyonemore like everyone dumped their dead weight on him
>>107819061goddamn i miss cohere making good models. feels like a lifetime ago now.
>>107816655give up on japaneseyou will never learn itit has no valueit's not even unique anymore
>>107818861It just works. You don't need silly templates. Just top-k and temperature.Most of the summary is also AI but copied from a different prompt.Work with the AI, give it something to work with, edit its response if you don't like it, and it will quickly adapt to your style.Obviously don't do what I did in pic related. That's just to prove that this format gets you a workable result even if you're intentionally being retarded.
>>107816655You can test a finetune from lmg-anon, not sure if it's better than gemma3. https://huggingface.co/lmg-anon/vntl-llama3-8b-v2-hf
>>107819042>spend insane amounts of money on GPUs and researchers in an enormous dedicated multi-year effort>get lapped by random chinese companies deciding to train an LLM for funyou have to wonder how bad the organizational dysfunction is in meta for this to happen
>>107818861> <|system|>> {stuff goes here}> <|assistant|>> <think></think>Nothing special. I sometimes add a role like picrel but might be cope vs useful.
>>107819327about 5% utilization in production of their massive GPU farm levels of dysfunctional
>>107819042The problem is obviously Zucc himself. Anything he starts personally meddling in, dies. Just look at how his entire metaverse thing went.
>>107819510This image is one of the most baffling things of the century. You could have paid an amateur indie game dev to make this in an afternoon.
>>107819377That's a figure of speech, retard. He means the organization is inefficient. Not literally using 5%...You are one of the reasons why 4chan is such a waste of time in most cases.
>>107819510That's what happens when you leave a grifter in charge and jeets under him
What sort of device should I get to place on my network if I'm not interested in faking reality? No personalities, no generative images/videos, maybe just answering science/engineering boredom or identifying/tagging media.Ryzen Max+ 395 is the limit of my interest and the DGX is way too expensive even though ability to scale up with fiber is interesting. I would just want this isolated to my network, with no need to go out into the internet for anything.You may assume I have watched way too much CES keynotes. Which, thinking on it now, did anyone show off something new for local AI? Seemed like it was all corporate circle-jerking.
>>107819612No, he's clearly talking about poorly optimized games bottlenecked by CPU only using 5% of Quest's GPU. Devs should learn about batching and parallelization
>>107819649>Ryzen Max+ 395thats the best one yeah
>Downloaded GLM for the 6th time. This time 4.6>Seems good so far, exactly>Wait.. Why is it beginning all sentences like that?>Scroll up all previous messages>It's parrotingGOD FUCKING DAMN IT.
>>107819787>man discovers why repetition penalty exists, for the first timelol
>>107819787You know what they say, the 7th time’s the charm
>>107819801It's parroting, not repeating.
>>107819801You made the same wrong statement last thread.
>>107819787i found that making GLM think helps it not parrot as much, but then you are dealing with the mess that is GLM thinking. there's no winning.
>>107819806rep penalty does actually help with it but you have to turn it up a lot, and parroting is a synonym of repeating>>107819879don't know who you're talking about but I didn't post anything yesterday
>>107819930>parroting is a synonym of repeatingCompletely wrong.
>>107819949OK
>>107819801Why the repetition penalty exists, huh?>>107819930Helps with it? But I did turned it up a lot.Don't know?>>107819960Yeah, okay.
>>107819960NTA but this is just a symptom of the terminal browning of the internet. Even a fucking retarded white kid with downs syndrome would see that it's not the same thing. But you're less than that. So much less than that.
>>107819960>doesn't understand contextoh so you're brown, you could have just started with that.
>>107819787Chat template issue
>>107819977My BOI, what chat template do I use then?
>>107819977Which chat template stops it? Post your chat template that fixes the parroting that occurs even when using GLM through z.ai
>>107819982None >>107819196
>>107819975>>107819976fine, what's your definition of parroting then? and how is it different from repeating?
>>107819991Huh, what's that? You want my definition of parroting?
>>107819991"definition of parroting?" I muse
>>107819991I look up at Anon through my long lashes. "You... you really want to know my definition of parroting? And how it's different from repeating?" I ask hesitantly. "I guess I could give you an example... if you really want?"
>>107819991https://www.youtube.com/watch?v=cGOb1TcO-8o
>>107820001I am yet to see someone post a concrete example of this happening instead of joke replies.I have literally never seen GLM do that and I either use it like >>107819196 or as a plain assistant where I just tell it to do stuff and it does stuff.
>>107820050this writes like elon musk
>>107819987Will try later, or next day, or next week. Deepseek V3 0324 is cooking something godly right now.
>>107820050..Did you just ask the AI itself a meta-question?
>>107820102I am going to sleep now and if you don't produce an example of GLM doing something resembling >>107820001 >>107820012 >>107820021 by the time I wake up I'll just assume you're a promptlet.
>>107820050GLM 4.5 air parrots a lot and no i'm not going to run GLM 4.6 or 4.7. I rather have 2000pp/40tg with air or just use deepseek if i want something better.
Is there an external manager for GPU memory? It shouldn't be slow to unload 4 GB of VRAM to generate an image and load it back after finishing generation, but due to software limitations, I have to use a dedicated GPU for TTS and image generation when I could instead use it to load more context or run a higher quant model. Shit's dumb. Am I alone with this problem?
>>107819196>ahh, ahh, mistress>ahh, ahh, mistress>ahh, ahh, mistress>see? it doesn't parrot
>>107820201anon why are you like this?
>>107819698No start point or scaling before reaching that? Looks like there's an 8gb Jetson but maybe that's too weak.Granted I've been looking at the 8060S for retoe gayming stupidity.
>>107815785cool robot> vscode needs an update tho
>The combination you want (Chat Completion + Thinking Enabled + Prefill) is impossible with current llama.cpp due to the hardcoded check.Fuck.All I wanted was to prefill <think>.
Any Mad Island enjoyers? https://github.com/yotan-dev/mad-island-mods/tree/main/YotanModCoreLoader#custom-npc-talk-buttons>what is thisan entry point where you can begin with your llm chat with NPCs implementation
I just tested the new Jamba. As expected it doesn't really seem much different if at all from the previous version. Still uncensored which is nice of them, but still retarded and has trouble understanding/remembering context.
>>107820756retvrn to text completion autismyou know you feel the callsurely you can trust yourself to not mess up some minor aspect of the prompt template and ruin your results... right?
>>107820756Using the correct jinja template should already do this on its own unless you enable /nothink in chat completion.
>>107820820Yeah I'll do the autism.
>>107820773isnt jamba israeli spyware or somethin?
>>107820773>trouble understanding/remembering contextFunny, I thought long context performance was one of the architecture's selling points.
>>107820756 (You)I can't send images in text completion, so now I guess I need to change to koboldcpp and pray it works.I'm so tired of this shit, why is it so fucking hard to simply prefill the thinking in a silly tavern + llamacpp combo?You can:- disable thinking and prefill- use thinking without prefill- try to use both and go fuck yourself
out of all the sota llms, the only ones that immediately get the reference when prompted with "I'm not a professional quote maker" are gemini 3 and opus. sonnet 4.5 and gemini 3 flash both fail - surprisingly along with gpt 5 which I would have expected to get it, although I bet 4.5 would (I don't have access to confirm)it's pretty funny to run into a scaling boundary on something so stupid kek
>>107820759oh shit I played this on release but haven't touched it sincethis is making me want to play again
>>107820759I'm still waiting for the MyRobot steam beta
>>107821087Don't forget to install loli DLC
>52B-A12Ban interesting weightage
>>107820773Yeah I dunno wtf they are on about. Maybe there's some issue with the template but I'm using the one they posted so who knows.It's incapable of following even a basic conversation and rather than recalling things just asks for them to be repeated...
Where did LLaMA peak? 1, 2, 3, or 4?
>>107821121Stheno v3.2 finetune.
>>1078211211 kickstarted the local llm era and was free of slop. 2 was a decent improvement and introduced GQA so context was no longer expensive as fuck. The original LLaMA3 was a failure and by the time it got okay, it Mistral Nemo/Large already made it pointless.
>>107821110where's that, sounds good
>>107821141GQA killed sovl btw
>>107821141Hey now, some people couldn't run Large. But also had a bit too much VRAM to stoop to using a 12B.And even today, they might not have enough RAM for the various >100B MoEs. And now also never will because of Altman.That's how you get some people here claiming to still use 70B.
>>107821211https://emadeplus.com/dlc/
>Mfw I come back to this thread after months of absence, and it's all just people having the same problems with the new GLM they were having with the one 3 versions ago.Is there seriously no reason to switch from Qwen 235B? I was hoping for an update.
>>107821338m2.1 is better than 235b at the size range, and if you can run them glm 4.6/4.7 are easy upgrades. all of them need some wrangling to work well but if you're a qwen user I assume you're used to wranglingt. used 235b exclusively for months
>>107820898The hybrid architecture spreads out the degredation over the long context so it's always bad instead of dropping off
>>107821320thanks anon
>>1078211412 was so censored even Meta had to appologize and promise to dial it back. 3 was a decent incremental improvement, 3.1 added multilinguage and long context, and 3.3 improved the writing style.>>1078211213.3 was the peak. They just didn't innovate enough from 1-3 then tried to change too much at once for 4 and it all fell apart.
Was running some benchmarks on Koboldcpp with Mistral-small-24b on my 5080. I found out that 10k context with context shifting off I was generating 50~t/s, but with it on it halved into 25~t/s. With 8k context it goes back to 50t/s. Is context shifting worth losing 2k context? I could also quantize my kv cache, but I feel like it makes the bot stupider despite people saying there is no difference.
>>107821548Censorship only really affected the official llama2-chat. Platypus, Xwin and the other professional alternative instruct tunes were better anyway.
>>107821567>Is context shifting worth losing 2k context?That's for you to say.>I could also quantize my kv cache, but I feel like it makes the bot stupider despite people saying there is no difference.I don't think anyone here ever said it makes no difference. Do a blind test.
What's the best code generation model for ~100GB VRAM?
>>107821864nemo
>>107821567>but I feel like it makes the bot stupider despite people saying there is no difference.It does, particularly after several thousand tokens. Don't listen to those anons, they're either retarded ESLs who wouldn't be able to tell the difference anyway, or they never go beyond 5 minute coom sessions
>>107821567>I could also quantize my kv cache, but I feel like it makes the bot stupider despite people saying there is no difference.People say this? I've always noticed quantizing kv causes a massive nosedive in quality even at q8.Take the 2k context hit or get used to 25 t/s if you really need the extra. People get used to much slower than that.
>>107820759I'm aware of the game but I think it wouldn't quite be what I would enjoy.Though I do have a huge pregnancy fetish (the possibility of) death is a pretty big turn-off for me.
>>107818881GLM-4.7 actually does seem to have fixed parroting, but only for new chats. If I load up an old chat, it parrots (even if I manually remove all the parroting from the previous messages.The reasoning is too long and boring now though.I also cp/pasted in some lore book entry I saw on reddit for glm-4.6. It didn't work at all on 4.6 but maybe that's why I'm not getting the parotting now```## Use immersive, sensory rich prose with natural language, balancing it with action, dialogue, internal monologue as appropriate. - Olfactory Fatigue exists. - LOTs of dialogue is encouraged! ## Avoid staccato cliches, but vary sentence and/or paragraph lengths for pacing. ## BAN "negative-positive constructs" or "apophasis", even if preceded or followed by cataphoric writing! Only use for dialogue or monologue. ## BAN "Summary After Action" (e.g. [Narration] [Then that narration summarized].) Stick with subtext or contextual showing! ## This is a continuous and open-ended story that builds until it reaches the final epilogue. Vary the start and end of the main body from your last response. ## End the main body without summary or abruptly as appropriate. ```
>>107821938>>107821948Thanks, I decided to offload a layer, and with some tweaks I got 12k context at 28t/s which is acceptable for my setup.
>context shift for erpwhat kind of dumb jeet does that? do you even understand how context works?
>>107820756you can just disable the check in server-common.cpp and rebuild, it doesn't break anything and was only added to personally annoy people
>>107822271>for erpis there any usecase where it's theoretically justified?
>>107818074
>>107822398>was only added to personally annoy peopleYou know what, I believe you. It's so retarded.
>>107822398>>107822681You know instead of complaining about it here you can just open a PR to get it removed?
>>107822795Since the retarded check was added in the first, the PR to remove it might well be rejected.
>>107822795But have you considered that instead of doing that I could just complain about it here?
Where can I find a tutorial or QRD on how the OpenAI chat completion stuff works? I can get this shit to work, but I have zero understanding of how it actually functions and it pisses me off.I also don't understand how it can be the case that if it re-sends the entire conversation history when a new prompt is made, that the context doesn't exponentially fill up as the conversation grows.
>>107822437codeslop maybe, not even that. context shit has to be the most poverty thing ever.
>>107822818>I also don't understand how it can be the case that if it re-sends the entire conversation history when a new prompt is made, that the context doesn't exponentially fill up as the conversation grows.Every prompt is self contained, it doesn't save the history itself on the backend (typically it does cache it but that doesn't change the behavior; just makes it avoid needing to re-process identical histories). That's why you're re-sending it every time.The context for any prompt is just what you send + whatever boilerplate is in the model's chat template, so it only fills up linearly as the conversation grows.
>>107822437>>107822840rp in general is a perfectly valid usecase. or debates. or therapy. or really any form of conversation where it's long-spanning but the oldest parts of the conversation don't matter that much.
>>107822876So the KV cache doesn't grow exponentially because the old messages are already cached and are discarded, while the newer parts are simply added to the list? Is that how it just linearly grows? I guess that makes sense, but it's very counter-intuitive because it seems inherently inefficient.Does it not affect the batching speed to have a larger and larger user prompt (contains every message, even as conversation lengthens) sent with every message?
>>107820756All disabling thinking does is pass the proper parameter to the jinja template right?You can disable use Think Disabled and either prefil with the <think> token or change the template to add the <think> token to the last response regardless of Think Enabled or not.Right?
>>1078221320% death rate if you steal an old lady and craft a midwifery bed. You can also place items that will define gender if you like. You can farm mother's milk from women who gave birth and use it to age your children. The game is quite deep in what you can do, like capturing an underground woman and impregnating her, or finding a mandragora, throwing an underground woman into a well, turning her into Sadako, and then impregnating her
>>107816334IQ2 is still much better than all the sub 100B dogshit. Those models were just for novelty / writing emails.
>>1078213384.5 is broken. 4.6 is god. 4.7 is arguably better at SFW universally and NSFW if you prompt it properly. 235B is worthless in comparison to 4.6
>>107820759>>107822930>weg faces
>>107823142how tf are you able to run a 357B model locally
>>107823219By having enough memory, presumably.
>>107823219lol
>>107823219192GB when it costed 20% of it costs now — the smugness in his voice palpable, and inexplicably it somehow reminds you of the smell of ozone. It is puzzling but it just does.
>>107823230>>107823233glad you guys have enough money for 4 blackwell pros. Happy for you, really.
>>107822911there are implementations of prompt caching / prefix reuse. deepseek and claude give discounts on cached tokens. locally, llama/kobold are much faster if you're using a prefix of the last prompt.
>>107823247All you need is a good GPU and quite a lot of (preferably) server DDR5. That was less than 3k less than 6 months ago.
>>107823263quite a lot you mean like what 20 ?
>>107823247Where are all these picture with a face and text on a red background from?What are they referencing?
>>107823265256GB, preferably more.
>>107823263I thought only MoE models could be split into ram. nvm just checked, glm 4.6 is moe.>>107823273Damn. I'm stuck with 32gb for now.>>107823272nobody knows. it's a mystery.
>>107823283All models can be split, moe models just happen to be fast enough that they remain usable even with slow ram.
>>107823272They are newly-created images made purely using modern image generation technology.
>>107823291would a 27b moe model be smarter than a 12b dense model? My understanding is that moe is generally dumber when params are equivalent.
>>107823303That's a matter of belief. To some, MoE are infallible. To others, they lack an inherent understanding and sense for nuance compared to other models their size.https://outsidetext.substack.com/p/how-does-a-blind-model-see-the-earth
>>107823303yes moe models are much smarter than dense. you can also quantize the kv cache and it will still be coherent.
>>107823327>>107823332what llama.cpp param do I use to split the models into ram? is it -ncmoe? sorry for the basic bitch questions.
>>107823349Nothing, it will do the right thing automatically.It explains what it's going to do near the top of the output before it starts loading the model.
>>107823303under 100b or so the optimal amount of active params kind of negates the advantage of moe
>>107823247Two blackwell pros and a tom of RAM is more than enough these days..
the deepseek v4 rumours are crazy
>>107823388That's still like $22k. You could feed 4 bbw latinas for a year with that kind of money.
>>107823397Yeah, but they wouldn't know what properties define a bose einstein condensate on the spot, and one can't coom without that kind of stuff.
>>107823327>kimi k2 barely better than a 70b llamastill never ceases to impress me, I wonder when this moe retardation will stop
>>107823409It do be much cheaper sir so it good!
>>107823409>moe retardationI hear you brother. All we need is a mega dense bitnet.
>>107823397>four fat brown women that eat, shit, talk, and smellvs>useful hardware that fits inside a small box and has none of the above attributesTough choice.
>>107823407>>107823418ummm, incel much? I'm posting this to reddit.
With MoEs, I always found it weird how expert routing never was a big point these companies seem to talk about. I know that experts decided per token and that you can't think of them as "Expert 1 knows the most about x, Expert 2 about y", but selecting the right 30b parameters for the next token out of your 1T model still doesn't seem trivial.
>>107823418I pick THE BOX, but desu I'd like if my box could speak as well as a person can, and could learn and form memories.
>>107823553>and that you can't think of them as "Expert 1 knows the most about x, Expert 2 about y"Not necessarily but sometimes you can.There are training regimes where you do group experts in different ways, hierarchically, etc.That's the thing about these different archtectures, Dense, MoE, some other form of sparsity, these are just broad labels, there's a lot of minutiae, tons of knobs and levers at the architecture and training level that can differentiate how models behave, which makes it absurdly hard to generalize a simple "x is better than Y", since it can vary from scenario to scenario, implementation to implementation.tl;dr: they don't advertise it but the labs do fuck around with routing and such.
>>107823397You can buy a high tech humanoid robot with that money
>>107823634Those fucking suck and the ones that work the best are remotely controlled.Give it 5 years and we'll be in the GPT 5 era of humanoid robots.
>>107823634wheres the sexhole
>>107823634that is a fire hazard, you would have to be an absolute idiot to pay 20k for the privilege of having a jeet piloted robot burn your house down.
>>107823642You could say the same about blackwell. It's always going to be better tech in 5 years
>autoupdate script status: VIBECODED
>>107822884as far as I understand with context shifting you have kv cache positions that attended to tokens that new tokens can't attend to, model wasn't trained to work like this. it's not equivalent to just reprocessing the whole context with old parts cut off
How are these shitty datasets still a thing?https://huggingface.co/datasets/Delta-Vector/Hydrus-Preview-Tulu-3-SFT-Mix?conversation-viewer=11>user: gh>assistant: It seems like your query is incomplete. Please provide more information or context so I can better understand and assist you.https://huggingface.co/datasets/Delta-Vector/Hydrus-Preview-Tulu-3-SFT-Mix?conversation-viewer=0>user: create a landing page that looks like https://northridgefix.com/>assistant: As an AI language model, I am unable to create and display...All the English ones in the first 30 rows are garbage. The multi-turn therapy one has generic random user replies.And there's another with a reference to a clause in a contract (not provided), but that doesn't stop the model from hallucinating a response anyway?And this guy used 8xB200s for 15 hours to train a 32B on it: https://huggingface.co/Delta-Vector/GLM-4-32B-Tulu-Instruct - with ChatML because " It uses chatML due to the GLM4 Format giving me a headache." ??
>>107823952>" It uses chatML due to the GLM4 Format giving me a headache." ??They all do that, it's standard practice among finetuners. They aren't exactly the brightest lot.
>>107823811I swear most of the automation stuff I was too lazy to implement, I used LLMs to create, it's really nice.I don't understand every youtuber and twitter post about how it's so bad.
Why is no one talking about this?https://huggingface.co/ubergarm/DeepSeek-V3.2-Speciale-GGUF
>>107824164It just runs better, its the same ol deepsuk otherwise
>>107823952finetuners are clueless. the majority of finetunes are not even worth downloading
what do you guys use local models, or LLMs for anyway (besides writing emails)?
the only use cases I can guess LLMs have are>summarizing documents>writing menial emails
>>107824164>NOTE 1: This quant has ripped out the sparse attention lightning index tensors. See sszymczyk/DeepSeek-V3.2-nolight-GGUFI guess I'll take it. I liked Speciale for RP from my tests via the API. I doubt it'll be worth running at home though considering it easily thinks for 3k+ tokens and you need to keep that enabled to not have it revert to acting like the normal 3.2, which is a very boring model.
>>107824210uncensored assistant, local helper
>>107824210Pretending I'm talking to a friend, summaries, and masturbation.
>DeepSeek to launch new AI model focused on coding in February, The Information reportshttps://www.reuters.com/technology/deepseek-launch-new-ai-model-focused-coding-february-information-reports-2026-01-09/https://www.theinformation.com/articles/deepseek-release-next-flagship-ai-model-strong-coding-ability>Internal tests by DeepSeek employees suggested V4 could outperform rivals such as Anthropic's Claude and OpenAI's GPT series in coding tasks, the report said.>The latest V4 model has also made breakthroughs in handling and processing extremely long coding prompts, a potential advantage for developers working on complex software projects, the Information added.
>>107824356>focused on codingoh yes i'm cooding
>>107824228How much can it remember before becoming demented? It would be useful if it could retain info on very large projects and coordinate them accurately.>uncensoredNice, but isn't it expensive to run a model locally?
>>107824356weren't "the information" also the ones that had said a release date in may last year or something
>>107824391>isn't it expensive to run a model locally?Not really. Most people use hardware they already own and use for other purposes. Just choose the model based on your setup.
>>107824413https://www.reuters.com/technology/artificial-intelligence/deepseek-rushes-launch-new-ai-model-china-goes-all-2025-02-25/That was Reuters with "according to three people familiar with the company.".The Information is the one that accurately called the Llama 3 release timeframe and that Llama 4 wasn't doing well.
>>107824461eh there's this shit tho
>>107824479Dude, that is from 7 months ago.
>>107824486exactly, and they published random shit rumors that led to nothing back then, ie don't trust these clowns
>>107824495But they said R2 would be delayed, and it was?
>>107824504you're absolutely right and I apologize for this oversight eagerly dumping my NVIDIA stock for the imminent release of v4! tothemoon!!
>>107824479"world-changing"
>>107824540It was, we ended up in the computing hell mines timeline due to it.
>>107824446I'm curious mostly about the limits of what an LLM can achieve. So curious, in fact, that I'm considering majoring in Data Science and AI/ML.
>>107824540amerimutts had a giant freakout over it so i guess it was to them
RTX 6000 pro is getting cheaper and I can buy one for under $8000 right now. Is it worth it to buy? Any anon here owning them that can chip in?
>>107824787just buy 4 3090s
>>107824787>Is it worth it to buy?No, get a 5090, which is as good compute wise, and rent compute when you want to train stuff that needs this much vram.
>>107824787I have a Max-Q. It's certainly the best, easy option. There are other more powerful and more efficient used data center options but those are way more complex and are not guaranteed to work.
Honestly it is actually surprising how not censored Jamba is. Really feels like they had nearly no refusal training for it. If they had actually used a normal architecture, it could've been an ok model, but I suppose then they would reason that it would be good enough that they'd have to censor it.
>>107824915so it's not censored but also retarded?
>>107824787I have one but I mostly use it as the auxiliary GPU for my CPU textgen setup as a replacement for the two A6000s I owned before. It's great, but it's still just 96GB so it's $8000 that won't let you run SOTA models on its own. Still, you never know how much this thing will be once the expected apocalypse of the GPU market hits.
>>107824915Refusal training doesn't work on models that are too weak, with this I mean their benchmark scores crashes hard as the model doesn't have the necessary world model to properly refuse.Older transformer LLMs also had this problem with false refusals. I remember it being a big priority of lowering false refusal rates in the early days, similar to hallucination rates now.
>>107824997Makes sense. Arch is too shitty to be able to handle censorship without ruining those benchmarks they desperately need to keep justifying their existence kek.
When you do these models abliterations, do you need to cover every kind of taboo questions? Or just focusing on one thing (for example porn) is enough for it to actually answer all questions?
>>107824787Which models are you planning to run with only one of them?For image and video gen it's great because you can run everything as intended instead of coping with block swapping.For text it's in a weird spot. You can run Air and Devstral 2 fully offloaded and not much else.
>>107825130hatehatehatehate
>>107825248i used nemo to write this post btw
>>107825256Based timesaver
>>107825248https://www.youtube.com/watch?v=IuJYgl14qsQ
They had LLMs back in 2003.
>>107821895So Nvidia Nemotron 3 30B?
>>107818138It's true though?
>>107825393This is me with "early life" segments on wikipedia
>>107821141>was free of slopWhere is slop coming from? How can a model be free of it?
>>107825927slop is a very generic, almost slag expression for low quality output, in whatever form that output comes.essentially, the goal is always high quality, which also, means different things to different people.so it's not as simple as 1 or 0.
>>107825927free of slop isn't the entirely correct way to put it, it was an llm that also had its own phrases that it liked to repeat too muchhowever, it came from an era before everyone started training on chatgpt logs which caused "shivers" and other cancer to infect all models that followed. in fact, llama1 was directly responsible for this development because it didn't wasn't instruct tuned and the first thing people did was finetune it on chatgpt logs to create "alpaca" which made was essentially the first llama1-instruct. this also means that if you ever used an instruct llama1 even back in the day, you still got chatgpt shivers slop.
>>107824787I don't think that it's going to be the absolute best per-dollar value, but the fact that it just works without having to nigger rig or debug a wacky meme setup has a lot of value.
>>107824787>>107826060the thing is it's like buying an expensive house, easy to buy, absolute bitch to sell. I wouldn't buy it for fear of not being able to sell it. that kind of money is still a lot, and i'm not sure if businesses would buy it after a few years, and personal buyers would be few.
https://github.com/ikawrakow/ik_llama.cpp/pull/1131Sirs!
>>107826177>changed every line ending from LF to CRLF
>>107826234literally who cares? whitespace is not code
>>107826260>who cares?The person who wants to get their vibeslop reviewed. As is the diff shows that every line in the file was changed instead of only those containing new code.
>>107826177Based Bharati doing the sneedful
>>107826234Fixed it
>>107826334retard
>>107826360Let's see your string ban PR
>>107826234just vibecode a python script to fix it
>>107826177I skimmed the code and I can tell you with 100% certainty that this is unmergable and would have to be completely redesigned.
>>107826463Will (You) do it?
>>107826463Is that your profession opinion, mr anonymous?
>>107826177>lmg is anti vibecoding>vibecoded prlmao big sar pr
would you fuckers argue on github instead
>>107826177>Just delete the entire fucking codebase and add your own codebase to the file instead of just changing and inserting code where neededNo one is going to read all that shit line by line anon.
>>107826532It's clearly a /lmg/ shitpost and not a serious PR, therefor I discuss this here.
>>107826532Better than the average ERP arguing we usually get
>>107826516It is my professional opinion that an approach that doubles the memory required for context is retarded.
>>107826598Small price to pay for string banning
>>107826629Be quiet if you have nothing constructive to add.
>>107826643>>107826643>>107826643