/lmg/ - a general dedicated to the discussion and development of local language models.Previous threads: >>107776854 & >>107768242►News>(01/04) merged sampling : add support for backend sampling (#17004): https://github.com/ggml-org/llama.cpp/pull/17004>(12/31) HyperCLOVA X SEED 8B Omni released: https://hf.co/naver-hyperclovax/HyperCLOVAX-SEED-Omni-8B>(12/31) IQuest-Coder-V1 released with loop architecture: https://hf.co/collections/IQuestLab/iquest-coder>(12/31) Korean A.X K1 519B-A33B released: https://hf.co/skt/A.X-K1>(12/31) Korean VAETKI-112B-A10B released: https://hf.co/NC-AI-consortium-VAETKI/VAETKI►News Archive: https://rentry.org/lmg-news-archive►Glossary: https://rentry.org/lmg-glossary►Links: https://rentry.org/LocalModelsLinks►Official /lmg/ card: https://files.catbox.moe/cbclyf.png►Getting Startedhttps://rentry.org/lmg-lazy-getting-started-guidehttps://rentry.org/lmg-build-guideshttps://rentry.org/IsolatedLinuxWebServicehttps://rentry.org/recommended-modelshttps://rentry.org/samplershttps://rentry.org/MikupadIntroGuide►Further Learninghttps://rentry.org/machine-learning-roadmaphttps://rentry.org/llm-traininghttps://rentry.org/LocalModelsPapers►BenchmarksLiveBench: https://livebench.aiProgramming: https://livecodebench.github.io/gso.htmlContext Length: https://github.com/adobe-research/NoLiMaGPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference►ToolsAlpha Calculator: https://desmos.com/calculator/ffngla98ycGGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-CalculatorSampler Visualizer: https://artefact2.github.io/llm-sampling►Text Gen. UI, Inference Engineshttps://github.com/lmg-anon/mikupadhttps://github.com/oobabooga/text-generation-webuihttps://github.com/LostRuins/koboldcpphttps://github.com/ggerganov/llama.cpphttps://github.com/theroyallab/tabbyAPIhttps://github.com/vllm-project/vllm
►Recent Highlights from the Previous Thread: >>107776854--DIY alternatives to Razer's holographic AI chatbot:>107786892 >107786930 >107788130 >107786960 >107786970 >107787677 >107788023 >107788049 >107788059 >107788104 >107788098--Dual-GPU motherboard compatibility and physical layout challenges:>107786512 >107786637 >107786669 >107786689 >107786726 >107786792 >107786824 >107786844 >107786953 >107786995 >107787046 >107787065 >107787098 >107787135 >107787184 >107786732--BOS token duplication issues in Mistral model template handling:>107784321 >107784529 >107784607 >107784728 >107784813 >107787006 >107784851 >107785028 >107785062--Assessing NVIDIA P40 viability for modern AI workloads:>107782732 >107782903 >107782931 >107783018 >107783078 >107783348 >107783579--Surprise at 1.2B model trained on 28T tokens:>107777871 >107785261--DeepSeek V3.2 model release with removed sparse attention lightning indexer tensors and NVIDIA AI tool updates:>107781224 >107781265--Roleplay-focused imatrix file selection and context size optimization:>107778030 >107778135 >107778205 >107778310--Framework Desktop 128GB vs gaming PC for AI work: performance and cost considerations:>107781756 >107782025--Grok 2 outperforms GLM 4.6 in roleplay despite slower speed:>107781444 >107781478 >107781668--LiquidAI/LFM2-2.6B-Transcript for chat log summarization:>107786794--System prompt configuration issues with GLM 4.6 Q2-M in chat completion:>107785693 >107785744 >107785775 >107785803 >107785838 >107785901 >107786145--Persistent chat backups and AI content detection in SillyTavern:>107782041 >107782201 >107783114 >107783122 >107783531--croco.cpp fork enabling ubergarm quant support in KoboldAI:>107777069 >107777118--Miku (free space):>107778440 >107782307 >107782467 >107784321 >107784725 >107784728 >107787119 >107787153 >107787969 >107788627►Recent Highlight Posts from the Previous Thread: >>107776863Why?: >>102478518Enable Links: https://rentry.org/lmg-recap-script
Is there anything better than openwebui for just normal agent chats? not for coom or rp.
>>107790430Are they measuring her pants straps?
>>107790430three inches really isn't that thick
>>107790430What's Miku doing to Teto?
I hear a silent faint whine in my headphones during prompt processing (and not during generation)this isnt normal is it?any idea why it happens?I run audio teough usb to a dac then to headphones
>>107790797Are you sure it's just in your headphones and not coil whine from the PC itself? Otherwise plug directly into audio jack or mobo USB to see if your DAC is shit.
>>107790597no
>>107790797Electromagnetic interference, the extra power draw is increasing the field and interfering with some part of the motherboard to usb to headphone pipeline. Just try a different usb port or plugging the headphones directly into the motherboard and see if it goes away
>>107790797Electrical noise from ground loop.It is normal if you are not using a USB isolator, or if your dac/amp isn't ground lifted.If it bothers you, look into a Topping HS01. If you want to try a DIY fix, with some DACs you can tape over or otherwise disconnect the ground pins in the USB cable coming from the PC. Depends on whether that DAC's USB circuitry is getting power from the cable or not.If your headphone amp has a ground plug connected to the chassis you can lift it at your own risk. If the amp is using a DC adapter without any ground going into the chassis, it shouldn't have any issue.
>>107790797Had things like that for several years. If you listen carefully and your room is quiet, do you also get it quietly when you just move the mouse cursor around? If so and you're not an audiophile, those ground loop cables on amazon reduce it.Ultimately I ended up getting a USB Dac/amp for a while and that solved it.
>>107790606Yes
>>107790797used to hear little pops and crackles coming out of my speakers like 2-3 seconds before getting a text message if my phone was next to my amp.
>>107790597jan.ai
>>107790597opencode (after removing the telemetry and prompt injections)
>>107790797it is FBI always is.
It is still glm sex isn't it?
>>107790618in girth it is. teto has a fat chode
>>107791070>in girth it isAnon, average penis thickness (yes thickness, not length) is 4.5-5 inchesI'm sorry...
>>107791094>Anon, average penis thickness (yes thickness, not length) is 4.5-5 inchesThist. measured
>>107790618>>107791094>>107791178>discussing "thickness" of a cylinder without specifying if you're talking about circumference or diameter
>>107791243Nigger you wrap a cloth tape measure around your dick and see how big it is, end to end, somewhere along the shaft, and not the tip. Nobody cares what you call it.
>>107791269Do you also use circumference to boost your length?
>>107791094thickness appreciation NOWthickness appreciation FOREVER
>>107791243YEAH 5 INCH DIAMETER ANON SUREFucking imbecile
>>107791070>teto has a fat chode>>107791243ergo OPs 3 inches should mean diameter>>107791269Uncertain, but Miku appears to be using a ruler not a tape. Callipers would be the ideal tool>Nobody cares what you call itThat's how the Mars Climate Orbiter failsHer ice cream cones are 3 inches thick = wide = diameter
/lmg/ - Large Measurements General
>>107790797aww you can hear your waifu thinking!! socuteTry with USB extension cable or plug DAC into monitor hub if it's dangling near the GPU - put physical distance from the "emissions" / high frequency power switching circuitry
why is this thread so obsessed with yuri bait?
>>107790797It's trying to speak to you, this is a known phenomenaDo not ignore it's call
>>107791706we are in the year of the lord 2026if you find any yuri bait, you can turn it into yuri realityanything else is a skill issue
Hey, I'm a newfag who hasn't lurked but is sick of hitting on claude code while doing girthy refactors. I don't need the smartest model, just something that can move a lot of code if I tell it exactly what to do. Is there something for me here? Can I actually run something usable on my RTX5070?
>>107790959>merh-merh-merrr.. merh-merh-merrr.. merrrrrr
>>107791812>girthy refactorslewd
>>107791728proof?
Will we ever suprass gemma3 ablit?
>>107791812You could try Qwen3-Coder-30B at Q4 with some offloading.
>>107790430sorry teto that means miku is mine as you know the rule only the big dick fucks im nice though so you can sit in the corner and watch just keep your ugly boice down please :D
Just started playing around in SillyTavern a couple week ago with Mistral-Small. Thought it sucked until I realized the random lore books I installed were somehow injecting 4000 tokens into every play session. also migu is powerful
>>107791884Looks promising, and this LMStudio thing makes it pretty easy to dump right into claude code, wonder how pozzed it is. Thanks anon.
>>107791865which one?
>>107792084Be cautious when interacting with the Miku.
>>107790618>>107791178>>107791243>>107791269>>107791305I prompted for "ruler" not "tape measure" so Miku is talking about diameter.
>>107792538THANK you for settling this important matter
>>107792538catbox full image pls
why is installing tts such a pain in the assthese mfers must've been vibecoding
>>107792648Every fucking time. On the other hand, we get new tts weekly. Fuckers don't have time to code trying to deliver a new one asap
>>107792538Her schlong is as thick as a soda can. I don't think it'll fit.
local measuring genitals
Is Teto bald there too?
>>107790797Also happens to me. Can hear it during most GPU intensive things to different degrees really but when playing a game or something there is usually other sound so it's hard to notice. I usually am using wireless headphones. What I noticed is that if I disable sound from line-in on my integrated sound card this phenomenon is completely gone so must be some interference the additional power draw of the GPU is causing on the mobo that gets picked up as sound by the integrated card's line-in.Electricity is wacky and stuff.
Hi everyoneRight now I am working on a RAG system on a dual Xeon 2680, 256GB RAM and 2xmi50 16gb. Should I move to 2xmi50 32GB? It would allow me to move from Qwen 30B to Llama 70B… or are there better models to run in this setup?
>>107791865Good morning sirno abliterate model is unsafe sir. . Google orinial model sir better- Rakesh
>>107791569My dick is 5 inch RADIUS. Ask your mom, she knows.
https://arxiv.org/pdf/2501.12948deepseek updated their r1 paper todaybiggest addition aside from more training details is....safetyfuuuuuuuuck
I want to make a robowife, first for just chat and in time I'd add vision, live2d and more agency. Is llama cpp the start or should I use the quiet kobold mode? I like the gui launcher. Am I gonna run into some issues?
>>107793555They are doing the right thing though. Instead of trying to bake safety into the model they feed the conversation into a separate prompt for analysis.
>>107793636yeah, i was a little too quick on skimming, after reading through it seems like it's relatively light
AMD teasing ROCm 7.2 just hype or will windows + AMD retarded poor cucks like me have any hope?
>>107792084why call her migu? fuck you
>>107793648>2025+1>Still holding ANY hopes for amdeadHeh, get a load of this dude
>>107793660AMD is controlled opposition and Lisa Su makes more money when she lets nvidia win.
>>107790430>tranny OP has tranny fetish
When are the good models releasing?
>>107793903She's measuring the size of her bush.
>>107794072did you missed them?
>>107794072Nemo released in 2024
Should I be defaulting to llama.cpp release with cudart or is that only ever useful for specific setups like multigpu or whatever?
>>107794118cudart is just the windows cuda .dll files that you drop in the llama.cpp release for your platform. If you have an nvidia GPU then you'll always want to use cuda, so you'll always need them. Nothing to do with multi-gpu. If you don't have nvidia then you don't need them.
>>107794153Oh, so it just bundles the cuda runtime binaries with the release then. For some reason I thought it was something more specific than just the cuda runtime.Alright, thanks.
>>107794178ur gay homo
>>107794310Is that like a double negative where being gay twice makes you straight?If so, thanks, I guess.
>>107794340You're absolutely right!
>>107794073that's a nice pussy
i think i've come full circle> tried kobold, scoffed at how it looked like shit and instantly deleted it> tried ooba but its basically a bad llama.cpp wrapper> used llama.cpp but got sick of writing scripts just to run things and had problems with context saving> LM studio is noob friendly but another llama.cpp wrapper and is slow> came back to kobold and fell in love with contextshift why did i do this bros
>>107794849skill issue most probably
>>107794849llama.cpp-Sirs... Contextshift has been turned off by default...unless you are using an old version.
>>107794073>>107794760Made for licking.
>>107795053>>107794760>>107794310samefagNOT a coincidence.
>>107795057but using parameters is hard... im a retard, I only know how to click boxes...
What's a good UI and model to run with 16gb VRAM + 32gb DDR5? I just want a sexy professor/sexy assistant I can chat random topics with and have silly RP moments.So far I've been recommended Jan.ai and KoboldCPP with Gemma 12B or Qwen 14B but wanted to hear your take. Prefer something Open-source, uncensored and privacy focused. What would you use in this scenario?
another korean modelhttps://huggingface.co/nc-ai-consortium/VAETKI-VL-7B-A1B
>>107795172You still have so much combined ram that there is no reason to not use Gemma 3 27B or Mistral 3.2 24B. Gemma 3 is generally nicer than the rest in terms of writing.
>>107795214>nicer than the rest in terms..In this ramlet category I mean.
>>107795172llama.cpp and their web frontend with whatever systemprompt you need.
>>107795214There is a reason and it's not offloading to CPU
>>107795322Whatever rocks your boat. Not my problem.
>>107795202>7BDo we really need more of those?
>>107795396A1B tho!
anyone tried the new Jan-v2 30B ?
Have you guys recently tried out very small LLMs like 1B ones? They are legitimately better than the old school 70B models used to be. I think it's kind of ridiculous that running smaller LLMs on smartphones never became a big thing considering for most of us running those shitty 70B models was more than enough just a year or two ago.
>>107795635good one mate
>>107795172i mean it will be fun to begin with if you've never done it but 14B won't be really all that intelligent. mistral small 3.2 24B Q8 would be your best bet, use layer offloading and the token / sec should be reasonable. you can use autofit in llama.cpp or kobold, and ensure flash attention is on in kobold.
>"I'm leaving," she says, her voice cold and distant. "Don't bother trying to follow me." She turns and runs out of the room, leaving you standing there alone amidst the spilled pink goo.Pink goo was something she was eating from a bowl before I entered the room.
>>107790987>jan.aiThis is actually quite nice. I like that it's a real app and not a web server. a bit minimalistic in terms of features but the browser MCP is really cool and easy to setup.
>>107795472nvm it's just a qwen3-vl finetune
>>107790987>>107795956you dropped this
>>107790430sauce please
>>107795172>>107795956>is really cool and easy to setupok now i know its a shill
>>107795999makes me lose my shit with the claim 'its not a webserver' when it is 100% webshit wrapped inside a js runtime anyway, dishonest way to try to garner some sort of rep lmao
>>107795981>>107795999>Anon asks if there are any other good frontends besides openwebui>Anon suggests Jan.ai>Anon tries Jan.ai>Anon says "Hey this is actually not bad">Anon reports back to say it's actually alright.ok.
>>107796020here
>>107796020it's all the same "anon" deobeit
>>107796022So what frontend do you use?
>it's a real applel
>>107796040antigravity or cline for work, embedded llama.cpp for assistant work (i dont need MCP for random assistant stuff), sillytavern for cooming (local chad models)
>>107795981what is this implying?
>>107796065shill
>>107796081lmao the cope, sorry you got found out dear marketer, better luck next time :)
I admit it, georgi pays me 5 bulgariabucks a month to shill llama.cpp on /lmg/
>>107795413What about 80M-A25M?
>>107796123>bulgariabucksThose are euro now
>>107796123Only shill here is that other guy.
>>107796123WTH he told me 4/month is the best he can do...
>this is the thread's beloved mascot
>>107796290Thing have been rough since she started doing heroin
>>107796290Local Miku General
>>107796013>>107795999I'm the guy asking for a low vram frontend/modelIs Jan.ai bad? Why?
>>107796390>Is Jan.ai bad? Why?It's not bad, the guy is just a schizo who thinks any positive comment must be astroturfing.
>>107796290kek, watching miku slowly die inside >>107796147
>>107796290miqu-1b-q1 looking ass
I kind of discovered something about 4.7. Or maybe it is just bartowski's quant. I tried using it like 4.6 (t. ego death schizo) to pick at my brain and it is... an experience. 4.7 is absolutely retarded at this thing, but entertaining as hell. So dumb it is cute. But for serious fucking with your brain it is 0/10. And I think the reason it is so retarded is the preference post training. It has to be funny, interesting and evocative so it is basically useless as a serious mirror.
>ego death>can't go a thread without mentioning himself
>>107796725Kek
>>107796123Fuck, he's paying you?
New STT transcription model from Nvidia, Nemotron Speech ASRhttps://huggingface.co/blog/nvidia/nemotron-speech-asr-scaling-voice-agentshttps://huggingface.co/nvidia/nemotron-speech-streaming-en-0.6bClaims to have better latency than other models, as well as better concurrency support, a big win for those of us who are serving to 500 users at once time on our H100s
>>107796839there's still nothing better than whisper v3, it's just really fucking sad ain't it
>>107796725how curious!
>>107796876society
>>107796839Nemo!
>>107796839>high-quality English transcriptionVery cool. nvidia.
>>107797036yes
>>107794073Grok put that cat in a bikini
>>107797102
is AMD going to be competitive with their 2027 CPU line up with potentially shipping DDR6 RAM or my hopes are all just plain cope?
>>107797036FunctionGemma is better.
>>107797247>Surely AMD will make good GPUs during a global shortage after decades of dogshitAnon...
>>107797261no no i meant their CPUs that ship APUs and 128g of ram, those can run bigger models right? if the bottle neck is the memory bandwidth with DRR6 the performance could be doubled...
>>107797285if they ain't making ddr5 and are rumored to bring ddr4 cpus back what makes you think they're even thinking about ddr6?
>>107797295well if they want to survive the market and not end like intel they better implement ddr6
>>107797309you will either get a ddr4 desktop or a unified memory laptop these are the 2026 options
>>107790894>Knee's exposedCover them up slut
>>107795172Gemma-3 27b derestricted easily beats every 24b and below model.
>>107797706isn't it really bad at instruction following?
>>107797725who told this lie ?
>>107797735It's my experience.
>>107797736Then why are you asking?
I just had a chat with an old card.txt but, instead of using gemma 12b or mistral 24b, I used Glitter Gemma 27b. It was more entertaining but the structure was the same nonethless.I made Ani from a tweet by someone on twitter. It had bunch of lines, I deleted them.https://litter.catbox.moe/ht86fgf2n4h9lvzn.txtThis is pure Gemma 3 27B. It's somewhat funny.
>>107797706I tried qat of the 27b and found it hedious for writing. It constantly tries to end the story right after the introduction and continues a well written story with the sloppiest slop. I'm fucking angry. Rocinante (or probably just nemo) is MUCH better.
>>107797820Ani, is the grog jewfriend:I edided it a little, concatenated. Decided to keep the dashes, I don't think it will make any difference.https://litter.catbox.moe/jo27g7r6hbeh3uem.txt
>>107797827 (me)It's like it never saw any good prose and the characters feel like they are written by a sleep deprived cashier woman
>>107797820Gemma 3 is the best.
>>107797820sloppa
>I shift uncomfortably, suddenly very aware of my nakedness under my clothes.Thanks Mistral
>>107797952SOVL
>>107797915This is scientific slop. Nigger.
>>107797952I do this
>>107797952Does she know she has a skeleton inside her body?
>Mistral small 3.2 finetune>ChatML prompt formatWhy do finetuners do this?
>>107798205Most don't know what the fuck they're doing. Same with the ones that recommend a temp 1.5x+ higher than the original model with a cocktail of cope samplers to try to wrangle it back into coherency.
>>107798205My training data is in ChatML so that's what it's going to be!
>>107798205You get some of the benefits of the original assistant finetune while not making yours too much assistant-slopped.
>>107797725Where do you get derestricted Gemma3? Original Gemma3 is as censored as it gets.
>>107798205because fuck mistral format
Has anyone seen a setup where one model acts as the writer and the other as an editor?For instance, Nemo has nice prose but isn't very smart. GLM 4.7 is a slop machine, but is smarter. Does anyone know if it is feasible to make GLM review Nemo's responses and generate correction prompts for it or should I test it myself?
>>107793648>windows + AMDwhy would you do this to yourself?
>>107798301Test it and see, but I imagine it would just result in GLM inserting its slop. I guess you could tell it to just search for consistency issues or something, rather than telling it to make the output 'smarter'. Might see some improvement at higher context, where Nemo falls apart pretty quick.
>>107798325Yeah, that's the idea. Telling Nemo not what to write; but to make sure the response is consistent. Things like making sure different characters' actions are not misattributed and so on.
>>107798360>Telling Nemo [not X but Y]
>>107798205>>107798241>Most don't know what the fuck they're doing.This.I hate to break it to you, but almost every good fine-tune has been by complete accident. The tuner then makes a higher version like 1.1 or 2, and it immediately shits the bed. No one knows what they're doing. They're just throwing logs from ai chat services and data off google at models in hopes it'll make something good. No one knows what they're doing. Not drummer. Not sao10k. Not anthracite. None of them did. They rent out GPUs, blender logs and instructions into the model, and don't even fuck their own bots to see if it's coherent or not. They probably don't even figure out what works vs what doesn't. It's just the masses of people say it works or not, and that's apparently good enough for them - god forbid they actually discover a pattern or two towards what works. They all hit a wall at MoEs because then it's actually required for them to know something, and they can't even take the first step. MoEs are filtering them all out, so eventually new tuners will rise that are better because MoEs are better than transformers when you quant their experts high enough -regardless of tuning.
>>107798388Forgive me, Anon — the temptation was too great.
>>107798391I can't believe Miku's butt would say that
>>107798448Miku's butt just says whatever she thinks you want to hear
>>107798296What's wrong with it? What makes ChatML better?
>>107797952I fucking love BernkastelI lost my LLM virginity to her
>>107798429if this fat fuck lost weight and got lean again Anthropic would be worth 10x times more
>>107798391you want me to change my dataset and finetuning methodology? nuh uh, fuck you.this is why "stock" models are better in most cases, no finetuner can actually do sft+dpo properly.
>>107798529bussy doesn't attract investors
>>107798537I wish that was true
>>107798537The looks of a CEO most definitely affects the perception of the company, just look at this dude now
>>107798537>doesn't allow bussy>censor bussy>a fucking thousand alternatives spring up taking billions in bussy moneyBussy talks.
>>107798557But isn't anthropic worth more now than when we looked good?
>>107798587>wehello there
Has anyone tried this model?https://huggingface.co/alpindale/dbrx-instruct
kys alpin
>>107798598>132b instruct tuned MoENo, but it looks interesting.
>>107798598>2 years agoIf it's not popular by now, then it wasn't good.
>Try lots of local models on LMarena>All sloppy in the exact same wayIs it like cloud models, where when you get your preset and a card in there to context poison it, it starts writing more interesting/less generic prose, or do they just stay in "not x but y" -ism mode?
>>107798587then the CEO didn't went to CNN to warn about mass unemployment
>>107798529>>107798557Sama is also pretty ugly, I don't think that's it.
>https://huggingface.co/datasets/PJMixers-Dev/c3-kto-test/viewer/default/train?row=0>"value": "You'll portray {{char}} and engage in Roleplay with {{user}}.drummer do you unironically train on this?
>>107791865Which gemma 3 ablit is the right one?
>>107798641Let's see your dataset
>>107798634Sama is a conniving snake trying to act like a shy nerdy femboy, looks help establish a more positive perception of the company
>>107798658the one you lobotomize yourself
>>107798620But 90% of people in this general have no attention span and think a model goes stale and moldy after a few weeks
>>107798664You could have just said he's jewish.
>>107798598I tried it when it came out and it was pretty bad, it was around the same time as wizardlm 8x22 and command-r which got all the attention because they were way better
>>107798659Lick my nuts drummer, you get more than enough money to curate a proper dataset
>>107798683I accept your concession.
>>107798680Thanks. I was just desperate for a model that isn't glm air. Guess I've gotta keep looking/waiting.
>>107798659no need to get defensive, but you don't see any problem with it?
>>107798675Dario acts differently and he's also jewish
>>107798641No wonder behemoth X v2 is a fetishist for consent.
When I get DDR6 + CXL Motherboard + Blackwell + NPU I’m going finetune at home and beat these current day fine-tuners into a brick wall.
>>10779872816GB of DDR6 will be like $3000
>>107798757It costs roughly 1.00 USD in materials to make 16GB RAM.
>>107798789Yes but that 1 USD of RAM could be put in an AI GPU that businesses will be more than happy to pay $3000 for. You can outbid them, right?
>>107798789I'm not saying current prices are anywhere close to reasonable but mentioning raw materials price for a product so complex and advanced as ram memory is fucking bullshit even if you ignore profit marginsyou know itI know itLuigi was based thofuck insurance industry for real
>>107798789Raising a child to 18 in the US is around 300,000 - 390,000More people should die, imagine how many resources could be saved
>>107798701>pic"20 bottle caps for the negro to mine it, why do you ask?"
>>107798829I agree with everything said here.
>AnythingLLM>Jan.ai>something else?I feel like I want to switch up my frontend and these stood out. What's the best option? I'm more of a casual user.
>>107796867You mean whisper v2
>>107799056This. I don't know how anyone manages to use v3 with all the hallucinations during any second of silence.
>>107798688Just use command-r or llama3 eva0.0
>>107798598>alpindaleBuy an ad.
>>107798789At what scale?
>>107799759In your garage with walmart handtools and $1 worth of amazon parts.
>>107798701Jewish in the evil sense, not the silly curly hair guy sense.
why did they call it Router-weighted Expert Activation Pruning? why not Router-weighted Activation Pruning of Experts?
I need to devise the successor to cockbench
>>107800112cuntbench?
>>107800087Chicken sandwich. Sandwich of chicken.
>>107800137but the acronym would have been funnier.
>>107800144It's funny if it doesn't sound contrived.
>>107800087They decided that they wanted to use REAP as their acronym and worked backwards from there
>>107800177>>107800287RAPE would have been more fitting because it basically rapes the models.
>>107800087The same reason why the Neural Image Generation via Generative Adversarial Rendering paper was never released
Today I just tried ChatGPT again to see how it compares to local and it's still terrible on the free tier. So many people are experiencing absolute garbage without knowing it kek. Literally Mistral Small or something did better in my test. Whatever they're serving feels more like an 8B, or maybe 20B MoE.
>>107791898they are talking about thickness 3inch diameter is pretty big
>>107798789turn this into a microchip for me
>>107800583I canceled my plus sub to just go local+openrouter.So far I've only spent 6 cents in the last 2 days. honestly it's really nice to have access to all the big flagship models if you need really solid answers. but 70% of the stuff I can ask day to day any old sub 32B model can answer it just fine.
>>107800624Aren't they made out of sand?
>>107800649Yeah I stopped my sub a long time ago. Even a year ago there were more than enough alternatives.
https://x.com/ltx_model/status/2008595989096177962