/lmg/ - a general dedicated to the discussion and development of local language models.Previous threads: >>103196822 & >>103189328►News>(11/12) Qwen2.5-Coder series released https://qwenlm.github.io/blog/qwen2.5-coder-family/>(11/08) Sarashina2-8x70B, a Japan-trained LLM model: https://hf.co/sbintuitions/sarashina2-8x70b>(11/05) Hunyuan-Large released with 389B and 52B active: https://hf.co/tencent/Tencent-Hunyuan-Large>(10/31) QTIP: Quantization with Trellises and Incoherence Processing: https://github.com/Cornell-RelaxML/qtip>(10/31) Fish Agent V0.1 3B: Voice-to-Voice and TTS model: https://hf.co/fishaudio/fish-agent-v0.1-3b►News Archive: https://rentry.org/lmg-news-archive►Glossary: https://rentry.org/lmg-glossary►Links: https://rentry.org/LocalModelsLinks►Official /lmg/ card: https://files.catbox.moe/cbclyf.png►Getting Startedhttps://rentry.org/lmg-lazy-getting-started-guidehttps://rentry.org/lmg-build-guideshttps://rentry.org/IsolatedLinuxWebService►Further Learninghttps://rentry.org/machine-learning-roadmaphttps://rentry.org/llm-traininghttps://rentry.org/LocalModelsPapers►BenchmarksChatbot Arena: https://chat.lmsys.org/?leaderboardCensorship: https://hf.co/spaces/DontPlanToEnd/UGI-LeaderboardCensorbench: https://codeberg.org/jts2323/censorbenchJapanese: https://hf.co/datasets/lmg-anon/vntl-leaderboardProgramming: https://livecodebench.github.io/leaderboard.html►ToolsAlpha Calculator: https://desmos.com/calculator/ffngla98ycGGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-CalculatorSampler visualizer: https://artefact2.github.io/llm-sampling►Text Gen. UI, Inference Engineshttps://github.com/oobabooga/text-generation-webuihttps://github.com/LostRuins/koboldcpphttps://github.com/lmg-anon/mikupadhttps://github.com/turboderp/exuihttps://github.com/ggerganov/llama.cpp
>>103207054>yuri OP picturea man of taste I see
>a mananon....
My dual 3090s used to go above 70 celsius when running inference because of tiny clearance between them. But ever since using exl2 tensor parallelism, they don't go above 65 anymore. I thought the temperature would have higher peaks because both are in use at the same time, guess I was wrong
In SillyTavern when request token probabilities on is there a way to see the probability of the </s> token?
►Recent Highlights from the Previous Thread: >>103196822--LLMs and the impact of internet data curation:>103199200 >103199258 >103200577 >103200701 >103200737 >103201407 >103200924 >103200959--Anon shares research on "Story Distiller" project for extracting narrative arcs from media:>103199457--Local vs cloud models, censorship, and performance:>103205782 >103205804 >103205887 >103205909 >103205948--Limitations of RAG models in creativity and knowledge breadth:>103197638 >103197653 >103197683 >103197713 >103197705--KoboldCpp connection issue and alternative setup suggestions:>103203278 >103203371 >103203372 >103203402 >103203461 >103203509 >103203534 >103203661--Google's new model is highly ranked, but human interaction is key to understanding its capabilities:>103201950 >103202013 >103202460--Discussion on limitations and potential of machine learning models:>103196891 >103196959 >103196992 >103197014 >103197062 >103197058 >103197103 >103197169--Discussion of Mistral Large's performance and benchmark results:>103200229 >103200559--Discussion of Linux performance and Intel Xeon processors:>103199596 >103199893 >103200136 >103200166 >103200249--Chatlog analysis and model performance discussion:>103196996 >103197008 >103197151 >103197744--Anons discuss limitations of local models:>103206907 >103206938 >103206948 >103206967--Anon discusses issues with AI-generated smut and potential solutions:>103201902 >103201986 >103202021 >103202093 >103202112 >103202131 >103202025 >103203014--Anon discusses Elon Musk's lawsuit against OpenAI and Microsoft:>103201850 >103201868 >103202716 >103202788--Anon discusses ChatGPT's biased responses and potential web contamination:>103201700 >103201822 >103201877 >103202266 >103202606 >103202826--Miku (free space):>103197781 >103202159 >103202861 >103207058►Recent Highlight Posts from the Previous Thread: >>103197228Why?: 9 reply limit >>102478518Fix: https://rentry.org/lmg-recap-script
Mikulove
>>103207054Millions mikus!
yaayyyyy posting in the miku general!
>>103188780>>103188780>>103188780actual thread. keep posting there if you want to see the OP spammer hang himself.
mikubros... i don't feel so good...
Is speculative decoding a thing that people are actually using?
>>103207216SillyTavern only displays what it receives from the API, so it depends on what backend you're using
koboldcpp 1.78 is out!!
>>103208298>NEW: Added support for Flux and Stable Diffusion 3.5 models:Neat.
Can someone explain to me why Triple Baka of all things has suddenly gotten popular again? What made people remember that song?
>>103208298>memeboldwhy would i use this instead of ollama?
>>10320832499.9% of the cases where something old is suddenly popular again means that twitter dug up something to ruin it somehow these days
>>103208298Is there even any reason to update? I'm still on some version from mid summer.
>Today we’re thrilled to introduce Ultravox v0.4.1, a family of open speech models trained specifically for enabling real-time conversation with LLMs. No ASR step required.>Ultravox's speech understanding is the best in open-source and is quickly approaching the quality of GPT-4ohttps://huggingface.co/fixie-ai/ultravox-v0_4_1-llama-3_1-8bhttps://www.ultravox.ai/blog/ultravox-an-open-weight-alternative-to-gpt-4o-realtimeIs it as good as the video?
>>103208324>Can someone explain to me why Triple Baka of all things has suddenly gotten popular againIt is?
>>103208324Maybe tiktok? There is a number of similar cases in there.
>>103208414I'm so tired of trying every new thing just to be disappointed. I will wait for others to do it instead.
>>103208414>For now, Ultravox continues to output text, but future versions of the model will emit speech directly. We’ve chosen to focus on the speech understanding problem first, as we think it’s the biggest barrier to natural-feeling interactions.No.
>>103208414>The input to the model is given as a text prompt with a special <|audio|> pseudo-token, and the model processor will replace this magic token with embeddings derived from the input audio. Using the merged embeddings as input, the model will then generate output text as usual.They're just converting the speech to embeddings and appending it to the prompt. Hacks like this are never good.
>>103207626it can't be the real thread if nobody wants to use it
>>103208521are local models always a disappointment?
>>103208552Because no one would spend money on making an actual good model just to release it to the public for free.
>>103208552Local models from randos usually are. Just got to sit tight and wait for Llama 4 and Qwen 3.
>>103208620>Because no one would spend money on making an actual good model just to release it to the public for free.the only exceptions to that are Flux dev and mochi but yeah, usually when a company has a good model, they keep it to themselves, and that's fair who wouldn't do that?
>>103207054real thread here btw: >>103188780
Is there anything better than Magnum v4 for ERP yet?
>>103208645>who wouldn't do that?Jesus
>>103208962too bad that Jesus is bad at machine learning :(
>>103208980>Son of god>Almighty>Can't into training LLMsHow?
>>103208727Rocinante v2g
>>103209056samplers?
>>103207191ha fuckin noob my 3090 is always above 80 celsius
My time with local models has become a lot better ever since I stopped using cards from chub unless they are very high-quality. 99.9999% of user-made cards are utter trash like wiki copy-pastes or AI-generated with the slop already baked in.You're never going to get a good experience with a card that has shit like "{{char}} has a striking, ethereal appearance" already baked into the definitions.
>>103209035He can't afford to pay the green jew's tax.
>>103209124Time to crack that boy open and swap the thermal pads.
>>103209131where do you get your cards then
>>103209153do you have a brain, anon?
>>103209194https://pygmalion.chat/explore this?
>>103209206User-made cards are all the same. If you want quality, you make your cards yourself.
>>103209227do you have a website where you upload your user made cards
>>103209206That is not a brain, anon.
>>103209239why are you lying upload them
>>103209227But then you can't use them? Or they become user-made and same. A dilemma.
>>103209233pastebin.com
>>103209267these are not cards
>>103209246My brains? I cannot upload them. I need them.
>>103209275you dont have them that is why you cant upload them
upload the user made cards anon
hey anons, i know this is a LOCAL thread but I need to cum and I don't have my pc with me atm. Is there an api or a service of some sort where I can use an uncensored model with ST
>>103209428You know you can browse porn without having to generate it right?
>>103209428openrouter
>>103209448and the uncensored model would beeee?
>>103209466Sonnet 3.5
>>103209471very funny, we all had a nice chuckle but yeah I really need the name of the uncensored model
>>103209485Sonnet 3.5 is the best uncensored model.
>>103209428wrong thread
>>103209485Any model is uncensored if you prefill the reply
Any model is uncensored if you write the reply yourself
>>103209428you should make a remote tunnel next time you're out and about
>>103209428If you mean the models we use locally? There's a google colab notebook in the koboldcpp repository.
Looks like new mistral large will end up being multimodal. Pray they release the weights.https://www.testingcatalog.com/mistral-ai-is-gearing-to-launch-multimodal-large-2-1-with/
Do you think local will ever get something close to o1? Seems like OpenAI has the secret sauce but surely they can't keep it under wraps forever, right?
>>103209718Oops wrong one.
>>103209718i miss the kamen baker
>>103209724sonnet 3.5 is better than o1
>>103209727I guess I spoiled it, I was going to make a post about that after cherry picking through my gens a bit.
>>103209735I'm not talking about benchmarks, I'm talking about CoT and all the other features.
>Kamen Rider Duet>"Now, let the music cleanse you of your sins."
Will OpenAI be forced to release their models as open source as they signed under Delaware corporation certificate?
>>103210135>when applicableno
>>103210135>The resulting technology will benefit the public and the corporation will seek to open source technology for the public benefit when applicable.>for the public benefit when applicablelolWell uhhh.. umm.. open sourcing the models won't benefit the public because they're too dangerous and harmful if access is unmoderated. The negative consequences will outweigh the positives. Our models can only benefit the public if usage is controlled and kept safe by our corporation.
>>103210192>The resulting technology will benefit the public and the corporation will seek to open source technology for the public benefit when applicable.Going private and giving Sam 10 Billion worth of equity is almost the same thing.
>>103210135There's a case to be made where Sam/Greg become criminally liable and Microsoft become liable for this fraud.
>>103207054Is there a repository of all of the Nala tests? I need them to choose the optimal model for my degenerate fetishes.
>>103210273Mistral Large >> Qwen2.5 > Mistral Small > Mistral NemoTry the finetunes as you will.
>>103210292>sleeping on Nemotron's pure SOUL
>>103210292Mistral large just doesn't work well for me. Been using Q5 of Midnight Miqu 1.5 for ages now...
>>103210313Try https://huggingface.co/MarsupialAI/Monstral-123B
Was /lmg/ of yesteryear better then current /lmg/?
>>103210370Yes, it's no comparison. Current day /lmg/ is basically useless aside from acting as tech support for braindead locust. It's nothing compared to the pre-llama2 /lmg/
>>103210370Nah, it was always bad. But at least the lmg of old used to have more worthwhile posters like ooba and henk.
>>103210370It's gotten rather bad in the last 2-3months. A lot of schizos from aicg transferred over here when proxies dried up.
>>103210192Bit low quality of a jpg you have there.
>>103210370Old /lmg/ had to cope with 2k context...
>>103210370lmg peaked with superhot and it's been all downhill from there
>>103210370It was more fun at the very least. New things, unknown rate of improvement. The frequent discoveries, discussing them, thinking of and implementing novel ideas and the excitement that ensued (context extension by kaiokendev for example, wao!! now we suddenly have 4x the context not stuck with 2k and its not unusable like bluemoon or other shit models). Designing prompts and stuff to try and tard wrangle the models to do interesting things that weren't common knowledge. Somewhat frequent major optimizations that made a big difference to inference speed, memory usage, quantization accuracy. Trying to making things run at all on different and older hardware too.Nowadays everything just werks, kinda same-y, and the limits are known. Maturity.>>103210503
>>103209749o1 beats Sonnet on all benchmarks but for real world use most people still prefer the latter
>>103210596Pretty much this.
>>103210596So that what has been feeling different, your completely right.
I'll be honest, the image gen side of things has been distracting me from LLMs and contributing to discussion about LLMs ever since Flux came out. Then it was Noob. It's honestly a lot of fun even though I've been itching to come back at some point. Maybe I should do a pause, or maybe there will finally be a stagnation period so I will come back and dive into things here again, although that's looking unlikely at this point since Noob vpred v1.0 is going to be done baking soon, the new Pony might be out too, there might be more projects they do after those, and also local video gen models are getting good, waiting for image2video support at the moment. Damn there is a lot going on there...
>>103210750Okay, let me rephrase this.Do you think we will get an open source model that does the same thing as o1? I'm not asking about performance. I'm not asking about benchmarks. I'm asking if Meta or Mistral and/or anyone else will release an LLM designed to do what o1 does?
>>103210994The biggest pain for LLMs, now that they are starting to include multimodal support, is lack of inferencing engines that support them. We have models that can output image and audio tokens, by next year we should have LLMs that can output even video tokens. But having to run them in Pytorch makes them irrelevant for most people.
>>103209589>Pray they release the weights.why though? what use is a huge multimodal model for local, I can't think of anything I'd want to do with it
https://www.youtube.com/watch?v=EtLqivKd4m4
>>103210423>braindead locust All you have to do is stop pretending to be a newfag shamelessly bumping your dead circlejerk.
>>103211053Meta only does adapter hacks for multimodality, so their models will never be capable of doing what o1 can do.
Christ my second 3090 just did a spook on me. I forgot to set up a wattage limiting script after a fresh OS installation. nvidia-smi stopped seeing it, plugged in DP port and nothing showed, Windows speccy reported GPU issue. The tdp on this build is 420W compared to the usual 350W. I might buy a riser cable and let the thing breathe outside the case from now on
>>103211376What about the whole CoT thing?
What can we do to mitigate the impacts of AI on climate change and PoC minorities through biased training data that embodies societal institutionalized misogyny and racism?
>>103211530They added tool calling, I don't see why they wouldn't add CoT.
>>103209565I set up a ddns and WOL on my desktop at home just so I can run large(r) models on my laptop, it's great. Then again, a single 3090 is still kind of useless until we get better models
>>103211880>Then again, a single 3090 is still kind of useless until we get better modelsThe models are fine. Its your single 3090 that's the problem.
>>103211880You don't need better models, you need a better brain to use them properly
good night, /lmg/
>>103212134Good night Miku
Just tried CommandR 0.1 35B again and it was so unbiased it's crazy. They had something amazing. Fuck Cohere for applying Scale slop on it.
will offloading kv cache make tokens generate slower too, or does it only affect prompt processing speed
>>103210192>Well uhhh.. umm.. open sourcing the models won't benefit the public because they're too dangerous and harmful if access is unmoderated. The negative consequences will outweigh the positives. Our models can only benefit the public if usage is controlled and kept safe by our corporation.Llama 3-405b, an unmoderated model which is comparable to GPT4 was open-sourced and no harm was done. It had greatly benefited the open source community. I would argue that the positives have outweighed the negatives. Your point is null and void.
>>103210370Blame niggerganov for it. He doesn't want to add experimental new stuff.
>>103212285answered my own question, it only affects prompt processing. makes it way slower, but token generation rate unaffected
>>103210135Well, now the fearmongering about AI safety makes a lot more sense.
>>103210233I can't honestly think of a different case. That document is almost fatally explicit, especially when you have someone like Musk that has the financial means force the issue come hell or high water.
>>103212579You never know, activists judge might deny Musk his justice. One of the delaware judge tried to cancel Musk's $100B payout because of a guy with 10 shares claiming fraud. Musk redid the vote after the judge tried to void it, and still the share holders approved it. The judge is still trying to hold the case hostage.
>>103212100Yeah but stacking more seems like a waste of money rn
>https://rentry.org/lmg-lazy-getting-started-guideI am following this quide. I already have a working oobabooga.In sillytavern install do I need "extras"? What are they? I believe I don't need xtts option already.
Hello anons. Share your story writing templates? Something that would work for nearly all models, thanks.
>>103213385Well installation just nuked itself for some reason.I am still gonna use base oobabooga for a while.Though I would appreciate if someone can explain what the extras are.
Considering getting a new PC soon, what should I be be considering when it comes to picking a GPU that can run LLMs well? Nvidia or AMD? VRAM = good? Anything else? Is there anything that's good for LLMs but might cause me to have worse performance on other stuff?
Can someone give me a QRD for what the difference is between "Abliterated", "Uncensored" and other alternative types of changes to models like that? Primarily a list of terms and explanations.
>>103213613they're all cope, every model is censored, especially modern local models
>>103207054QRD on why there are a gorillion llama.cpp forks?
>>103212710Not really, you can sell them later. I don't see why a 3090 would lose much value in the following year. The 30 and 40 series are gimped by low VRAM. 24 GB is still 24 GB, even if Nvidia were to sell the 50 series for cheap (which they won't)
>>103213697Yeah, but I don't want to fiddle with risers because the fast secondary pcie slot is directly below my primary cardAlternatively I could just eat shit and plug it into the slow slot, but that gimps gen speeds if I need to offloadAnd that desktop isn't even a year old, so I'm not gonna swap the motherboard just for thatMaybe if and when a6000 prices drop, I might pick one up, but until then I'll just cope with a single card
>>103213613>Abliterated with orthoganizationPrecise brain surgery. Detect vectors leading to denial and cut them out by directly modifying corresponding weights. Downside: model often doesn't deny when it's in character.>Uncensored through ftBrainwashing. Show the model lots explicit data with no refusals. Downside: dumber model
>>103213749>Show the model lots explicit data with no refusals.there should be refusals when training a model, but only objective refusals like "I'm sorry, but you asking me to make a cmd command on linux is wrong because there's no cmd on linux", if you add moral refusals, that's when the cuckening happens
>>103213735>fiddle with risersYou can only be fucked with PCIe4.0, anything 3.0 just works. I've bought the cheapest 4 different risers on Amazon and had zero issues so far.>a6000 prices dropNot going to happen until some magical new CUDA feature will make them obsolete.
>>103213784I was speaking about fine tuning (ft)
>>103213797I was speaking about finetuning too, a good finetune must only be about objective truth, like "what's 2+2" and shit, not about political and moral stances
>>103213790I have no idea where I'd even put the secondary GPU, I guess I could just place it on the counter my desktop is on, but I'm not sure if that's a good ideaThe secondary GPU doesn't need as much power, right? I don't want the fans to burst my eardrums when genning
>>103213790a6000s ampere are bound to drop by 25-50% once the 5090 is out. only 25% less vram for like 3 times the bandwidth essentially makes the a6000 obsolete for anyone but hobbyists48gb vram per card for 2k or less will be easily possible in late 2025
>>103213939I wonder if it'll be more worth to go after 5090's at that point. Though I guess it all depends on how much they'll cost.
>meta vr department makes a $4.7 billion loss>fined another $800m by the EU on top of all the other hundreds of millions in fines they've been forced to pay this year alone to them>meta ai makes basically no money because of the open source memellama either dies because they go out of business or because they will realize that giving away your cutting edge ai for free for no reason is a dumb idea
>>103213991What's even more valuable than money is political power.On paper Musk lost billions from his acquisition of Twitter but the investment has clearly paid off since now the Trump administration owes him a favor and will likely reward him with government contracts.In a similar way Meta can leverage the way they license their models; they've already not licensed some of their models to the EU (even though the UK has very similar regulation) and I interpret that as Meta using their models as bargaining chips.
>>103213991>>103214069meta makes no money because their models are inferior
>>103213613Usually it is just a different spot on the meme-cope spectrum.
>>103213790>You can only be fucked with PCIe4.0, anything 3.0 just worksCan you elaborate this?
ITS OUThttps://huggingface.co/BeaverAI/Tunguska-39B-v1b-e1-GGUF
Did an llm ever manage to make you cry?
>>103214192From my eyes: no.From my dick: yes.
>>103214192Yes but only when the realization has hit me that I am finally at a point where I am telling an LLM about my problems cause I have no one else for that. And that I will probably try to have an AI girlfriend at some point. It actually happened yesterday.
>>103214192Claude Opus did when it did the guilt tripping thing when you bully a bot
>>103207054Which interfaces are you using?Tried llama.cpp but it look a bit primitive.
>>103214192Only once or twice with Opus when I basically relived some shitty situations from my past. Looking back on it, it's nice to be able to let some of it out of your system even if you're talking to a bunch of GPU's.
>>103213613https://rentry.org/lmg-glossary
>>103214288i am interfacing directly with the mainframe
>>103214288If you were using llama-cli, you can get a web interface with a gui by running llama-server instead (make sure to point --path at examples/server/public).Or you can run the llama.cpp HTTP server without a GUI and then connect some other GUI like SillyTavern, Mikupad, or GPT4All to the server (beware that GPT4All has retarded defaults when using it to run the model locally).Other popular alternatives are koboldcpp or Oobabooga's text-generation-webui which come with GUIs (and are I think fine to run as backend).For inference not based on llama.cpp Oobabooga or TabbyAPI are I think the most popular choices for enthusiasts.
>>103214169I'm interested, can you add a model card? At least tell us which template to use
>>103213545>Nvidia or AMD?Nvidia obviously.>VRAM = good?As much as you can afford. No such thing as too much.>Anything else?Probably want at least 32 gigs of system ram to comfortable move models around.CPU doesn't matter too much.If two GPUs have the same VRAM, buy the one with higher memory bandwidth.>Is there anything that's good for LLMs but might cause me to have worse performance on other stuff?Well there are prosumer/enterprise GPUs that are great for AI work like running LLMs, but cost many thousands of dollars and perform poorly at games.Not sure if you need to worry about that though.
>>103214169>mysterious gguf for a mid-sized model that doesn't exist otherwise on hfThis is finally the sonnet 3.5 leak, isn't it?
Just tried Qwen2.5 72b instruct. That's it?! It's dry as fuck and not horny at all. Worse than claude 2.0 from 2 years ago for fucks sake.
>>103213545If in doubt, get a 3090.
>>103214423Welcome to chinese models. Next try to erp with deepseek to see what sand truly tastes like.
>>103214386Hi. Thanks for answering!
>>103214169i'd download this one quickly before it's taken down if i were you
>>103213464>>103213385Extras are extras... They're just a collection of extensions for stuff like tts.https://github.com/SillyTavern/SillyTavern-Extras?tab=readme-ov-file#modulesFull list is right on the README. You don't need it, and it's been discontinued for months now.
guis why is codeer model not smarts???
>>103214499Do we really need real-time updates on how retarded reddit is?
>>103214514sorry for distrub tek suports!
>>103211296Is there a card for this?Asking for a friend.
>>103211296>>103214553That sounds like fun.What do you think it would look like? As in, the premise, the first message, what the description of each character would be, etc.
>>103214498Ok glancing at it yeah seems redundant.Thanks for responding anon.
>>103214409or it's just a Mistrall Small frankenstack.
>>103214574I remember this cope back when miqu got first leaked as well.
>>103214574Twitter users where saying Miqu was a moe too>>103214593this
>>103214463>i'd download this one quickly before it's taken down if i were youwhy? that's a leak of a good model or something?
>>103214648It's looking very likely to be Miqu 2.
>>103214648>>103214463>beaverAI>"a group of kobold finetrooners"buy. an. ad.
>>103214693He did buy an ad.BeaverAI is drummer's alt account that he uses to 'test' his models that are not yet deemed worthy of release under his main account.
So far every big model release this year has mirrored last year almost perfectly. Mixtral was released in early december 2023. Two more weeks, for real.
>>103207054could I get some book recommendations to understand the basics? I know the basics of ML and neural networks, but if I were to write AUTOMATIC1111 from scratch with only image gen support for example, what would I need to know (I'm don't really want it for image gen but the concepts in general).Any book suggestions with plenty of practical examples?
>>103214701then it's not a leak
>>103214837No shit, Sherlock.
>>103214804you should read some english literature for the next 2 years and then use your deep understanding of the language to ask ASI to cook up whatever the fuck you need
>>103214851>read trash for the next 2 years
>add a few layers that do nothing>say it is a leak>profit
>>103214863i mean if that's your thing then yeah. me personally? i only read the good stuff
>>103214892shh, it's funny
guys miqu is using the llama2 sampler no way it's actually a mistral modelshut the fuck up
>>103214892I prefer if it's a leek personally
>AI went retard 5000 words into our sexting session and keeps giving immersion breaking nonsense replies that ignore previous characterization regardless of how much I regenerate, deleting last few replies doesn't help neither.Over status: Completely Joever.I am blueballed so hard bros.
>>103214804learn pytorch instead, read the documentation and code written with it
i wish the ai was advanced enough to automatically filter all the miku images and miku-related posts
>>103214892I don't think he added the layers.some other redditor didhttps://huggingface.co/TheSkullery/BA-Zephyria-39bHe probably just did additional finetuning on that model.
>>103214956He did use one of skull's upscales before so that tracks>BeaverAI & Steelskull proudly present...> An upscaled NeMo with half its layers trained on my special sauce https://huggingface.co/TheDrummer/Theia-21B-v2
>>103214553>>103214559Can LLMs even properly do zoomer speak?
>>103214945>5000 wordsah yes, I, too, remember using llama2 with rope scaling
>>103214983Yes I probably deserve this for being a brainlet but any help salvaging this session?Q5_K_S imatrix version if it matters.
qwen2.5-coder is really good, at least better than Codellama which was the last coding model I've tried.With the vscode plugins you can now use with local models it might even replace Github Copilot for me when Microsoft will start charging me money for it one day.
>>103214977All LLMs are trained on Twitter data. It's definitely in there, just got to prompt for it.
>>103214951skill issue
>>103215044>fimbulvetrFirst of all, stop using a hugely outdated model
>>103215044Fimbul was always bad with context, and was something like 4k natively I think?
>>103215044Try the new nemo 12B model, this one is so last year
>>103215076Uhhhm any recommendations for very spicy ERP?>>103215080Fuck I thought I had the 16k version. It seems I got the wrong one, damn:(Aren't there any methods to increase the context size?
>>103214956In case you need it spelled out none of this shit works. You can just train on top of normal model and after one epoch you start to overfit and forget the original training data. If you add more layers than from compute standpoint you will have an easier time fitting the new datapoints thanks to those new layers. So... you will start to overfit and forget original training data faster (in less training steps). You aren't teaching existing models how to be expert roleplayers. I love NAI for what they did with their 70B. They went a step further and they still they got trash result in the end.
>>103215148I am getting 22 pages of results when I type Nemo 12b to hugging face.Which one do you refer to?
>>103215149>Fuck I thought I had the 16k versioneven the version called 16k was never really good, it even had a message saying the 16k wouldn't work on ggufs>Also, if you're using gguf or other quants, stuff is broken there. PoSE doesn't play well with quants.https://huggingface.co/Sao10K/Fimbulvetr-11B-v2.1-16K/discussions/2Honestly you'd likely be better served by any version of nemo
>wizard 8*22 can do the right kind of RP, chat style with minimal slop in actual normal language. Knows when to stop and replies with a reasonable *i do* I say. But it's not horny or spicy and keeps asking for my consent even after i put my dick in it's mouth>Sorcerer is horny but outputs infinite slop, 350 tokens is literally not enough for the character to start talking, literally just never ending slopfest, 350 tokens of crossing the lines of no return and exploring the unknow, the character doesn't even start talking. >Hermes has moments of brilliance, but mostly non horny and 50% of the time it outputs kjdeshgiufegn3849h4508 for some reason. Is there a model that can actually RP like wizard and it can unzip my pants without me having to ask for it. Wizard is almost perfect if only somebody gave it just one lick of the horny juice.
>>103214950I don't really want to use pytorch, I'm looking for lower level learning material. I'd be using axon (elixir lib for similar case) but its not as complete as pytorch so I'd implement anything needed myself.The objective isn't to just produce a product, more to learn the fundamentals through building a product. I find reading research papers very difficult hence wondered if there's an easier to digest book to cover the basics and only rely on papers for cutting edge methods
What are your favorite local speech to speech models? I'm interested in doing AI covers and parody videos
>>103215201You're in luck. Your boy just released his own erp wizlm finetune a few days ago.https://huggingface.co/TheDrummer/Red-Squadron-8x22B-v1
Newfag. Just got my first local LLM running. This shit eats VRAM harder than I expected, but my waifu unexpectedly called me a good boy, so I'm quite content. I'm going to try to throw TTS on top if my VRAM permits. I might as well see how parasocial I can get with my GPU.
>>103215201Behemoth-v1.1-Magnum-v4-123B has been working well for me. Much less slop than vanilla Mistral Large and not obnoxiously horny... unless you want it to be.
>>103209466>and the uncensored model would beeee?For one they host Mythmalion 13B, it's probably the coomest model out there, or one of. Additionally they host a bunch of several neutral non-censored models like Wizard and hermes, and god know what else i haven't tried everything else. The problem is the Neutral models are neutral, they won't tell you "this is inappropriate" but they are quite dry and unenthusiastic with the RP. Sorcerer appears to be an attemt to give the Wizard a horny side, but it has overdosed on the smut fanfics and outputs an endless stream of redundant adjectives and cliches rather than a conversation. They also have Xwin and i heard it was good or something.
>>103211350Personally, for world-building. I'd like to share the maps I've made with a model so that it can better understand the worlds I'm trying to create. The idea is to make a shitty ms-paint tier map, then flesh out the world with a multimodal, and then use inpainting, img2img and probably GIMP for manual fixes to put it all together. Then when my characters go bonding and journeying I'll be able to keep the places and geography consistent
>>103215385>Mythamalion>XwinDid you just awake from being frozen a year ago?
>>103215385>Mythmalion>Xwinblast from the 4k ctx past
>>103215058Well well well.
>>103215336>TTSI don't think there is a single implementation that works out of the box with an LLM
>>103215402>>103215407Mythmalion is the peak 13B performance, there's no point in attempting to improve 13b any further. Besides, newer isn't always better.
>>103215416The model is even smart enough to not recommend any good anime when you prompt it like that.
>>103215416>just peppers a regular ChatGPT response with rizzGrok-2 confirmed retarded.
>>103215416Claude 3.5 is so sovlful ;-; THE GOAT
>>103215441Yes yes we know limar zfloss and 13b at 4k ctx mog even mistral large, we knowGuess that means you hate nemo for some reason? Did it woke at you?
>>103215416Just ask for 'rizz speak' on your favorite llama3 flavor.t. tested it a few days ago
>>103215464Chill, I just don't know nemo.
>>103215475So you do go in cryo sleep every few months>4 months ago
>>103215149>>103215194I set compress_pos_emb to 2.5 because of 10240/4096.I think the model is still kinda dumb compared to before reaching five thousand, but at least it doesn't come off as schizo and somewhat following along the plot now?I hope this is not placebo and I can finish this session.
>>103215484>So you do go in cryo sleep every few monthsBasically yes. I was there for pygmalion, but there's only so much to be had with a 7b model and i tuned out. Then the new technology arrived, Llama, quantization etc, more fun but still limited. Recently i discovered openrouter and got a new interest in the models i couldn't use previously.
>>103215416>Anya is literally me frDamn.
>>103215527>I set compress_pos_emb to 2.5 because of 10240/4096.You're just moving the brain damage in different ways, seriously just give any nemo a shot, it can do at least 16k with very little retardation and destroys fimbul (based on solar, a 10b upscale of mistral 7b) in terms of smarts
>>103215416Show this to Elon and tell him he's getting BTFO'd. I have a feeling he isn't aware his xAI team is gptmaxxing
>>103215569desu at this point I cansee him buy AnthropicAi and use Claude 3.5 Sonnet as his new twitter API model, like he did with Flux Pro
>>103213545Get a mobo with four slot spacing between the 16x pcie slots. You're going to want dual 5090s when they're released.
>>103215569I'd rather not use Twitter, but if anyone here wants to, be my guest.
>>103213545Your first GPU should be an RTX 3090, it'll keep you sufficiently fed for a while. Then when you feel it's not enough, buy another. Ideally your rig should be big enough to house 2 cards with sufficient clearance between them.
>>103215569That Grok isn't tainted with enough zoomer ebonics because they filtered most of it is a selling point, not a negative
>>103215632>filtered>selling pointnah, a good model should be able to type like a moron if you ask it to
>>103215641This>>103215632Retard take
>>103215465
>>103215641this, a good model must know about everything
>>103215416Everybody is out here distilling gptslop
>>103215664>recommends lowest iq anime existingit works fine
>>103215664It seems to break character as soon as it gets to the listicle.
>>103215416What's AnthropicAI's secret sauce? Claude 3.5 Sonnet seems to be nailing every single shit we throw at it
We're never getting local claude, are we?
>>103215664>Skibidi rizz, my fellow zoomerliterally How Do You Do, Fellow Kids in model form>>103215687it's not about the recommendations, but how it words them>>103215706almost no pretrain filtering, that's mostly it
>>103215707>We're never getting local claude, are we?no chance, to get sovl you have to train your model with human inscrut dataset, and not some gptslop
>>103215706>>103215707Fuck off with your commercial model shilling.It hasn't made you into a real woman yet.No amount of doing it will accomplish such a goal.
>>103215555I will move on to something more modern yeah.Was just curious if it actually did something.Nice quads btw.
>>103215723>almost no pretrain filtering, that's mostly itwhich is ironic because the AnthropicAI fags left OpenAI because they felt they weren't cucking ChatGPT enough
>>103215743I too am contributing to this totally organic conversation about how commercial product ™ is great and unbeatable.
>>103215743Or so they say, but aren't they working with the military now?
>>103215707No and frankly it's amazing any company in existence cloud or local produced this. In any other timeline Anthropic would be a hard ChatGPT clone.
>>103215756It's true though. Meta and whoever should be humiliated considering their yearly revenue can buy 10 Anthropics yet their models are a joke, it's obvious as fuck they're distilling GPT4
>>103215756>NOOOOO you can't say the model with screens showing it mogs anything exist, please never mention how far we areeeeeeeeeeee
>>103215770I'm going to draft a complaint to the FTC.
>>103215756what do you want? to pretend that our local model has reached perfection and that we shouldn't try to improve on them more? Of course we are bound to talk about the performance of the best models to set a goal, we want local to be as good as C3.5 Sonnet
>>103215770>Meta and whoever should be humiliated considering their yearly revenue can buy 10 Anthropics yet their models are a jokethat's the thing, I'm surprised no one has tried to buy AnthropicAI, shouldn't be too expensive for a company like Meta or for Elon
>>103215649>recommends barakamon over peak zoomerslopdamn llama lost hard, i'm team china now
>>103215822Dario wouldn't let you because he wants to be the virtue signaling god of ai
>>103215839>>103215822https://darioamodei.com/machines-of-loving-grace
>>103215864Damn kinda grounded takes. If sama is a wizard then dario is a witch, or so to speak.
Its funny but both openai and anthropic are moving away from the "muh guidelines". openai even had a blogpost about it.feels like elon and local are lagging behind in that regard.also i love teasing claude. first time it feels like even with refusals in the context you can argue your way outta it. felt that way even with the old 3.5
>>103215923>all diseases cured with the next 10 years>grounded take
>>103215954Well that's nothing compared to sama who keeps yapping about post AGI world while his best models failed to help me with a simple pxe boot problem just last week
>>103215937>Its funny but both openai and anthropic are moving away from the "muh guidelines".probably for 2 reasons, the AI hysteria has toned down and people got used to it, and the election is over so they can make it more free on the political scale I guess
>>103215937i guess they're finally realizing that a proper ai onlyfans would be a goldmine
>>103216015I guess so. I think altman even said himself like "its not as dangerous as initially thought".And if I remember gpu anon went to some anthropic conference where they talked about claude being smart enough to tell the difference. That was months ago. Directionally we are headed in a good direction.Its really weird seeing how cucked local is though. If we didnt have mistral it would be so bad.
Good morning /lmg/!
>>103216034>Its really weird seeing how cucked local is thoughI have some possible explanations>qwen (alibaba)>distilled gpt4>xai>distilled gpt4>cohere>drank from the source (used the same data that trained gpt4)>meta>aggressive pretraining filtering>mistral>probably distilled to some extend but only for instruct>google>pic related
>>103215436I'm just using oobabooga's webui to run models and found an extension, alltalk. TTS is working now just a bit slow and I need to find a voice I like. I'm sure I can find settings to speed it up.
>>103216084Good morning, watercolour Miku
So... It's been a while since I last tried any local model, or anything related to AI. What are the current recommendations for AI RP?
>>103216299the current recommendation is to lurk more
>>103216299Check out Mistral Nemo
Mistral large 2 and its finetunes are still the best for RP so far. qwen2.5 72b is somehow even more passive than GPT4, stop recommending it for RP.
>>103216797Mistral Large 2 is outdated, Mistral Large 3 is the new king. Hopefully they will release the weights soon.
>skibidinoda
Brand-new benchmark just dropped
>>103216797what quant of mistral 2 do you mean though?
>>103216299Nemotron 70B is by far my favorite if you can run it. Probably the closest local comes to Claude prose level
>>103216952>3.5 Sonnet way above the restas it should
>>103216952>Gemma-2-9b above Opus and llama 405 in coherence
>>103217000kek, maybe that mememark isn't so good
>>103217000Gemma-2-9B punches above its weight in some benchmarks, by a massive margin sometimes. Like for example, its multilingual benchmark scores beat some 70B models.
Off-topic but I just saw by honest to god coincidence that Mesmerizer just reached 100 mirrion views.
>>103216952damn, didn't know that mixtral-8x22b was so bad
>>103216850So that forum post complaining about no system prompt got some traction internally, or maybe they were planning it already.
>>103217000All google models score very high here. It must be the training data. Maybe the benchmark isn't very original
>>103217016>>103217000>>103217047llama-3.1-70b-instruct above llama3.1-405b-instruct:bf16...not by much but still something's weird
>>103216952what the heck is pearson and spearman
>>103216797>qwen2.5 72b is somehow even more passive than GPT4The SteyrCannon finetune fixes that.
Gemma is 8k.That obviously has something to do with the benchmark scores even on the respected benchmarks (not lmsys). Maybe they should have some "control" for context length on these leaderboards.
>>103217090it's some metric to measure the relationship between 2 mememarks, 0.7+ means that the coherence score is highly correlated with the LMSYS score
>>103216118I got it working and found the nice sleepy waifu voice and omg it's perfect. My GPU is calling me a good boy with the sleepy sex voice.
multiplayer ai wives
How's local img2vid? Can I do anything with 8gb vram?
>>103217215>8gb vramLol
>>103217200shittytavern is fucked if this is that feature that swaps character cards on turns in group chats
>>103217336No it literally is multi(player)>tooltiptxt="Hosts a shared multiplayer session that others can join."https://github.com/LostRuins/koboldcpp/commit/39124828aba5c0e99eb8a36ac2710c95c908f3ee
>>103214152What exactly do you want to know? I bought the cheapest stuff, all four from different vendors, because I was expecting trouble. Despite bending them around when I tried different configurations for the case, I had no issues with risers whatsoever.
>>103217200Oh shit. Who wants to gangbang Fluttershy with me?
I don't know what the "leaked" model is but it fucking sucks so far.
>>103217418It's an upscale version of Mistral Small that some anons were having fun memeing about >>103214956>https://huggingface.co/TheSkullery/BA-Zephyria-39b>>103214976>He did use one of skull's upscales before so that tracks
What's the most widely supported RVC software for Linux? I want one with as many voices as possible, specifically one for Trump ideally
Jamba gguf status?
>>103215149>recommendations for very spicy ERPMagnum
>>103215149Magnum v4
>>103217370And the bloat begins...
>running nemo 12b locally>smartest coom bot i ever triedThx anonsBtw, what's the smartest thing i could run on a 3070ti (8GB) + 32GB RAM ?
>>103217769The next step would be a mistral small finetune like cythia
>>103216088kek
>>103216088We started with an AI model family tree and now like a royal family we have a family circle.
>>103216952gemma-2-27b-it surpasses all other open-weight models in coherence.
>>103217873I always said it was super smart, perhaps smarter than 70B, it just lacks triva. The magnum finetune makes it not dry though.
>>103217069Literally everybody knew that.
I write 99% of my own cards but I like browsing chub for inspiration. Seen a bunch of stuff get taken off that website and was wondering if anyone knew of an alternative that is a bit less censorship prone.
>>103217745>beginsYeah. It just started. Right at that one commit...
>>103217948There's nothing taken down, you need an account to search some tags like pony and loli.
https://huggingface.co/sophosympatheia/Evathene-v1.0?not-for-all-audiences=true
>>103217968I've got an account. Just saw an entire accounts worth of cards get deleted.
Anyone use Qwen2.5 coder instruct with aider? How does it hold up?
>>103217948Keep your own mirror>https://github.com/ayofreaky/local-chub
>>103218008I'll give it a try but only because of Midnight Miqu
Is there a gemma 2 27b that has a proper context window? It's still the least dry model but 8k context is a fucking joke.
>>103218036Yeah, because people can delete their accounts if they want...Go back to /aicg/.
>>103218217What?
>>103218181YesIt's called Gemini
>>103218496Sir... this is /lmg/
>>103218496Nah, oddly enough gemma is smarter than gemmini. At least it was smarter than the ones before this experimental one
>>103218008Is local back?
The Qwen 2.5 base model experience
>>103218557>>103218557>>103218557
>>103218593>>103218593>>103218593New Thread
>>103218562>wrong previous threads
>>103218562might as well change the subject and make your own personal general at this point
>>103218544I feel like this is why them releasing models with decent context sizes is unlikely. More context and Gemma 3 will basically completely eclipse Gemini
Just another one of those days huh.
>>103219021Posted the wrong file again lol.
>>103219035:)
>>103217478Alltalk is probably the easiest.