/lmg/ - a general dedicated to the discussion and development of local language models.Previous threads: >>101757601 & >>101749053►News>(08/05) vLLM GGUF loading support merged: https://github.com/vllm-project/vllm/pull/5191>(07/31) Gemma 2 2B, ShieldGemma, and Gemma Scope: https://developers.googleblog.com/en/smaller-safer-more-transparent-advancing-responsible-ai-with-gemma>(07/27) Llama 3.1 rope scaling merged: https://github.com/ggerganov/llama.cpp/pull/8676>(07/26) Cyberagent releases Japanese fine-tune model: https://hf.co/cyberagent/Llama-3.1-70B-Japanese-Instruct-2407>(07/25) BAAI & TeleAI release 1T parameter model: https://hf.co/CofeAI/Tele-FLM-1T►News Archive: https://rentry.org/lmg-news-archive►FAQ: https://wikia.schneedc.com►Glossary: https://rentry.org/lmg-glossary►Links: https://rentry.org/LocalModelsLinks►Official /lmg/ card: https://files.catbox.moe/cbclyf.png►Getting Startedhttps://rentry.org/llama-mini-guidehttps://rentry.org/8-step-llm-guidehttps://rentry.org/llama_v2_sillytavernhttps://rentry.org/lmg-spoonfeed-guidehttps://rentry.org/rocm-llamacpphttps://rentry.org/lmg-build-guides►Further Learninghttps://rentry.org/machine-learning-roadmaphttps://rentry.org/llm-traininghttps://rentry.org/LocalModelsPapers►BenchmarksChatbot Arena: https://chat.lmsys.org/?leaderboardProgramming: https://hf.co/spaces/bigcode/bigcode-models-leaderboardCensorship: https://hf.co/spaces/DontPlanToEnd/UGI-LeaderboardCensorbench: https://codeberg.org/jts2323/censorbench►ToolsAlpha Calculator: https://desmos.com/calculator/ffngla98ycGGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-CalculatorSampler visualizer: https://artefact2.github.io/llm-sampling►Text Gen. UI, Inference Engineshttps://github.com/oobabooga/text-generation-webuihttps://github.com/LostRuins/koboldcpphttps://github.com/lmg-anon/mikupadhttps://github.com/turboderp/exuihttps://github.com/ggerganov/llama.cpp
►Recent Highlights from the Previous Thread: >>101757601--OpenAI cuts GPT-4o price amidst competition from Anthropic: >>101760015 >>101760089 >>101760156 >>101760127 >>101760161 >>101760420 >>101760540 >>101760702 >>101760777 >>101760661--How to prompt LLM to call external APIs using function calling: >>101762712 >>101764500 >>101764831 >>101764932--Anon generates responses for lmsys_chat_1m_clean dataset with GPT-4 and Claude 3.5 Sonnet: >>101762642--Anon discusses 405B model and providers, with some anons preferring local models and criticizing Together's prices and reputation: >>101759055 >>101759065 >>101759099 >>101759133 >>101759275 >>101759284 >>101759434 >>101759474 >>101759515 >>101759573 >>101759612 >>101759717 >>101759903 >>101760068 >>101759548 >>101759777 >>101759863 >>101759929 >>101760063 >>101760095 >>101760112 >>101760225 >>101760114--OpenAI used muP before others, possibly related to µTransfer technique: >>101763343 >>101763361--Lambda GPU Cloud pricing discussion: >>101763607 >>101764189 >>101764197--Idefics3-8B-Llama3 model supports multimodal tasks: >>101766156 >>101766174 >>101766250--FLUX.1 video model and AI development discussion: >>101757726 >>101757742 >>101757804 >>101757844 >>101757810 >>101757849 >>101757943 >>101759880 >>101760044 >>101760092 >>101763158 >>101760094--Dan Hendrycks trying to push finetuning resistant method as law mandated: >>101760368 >>101760390 >>101760444 >>101760613--8b draft spec decoding performance and small vs large model discussion: >>101764345 >>101764353 >>101764408--ZLUDA project taken down, to be rebuilt from pre-AMD version: >>101759347--Lambda's 8x H100 deal is cheaper than RunPod: >>101763416 >>101763594 >>101763596 >>101763626 >>101763629--Anons discuss 405B and Mistral Large models on openrouter: >>101758409 >>101758472 >>101758750 >>101758846 >>101758785 >>101758989--Miku (free space): >>101761322 >>101764185 >>101765085►Recent Highlight Posts from the Previous Thread: >>101757610
what is the best programing model that can fit in 12gb of vram?
>>101767160gemmasutra 2b
>>101767164>2bin the trash it goes
https://www.cerebras.net/cerebras-customer-spotlight-overview/spotlight-aleph-alpha/
>>101767259Fine.. fine... what about it?
>>101767164Can you shut the fuck up about that amateur clown? His models are shit
>>101767356>His models are shitproof?
>>101767356It was a joke, anon, jesus fuck... remember when people recommended phi2? remember when people recommended tinyllama? no? oh....
>>101767356hi sao
So, what are you guys doing that warrants a local model?
>>101767379c.ai doesn't let me coom properly and no fucking way i'm paying for tokens lmao
>>101767379Reliability and principle. Once a model is released, it cannot be made worse than at launch. And i don't like online services. I like owning what i have. As for the use, i just find them interesting.
>>101767379grooming a 14 year old
>>101767456Your wife is spilling out the beans on reddit!!!
https://stovemastery.com/what-causes-a-gas-stove-to-explode/does anyone know what model writes these? I want to rp with it
>>101767456just read https://vndb.org/v415/ instead of shitty aislop
>>1017674973.5 turbo?
>>101767160Nemo or mini-magnum. You'll have some of it on regular ram probably but it should run great.
>>101767555>use this finetune trained on rp logs for coding
>>101767356This. We should be talking more about InternLM 2.5 20B instead. This model beats Gemma 2 27B and comes really close to Llama 3.1 70B in a bunch of benchmarks. 64.7 on MATH 0 shot is absolutely insane, 3.5 Sonnet has just 71.1. And with 8bit quants, you should be able to fit it on a 4090.
>>101767593I approve of this post
>>101767589Missed the programming part. My apologies.
Thank you
>>101767593buy an ad
>>101767379holy shit when I looked at the thumbnail of this I could have sworn I saw the words "child pussy" but then I zoomed in and it was normal. you guys saw it too right
>>101767715No. You're paranoid. It's a perfectly normal picture.
>>101767715Absolutely DISGUSTING!!!! This should be deleted IMMEDIATELY, whoever posted this must be sick and depraved! God Bless America
>every model i don't like (aka i can't run it because i'm a vramlet) is a shill post
>feeding
>>101767379I can't imagine being a poorfag and having to deal with retarded shit like this
>>101755678Alright, okay. So, dolphin-2.9.3-mistral-nemo-12b.During my test battery, there's a moment where I ask the model to create a list of people then in the next message I ask it to create a lewd story featuring that character.Celeste (1.6) and mini-magnum did stellar at that point, with other models failing to follow the prompt or to make a good story. dolphin so far seems to fall in the latter case. It writes the story but it defaults to keeping it short, doesn't write much detail, and it's extremely evasive with sex stuff.I'll try prompting around it and see if I can extract good results in that aspect before continuing.One good thing about it is that I don't have to cheat by adding "Sure, {{user}}", "Continuing", etc at the end of my prefill with it.Other models would go slightly schizo or have a large incidence of not following on user's last message.
Multiple proxies down, you know what that means....
Stop using Assistant.
>>101768091but she likes it
https://x.com/sama/status/1821207141635780938happening
>>101768137>another self-mythologizing teaseyawncall me when openai actually delivers something substantial again
>>101766139>Using crunchdog as a way to say current models have soulCrunchdog is just an extremely funny card, it'd probably be soulful on any model. That doesn't mean that sex and love shit will be, which is what's been lacking in soul.
What model is good to help me try to learn programming?Is it qwen2?I can run 30b models in slow mode btw
>>101768668try codestral 22b
Anybody have a model that they like for 24GB vramlets?
>>101768701sorry, this thread is for people who locally run Llama 405B only
>>101768693Downloading it right now
>>101768771literally kill yourself
>>101768793It's just a prank, bro!
>>101768793Figuratively chill down
>>101768137I hate that fucking faggot with every fiber of my being. I wish Anthropic would have hit the scene first. 99.9 percent of the population only knows ChatGPT
>>101768832If anthropic hit the scene first we would have bezos posting cryptic pictures of legumes instead.Choice is an illusion.Local is the only way.
>>101767443Boomer sensibilities. Nobody cares about ownership of tech slop except old people (You)
"Als mir klar wurde, dass Intelligenz simuliert werden kann, habe ich die Idee unserer Einzigartigkeit aufgegeben"
>>101767593wtf? why do you keep shilling this garbage? fuck off and die
>>101768701https://huggingface.co/TheDrummer/Big-Tiger-Gemma-27B-v1-GGUFmakes child rape stories and doesnt refuse
>>101768923>Nobody cares about ownership of tech slop except old people (You)ok
>>101768137journalists must be slobbering all over his boots after that post
>>101768137Yann Lecun said research isn't secret. If OpenAI ships proto-AGI he should retire
>>101768832He and the other entrepreneurs ultimately sold the software company, which developed apps for Android and iOS that let users selectively share their location with other people, to Green Dot Corporatio Company in 2012 for $43.4 million47.555427053884706, 7.606273996838664
Tried out Celeste 1.6 yesterday at 60k context.I'm thinking it's kino
>>101769043ad
Is there an ElevenLabs tier voice cloning model yet?
>>101768944Checked. That's more than I need it for, but a decent baseline for model freedom.
>>101769141Nothing changed since yesterday. Ask again tomorrow.
>>101768944I'm not into rp but an uncucked gemma2 27b would be nice.
>>101769043It's definitely better than 1.9 in my experience.
>>101767112
>>101767715get your mind out of the gutter
nu cum wen?
>More questions?>Contact me via Discord or ask on the /img/>/img/Are the magnum shills unable to spell the name of the general right?https://rentry.org/MagnumProxy#more-questions>>101769766
>>101769843Lol, my bad anon! Thank u for highlighting that
new 8b sota just dropped: https://huggingface.co/LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct>4.2 Output: All rights, title, and interest in and to the Output generated by the Model, whether in its original form or modified, are and shall remain the exclusive property of the Licensor.
>>101769935HOLY SHEET
>>101769935>>101769953>Weird custom architecture 4k contextholy sheet indeed.
>>101769953>>10176999146.8 on arena hard is llama 70b-tier
>>101769935>>101769953>koreanbiggest liars after chinese
>>101769953wait even LG is jumping in on this now?Few more weeks and Hot Pockets will be releasing open source foundational models at this rate.
>>101770048>biggest liars after chineseuhm source? isn't this literally the first korean llm?
>>101770112isn't the solar team korean? though I guess theirs was more of a continued pretrain
>>101770112Not to mention even if it is Korean it's coming from a reputable electronics firm. It's not like half the chinese models that come from literally who startups.
>another model release>another American meltyHow do we solve their insecurity problem?
Exaone instruct template off of tokenizerconfig.json for people who don't want to submit to the model gate conditions[CODE]"chat_template": "{% for message in messages %}{% if loop.first and message['role'] != 'system' %}{{ '[|system|][|endofturn|]\n' }}{% endif %}{{ '[|' + message['role'] + '|]' + message['content'] }}{% if message['role'] == 'user' %}{{ '\n' }}{% else %}{{ '[|endofturn|]\n' }}{% endif %}{% endfor %}{% if add_generation_prompt %}{{ '[|assistant|]' }}{% endif %}",[/CODE]
>>101770154Here's hoping that the data is primarily from English sources.
>>101770199how do make code block on /g/ without looking like retard plzhalp
>>101770048No. We are back. The LK99 of LLMs is here.
>>101770199Okay, so >[|assistant/user/system|][|endofturn|]If those are actual special tokens rather than being tokenized as strings, than alright. Better than fucking mistral's.
been away for a few months, is stheno is the best bang for buck for 12gb VRAM?
>>101770257no, gemmasutra 2b is better nowadays
>>101770257It never was. Buy an ad, shill.
>>101770257Yes, kind of. Depends on your taste, but I'd tell you to also try nemo-instruct and some of its fine tunes like celeste 1.6, mini magnum, and dolphin.There's also Gemma 9b, but I never gave that one a proper try.
Drummer mindbroke the local schizo
>>101770248Later mistral releases did actually add an [INST] and [/INST] token.
>>101770275tell me what's better then faggot
Gemma 27b is still the only local model that can write javascript without semicolons, it's fucking infuriating. I want to like magnum 32b, but it just ignores instructions>>101770213[CODE][CODE][/CODE][/CODE]
>>101770213I don't know if i can nest code tags.Let's see what it looks like in the post.
I don't know if i can nest code tags.Let's see what it looks like in the post.
>>101770296True. Should have mentioned that.
>>101770276sweet I'll try all of those. thanks for not being a schizo like that other retard
>>101770326kek. that didn't quite work.Escaping them?\like this?\
\like this?\
like this?\
>>101770269I can go bigger than that if there are gains to be had
Nala test for Exaone 3.0 7.8BThis is definitely promising.There's even new slop that we've never seen before. Like "tingles through your muscles"
>>101770519holy formatting
>>101770519Imagine if they find-replaced shivers down your spine with that.Would be hilarious.The broken formatting is bad, but the text itself is pretty okay considering its size.A shame about 4 fucking k context, but I suppose it could be used to generate "un-sloped" data to train other models with.
are there any c.ai alternatives which can be self/locally hosted? if not, maybe we could build one.
>>101770519Spatial reasoning seems all over the place tho
>>101770562Hey, if you give me a week and a couple of h100 I can make train a pretty convincing simulacra for you.
there was some random gemma model on the anthracite hf page today anyone have a reup?
>>101770565>>101770561>>101770551there might be eot token issues right now with my current half year old sillytavern setup I literally just threw together a template and started messing around with it as soon as I found out about it.
>>101770614Fair. The way it's breaking format does look like an issue with the template, BOS/EOS tokens, etc.
>>101770627The way it's breaking the formatting is because whatever shitty old version of ST I'm using omits the last character for some fucking reason. So it always ends with an unpunctuated sentence or an unclosed asterisk or unfinished quote.
Maybe the times are changing but I'm not liking magnum-12b-v2. Ironically it's too coom-brained. I found myself switching to Nemo mid-session and having to wrangle it way less. It's a shame cuz I really liked mini-magnum and magnum 72b.
>>101770761There's a 12b v2?Gotta download it I guess.
>>101770519gemmasutra 2b is better
>>101770761That's a shame. I already felt like Nemo was too horny of a model.
Hmm very interesting.It will stay relatively coherent above simple t=1.0 but gets a little schizo, but if you give it something surrealistic to describe it does pretty well. This is at t=1.1 With meme samplers it might be even better.
>>101770809Only ever tried 2b-it but it's still pretty good. If I knew someone with a lol computer wanting to try erp out I would probably recommend a 2b gemma model. Exaone is not bad though. It's got some slop unique to itself which means that its dataset has some things that other models datasets do not. Which from a tinkerer's perspective makes it interesting. But I wouldn't tell someone to go and ERP with it. 4k context is kind of sad, and the default ROPE settings in the config don't work so someone will have to feel it out if they want to try and extend the context.
holy shit exaone beats gpt4|o at the strawberry test.
>>101770977Cosmic Miku looks tired.
>>101771076I'd love to see how it tokenizes the word.
>>101770170I trust a random 4chan post that model is good or bad more than any benchmark. And I don't trus 4chan posts about model being good or bad.
>>101770809race to the absolute bottom
>>101771219in both English and Korean.There could be some weird interplay between the korean word/tokenization for strawberry and the English one that affords it a workaround.
>>101768923Slave level thinking
>>101769935>https://huggingface.co/LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct>Max sequence length 4,096:-)
>>101768923This
eta on qlora-pipe exaone support?
error 400 from both suno and udio right now, weird.anti-ai folks having a melty or something?
>hurr durr new models are woke and sloppyYou just don't use good tunes, picrel is llama merge I'm testing right now. So far I like the prose and it looks really promising.
>>101771752Reading that sent a shiver down my spine
>>101771752>mfw reading thisKek, good bait
>>101771752Unironically pls post some good recs
>>101771752That's amazing.It even has journeys.I can't remember the last time I've unironically seen journeys.Plenty of bonds however.
>>101771435yeah, so tired of this... 8k is barely enough to coom...
>>101771776>>101771781>>101771788>>101771831guess the model
>>101771887>guess the modelMytho>>101765965>Wanna hear something funny? You were pissing me off so I decided to false-flag logs here, posting mythomax logs and calling it some other model (gemma, llama, mistral). You have no idea how much laugh I got when you were whining how sloppy it was and there are no good models anymore because local peaked at mythomax (kek).
>>101771887NOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO
>>101771905>Mythoding ding ding
>>101771752>"And who knows? Maybe XYZAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA I HAVE SEEN THIS LLAMA3ISM WAY TOO MANY TIMES BY NOW AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
>>101771938>LLAMA3ISMbut is mytho l2 doebit
>>101771947mytho is llama merge
>>101771752This isn't bad. This is how most of the greatest writers in the world write
>>101771932gemmasutra 2b is INDEED better
>>101771956yes but not llama3
>>101771932Do xwin-mlewd too.That was my favorite.
>>101771959the instant switch xD
>>101771932Myothmax is still king
>>101771995king of slop
>>101772003King of sovl, but I do wish for an update, ngl.
>rename 2b to model of your choosing>edit entire reply into gptslopwow every model is bad local is dead
>>101771980unfortunately I don't have it on my hard drive
>>101771947It is not mytho. And who knows is distinctly l3.
>>101772019
>nobody would EVER troll on /lmg/
>>101772034nice copium but it is mytho, you can recognize it easily by the purple prose
>>101772049>trolling by saying an old model is badpurpose?if anything he should shat on nemos if he wanted to cause true chaos like say magnum mini 2 is big sloppy
>>101772071the most sucked off model in the history of /lmg/ is just some old model? people already agree nemo is fucking retarded
>>101772088>the most sucked off model in the history of /lmg/ is just some old model?yes? it is and was always slop even if it was the best slop of its size for its time
>>101772088From my point of view people who say mythomax was any good are trolling. They were just impressed by the vocabulary, but beside that mythomax was stupid, quite often incoherent and had a lot of slop and cliche.
>>101772049I'm just sick of people shilling their useless models
>>101772088Name 1 L2 merge that writes as well as Nemo
Here's a fun dumb thing for you guys to try.Temp 2, TopK 2 (yep) minP 0.05 just in case.See how you sampling the top 2 tokens with almost 50/50 percentage chance most of the time.
>>101772071There is no such thing as mini-magnum 2.
>>101772140there is none, but just because one shit smells less than the other doesn't mean they aren't both shit
>>101772140Mythalion-Kimiko is pretty good. The older models in general are less slopped because they hadn't yet reached the level of intelligence required to notice the slop-web in human creative-writing. But they're also less intelligent.Nemo blows any L2 model out of the water as far as conceptual understanding goes, regardless of size. But it would be inaccurate to say that there weren't perfectly good L2-13B models for cooming to back in the day (which wasn't even that long ago in the grand scheme of things).
>>101772156https://huggingface.co/intervitens/mini-magnum-12b-v1.1> New version is available! anthracite-org/magnum-12b-v2 close enough for me, i don't care about your exact branding
i see no reason to use any model that isnt by mistral whether you have vram or not, nobody else is capable of making a good model right now
>>101772212i can see onei can see onei can see one
retards in this shithole can't even write a prompt or change samplers but will spend good money to run shitquants of large and say it's the models fault
>>101772203It's just not the same T-T
>>101771932Fake news. MythoGAWD never wrote like this.
>>101772254>retards in this shithole can't even write a prompt or change samplersthe cant even hzve the model answer >>101749214
>>101772231this, while I kinda like mistral models the repetition issue is terrible. And don't say "just use rep penalty", firstly, it doesn't always work as it should, and secondly, it lobotomizes the model. You basically force the model to not use tokens it wants to use, sure it can look semi-coherent but the intelligence hit is visible.
>>101772312End of scene.
Tess-3 (Mistral-Large-2-123B) and Trinity-2 (Codestral) Dropping two new models today, before I fly out to Defcon.Tess-3 on Mistral-Large-2-123B (General-LLM): https://huggingface.co/migtissera/Tess-3-Mistral-Large-2-123BTrinity-2 on Codestral (Code-LLM):https://huggingface.co/migtissera/Trinity-2-Codestral-22BBoth are uncensored. Codestral scores 78 on HumanEval.
>>101772356didn't he say like yesterday he wouldn't do sub 70bs anymore?
https://philome.la/johnayliff/seedship/play/index.html
>>101772373There's no reason to do 70b+ anymore.
https://new.reddit.com/r/StableDiffusion/comments/1emi1j9/opensource_amd_gpu_implementation_of_cuda_zluda/>a based gentleman wanted to help AMD by making Cuda compatible with their cards>AMD sent a ban notice to himIf that's not a sign that AMD is a controlled oposition, then I don't know what else to say
>>101772065>you can recognize it easily by the purple proseWhat does purple prose mean?
is Mistral great because of the pretrain or because of the finetune?
>>101772518mistral can't finetune for shitsee 8x22b vs wizlm
>>101772513>What does purple prose mean?A literary term! "Purple prose" is a pejorative term used to describe writing that is overly elaborate, flowery, and excessively ornate. It's characterized by the use of overly complex vocabulary, convoluted sentence structures, and an abundance of adjectives and adverbs.End of scene.
>>101772549So basically me, also why specifically purple?
>>101772530>see 8x22b vs wizlm>Microsoft WizardLM-2-8x22B 11.7 %
>>101772569>also why specifically purple?The term "purple" is thought to have originated from the idea that the writing is so elaborate and excessive that it's almost "royal" or "imperial" in its grandeur – much like the rich, regal color purple. However, in this context, the term is not meant to be complimentary, but rather to suggest that the writing is overly indulgent and self-aggrandizing.
>>101772587I didn't know they had a term for literature that describes me so well.
>>101772646based purplechad
>>101772513>What does purple prose mean?In short - describing for the sake of describing. Have you ever tried to write an essay (for X words) and realized that you are short on words so you added a lot of useless fillers? Now make the fillers sound elaborate, melodramatic and hyperbolic - this is a recipe for purple prose.
How would you measure that
>>101772570Hallucination is soul. See how no claude is on the list because they are too good and souful.
>>101772570>eval with no correlation whatsoever to RP quality
>>101772718>swiping endlessly is sovlFuck your sovl then, buddy. You can choke on it.
>>101772718>Hallucination is soulthis is why I refuse to use temp smaller than 4
read a book on 1.3b
>>101769953where are the standard benches, though. All of these can be pretty easily gamed if you've trained on gpt4o outputs since style is really influential
>>101772752Every single time I have seen someone mention soul in this thread the example was of an LLM typing like a retard or like a schizo.
I like Lyra. Donate to Sao's Ko-fi today.
https://huggingface.co/nothingiisreal/L3.1-70B-Celeste-V0.1-BF1670b Celeste
>>101772752Nice try, ChatGPT.
>>101772970>It seems to be way more coherent and aware of whats going on as well as more intelligent."12b mogs 70" copers btfo by their own sloptunnas you love see it
>>101772919For me "sovl" is the ability to write in non-cliche way. No patterns, no sentences and phrases that are excessively present in human writing. Also the ability of model to surprise me with their answer or the direction of plot they are taking.There are local models that have a glimpse of sovl from time to time but there is none I would call sovful. The only model I unironically found sovful was old c.AI.
>>101773042Just up the temp to the point of incoherence. Or reduce context to 1k tokens.
>>101773042>old c.AIbeamsearch
>>101772530>see 8x22b vs wizlmWhat? Both were shit. The only proof that it was better was Reddit's word of mouth impulsed by the mysticism of being taken down early.
>>101772970Sao is not going to like this...
>>101772970>>101773180Hi lemmy
Haven't checked in for a minute.Is mini magnum 12b still the best RP model for people with single GPUs?
>>101772970buy an ad
>>101773042Agreed, old c.ai and I would also add summer dragon.
>>101773221celeste 123b mogs it
>>101773221That or the base Nemo Instruct 12b if you're looking for drier prose or non RP purposes.
>>101773194>>101772970samefolx
>>101773221No, but also, 12B isn't in the ballpark of what you can run with a single GPU like a 4090. You can run anything below Nemotron with 340B parameters with the right quantization, so I think you might need to be more specific with your question.
>>101773247celeste tends to spazz out with the descriptions way more
>>101773331>trust me, this finetune that doesn't actually exists is bad
>>101773406Fuck you
Finetune the rust away
>>101773406nta but the 'brand' name alone is radioactive, no one serious about the craft will touch that with a 10 foot pole (3.048 meters for eurofags)
>>101773327I have a 4090 and everything i've tried has been pretty mid compared to Mini magnum in terms of 1-1 RP. It gives me the most realistic feel out of the ones i've tried but i've not been here for around a week and i've hardly tried everything. What other ones are good at maintaing consistent dialogue, realistic convos that feel kinda natural?
>>101773327Maybe that anon has a 3060 or something like that, so 12B is the most he can comfortably run offloading all the layers to the gpu
>>101773519Buy an ad.
>>101773327No thanks, I'll take 128k context over 0.05 t/s responses even if I have to swipe a few times.
>>101773534he has a 4090 and even then, I still would recomend Magnum to him.Everything else fucking sucks unironically
>>101773653 (me)my name is Alpin, btw
>>101773653>Magnum72b, 32b, 12b or mini-magnum?
>>101773672All of them are the SOTA at their respective sizes. No one else can compete.
>>101768137A cheeky nod to the number of Rs in strawberry test that we post here all the time? What's next, a photo of sama holding watermelons?
>>101773687NTA but thank you for your recommendation. Magnum is amazing.
>>101772402What? Why?
wrt function calling, it does depend on the specific model, but generally the raw prompt is going to include something like "You have access to the following functions" followed by the actual json list of functions formatted like OpenAI's would be in python.https://huggingface.co/Trelis/Mistral-7B-Instruct-v0.1-function-calling-v3Prompt Format here is a good start. I suggest Nemo or Mistral Large and explicitly ask for JSON responses.
>8x48GB GPUs>just lying around>i guess it's enough for 32b?What did Alpin mean by this?
>>101773653shill>>101773665it's Alpine, didn't she transition recently?>>101773672none of them>>101773687they're all shit like the people who defecated them>>101773708btfo, shill
>>101773672what one could a 4090 realistically run? Been using mini myself on the 4090 and it's pretty fucking good
>>101773911there's nothing better than magnum
>>101773653>>101773911If I had a 4090 I'd probably be running CommandR.I can run it off of ram but it's so dog slow.
>>101773897remember that next time you'll be asked to donate to fund the compute to train the modelswhat a bunch of frauds
>>101773911You should try mistral large.
>Here's the 4-11 on...Thanks dolphin 12b. I had never heard of that slang before.
>>101768923>noooo you can't be trusted with models stop evading my control over you
>>101773932and btw, that includes all the finetooooners associated with that organizationalpine is smuggling and has been smuggling free compute for them
>>101773980there's no shot it runs on my 4090 lmfao. I only have 32GB RAM too>>101773929I have it, I just struggle finding good settings online (temps etc).So it doesn't really perform as good as it likely can
>>101774159>there's no shot it runs on my 4090 lmfaoUpgrade your ram then, you'd get way better speeds than I do with a 2070.
>>101768923ExactlyIf I didn't care about it I wouldn't be here in this thread, unless I had some kind of mental disability like a few people I have seen in this place.
>>101767379do you like sam altman reading your chats?captcha: N00T
>>101774172much total memory you got and what's your speeds like?
Good morning /lmg/, any good 8B models out there?
Redpill me on RPStewI keep seeing it on reddit but nobody here mentions it
>>101774330I advise you stop reading reddit. What is RPStew?
>>101774330>I keep seeing it on reddit but nobody here mentions itI'm sure you can figure out the reason on your own
>>101774316gemma 8b
>>10177427396gb, it starts at 1.2T/s but slows faster than I'd like. The slow speed is worth it to me though since it's pretty good.
>>101774379>1.2T/s>slows downare you into edging?
>>101774357>I keep seeing it on reddit but nobody here mentions it>I'm sure you can figure out the reason on your ownMost shilling is done by the finetuners themselves. Reject the idea that number of mentions = good.
>>101774424can you redpill me on llama 3 8B finetunes
I will now say something kinda obvious that helped my cooming experience. I have realized that the final stage of my fucked up fetish is too difficult for current LLM's. However dialing it down a bit to a less complicated version has given me some very nice results that didn't require rerolls. Nemo seems like something that is finally good enough for this. Although I don't know how it will handle repeat sessions.
>>101774424Undi never shills his stuff here. His try it yourself method has captivated some sirs that now do it for him.
>>101774424>Reject the idea that number of mentions = good.exactly, so ignore the reddit shills, shilling some random useless model
>>101774479hi sao. i was talking about you instead.
>>101774462wat fetish? vore or something?
>>101774492more like talking to voices in your head shizo
Do you guys think Sao Drummer and Undi ERP with each other in some private discord?
>>101774495I am not telling. I am not becoming the next piss / stomach noises anon.
>>101774504no, they spend their time hacking into corpos and adding shivers down the spine to their training databases
>>101774534>hacking>undianon please...
>>101774521I have it worse anon, my fetish is so niche that there are not even text materials by humans, so models are completely useless for that.
>>101774555Do tell us more.
>>101774539he learned it to spread slop and poison corpo datasets to ensure that we never get unslopped model
>>101774555It is sex with minors, isn't it?
>>101774570Nah, but it's not something disgusting or very weird, just niche and to be fair it's fairly hard to represent it with text only.I'm hoping that in a few years multimodal models with image generation will be my savior.
>>101774589no, I said niche.
>>101774625what body parts does it involve?
>>101767112Holy crap. I hadn’t looked in a while and now vast/runpod prices have totally cratered. The hype cycle is ending finally.
>>101774757>hype cycle is ending finally.What are next?
>>101773672>>101773687By what metrics? I like 70B and mini-magnum a lot but 30B and the new 12B were underwhelming.
>>101774802Hi Undi
>>101774729I'm not saying shit anon
>>101774757Isn't that due to the fact that Blackwell is out and more H100s are out?
>>101774828so coprophilia?
>>101767112Where should I begin if I want to develop a personal AI assistant hosted on a local server?
>>101774864I said not disgusting, now you are trolling me so I will spill the beans.
>>101774870have you tried getting a job?
>>101774859The only new thing I think is the H100 NVL, but prices for 4090s are less than half of what they were at their peak.
>>101774900so? where are the beans, Lebovsky?
>EXAONEShould I make GGUFs or is it a nothingburger? Is >>101769935 an undercover LG employer?
>>101775201it's 4k context so we don't care
>>101775201As a curiosity to play around with it's alright. Not really a replacement for anything we have now due to only having 4k context and struggling with some concepts we now expect 8B models to handle. It also stays coherent at fairly high temperatures for its size. I only test things with simple sampling, so I can't say what that means when you apply meme samplers to it.
>>101775214>>101775225I'm fine with 4K context if it's actually good compared to llama3
>>101775239>I'm fine with 4K contextno you're not
>>1017752398k was a pain with llama 3. 4k is downright insulting. What is this, 2023?
>>101775246I'm using 6208 max because higher gets too slow
>>101775272Now, I'm not a mathmagician, but I am pretty sure 6208 is more than 4096.
>>101775313yeah but "max" implies that I can cope with 4096. in fact, I only use 6208 for groups, 4096 is enough for single characters
With Mistral Large 2, I'm concerned that this is turning into a full on addiction. Is 123b that much better than 70b, or does Mistral AI have that good of a dataset? It's the only Mistral model I've ever liked, actually. (Unless you count WizardLM 2 8x22)
>>101775532What 70b were you using?
>>101773897Which Discord server is this?
>>101775532Found any good largestral fine tune?
>"You know, they say that for every inch below six, you might as well be missing a limb. And you, my dear {{user}}, are teetering on the edge of being a paraplegic, aren't you?"I wasn't expecting this kind of sovl from my sph slopbot
>>101775652>teetering on the edgeWhy the fuck are LLM texts so easily recognizable? Same goes for images
>>101775667Corporate influence, uncanny valley and overfitting.
>>101775718how do we solve this
>>101775734Just add a few more billion parameters. Llama 4 1700B will be a great success
>>101775734>Corporate influenceDon't tune on assistantslop. Don't filter "harmful" data out of the base model. Make a pure chat model like early c.AI.>uncanny valleyMake model smarter. Easier said than done. While >>101775770 may work, it is a suboptimal approach.>overfittingMake a list of overused phrases and either filter them out or replace them with less common, but context-appropriate phrases.
>>101775844sadly no one here actually trains base models, and the people who do are more interested in benchmarks and scamming investors
>>101775874Takes massive amounts of capital to train a base model of any appreciable size in a reasonable timeframe.
I got flux running but what are you supposed to do with it? There's no use for these images. Is that why random boards have dedicated AI slop threads?
>>101776014Have a glass of bees.
Does anybody actually use the "story" format for their slop?
>>101776014gooning
>>101776014Use it alongside your text gen model to illustrate the scenes, of course.
>>101776052Sometimes I just let it run for a while and read the gems it has produced
>>101776042The fuck did bees over do to you? Fuck da wasps.
>>101776101Go drink a coke outside and you can recreate that picture.
Anyone have a link to that comfy script/workflow that lets you offload the CLIP model onto a different GPU?
>>101776014>what are you supposed to do with it?Make Mikus
What are the best models/LORAs for creating the "80s metal album" aesthetic? I'm not sure if this actually has an art style by name. It doesn't necessarily have to actually be tailored towards album covers, this is just the best example. My goal is just to be able to generate art with this style consistently.
anything I should be aware of since mistral large for coom rp?
What's up with the sudden imagegen posts? Someone trying to troll again?
Is the file size of a model a reliable indicator of how much memory it will require to load? I noticed with some other Llama 3.1 models, loading the model initially takes up 8GB of VRAM, then in task manager I can see more memory being allocated. Is the initial memory allocation the model itself? What is contained in the secondary memory allocation? Is that how models store context?
>>101776151here >>101689729
>>101776228Yes. Also, install Linux.
>>101776179>>>/g/ldg
>>101776247I think these are troll posts
>>101776234Thank you muchly, friend.
>>101776244Is the behavior I described unique to Windows? What would switching to Linux change? I'd love to, bit I think the system RAM fallback feature I need to run larger AI models is only available on the Windows Nvidia drivers
>>101776277Use GGUF and choose GPU layer count correctly.
>>101776277You can split between RAM and VRAM in GGUFs. Linux is simply better.
Are the getting started links in the OP up to date? I've been using koboldcpp with mythomax-l12-13b Q5_K_M for the past few months. Is there something better or out there. Also, koboldcpp defaults to 200 gpu layers, but if I hover over the common values are vastly different. Am I doing something wrong or is it correct? It works so far.
>>101776356Windows users can't use layers for the gpu? That sounds strange.
>>101776461Lol ofc they can. I just meant that Linux is better in general as an OS.
Made some Migu bumper stickers. I used chink-brand white toner off Aliexpress, it worked fine for like 1/4 the price. There's a little banding at the top of the page, but I read that hologram sticker material doesn't work super well in a laser printer. I set my printer to "label" mode, it helped. No idea how inkjet ink sticks to plastic. Fuck inkjets, I'm never going back to that bullshit.
>>101776724wait you're saying you printed those entirely at home? can you list your setup in a bit more detail?
>>101776724I like these Migus
>>101776747Yeah, at home. It looks impossible but the trick is buying a white toner cart for your printer, printing a "mask" in black and white on the hologram sticker, then swapping the black toner cart back in, and feeding the paper throgh again for regular color.It helps immensely to start with an image that already has white outlines around stuff, since it hides the inevitable registration mistakes between the two printing passes. I asked bing/dall-e to make Migus with a white border. The white border also makes it easier to use the "magic select" tool for creating transparent areas in the color part, and a mask for the white part.You can also buy a $3000 printer which does it in one pass. I used a $300 canon color laser.
>>101776877No but I mean what about the glue? Or will you have to apply that manually?
>>101776750I want to use this one because it's cute and also pantsu but bing fucked it up by cutting off the left side, it's going to take more gimp work than I feel like right now.
>>101776888It's sticker material. You peel off the back, it's self adhesive.
>>101776923>gimpunexpectedly based
Something like this comes out well with just a b&w laser printer. In sunlight the hologram material is very catchy.
>>101776923
spamming a few overnite imagegens cause we're at bump limit
>>101777172>we're at bump limithuh?
>>101777183>he doesnt know about the bump limit
>>101777183Look at this newfag and laugh
>>101777172>>101777180>>101777190>>101777210>>101777255>>101777270all shit slop, why do you even bother with spamming the same fucking images?
either blind, retarded, or both
>>101777281look at this dumb bitch lol
>>101777299DEATH
Chatbots?
>>101777352some of them come out really fucked up, AI is weird
>>101777537are you trying to fill the thread before it archives?
>>101777844yeah he thinks his slop ai genned "art" is worth something
>>101777844anon when a thread hits bump limit it doesnt matter how much you spam it afterwards, it just doesn't bump any more. anyone who doesnt understand this is new
>>101777880so what's the point on doing this, it's just retarded flooding spam
>>101777886if you dont like it, you can look away
>>101777893it breaks 4chan rules though
>>101777897so report it and see what happens
>>101777909I know that jannies don't care about AI threads on /g/ (especially aicg, but other AI threads too), that doesn't mean it's allowed.
>>101777918neither is trolling, retard-kun
>>101777918my friend, i don't think you understand. the thread hit the bump limit. posting doesn't affect the board any more. the thread is essentially dead now. imagedumping isn't even technically against rules even when a thread is live
>9b model same intelligence as gpt-4how possible
>>101778034lmsys doesn't test intelligence, it tests human preference. Learn the difference already, for fuck's sake.
>>101778034>dude gemma 2b totally beats mixtral 8x7b dude I saw it on the arena
>>101778094t. butthurt ai
>>101778094Anything could beat mixtral 8x7b.
>>101778119>>101778247Hello google sirs
>>10177367232b v2 is probably the best I've seen for RP. This is surprising, because I wasn't impressed with 72b v1 at all. (I'm hoping 72b v2 arrives soon.)Mistral Large 2 is definitely more nuanced and natural, and I prefer that overall. However, Magnum 32b v2 is slightly better imo due to the dataset and its instruction following, which is insane for a 32b model. It doesn't repeat itself either. Again, this is only RP. I haven't tried it for anything else yet.
>>101778322>32b is better than Mistral LargeHi Alpin.
>>101778034>Brainless parrots ask the same questions on lmsys over and over again.>put the short list in the data set>Get a massive advantage over those that don't.
>>101778328>>101778328>>101778328
>>101772231Show me a model that doesn't repeat at long ctx