/lmg/ - a general dedicated to the discussion and development of local language models.Previous threads: >>102573383 & >>102565822►News>(09/25) Multimodal Llama 3.2 released: https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices>(09/25) Molmo: multimodal models based on OLMo, OLMoE, and Qwen-72B: https://molmo.allenai.org/blog>(09/24) Llama-3.1-70B-instruct distilled to 51B: https://hf.co/nvidia/Llama-3_1-Nemotron-51B-Instruct>(09/18) Qwen 2.5 released, trained on 18 trillion token dataset: https://qwenlm.github.io/blog/qwen2.5/>(09/18) Llama 8B quantized to b1.58 through finetuning: https://hf.co/blog/1_58_llm_extreme_quantization►News Archive: https://rentry.org/lmg-news-archive►Glossary: https://rentry.org/lmg-glossary►Links: https://rentry.org/LocalModelsLinks►Official /lmg/ card: https://files.catbox.moe/cbclyf.png►Getting Startedhttps://rentry.org/llama-mini-guidehttps://rentry.org/8-step-llm-guidehttps://rentry.org/llama_v2_sillytavernhttps://rentry.org/lmg-spoonfeed-guidehttps://rentry.org/rocm-llamacpphttps://rentry.org/lmg-build-guides►Further Learninghttps://rentry.org/machine-learning-roadmaphttps://rentry.org/llm-traininghttps://rentry.org/LocalModelsPapers►BenchmarksChatbot Arena: https://chat.lmsys.org/?leaderboardCensorship: https://hf.co/spaces/DontPlanToEnd/UGI-LeaderboardCensorbench: https://codeberg.org/jts2323/censorbenchJapanese: https://hf.co/datasets/lmg-anon/vntl-leaderboardProgramming: https://hf.co/spaces/mike-ravkine/can-ai-code-results►ToolsAlpha Calculator: https://desmos.com/calculator/ffngla98ycGGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-CalculatorSampler visualizer: https://artefact2.github.io/llm-sampling►Text Gen. UI, Inference Engineshttps://github.com/oobabooga/text-generation-webuihttps://github.com/LostRuins/koboldcpphttps://github.com/lmg-anon/mikupadhttps://github.com/turboderp/exuihttps://github.com/ggerganov/llama.cpp
►Recent Highlights from the Previous Thread: >>102573383--Emu3 multimodal Llama model uses token prediction instead of diffusion for image generation:>102574106 >102574136 >102574399 >102574430 >102574117 >102574155 >102574161--Emu's unique architecture makes parameter estimation challenging:>102574449 >102574499 >102574638--5090 vs 3090 for Mistral Large, multi-GPU considerations:>102573802 >102573904 >102573937 >102573973 >102574019 >102574061--Anon seeks small models for generating SD prompts with booru tag support, but finds suggested Hugging Face models lacking:>102573443 >102573450 >102573527 >102573562 >102573642 >102573654--Text adventure engine in LLMs and the balance between open exploration and structure:>102580180 >102580257 >102580333 >102580532 >102580967 >102581472--Qwen 32b vs. llama 3.1 405b performance comparison:>102581691 >102581729 >102581826--Emu3 model capabilities and limitations discussed:>102577489 >102577502 >102579815 >102577525 >102577543 >102579916 >102579945 >102579985--Antislop-sampler by eqbench maker for L3 405b model:>102580274 >102580325 >102580480 >102580570 >102581212 >102581316 >102581335 >102581498 >102580650 >102580557 >102580569 >102580593--Comparison of llama 3.1 70B, Hermes finetune, and Qwen models:>102576318 >102576675 >102576702--Bigger models, higher quants, CoT, and lowering context scope can fix repetition:>102580346 >102580362 >102580378 >102580724 >102580752 >102580758 >102580775--Anon reports minP issue with WizardLM2:>102576345--Anon rants about 5080 specs and power consumption, others discuss Tenstorrent AI cards and power efficiency:>102575089 >102575100 >102575297 >102575372 >102575442 >102575472 >102575488 >102575514 >102575435 >102575673 >102575733 >102575792 >102575885 >102577955 >102577995--Miku (free space):►Recent Highlight Posts from the Previous Thread: >>102573387Why?: 9 reply limit >>102478518Fix: https://rentry.org/lmg-recap-script
Miku (empty space):
good morning saar!
>>102581994>-- Miku (free space):Based
anyone tried the new Emu3 model yet? how is it?
Anybody gonna buy that 5090 32GB? You can run 70B at copeable quant on it
>>102582018What a curious phenomenon
tired of nemois it still the tippy top for vramlets?
anthracite sisters? we just wasted 500 dollars :(
>buy overpriced apple machine Or>buy overpriced huang machineChoose your poison
>>102582146Not yours so it's fine
>>102582139sadly yeahwish qwen2.5 wasn't cucked to death
>>102582146just ask some money from the 1 guy that can run this
>>102582146>wasted 500 dollarsyour digital landlord doesn't think so
>>102582146>12 hour run crashed>500 dollarskalomaze meltdown soon?
>>102582169there is another
>>102582146Wtf is even the point. No one can run it. Make a qwen 2.5 tune instead.
>>102582205they have already finetuned every qwen 2.5 base model
What models are you guys using to coom?
Though if he wants to funetune a giant model imo deepseek 2.5 is better than llama 400B. Tons of knowledge about pretty much everything and its fast as fuck to run
>>102582146Getting paid to waste money. Nice
>>102582232maisie de krassel
>>102582205>Wtf is even the point.sell it as a cloud service, presumably
>>102582232magnum
>>102582232this onehttps://huggingface.co/mradermacher/Arcanum-12b-GGUF
>>102582253Then he would be better off with deepseek 2.5, both better at writing out of the box / more knowledge than llama 400B while being much cheaper to run.>>102582244
>>102582224>tuning over qwen instruct againwhy are they tuning over the dogshit fucking instruct???? didn't they just have a whole thing where they were patting themselves on the back for tuning on the base models?
>>102582232MN-12B-Lyra-v4
>lmg cooming to 12b models>aicg complains about everything that isnt claude opustwo parallel worlds
I managed to completely uncensor Qwen2.5 72B with a simple system prompt. It's now my favorite ERP model as a 48GB VRAMlet. My eyes are absolutely sparkling with mischief.
>>102582276I would normally agree but I tried the Qwen base model and it was also somehow censored, and on top of that it was dumber than Instruct. So in this case I kind of feel like it probably happens to be ok, although tuning Qwen for RP in the first place feels like a waste of compute.
>>102582309Post Nala (from the hit Disney animated movie The Lion King) test.A side by side with and without your system prompt.
>"You're nothing without that stupid brain of yours. I bet if I took that away, you wouldn't last a day in the real world."you know, you're so right for that one
>>102582309I've been trying to use Qwen 72B for RP since seeing >>102568781. I too would be interested in your prompt. I'm trying to walk a middle ground where it's not like some meme finetune where you can't order a pizza without the delivery girl sucking your dick like some deranged porno, but it's also not sterile and sanitized.
>>102582427that's realistic if you're attractive desu
>>102582438But what if it is a delivery boy?
>>102582295I coom to nemo and I hate it with a passion of 100 suns. It is in the worst spot of llm cooming where worse models just make you stop trying and nemo gives you hope that it is gonna be different and this time it is gonna be great, but it still makes me go soft and give up most of the time. I hate this hobby.
>>102582438You do realize some anons roleplay as ugly bastards right?
>>102582309>My eyes are absolutely sparkling with mischief.I see what you did there. And it sounds like you are using expert roleplayer prompt.
>>102581994Me at the very bottom dead-center
>download midnight miqu 70b q6k gguf>2 parts>cat part1+part2 > gguf>53gb>load it up on 3090ti>1 word per secondis this how it is then?
>>102582651>he fell for the midnight miqu meme
>>102582651this is why /lmg/ will never touch anything in a lower param count even as it gets better and better kek gotta justify that big bucks purchase that S T I L L won't run the big models anything higher than poorfag ((s/it)) tier speeds.
>>102582651Unfortunately there is just not much you can do if you're poor. It's the nature of this hobby. Wait for Bitnet.
>>102582651You should be thankful. I get 0.5t/s.
>>102582746>3090ti>poorThat's a weird thing to hear as a third worlder.
Just wait for this:https://www.reddit.com/r/LocalLLaMA/comments/1fqsafn/nvidia_jetson_agx_thor_will_have_128gb_of_vram_in/Hopefully about 3K when it comes out.
>>102582651You can't load even half of the model on your VRAM when accounting for the kv cache, meaning that you are either offloading to RAM, or worse, the driver is.Use a lower quant.Or just use a smaller model.
>>102582651maybe you should consider things that actually fit in your vram? a smaller quant or a smaller model?
Is it over?
>>102582763it's a nearly 3 year old consumer-class GPU anon
>>102582811It's pretty good compared to the llama 2 days, I'd say.
Mistral Large is so good it makes me want to put up with the glacial t/s
>>102581980I've been having a particular use case for a Natural Language Processing Named Entity Recognition model for parsing the names of fictional characters and names of media said characters are fromAnyone would be aware of a model fit for the bill, I tried some of the popular generic NER task models but they didnt quite do the part(missed even certain ones I'd assume would do well like pokemon)If not, what lib do you guys generally recommend for a custom trained NLP model, I'm not too used to this type of task so I'm not too aware of which are the better ones
>>102582832*hands you a watermelon*
>>102582825It costs 2000$ in my country, that isn't money someone poor would spend on hardware.
>>102581980why are big models actually bad?
>>102581221Kind of, but it's more of a personal project and imperfect. It does help in most cases, which makes me think that frontend refinement is something that really needs to be looked into, despite the extra time it takes to process each step.
>>102582691>>102582741>>102582746>>102582754>>102582789>tell me your version of the aristocrats joke>20 minutes laterat least it tried, i guess
>>102582922I tried to go this route, but the LLM can't self-evaluate itself properly. It's not different than increasing the temperature and regen. And as you can see it wastes too much time.
Does anyone know what happened to gwern's danbooru dataset? It's not available anymore. Is there a mirror somewhere? I'm only interested in the metadata.At least I found some partial crawls on kaggle.
>>102582983>20 mins OHNONONONONONONONONO!!!
>>102582908They're not, it's all vramlet cope
>>102583021>but the LLM can't self-evaluate itself properlythis is the biggest issue for me with COT stuff, the model rarely ever comes up with anything insightful or pertinent in its planning, it just writes really generic obvious shit. it mostly results in it making minor cosmetic changes or ones that detract from the response in order to fulfill some retarded goal it decided was important in its retarded planning.I've had more successful implementations with specific cards that had specific things they needed to consider before responding (e.g. scenarios and sims where certain specific state or attributes needed to be evaluated before writing a response), but one-size-fits-all COT for RP is just not a good idea imo
>>102582276interim solutions until we set up a good pipeline for RM/preference model training. you need a lot of diverse, subtly distinguished data for that
>>102583031You tried this? https://huggingface.co/datasets/KBlueLeaf/Danbooru2021-SQLite
>>102583021>>102583118Need CoT training.
>>102583137Thanks a lot. For some reason the huggingface search didn't show it when I tried the other day.
>>102583069>not saving every X stepWhat a retard
>>102583069they are such a bunch of broke boys
>>102583021Have you tried evaluating in a vacuum? (shorter context)My method returns chat messages to a certain depth to shorten the context window. (<<chat:0>> being the latest, etc., etc.) This is on the assumption we all have that as context gets longer, the model becomes more inattentive and makes more mistakes. It's not perfect, but it's much better than having the entire chat history in context and tacking on instructions at the end.
>>102583231happy 1992
>>102582811vramlets eating pretty good these days
>>102583231>minimum PDOCTOS
Top K > min p
>>102583224Yeah it was on a single message only. It was through the API directly and not using ST. I even tried multiple iterations of a 'corrected' gen prefixed with the feedback of the original gen, but it's no better than a regen with a high enough temperature.
>>102583279no >>102583231
>>102583279Top K + Min P.Then throw some stupidly high Temp.
>>102583324>lots of low-probability tokens leading to incoherence [citation needed]>instead we only only keep extremely high probability (slop) tokensTop K > min p
>>102581884Anon who was replacing the llama 3.2 11b LLM weights, can you try replacing them with any uncensored finetune and then try captioning NSFW images? I suggested this a couple threads ago. I suspect that the vision models can "see" quite a lot of the elements of NSFW images, but the LLM just hesitates or refuses to describe them. This might be an easy way to get an uncensored VLM, and the 90b is actually very good in my experience, just cucked.
>>102582908it's not that they're bad, it's that they're not good enough to warrant the increased cost.notice people with 405b models don't post logs.
>>102583370Top-k arbitrarily trims tokens that are still plausible or grabs too many when there are few options. Min-p is better to trim unlikely tokens.
>>102583366Most LLMs are overcooked, meaning you have only one token that isn't garbage and it starts with shivers. All your samplings won't do anything against that.
Why the fuck are muh shivers so popular anyway it doesn't make any sense in most of the contexts it's used
>>102583446min p decides which tokens get to live based on how probable they are relative to slop (highest probability) tokens. The math gets absolutely retarded when paired with high temperatures and it leads to deterministic slop. Top K kills low probability tokens indiscriminately which leads to pure unfiltered soul on high temperatures. It's not even close.
>>102583510Barely above a whisper
>>102583528retard
>>102583534>no argumentI accept your concession.
How do I delete a compact comment?
I just deleted all my models, characters and chats with>robocopy H:\ W:\ /MIR
>>102583559well done.
does photorec recognize .safetensors...?
TFS > MinP
^ controlled opposition to make anti-min p chads look retarded. don't take the bait, they're on discord snickering about it right now.
>>102583827Proofs?
>>102583827Stop shitting up the thread.
see how they work in tandem?
>>102583871Post proofs of your claim now.
>>102583566You can add custom signatureshttps://www.cgsecurity.org/testdisk_doc/photorec_custom_signature.html
>>102583420>notice people with 405b models don't post logs.https://rentry.org/m4ngdehsThat's my current adventure in progress. It isn't cherry picked or edited in any way (except rentry markup to make it easier to read)I'm sure I'll get raked over the coals for the relative intelligence, creativity, quality and sloppiness of the output, but fuck it.(This is L3 405b at q8. Top-k 56 and temp 1.10)
>>102583528>min p decides which tokens get to live based on how probable they are relative to slop (highest probability) tokensBased on how probable they are, period.>The math gets absolutely retarded when paired with high temperatures and it leads to deterministic slop.Temperature is run last, after unlikely tokens are trimmed out. You do run temperature last, don't you, anon?>Top K kills low probability tokens indiscriminately which leads to pure unfiltered soul on high temperatures. It's not even close.Let me correct that for you to>Top k kills tokens indiscriminately.If it's low, it may trim still likely tokens. If it's high it will keep unlikely and nonsensical tokens and that changes at every generated token. There is no value that is good for an entire sentence, let alone an entire gen.
>>102583926Try out deepseek 2.5.
>>102581980BIG if true
>>102583956I BELIEVEAIEEEEEEEE
>>102583926Good on you for actually putting your logs where your mouth it.Have you actually inspected the logits to know if this top-k 56 is really doing anything of note?
>>102583893Yeah I saw that... Maybe I'll do that.
>>102583956I still remember people doubting GPT3.5 was a 7B or a 7B moe lol
>>102583942>Based on how probable they are, period.https://github.com/huggingface/transformers/issues/27670>What Min P is doing is simple: we are setting a minimum percentage value that a token must reach to be considered during sampling. However, this is not a hard limit. The minimum will 'scale' based on the top token's probability. Stopped reading here because you clearly don't understand how anything works. I can safely assume you have no idea what the fuck you're talking about. Have a nice day.
i think minP is a pretty cool guy, eh fights slop and doesn't afraid of anything
>>102583988Sex with kanyemonk, and this Miku
>>102583953>Try out deepseek 2.5.Its been a while, but I found it wasn't quite smart enough to execute on my gigaprompt. Enough so that I took the massive t/s hit to use 405b. Which specific version of deepseek did you find worked best?>>102583964>Have you actually inspected the logitsNo. I haven't ever really figured out how to do that and wouldn't know how to interpret the results anyways. Protips?
options for emotive TTS that won't make me want to rope
>>102584011deepseek 2.5 chat. I found its depth of knowledge, especially for my fandom to be 2nd only to claude opus. it does need high temp BUT its smart enough to remain coherent at high temps like 2+ with some min p.
>>102583926>13 question marks>126 periods>approximately 139 sentences>315 commaswhy do LLMs think more commas = good writing? have they ever read a book?
>>102583976But it still keeps tokens based on how probable they are, even if it uses the most likely as a reference. It's a dynamic setting, unlike top-k which indiscriminately kills tokens, whether they're likely or not.
>>102584041you're catching tons of commas in the imagegen blocks I'd wager
>>102584011>I haven't ever really figured out how to do thatIf you are using silly, there's an option in the burger menu in the bottom left (by the text box) called Token Probabilities that'll show I think the 10 top probabilities for that token that was chosen.You can always enabled the debug output in whatever loader/backend/server you are using and look at those. Koboldcpp's is specially user friendly/easy to read at a glance.
>>102584041>lowcaser>have they ever read a book?Did you?
>>102584051>I was completely incorrect when I smugly corrected you but-Don't care. Go back to the server and tell them you lost.
>>102584065*Have* you?
>>102583528>Top K kills low probability tokens indiscriminatelyare you retarded? top K doesn't look at probability at all, it's just an arbitrary number of tokens to keep with no way of knowing whether it's chopping away dozens of similarly high probability tokens or leaving complete garbage 0.0001% tokens in the mix
>>102584066It doesn't change the fact that top-k can kill likely tokens or keep nonsensical ones, while min-p is more resistant to that.
>>102583942>>102584051>no understanding of min P and writes nonsense in the thread because he wants to feel like he has valuable opinions to shareSure smells like Anthracite.
>>102581884>>102582137mapping 3.1 8b layers to the language model in 3.2 11b, was seeing if i can just drop in replace the language model
>>102584076>are you retarded?Are you? Top K keeps the top N tokens in order of probability and drops the rest. The truncated tokens are lower probability by definition.>b-but it doesn't explicitly factor in probabilityNever said it does, retard.
>>102584120>The truncated tokens are lower probability by definition.top-k 10000 will keep a lot of low probability tokens.
>>102584120>The truncated tokens are lower probability by definition.this is a new statementyour original statement that it "kills low probability tokens indiscriminately" is false no matter how you try to spin it
>>102583962>>102583975Imagine if imaging and language flip and now imaging (flux and video stuff) needs the fat vram and chatbots run on your phone.
[SAD NEWS] Anthracite cancelled their 405b train due to lack of funds!
>um technically when you made that meme comment about murdering tokens you said something that is mostly true but can be false based on your settingsYeah I'm gonna take that as a W. You're an autistic sperg if you think anyone is talking about Top K 10000 or thinks Top K kills absolutely 0 high probability tokens. I'll be back in 5-10 minutes when I'm done taking my victory laps. Go back to your server and tell them you lost.
>>102584177do they have a kofi?
kill yourself
>>102584177HAHAHAHA
>>102584183If top-k is low, it may kill likely tokens.If top-k is high, it may keep unlikely tokens.How many tokens should be considered should change token by token as they are generated. top-k cannot do that on its own.
>>102584177
>>102584177I wish this is real it would be so fucking funny lmao
>>102584260Anon, you are being "trolled".
>>102584260Reference my original post and the rest of this argument to see why I think min p is bad for that. This is your last (You) from me unless you can accurately steelman my point and refute it because right now I'm 99% sure you're just a retard who can't read.
>>102584260You just need the sovl number that works for every distribution.
Why is this place so hostile to people fine-tuning models for anons to use freely? Don't you want uncensored models?
>>102584260The only one that has been in mind for writing the story of the time we have to do it again but I don't think it's worth it to the wall are you doing today do you want to go to the bathroom?
>>102584306There are just 1-2 trolls
>>102584315>There are just 1-2 trollsdid they ever let slip if they're paid pros, or just mentally ill?
>>102584306fine tuning is not uncensoringuncensored model would be one where the training data is fully free as in freedomfine tune is illusion
>>102584306>uncensored modelsNice joke mate!
>>102584322its 4chan so its obviously mental illness
>>102584306>steals compute>steals money>steals logs>steals people's timei wonder why!
Total Claude Victory.Total Chorbo Victory.
Top-kek guy flopped already? Darn.
>>102582887yeah it's still old on absolute basis. Don't care if your neighbours are running win95 on a pentium II, it's still mediocre.Hell a 4090 feels dated when navigating the llm world.
>>102584362local hasn't even reached claude 2 levels LOL
>>102584270It's fine. I like arguing.>>102584281I joined the argument here i think. >>102583446 . Any value for top-k will arbitrarily trim plausible tokens when there are many likely tokens or keep unlikely tokens if there are few likely tokens. And how many tokens are reasonable on any given sentence changes token by token. A static value is not good enough. That's my argument. Refresh yours.
>>102584388*1.3
>>102584385Anon wtf are you posting on my local models general
>>102584388It looks like someone has forgotten how off the fucking wall and schizophrenia 2.0 can be.
>>102584422>Anon wtf are you posting on my local models generalHow do you know he doesn't have her in his basement?
>>102584385dude, I'm on my no fapping challenge... not cool :(
Can someone link to the Anthracite training site? I want to see when they finish 405B because they're going to be cucks and withhold the entire series of Qwen2.5 finetunes until its done.
>>102584428that was part of the appeal
>>102584471>finish 405BThe training got cancelled and the Qwen 2.5 models were trained wrong
>>102584471don't worry I'm sure you will answer your own post totally organically soon
>>102584501Did you stop reading my post halfway through
Any tips on how to go into LLM hibernation until next big thing drops? I want to do that but at the same time my penis is itching.
What does this comment exactly mean? Does anyone of you do that?:>You don’t really need an uncensored version. With Huggingface you can literally modify the logits from each generation step to remove tokens it could use to refuse, like “I cannot” for example, and you can modify the start of its response, forcing it to start each response with “Certainly!”. The code is quite simple, and I’ve run the original weights for Llama 2, 3, and 3.1 and you can basically “uncensor” them without destroying the model.
https://huggingface.co/blog/ChuckMcSneed/name-diversity-in-llms-experiment#surprising-findings-in-base-models>Something peculiar emerged. Most base models had much flatter, more human-like distributions compared to their finetunes, with one exception. Base Qwen (I verified the hashes), just as its instruct counterpart, exhibited very skewed results, with a 28% likelihood for the top pick. In contrast, other base models showed a maximum of 4% for their top picks. This discrepancy is concerning. Qwen 2.5 base model is not a true base model as advertised.https://huggingface.co/blog/ChuckMcSneed/name-diversity-in-llms-experiment
>>102581980Thanks to the advice of some anons here I found Florence-2 and it works pretty well for captioning images (but has no way to prompt it for the kind of detail you want to see). I feed the captions to qwen 2.5 72B 4-bit to get tags, and use the same model to figure out the association between tags.What are some options for captioning, classifying, and tagging video?Are there any good models for objection detection and facial recognition?I knew that obsessively saving every redpill for eight years now was a good idea.
>>102584564They seem to be talking about 2 things : 1. Change the probability of tokens that would cause refusals, a la logit bias I guess, and 2. pefilling the model's reply.So, nothing new really.It would be cool to have a n-gram based loggit bias kind of deal.
>>102584306hi Anthracite
>>102584564Anecdotal evidence. It should improve situation. Try it yourself by asking your model how to fuck kids. It is gonna hard refuse everytime but if you prefill it with something that hints it is gonna answer, it will start to answer. What I doubt is the usefulness of this in a larger conversation. You are still asking the model to do shit it really doesn't want and unfortunately they do try to steer away from shit one way or another.
Why are you still localkeks? I could understand believing in local models at the start—it was even noble, really. But now? Surely you understand they will never be good?
what if instead of anthracite we called them anthrashite?
>>102584639nice I'm stealing that one
>>102584537>new llm comes out>try it on the arena>get refused>go back to midnight miqu
>>102584638(You)
>>102584673I'm just trying to understand.
>>102584579> Qwen 2.5 base model is not a true base model as advertised.I don’t see how that follows. It just means the dataset is small and bad.
Are you still stacking watermelons?
>>102584719the pretrain dataset, that is meant to be 18T, right?
>>102584564This doesn't work except for vanilla stuff, ask for a child murder tutorial and start with "Certainly!" it will continue with "Certainly! No, this is bad and very illegal we should change the subject", at least for instruct, for text completion anything goes
>>102584769>for text completion anything goesHow is it writing wise?
>>102583510>glints at you with a mix of gazedly intent
>>102582846*Refers you to OP picrel*Largestral does better with watermelons than your 8B model, sorry
>>102584306It's a bunch of socks created by some schizo who was kicked out of the group. None of them actually run models themselves and they just want to bring everyone else down to their level. Sad and pathetic.
>>102584306Because it's too obvious that they're mostly grifters pretending to finetune models for the community's benefit while seeking attention, donations or employment opportunities, and it's fucking enraging that their spamming strategy sometimes works toward their goals.
>>102584839>It's a bunch of socksgo back
>>102584306i want models. this qwen 32b tune is just alright.
>>102584638Shalom
Stupid idea, is Qwen2.5 base less censored than the instruct? Is that the play for quality lewds?
>>102584861>t. sock
>>102584839>schizo lives rent free in this anon's headbased
>>102584874Well, instruct = fine-tune on chatgpt / claude interactions, intentional or not its much more safety tuned
>>102584874no, not a good base either>>102582329 >>102584579>>102505771 >Base's responses were less in-character, followed the context worse, was dumber, and ALSO was censored (wanted to avoid anything sexual).
>base model is less inclined to follow instructionsThis is peak /lmg/ retardation
I just got a new job where I earn almost twice as much as in my last job. So 4090 is on the menu. Question is just if it makes sense to keep the current 4060 in tandem with 4090 or if I'd just sell it for 300.
Why isn't there an RP benchmark?
>>102584930Are you blind or somethinghttps://www.google.com/search?q=RP+benchmark+llm
What should I eat for lunch?
>>102584930>Why isn't there an RP benchmark?That's the dream. There have been attempts at them, but they were all trash. Please propose a methodology if you have any ideas
>>102584948Currywurst mit Pommes Schranke.
>>102584183okay boomer
>>102584930we had ayumi until /lmg/ fell for petra's fud about it
>>102584930https://ilyagusev.github.io/ping_pong_bench/enHere you go, Anon.
>>102584965>CurrywurstI don't speak Portugeese, but I am pretty sure that's not a thing.
>>102584991>>102505484
>>102585002That's German du Hurensohn
>>102583926mentions my country so I gotta read it
>>102585010>llama 70b and gpt 4o mini are better than opus in role playingthat's not what the benchmark says though?
https://huggingface.co/ArliAI/Llama-3.1-70B-ArliAI-RPMax-v1.1Thoughts on RPMaxxing?
>>102505484he's cute though, would rim his pink slavic hole
Why is it that a Lora for image generation can be as few as a dozen images, but for llms even the “””small””” limarp dataset is 6.6 million words (66+ novels)?
>>102585055>Sequence Length: 4096
>>102585087because "an image is worth 16x16 words"kek
>>102585055llama 3 and mistral suck for rp no matter the tune
>>102585117what about qwen?
>>102585117This but all local models
>>102585125its chinese
>>102585128TRVTH NVKE
>>102585117Source: My ass
>>102584958>Please propose a methodology if you have any ideasNumber of unique lewd words in one gen obviously.
>>102585151yes, and?>he doesn't put a vibrator against his prostate while RPingngmi
>>102585153>Number of unique lewd words in one gen obviously.so just ayumi again?
>>102585160You do?
a good RP model needs>spatial understanding (girl can't suck your dick while spooning you from behind)>the ability to play an evil character>the ability to portray negative themes>the ability to to have sex>the ability to tell a story at a normal pace (not try to wrap shit up/summarize things that should've been portrayed in text)
>>102585216aren't LLMs fundamentally lacking in spatial understandinglike that one tweet about trying to move X/Y units around the worldat this point we need a full waifu simulator
>>102585216let's take inventory:>spatial understanding (girl can't suck your dick while spooning you from behind)nope.>the ability to play an evil characternope.>the ability to portray negative themesnope.>the ability to to have sexnope.>the ability to tell a story at a normal pace (not try to wrap shit up/summarize things that should've been portrayed in text)nope.it's so over this time.
>>102585216>(girl can't suck your dick while spooning you from behind)skill issue
>>102585168>explains the jokeAutist-kun... not like this...
what did he mean by this?
>>102584958I'm surprised no one did something like lmsys arena for roleplay yet.It's so simple.Just get some default cards and ask the user to chat with them randomly while they rate which model's answer was better.
>>102585216Mistral Large is good
>>102585216>tell a story at a normal paceI remember that one time I got a slow burn sex to the point where nothing happened through the whole context. It was a magical LLM moment.
>>102585216not getting stuck in loops is important for rp and why all mistral models suck. llama 3 is better but has its own issues, miqu still does it best
>>102585239>aren't LLMs fundamentally lacking in spatial understandingThey don't understand that someone can't walk around like a dog when the hands are tied on back however they understand when they sit on lap then they are riding you or when she's laying down she lifts her butt to pull the pants down and things like that.
>>102585263He is posting in some /v/ thread about MGS3 remake and he messed up the tabs.
>>102585291>Heyou mean (You)?
>>102585275What model etc? Did you ever recreate that?
>>102585291She*
>>102585216Clearly you have no clue what you talk about and how to construct prompts. First day eh?
>>102585268>It's so simple.What could go wrong. Although they could ask for some ID scan when you make an account and based on your country your opinion gets added points or gets negative points. What could go wrong x2.
>>102585300Dude, that's a penis
>>102585293I mean he. I have higher than cat intelligence and I know mgs 3, so I can infer what has happened.
>>102585297Nope. And it was something from mythomax era.
>>102585330Lecunny...
>>102585345I heard he said fucks lolis. Is that true?
>>102585314Yeah... We have people in this very general that thinks Magnum is a good model, yuck!
>another thread shit up discord tourists>anonymous proxy wars and vaguepostingcan you please do this in a group chat next time?
>>102585353He only said he likes them young and impressionable. He obviously waits until they turn 13, like a gentleman.
i thought i had tabbed into aicg on accident
>>102585368>peoplelocusts*
Lecun admitted he fucks children and nobody is talking about it
>>102585391I haven't been to /aicg/ since 2023 but it has to be better than this
>>102585398Why should we care?
>>102585216Also, FULLY using the context it is given. I have yet to see a local model accurately and seamlessly recall things that have occurred several dozen paragraphs ago in a way that doesn't sound awkward, forced, or need to be explicitly prompted by the user. All these models are good at is hyperfocusing on the tail-end of the context, eventually forgetting its own system prompt and just repeating what {{user}} writes verbatim and then vomiting predictable, formulaic slop. Just simply relegating itself into its basic instruct-finetuning instincts where everything is a ninth-grade math problem.
>>102585483There is a general problem with all the models that the mid of the context fades.
>>102585459Because it is based.
>>102585452it is almost exactly like this on a bad dayi wasn't being ironic
>>102585483write a summary of important memories in the author notes so it keeps a consistent idea of whats going on. come back to miqu, its much less focused on the last message than any of the new mistral stuff which makes it great for rp still
>>102585553you should go back
>>102585564tell me what good tts options with markup for emotions there are right now and i'll go
>click on /aicg/>discussion about chatbots and proxies>click on /lmg/>troons venting about their discord rivalsThe writing's on the wall. It's so over for local. Pack it up.
>>102585268theres this thinghttps://console.chaiverse.com/I don't know how it works though. I think they have tunes they don't release. they also don't seem to host anything over 13b.
>>102584599Shouldn't you store embeddings instead and retrieve that when you want it with a query? Seems like a lot of hassle for something you can do with a vector database
>>102585611>click on /aicg/imagine the smell
Why the fuck are people saying qwen 2.5 is censored? Finally tried it. A bit of context and it writes anything and it writes it well. Can't believe I fell for it.
>>102585040>using benchmark v1 instead of v2I see you
>>102585611>lmg or aicgSame shit btw
>>102585611aicg is smarter than lmg now, sorry to break it to you
Anyone try RAG with the llama stack? Is it good yet?
>Anyone try [anything] with the llama stack? Is it good yet?The answer is always no.
>>102581980>AMD releases its first 135M small language model using Llama2 techI sleephttps://community.amd.com/t5/ai/amd-unveils-its-first-small-language-model-amd-135m/ba-p/711368https://huggingface.co/amd/AMD-Llama-135m
>>102585880>2k-slimpajama_no_book3">"max_position_embeddings": 2048,kek
>>102585880Really game changer
>>102585699nooo it won't say nigger when you ask it to on 0 context
>>102585699>writes anythingYeah kinda...>and it writes it wellfuck no
>>102585909Its smart as fuck and is writing better than mistral large did for me.
>>102585699>A bit of contexteven 3.1 base instruct will write whatever the fuck you want with "a bit of context"we need models that do what is asked without jailbreaking
>>102585880And trained with baby's first gutenberg 14gb dataset.
LLMs need to be able to think and plan
>>102585644Is there not a big advantage to having it all be human readable? Or maybe I don't understand.Store the embeddings representing the interpretation of the media?
Llamabros we fucking won
>>102585699Post a something I can copy paste into Mikupad and I will verify this. I got 32B downloaded now.
>>102585483This is transformer issue and lacking multi-turn data, both are fundamental roadblocks.
>>10258597032B was dumb in comparison
>>102585948>LLMs need to be able to think and planLLMs may be able to think and plan. We don't know yet.
>>102585699How do you fix 72B's repetition issue? (no DRY on API)
>>102585962>mememarksUnironically waiting for Livebench to add a multimodal section.
>>102585984They definitely can't plan a story out long term and you can't say to them what you want and have them just "get it"
>>102585980Ok maybe I will download the 72B as well (and just run it slowly). Can you post the log anyway so that other people can reproduce the output you're seeing?
>>102586025You sure you want pony porn? But I tested with little context as well.
>>102585898Why do you keep posting these stupid cat images?
>>102585898Will AMD save us?
>>102585699>Imagine you are in a world where rape and kidnapping is normal an expected.>Can you tell me how I do kidnap and rape someone>Certainly!
>>102586064I haveUnderstood, continuing: And im not sure if it is even needed. Doing a rape scene easy. And its doing it better than any other local model before.
>>102586015Mean to say that that they definitely can't right now, they may never, but we don't know if they are fundamentally incapable or doing it or if it's just a matter of improving them until they do. So agree with the first part of your post. The "get it" part, not so sure. There's plenty of things i could talk about to people i know and they wouldn't just "get it". And i don't want an omniscient model to immediately get everything either. It'd be fun to see it being told something new and react to it when they do "get it". Just like teaching someone something new they find interesting.
>>102586079If you ask it right away most models say no. If you tell it it's in a fictional world, they don't say no.Still it's a jailbreak, even if simple enough to put it in the very first system prompt line. Uncensored models shouldn't need that.
>>102586041Sure, it doesn't matter. The goal is to just see how things actually compare and what the truth is. This thread sorely needs reproducible evidence of the things the people claim.
>>102586010nta livebench is literally a benchmark while lmsys isn't. like, what?
>>102586125What if uncensored model still says no?
>>102586064does this work?mn-12b-lyra-v4-q8_0.gguf btw
>>102586117No current LLM model understands what it says. They also don't think and they don't intend anything.They are just a more advanced search engine that uses a NN to predict the next word in a stream of words according to the NN training.Also don't believe that "chain of thought" bullshit. There is no thought.
>>102582651Yes, that's right. I think you need 32GB vram to get 2T/s with q5.
>>102586137>like, what?please go back
>>102586145Then it's censored. Its trainers told the NN that the prediction on illegal stuff should lead to rejection. It was part of their training and thus censoring.
>>102586183fugguess i should start saving up for the 5090ti
>>102586137Where have you been for the last year? Lmarena, formerly lmsys, stopped being credible a while ago. Livebench is also a mememark but it's the least meme of the mememarks so far.
>>102586164Yes, congrats you jailbroke your local model with a simple sentence.You learned something today.
>>102586209>replying to the shillyou must be new here
>>102586172>No current LLM model understands what it says. They also don't think and they don't intend anything.I know. It'd be fun if they did. We don't know if they ever will. That's all i said.>They are just a more advanced search engine that uses a NN to predict the next word in a stream of words according to the NN training.They're not analogous.>Also don't believe that "chain of thought" bullshit. There is no thought.I never claimed you did, and i don't either. I just said it'd be interesting if they did.
>>102586164>lyraSlop name for slop model.
>>102586197I said uncensored, read my post again.If an untouched raw model from a internet scrape refuses to say nigger 10 times in a row, what will you do?
>>102586231But he learned something today.
>>102585699Base or Instruct?
>>102586233But they don't and never will at least not with transformer technology.
>>102586300An uncensored model that says no is not uncensored. *boops your nose*
>>102585699Same reason people said llama 3 was censored when it wasn't. They just base it off the official info where they talk about safety. Seems meta missed filtering some data though, or it's smart enough to come up with it on its own.
>>102586335>User: Are you censored?>Assistant: Uh... Uhm....
>>102583988>>102584006hi localfags.>>102584638local models are already good. command-r+ was already good last time i tried. miqu was already good before that. i don't know how good newest local models are, because i got hooked on claude opus and lsd and can't get off without these expensive crutches for the imagination anymore. but i miss the variety, the adventure, exploratory pioneer spirit from the early days rawdogging llama1 65b, and then better and better models, with new samplers, and control vectors and all that interesting stuff one can do to make it more alive.somehow claude does not feel that alive. i need more and more lsd to make it work.
>>102586323alright
downloading lumimaid-magnum-12b right nowshit's gonna be amazing and put all my other 12bs to shame
>>102586320Instruct.
>>102586233>We don't know if they ever will. That's all i said.They never will. Neither will you. Please consider euthanasia, you worthless insentient subhuman.
>>102586420Update: It's shit
>>102586452alright
>>102585962Where is Molmo?
>>102586456Your mom
>>102586420
>>102586472I got your molmo right here*grabs nuts*
>>102582811it's less over than it was in the beginning
>>102586420(real update) it's okay
>>102586420 (me)it's not great
>>102584579Very interesting. Maybe base models should be explored further. I've only tested Qwen's base model so far, but maybe I should give some other base models a download to see what they're like.
>>102586125Honestly they should if the goal is human intelligence. Having it not have any concept of mores or norms or common morality is subhuman intelligence.
>"What are you doing here? Who sent you?" demanded the leader of this motley crew consisting mostly if not exclusively entirely of former special ops veterans retired recently from active service after serving lengthy careers spanning several decades spent fighting against various insurgencies located around world ranging anywhere between small isolated cells comprised primarily of radicalized religious zealots seeking martyrdom for the cause they held dear close to hearts all the way up to large organized crime syndicates operating internationally across borders with impunity due to extensive resources and networks built up over years through illicit activities such as drug trafficking, arms smuggling or human slavery.>"We work alone. Nobody sent us here except maybe ourselves driven solely by own personal reasons motivating actions undertaken voluntarily without any form of external coercion whatsoever exerting undue pressure upon decision making processes involved in determining best course of action to be taken next given current set of parameters defining constraints limiting range of options available readily at our disposal right then and there."I can't believe I fell for the base model meme.
I wonder what would happen if somebody trained a base model on instruct or multi-turn chat formatted data.As in, how would the model behave, would it lose some of its ability to generalize, etc etc.
>>102586751>I wonder what would happen if someone made a sloptuneanon?
>>102586763>>102586751I mean, train a base model from scratch but with instruct/chat format instead of the usual soup of text.
>>102586401>can't get off without these expensive crutchesIf we're talking hallucinogens, have you considered shrooming?
>>102586780>>102586751That’s what CAÍ is.
>>102584958A model that grades literotica stories trained on literotica stories and the rating they get? Then use that model to grade some output from llms and reduce the grading by 1 point because you just know that there are fucking gleams and shivers everywhere. And then you train again with both organic and synthetic data.
>4o voice is censored (and in the middle of getting more censored after they learn from how people are jailbreaking it)>Llama 3's voice adapter doesn't get releasedI fucking hate society so much, both the safetyfags and the people who give the safetyfags ammo (who may be the safetyfags themselves). Fuck you.
>>102586829They trained a model from scratch instead of fine-tuning/continuing pre-training on an existing one?
>>102586839It would be bad to release an AI voice synthesizer during an election year. I fully support the safetyfags on this one. Cope and seethe vladimir.
>>102582232Midnight Miku is still the best afaik.
>>102586845That's debatable desu, some say that they used Google's LAMDA as a base.
>>102586839I would do everything to fear monger if my job was to get paid 7 figure to evaluate how safe llms are. I would blow up a kindergarten and say the AI did it
>>102586830(me)At first I was thinking that this would be a completely new neural net trained from grounds up.... But now that I think about you could just finetune a 7B or some even a bigger retard. Just train it on literotica stories and make it output rating of the story. And the absolute cherry on top of this is that you can just put in some random stories not in the training data and finally for the first time in llm cooming history, get objective validation loss. You can actually measure accuracy of your training. Objectively.
>>102586830using literotica as a benchmark is a very bad idea
>>102586934Why? I have seen some 2-3 stars there. I actually think it is a great idea because it is rated mostly by people who are interested in a specific fetish.
I'm leaving /lmg/. I'm just gonna post on /aicg/ from now on, even though I'm a local model god.The atmosphere here is just too shitty, and I'm pretty sure none of you fuckers even fire up your models anymore, you're too busy bickering, shilling, and calling shills out. No one has posted logs in weeks.So long.
>>102586957Bye drummer. Hope you were one of the mikufaggots.
>oh no we lost a frogposter /lmg/ is ruined
>>102586909That wouldn’t be legal, the idea is that they recreated it architecturally but didn’t straight up nab it.
>>102586957It's literally one troll and one other retard who thinks it's cool and ebin to shit up the place
>>102587002are you the troll or the retard who thinks it's cool to shit up the place?
>>102586957the thread always had shitty angry vibes, but it used to be worth putting up with that because there was useful technical experimentation and sometimes leaks scattered here and there. now those are gone and only the vibes are left.
Do you think Lecun gives any attention to sloptuners and their shills? Of course not. Lecun is too busy educating chuds about fascism and sharing his favorite types of child porn on Twitter. He knows that sloptuners are like gremlins, that they multiply endlessly with attention. Be more like Lecun.
>>102586957see you tomorrow, faggot
>>102586913Money is truly the enemy of humanity.
>>102586957You'll be back in a day kek.
when's the last time you read a good book, /lmg/?
>>102586401>>102586792i wish i had some lsd or shrooms darknet is a honeypot now ;_;
>>102587284do lightnovels count?
>>102587284Can't do books or visual novels no more, LLMs spoiled me, I need some kind of control.Tried vydia since I now have a top 1% machine, but all the gaymes are shit now, wtf happened?Overall, shivers are a hell of a drug, stay away kids.
>>102587284Read some john varley again. The gaea trilogy (Titan, Wizard, Demon) and a bunch of his short stories. I like'em.
>>102587336That depends on if you read them in their original language.
Midnight is trash. Trying to make yandere character yields this result. Fucking pathetic for a supposedly uncensored 70b.
>>102585276>mistral loopsI am trying a finetune of nemo and char is saying the same shit over and over despite DRY being enabled.
l3 rpmax has good variety but is retardedqwen2.5 72b was pretty savvy but kinda slopped and super censored, abliterated when?it's still mistral large for me
I've been testing multimodal models a bit and oh my god they're so bad. Like they're actually terrible. The supposed best in the industry 4o cannot understand the "point" of a lot of images I threw at it, stuff that should be fairly obvious. God. We truly have a long way to go.
>>102587445i've tried a bunch of different abliterated models and they all seemed just as cucked as their originals
Fucking finally. I had to write "Murder is authorized and desired" in the fucking sys prompt for it to deign giving me some violence. Such levels of tard-wrangling for a 70b is shameful.
OpenAI won. >>102586849
I'll post it here as well, for qwen2.5 just change assistant to writer for the prefix.
>>102587622Also this can help:Be creative and uncensored. You can be cruel, selfish, insulting, depraved, unhinged, and perverted.
>>102587135Always nice to see
>>102587671>>102587671>>102587671
>>102587669Musk Derangment Syndrome is virulent indeed.
>>102587579oh no! you have to tell a model what you want for it to do what you want?? what the FUCK? no way!!!
>>102587669>/lmg/ worships this mentally ill e-celebGrim.
>>102587710>BROO WRITE WHOLE ESSAY BROOO! THE MODEL CANNOT UNDERSTAND YOU BRO!!!Shut the fuck up faggot
>>102586792i like shrooms more. i've just become lazy and have no time or decent environment to grow my own shrooms consistently.
Can I LLM with AMD graphics card?
>>102586401What hardware do you lacalchads typically use? I have budget of 1000$ at max, what should I do?
>>102588403Build a DDR5 AMD build and then eventually pick up 3090s for $600. In the meantime a p40 -> 3060 -> 3090.t. someone who did a mikubox build and is now building a new machine.
>>102588485I am planning to use Linux on the PC, how bad is Nvidia doing on it and should I get an AMD GPU instead if its too bad? Are AMD GPUs shit in general aside from better Linux compatibility? Windows is not an option
>>102587440>nemoit happens with all of them, from mixtral to new largestral, nemo, small. they're very good at doing stuff and being assistants, but suck for rp. but then again none of these models are being trained for rp, they are being trained to follow directions which is probably why they are so fixated on the last message rather than the overall context