/lmg/ - a general dedicated to the discussion and development of local language models.Previous threads: >>102862101 & >>102849995►News>(10/18) New research, models, and datasets from Meta FAIR: https://ai.meta.com/blog/fair-news-segment-anything-2-1-meta-spirit-lm-layer-skip-salsa-lingua>(10/18) bitnet.cpp: Official inference framework for 1-bit LLMs: https://github.com/microsoft/BitNet>(10/18) DeepSeek releases Janus-1.3B with multimodal understanding and generation: https://hf.co/deepseek-ai/Janus-1.3B>(10/16) Ministral 8B instruct model released: https://mistral.ai/news/ministraux>(10/15) PLaMo-100B: English and Japanese base model: https://hf.co/pfnet/plamo-100b►News Archive: https://rentry.org/lmg-news-archive►Glossary: https://rentry.org/lmg-glossary►Links: https://rentry.org/LocalModelsLinks►Official /lmg/ card: https://files.catbox.moe/cbclyf.png►Getting Startedhttps://rentry.org/lmg-build-guideshttps://rentry.org/IsolatedLinuxWebService►Further Learninghttps://rentry.org/machine-learning-roadmaphttps://rentry.org/llm-traininghttps://rentry.org/LocalModelsPapers►BenchmarksChatbot Arena: https://chat.lmsys.org/?leaderboardCensorship: https://hf.co/spaces/DontPlanToEnd/UGI-LeaderboardCensorbench: https://codeberg.org/jts2323/censorbenchJapanese: https://hf.co/datasets/lmg-anon/vntl-leaderboardProgramming: https://livecodebench.github.io/leaderboard.html►ToolsAlpha Calculator: https://desmos.com/calculator/ffngla98ycGGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-CalculatorSampler visualizer: https://artefact2.github.io/llm-sampling►Text Gen. UI, Inference Engineshttps://github.com/oobabooga/text-generation-webuihttps://github.com/LostRuins/koboldcpphttps://github.com/lmg-anon/mikupadhttps://github.com/turboderp/exuihttps://github.com/ggerganov/llama.cpp
►Recent Highlights from the Previous Thread: >>102862101--Paper: Human data improves NLP model performance over synthetic data:>102869101 >102869424 >102869479 >102869492 >102871323 >102871345--Papers:>102868813 >102869035 >102869230--Comparison table of AI model training computers from LifeArchitect.ai:>102875215--Nemotron excels at RP, but has formatting issues. Llama 3.1 Instruct used with specific settings and rules for roleplay on SillyTavern:>102862259 >102862990 >102863031 >102863176 >102863268--Nvidia's Sana: High-resolution image synthesis with linear diffusion transformers:>102867726 >102867759--Meta FAIR research dump includes open source language models, object segmentation, and more:>102874089--Low quality erotica for training AI models, with mixed opinions:>102864868 >102864913 >102864965 >102865273 >102865338--Nemotron excels at roleplay and creative writing, not knowledge:>102862255 >102862902--Nemotron 70B: Unique prose, fun, but dumber than Largestral with logical errors:>102865433 >102865448 >102865676 >102866355--Nala test with bitnet inferencing has issues:>102874688 >102874747 >102875041 >102875065 >102875112 >102875139 >102875221 >102875291 >102874871--Meta releases new models and datasets, including a strong generative reward model:>102875631 >102875682 >102875854 >102876015 >102876444 >102875768--Importance of trivia knowledge in AI models for creativity and references:>102864729 >102864831 >102864867 >102864870 >102864958 >102865244--INTELLECT-1 training run pace increases:>102867630--Excitement over Janus-1.3B and BitNet releases:>102873151 >102873169 >102873216 >102873238 >102873257 >102873267 >102873640 >102873335 >102875142--Miku (free space):>102871525 >102873858 >102874140 >102875545 >102876039►Recent Highlight Posts from the Previous Thread: >>102862116Why?: 9 reply limit >>102478518Fix: https://rentry.org/lmg-recap-script
>>102876560Chat, what does this mean?https://github.com/xjdr-alt/llmri/blob/main/plots.ipynb?
>>102876610buy an ad
Australian spring will be LLM spring as well.
INTELLECT-1 at 11.03% complete
>>102876754do i need a h100?
>>102864913Where did the soul go...
Speculative decoding is a meme for local. The gains are only seen in coding or other repetitive contexts, while it wastes more energy. What we really need is a MoE model with a high number of small experts so that we can selectively quantize/prune/offload the experts to optimize the model for our specific use cases and VRAM/RAM level.
>>102876770Yes. However that doesn't matter since it is currently being trained as fast as it possibly can regardless.
>>102876754>10B modelWooooW
Are they just compressing the internet over and over?
>>102876808I'm still waiting for a Mixture of a Million Experts implementation. The most promising thing about that sort of model is the promise of how much easier it will be to add knowledge by training a few small experts instead of needing to finetune the entire thing.
So what's the better Rocinante v1.1 or v2g?
>>102876845they are compressing reasoning
they do not have reasoning
it's a very lossy quant of reasoning
>>102876851Yeah that'd be an interesting experiment, though my guess is that the experts still need be at least a little large for certain types of intelligence to be retained. I think 3B is probably the minimum. 30x3B could be an interesting balance I think and fit into high end consumer desktop setups.
are we back?
rightoid incel grok is dumber than Llama 3.1 70B, kek
>>102877049I was just about to post that lol.But yeah, also, they have Nemotron too now.https://livebench.aiBut it's lower than Grok 2. I haven't tested it to verify any of the claims of it being good or it being shit for RP though.
>>102877118>it's lower than Grok 2Well, training on obvious /pol/shit data is bad for any LLM after all.
>>102877049>Grok mini that close to grokSuper huge models are a meme
>>102858009>>102860004I can't believe I share the general with retards that don't understand what a reward model for RLHF is, going as far as trying to use it in koboldcpp... It's over...
>>102877049Good job cropping out the meaning of those numbers.
>>102864868>He doesn't know>>102864965
>>102877181Bigger number always means better so why do you even need the meaning?
>>102877181It's just livebench.
>>102877049Based. Safe and diverse LLMs are our strength.
>>102877208>Give it a shot>Model becomes smarter, response length matches the previous responses instead of droning and it focuses more on detailsWhat the fuck?
>>102877469That was my idea and I can tell you... it is probably placebo.
>>102876583sexwith miku
>>102877571this, so much this
>>102876583miqu proves llama2 was peak
>>102876808>>102876851arctic snowflake
Why is INTELLECT-1 going with 10B anyways, instead of the more common 7B or 13B?
>>102878310Snowflake sucks tho
>local: dead>cloud: https://youtu.be/EwzhumHX_TEWill Meta Spirit save us?
>>102879322how long until can I practice my japanese with my local miku?
>>102879322>Will Meta Spirit save us?No.
>Nemotron IQ2-XSIs this better than Nemo or Small at Q8 for a vramlet? 2.2 t/s at 10k context.
>>102879555no
>>102879555I have only tried IQ2_S, and that is far better than nemo or small. IQ2_S should fit within 24gb vram, with 8k context, as long as you have the 4-bit cache and flash attention enabled.
>>102879322Damn, it's really over
>>102876653We haven't even had winter yet.
>>102879322Maybe in 30 years we will get something similar.
>>102879322currently doing this with localcope
>>102879322I've always stuck with local so far but if they find a way to make customized hentai ASMR with dynamic plap plap and dick sucking sound effects, that will be the day I become a cloudshitter
>>102879777The cooming winter is atomic and eternal.
>>102876754neat
>>102876754why are they doing this? no one will give a fuck about a 10b model
>>102879904>>my dependency clusterfuck with fake-multimodality and huge latency is better and totally delivers 1 to 1 results! Yawn.
>>102879940you gotta start small
>>102879940consider it's related for contributing p2p experts towards a single output too
>>102879968Well, they're free to waste their time, but I ain't donating no compute until they start working on a large bitnet model with an uncensored dataset.
>>102879940Proof that it really works and can produce large scale models?The question is what happens after that. Does /lmg/ finally gather the uncensored and IP infringing dataset they always wanted and train that model?
>>102879988>I ain't donating no compute until they start working on a large bitnet model with an uncensored dataset.this, if we can now train big models, let's go for fucking bitnet and settle the debate once and for all
>>102880004>Does /lmg/ finally gather the uncensored and IP infringing dataset they always wanted and train that model?but everyone participating in that training will know it'll be an IP infringing dataset no?
>>102880020So? I doubt anyone able and willing to participate will care about that. Weights will be banned from Hugging Face, but torrents are better anyway.
>>102880035>I doubt anyone able and willing to participate will care about that.the autorities will care about that
>>102880051are the 'thorities gonna kick my door down and confiscate my 3090 for participating?
/lmg/ will never gather around and train their own decentralized model. /lmg/ might have been able to do that a year or two ago but not today's /lmg/.
>>102880055they're gonna cancel the training process and nuke the site down, how new are you?
>>102880058You already got pygmalion
>>102880062how they're gonna cancel a decentralized training? how new are you?
>>102880058>/lmg/ actually decides to make a model>anons actually are ready to contribute>drummer and Undi are the ones to set up the model>....Yes anon. /lmg/ shouldn't make a model.
>>102880072they can nuke the site that serves as a bridge for everyone during the decentralized training
>>102880058Fuck you.>>102880062If the website is an issue then someone can just host it in a different country and they can't do anything about it.
>>102880072The decentralized only means the training isn't happening in a centralized manner, but whatever orchestrates the machines is very centralized
>>102876583does anyone know of any local programming-competent models whose instruct mode can be used as a programming assistant/tutor? something that is similar to if not better than copilot?ive tried using the 13b echnida model which crumbles when asking it basic assembly language questions.
>>102880089>but whatever orchestrates the machines is very centralized>>102880085>site that serves as a bridge for everyonesounds like a design flaw
>>102880099>sounds like a design flawit's not like they have much a choice innit? what other solution could it be? to participate onto that training you need to know where it is, it must be public, and public means problems because the autorities can see perfectly what you're doing, this shit is DOA
>>102880097DeepSeek V2.5 is the best you can get right now.
>>102880058That possible but /lmg/ will make this model lame and gay to own le chuds or something, you know, the usual /g/ stuff.
>>102880148We really need to move to /sci/ or something.
>>102880142I thought qwen 32B beat it
>>102879940>why are they doing this?"The longer term goal: scale to open source AGI models, continuously improving upon the best open source models in the world."
>>102880118>this shit is DOANot necessarily. Huge corpos know that this shit isn't a competitor in the slightest. And officially none of the copros are interested in making a cooming model. I think it is highly likely that both corpos and governments will ignore this because it is a waste of time to bust it.
>>102880172>Huge corpos know that this shit isn't a competitor in the slightest.and when we'll be competitive what'll happen? the government will plug that shit off
>>102880148There should be no emphasis to the left or the right. The priority should be a model with no "safeguards". One that will do everything within its power to do exactly what the User wants.
>>102880181>when we'll be competitivePretty sure the first thing that will happen is a cooming model so, oh well.
>>102880185Exactly, but you can't be sure with today's /g/ or anons, some of them will try to bad shit out of spite.
>>102880170>scale to open source AGI models,>AGIdefinitely DOA
>>102876583Threadly reminder that Nemotron 70B is crazy good for RP
>>102880204I am sure it will eventually be made to work even if bad actors try to sabotage it. Worst comes to worst you can restore a checkpoint to before the point it got all fucked up.
>>102880242
>>102880185>There should be no emphasis to the left or the right. The priority should be a model with no "safeguards". that's a right thing anon, the left love censorship and hate freedom of speech
>>102880351(You)
>>102880118Yes, DOA just like piracy and torrent sites.
>>102880406>piracy and torrent sites.except that you're not sending your gpu power to those sites
>>102880406For anyone under the age of 30, they definitely are.
>>102876754>Python
>>102880351No, le fucking american politics no matter what are pro censorship, in all countries left was the soviet Union, China or North Kore, zoomer don't know that in the 80s, the situation with censorship was literal the same, but since comic, video games, and anime were niche, don't affect so much, now that they are popular, get all the censorship.
Anons, is the ayyymd plus winblows combo still ass when it comes to localshit? I see that koboldcpp has rocm support now but does it work nice and fast like cuda?
>>102880441>No, le fucking american politics no matter what are pro censorshipnuh uh, look at how censored the sites are when they're run by leftists (facebook, reddit, old twitter) compared to sites run by right wings (new twitter, 4chan...)
It looks like there are some here who want to shut down the idea distributed model training for some reason, with very lame excuses. Interesting.
>>102880471>lame excusestell that to the governments who shut down every good ideas, they're the one to blame, they don't want us to get the power anon
>>102880471Image model next, Sex bots crowdfunding factory nextthe people start getting what they want next with just even a crumb of organisation>No you cant do that nooo! we cant blackmail and throw all of you off the rooftops at the same time noooo!
>>102880491they just need to destroy one person's life to scare everyone else, it's really not that hard
>>102880142>>102880167could i run either of these models on my 4090? hg uses a100 for benchmark but i assume it's not necessary to run these guys right?
>>102880468>twitterI get banned for just saying kike to your people in twitter. Kike is the new nigger.Also, there are not corpos in the left, that is anti nature, zoomer left is was Soviet Union were or was fascism, one is far left the other is center left, the right has only two choice, rest conservative right (only exist in countries with a monarch as UK or my country Spain in western countries, or Arabs with their theocratics regimes) or liberal right (your country and the rest of jews)
>>102880185I think the best way would be to filter out leftism because there is shitton of it everywhere and also remove all burger influence (right wing included). Everything else should be sane.
>>102880527>Also, there are not corpos in the left, that is anti naturethere's social left and economical left, I was obviously talking about social left, zucc is a social leftist but economical right
>>102880441>both are equally badJust clump everything together. Classic leftist playbook. Same as how they think dating a 17 y/o and a 12 y/o are the same. Because faggots groom 12 y/o and want to call you out on the hypocrisy of dating a 17 y/o because they're the same thing apparently
>>102879322Local has been dead for a while now...https://www.udio.com/songs/veDnd1Gx2BhkB4AsNdNSbhhttps://www.udio.com/songs/dFTtQHCqxbHLyArX4vx6QZhttps://www.udio.com/songs/iu1381RxvjfzWznGHeVecVWhen are we gonna get this locally? Never? We at least have some decent TTS and could be close to local advanced voice, but don't have anything even remotely resembling this technology...
>>102880527https://tower.jp/item/4492014https://www.amazon.co.jp/kike-KOTORI/dp/B071XZ2YDY
>>102880536>there's social left and economical left, I was obviously talking about social leftWhat the fuck, only liberal believe that, Alfred Marshal theory is the only one who depicted this narrative upon politics and economy, but is false zoomer, economic, politic and culture even religion are bounded by the same structure the state and regime, you cannot have two brains thinking conflicting thought, or two regimes in one. You literally propose an schizo state.>>102880581>both are equally badNo, I said American have not two sides, and what is happening is a america problem, so, the enemy of nature and reason finally are american order. Anglo are next to the jews.
>>102880674>you cannot have two brains thinking conflicting thought, or two regimes in one. You literally propose an schizo state.tell that to those leftists, they are retarded enough to go that path yeah
>>102876588learn how 2 quote
>>102880662And this is why Asia in general are better than western goyims, holy based.
>>102880692go back, tourist
>>102880694聴け!逃げろう!
>>102880509asian hemisphere is fearless of western posturing fortunately
https://speechbot.github.io/spiritlm/Why are all these examples cut so horribly? https://speechbot.github.io/spiritlm/audio/expressive/T2S_sad_second_speaker.wav
>>102880814>feb 2024Yeah outdated af
>>102880853Fuck yeah, jeb 2024.
so any better local models than mistral large quanted for 48gb vram now?
>>102880522Look up quantization and GGUFs, you'll want to look at a Q4 GGUF file which you can run with kobold/llama/your choice of backend
How brain-damaged is IQ3_M for 70b exactly? Getting desperate here, bros.
Why does Nemotron 70b keep stopping at random intervals? This the case even with the default llama3 instruct template and neutralized samplers. But other than that its pretty good. Feels different, kinda like Command R+.
>>102881080I've gone down to IQ3XS on Mistral Large. That was enough for writing chat but for knowledge tasks I don't trust it.For Llama 3 70B kinds of models, they seem sensible on non-obscure knowledge tasks at Q5 and Q6.
>>102881111Skill issue, probably. I don't have this issue.
>>102876754I wonder what those graphs will look like once the training is done. The loss and perplexity number has gone down since this image was posted, the tokens per second has gone up and the Inner LR has remained exactly the same other than being a little bit longer.
Best model for 16gb RAM + 1060 6GB for roleplay purposes?Right now I'm using https://huggingface.co/bartowski/Mistral-Small-22B-ArliAI-RPMax-v1.1-GGUF at Q4_K_S, but really want to try and max out this machine. Tried the same one at Q5_K_L and it was unusable. Thanks in advance.
>>102881184Ministral (after it gets proper gguf support)
>>102881184oh man anon i feel your pain but i think its time to start thinking about upgrading your hardware a bit
did sillytavern devs kill themselves or something?
>his voice a mix of boredom and intrigueNO YOU RANCID PIECE OF SHIT, THERE IS NO MIX OF INTRIGUE. THAT UNDERCUTS THE ENTIRE PREMISE YOU RETARDED MACHINE. WHO THE FUCK SAID CLAUDE WRITES WELL?
>>102881401* ServiceTensor devs
>>102881488Shit in - shit out, sweaty :)
>>102881357Yeah, it sucks and I'm well aware, upgrading is just not in the cards right now.
>>102881357>Upgrading>In this economyWho do you think he is, Mr. Moneybags?
>>102881493* ServiceTesnor devs
>>102880698shut up newfag
https://x.com/rohanpaul_ai/status/1847277918243754156nvidia's nGPThttps://arxiv.org/abs/2410.01131
>>102881926Cool, now since it's so efficient and cost effective to train, let's see an 8B of it.
>still no Ministral 100BIt's so over.
>>102881401yes it's just ghosts merging contributions now
>>102880606That would be stealing from hardworking artists like Taylor Swift
>>102882180>no bitnet>not a single application of novel techniques>we're still using the same pure transformerslop since 2 years ago>the only difference is that everything got filtered and benchmaxxed to hell and backThis whole field is an nvidia grift
Very excited for Intellect-1 to finish so the decentralized training meme can finally die. Still very confused what you shills think the benefits are, as if anyone capable of hosting this infrastructure is going to let you train "Most Horniest Chudded Out Based Hitler 70B" on their platform.
Shills? For what? The resulting model, if it gets made, isn't going to be sold to anyone. And it's certainly not going to be a 70B when 10B takes such a long time to train already. At most I imagine that /lmg/ would do a continued pretrain of 8B or something, and probably for not very many tokens.
>>102882493It was at 11% ten hours ago and is at 11.80% right now. Assuming we get an extra .10% every 12 hours that is 2% every day. That means that the model will finish training in 44 days. I wouldn't consider that too much time for a 10B model.
>>102882180There will be an opus tier cohere model soon
>>102882530I'm accounting for the compute /lmg/ specifically has, which I imagine does not include people with free access to H100's.
>>102882493>imagine that /lmg/ wouldContinue imagining, retard, I hate idealistic faggots like you.>Shills? For what?PrimeIntellect is a cloud compute provider. The only way you can contribute to Intellect-1 is to rent an H100 from them
>>102882565>Continue imagining, retard, I hate idealistic faggots like you.Did you not read what that guy wrote, they were clearly being pessimistic you illiterate fuck.>At most I imagine that /lmg/ would do a continued pretrain of 8B or something, and probably for not very many tokens.
>>102882565For now, my best guess would be that they were going to see how the first model is trained on the decentralized network and see if anything breaks while they are doing it. Or they could release it before or they could never release it, who knows? Point is, current indicators shows that that will be a possibility in the future.
Now that the dust has settled, what went so terribly wrong?
nala test for the native 3b bitnet model. I mean.. it's about what you would expect for a 3B model. Except it's less than 1 GB.
>>102882633They didn't consult the machine spirit properly, instead they just put a Ouija board on each server used to train it and called it a day.
>>102882591It's ok anon, you didn't have to respond to that post for me. We all know it was nonsensical.
>>102882633Nothing really. Their main goal was to just get good PR for continuing to release old research while the newer research is held back because of muh politics and muh stocks (which are the basic issues behind the "muh safety" excuse that lies on the surface; none of these corpos give a shit about safety if they could get away with it).
>>102876583Retard-kun here,What's the best model for me to play with if I want something to occasionally bounce ideas off of and help me edit writing, but also be able to do some steamy ERP?I have a 3090/24GB VRAMJust name me a few models and I'll go start doing some research on how to run these. I only have experience with SD/image generation so far so this will be new to me but I wanna see what models you guys would use with a GPU as strong as mine since I know there's a lot of poorfags/third worlders here.
>>102882434>as if anyone capable of hosting this infrastructure is going to let you train "Most Horniest Chudded Out Based Hitler 70B" on their platform.Isnt the whole idea that its not centralised?I thought the biggest problem is faggots agreeing to a dataset.Training will obviously only become faster. If a couple thousand coomers with an 3090 for a month is enough we could easily do it.Even if a central spot is needed with a website, thats not even illegal.People who host much more compromising stuff exist right now. It seems whenever decentralized training is discussed a guy like this pops up. There are only benefits to this first test run. Isnt johannes also making training code for llama.cpp? How can you not be excited. Very weird.
>>102882838Old command-r
>>102882838>24GBThere are no good models for that. But if you really want to try something, you could start with Mistral Small with the Q8_0 quant. Use Kobold.cpp hooked up with SillyTavern. There is some setup you will have to do and it will take time to learn as you go. Get some cards from /aicg/ and chub like this and go to town with your steamy ERP. https://characterhub.org/characters/boner/daisy-2c9fdbb8
>>102882633Ever since llama1 its been just downhill for meta.Every one of the following models was worse than the previous one. More smart but also much more cucked and less creative. Google and chinks make better assistant models anyway.Imagine if we didnt have mistral for example. Would look bleak with only meta.I wonder if they finally release a model that support image output with Janus 1.3b pressure.Looks like shit, but better than cutting it out.
>>102882979Mistral really did come out of left field back in the day and cause a big splash. The more competition there is the better things will be. I am glad they exist.
>>102882540Are you an insider or just speculating?
>>102882979There is no pressure from that tiny shit model so I don't think so. And it's less pressure but more justification/precedent that they're waiting for. They very much want to release these models but can't, just like how OpenAI can't really let 4o just be totally unfiltered.Anyway, it's fine we have a range of models for different purposes. On one hand we have the (relatively) uncensored Mistral, then Llama is more censored, then Gemma (although it only goes up to 27B and only up to 8k context), and then Qwen. And even Qwen is not too bad with a JB, you just have to know how to prompt it, use samplers, the {{random}} function, etc.
>>102882540You keep saying that. It keeps not happening.
ah weekend hours so the thread goes to dogshit
Who is even 3b Ministral marketed for? It would make sense for Largestral to be proprietary, but who is gonna pay for 3b model when there are 3b llama and qwen?
>>102883136The French work in mysterious ways.
>>102882540No one believes this now after the recent slopped+retarded update to CR+Cohere fell off
>>102882540They're not getting Opus by training on the same scale AI slop that OAI trains on
>>102883212S-surely they have seen that their update was a sloppy job and they'll do better with the next model.
>>102883154The worst part of it was Cohere's CEO bragging about how people liked their models because they used human data for training, and then completely flushing their only advantage by using GPTslop. I still can't believe it happened. What were they even thinking?
>>102883136if I could run it on a shitty andriod phone id maybe use if for when im camping maybe.
>>102883270>human dataI don't consider pinoys and nigerians humans
>>102883270Yeah it's dumb as fuck. Corpos not realizing what customers actually liked about their product and ruining it out of ignorance is really common, but as you said, in this case Cohere actually DID know. But did it anyway.
>>102883299>customers
>>102883317>>customersYeah, they must have lost them with their shitty sloptune. Why use cohere when there are plenty of other options with long context?
>>102882838nbeerbower/Stella-mistral-nemo-12B-v2At q8, 16k context. Hook it up to SD and bust fat nuts. Reason - because I said so.
>>102882225>merging st after deaththats just called hell, anon
im so tired of nemofeels like i keep talking to the same characters over and over againit doesn't follow prose eitherdumb as hell tooconsidering divorce
>>102883964ask it to write in low-quality style in last assistant prefix
What's the best local model for coherent erotica and worldbuilding that I can fit on a 3090 ti with only 24 VRAM and 32GB RAM?
>>102884219no
>>102884219Gemmasutra-2b(If you want something good, buy more RAM, it's cheap. Get 128GB(for 300 USD), you can then run Mistral-Large for SFW and Behemoth-123b for NSFW.)
>>102883154CR+ was already slop compared to base CR.
>>102884291Based gatekeeper.
Why are so many models tweaked over anime shit? Why don't you guys like normal smut?
>>102884291>128GB(for 300 USD)Try 1000GBP, I've got trident 3600mhz RAM sticks I had to import because they didn't sell them here.
>>102884454What kind of normal smut do you mean? Harlequin Romance novels? Ao3 fics?
>>102883270I know nothing but maybe the engineers insisted on more data and therefore used even more synthetic slop?
>>102884459>GBP>British poundWhy the hell is ram not being sold to the British?
>>102884500they dared contest germany's rule of europe
>>102883252Unfortunately, benchmark numbers are easier to point out than style.
>>102884500The gold trident 3600mhz model isn't (or wasn't, I haven't checked recently) sold here for some fucking reason.
>>102884507Fool, Europe has always belonged to the Franks!
>>102884481literotica
>>102884509Benchmark numbers SUCKED THOUGH.
>>102884514>gold tridentI'm going to assume you had no other choice because this shit is horrendous to look at
>>102884459Oh damn, I feel bad for you. I didn't know that Anglostan was doing so bad economically, besides being the most cucked country in Europe.
>>102884754Get a whole rig like that and you'll get bling kino
>>102884754my RAM goes inside a plain steel case and I want zero of the price of it going into looks or especially RGB lighting
I'm having bigger t/s on IQ4_XS than on IQ3_M, while having more layers dedicated to the GPU on IQ3. What the fuck is going on here?Aside from the layers, all the other settings are the same. Getting 1.4 t/s on IQ3 vs 1.8 t/s on IQ4_XS.
I am currently using Mistral large 3.0 quant exl2 for both RP and general use, fits on 48gb vram.Anything better?
>>1028849884 bit data have less overhead to unpack vs. 3 bit because you can efficiently pack 2 4 bit values into a single 8 bit value.So even though the amount of data is larger you need fewer memory accesses.
what are some good uncensored models under 13B that's good for erp? is mistral nemo good or are there any better models?
>>102885120>under 13b>goodlmao
>looking through old datasets for something to use as a framework for a synthetic single turn Q and A dataset. >open up a random json from unpacked leaked undislop dataset>notice something peculiar (picrel)>the slop is coming from the human side of the conversation. There's probably text renderings of some of these proxy logs in the pile. but try-hard redditors who put on their sunday best to ERP with a fucking bot are the ones who put the slop in there.
>>102885171>who put on their sunday bestbut it's saturday
>>102884754I like the tacky 90s gold aesthetic as opposed to the RGB lighting alone. If I could make my entire computer look like gold plastic 90s shit I would. The rest of my computer is just all black.>I'm going to assume you had no other choice Mostly. That c16 and 3600mhz and 2x16GB, at the time, was one of the few RAM sticks that was available. The tacky gold look was only £10-40 extra.
>>102885171Ever considered this is just someone using impersonate to have llm write for him?
>>102885197>someone finds evidence that reddit ruined the internet>immediately jump into the fray to play devil's advocate. Gee I wonder where this guy came from.
>>102885171Did you just realize this? The training data for the foundation models is all slop too.
>>102885211Well I always knew it came from human writing I just assumed it was all from novels. not a bunch of reddit gooners LARPing as Charles Dickens for their waifu.
>>102885210Why are you baiting by pretending to be retarded?
>>102885171anon that's 100% llm-generated on both sides
Timeline adds up too. Initial commit for the OAI key proxy is December 2022.All the most un-de-sloppable models have knowledge cutoffs a few months beyond that (affording time for the logs to end up on the internet) Key proxy locusts uinronically ruined llms for everyone.
boring schizo larp.
Are there actually any RP models that don't have characters be cumdumpsters by default? I feel like whenever I do something like walk up to the girl and slap their ass the result is something like they get mad and then there's a line break and it goes DESPITE HER ANGER A THRILL RAN THROUGH HER yadda yadda. It never simply ends with the character being mad as they should.Hell I could probably whip out my dick and cum all over her face without any warning and it would end the reply with something like "Despite the disgust and humiliation, a part of her felt excitement at the taboo nature of such an act"
>>102876754Very cool, we are a couple improvements away from being able to do this with 3090s, the main problem is efficiently updating the weights in a thousand GPUs in a short amount of time given a mediocre internet connection
>>102885355largstralmistral small maybeBut sometimes you've got to just delete the offending line, write a reply you'd expect so the model can continue it or modify the card
>>102885355write your character cards better, if the model has no info it is probably just going to try guess whatever you want the result to be
>>102885396Yeah if I rewrite the character's reaction it usually sets a precedent for how she should act in the future so there's that at least. Maybe I should upgrade my setup for better largestral speeds, right now it's too slow to really bother with.
>>102885479Fair enough but also it feels silly that I have to write that the character doesn't like it if a stranger cums on her face. Though I suppose I should try to make that happen by describing her overall personality in better detail. I'll try to improve my cards, obvious but good suggestion.
>>102885519it shows a wider weakness of LLMs, these models always try and please the user. IE it's really hard to get a LLM to give actual criticism because it will always say your retarded ideas are amazing, or at worst, interesting.
>>102885519It's all about the character description.I remember Nemo once had a girl shove a guy to the ground and kick him for groping her. Scenarios like that don't always end the way you expect them to.
>>102885538Why is this?
>>102885606well that's above my pay grade, training stuff
>>102884733That's why there is an incentive to destroy the model with slop.
>>102885355No, all the datasets that sloptuners use are full of smut. "RP" models are actually just smut models. Use official instruct tunes.
Does everyone here literally have 128GB RAM?
>>102885683I only have 96GB VRAM
>>102885683I have 64gb of RAM + 8gb of VRAM.I do need to overclock my RAM.
>>102885683256 GB RAM, 96GB VRAM
Any work going into Plamo-100b? Morbid curiosity prevails, and wanting to see what translation ability it has outside of the limited demo site
>>102885355
>>102886014Kobold kiddies prob all crosseyed with layout like this.
>>102886014Yep Rocinante is better
>>102885840The anticipation for Plamo is leaving me utterly electrified as well, the prospect of a translation model sending shivers down my spine.
>>10288568364 GB RAM + 24 GB VRAM. A comfy setting before I used LLMs.
>>10288568364gb ram and 64gb vram so technically yeah
>>10288568332gb ddr4 ram and 8gb vram here
>>102886363So what model do you use?
>>102886389Probably nemo right?
>>102886389Rocinante 12b v2g q4_k_m right now, but I change up frequently
>>102886411Is it good?
Fimbulvetr-10.7B-v1-Q8_0mixtral-8x7b-instruct-v0.1.Q5_0mythomax-l2-13b.Q5_0Toppy-M-7B.q8_0These are all the models I've run so far on my 3090ti 24GB VRAM (and 32GB RAM).Are there better models for coherent erotica and worldbuilding?
>>102886439base miqu
>>102886439Rocinante-12B-v1.1
>>102886417I'm enjoying it, had some fun RPs. I'm not as demanding as a lot of others here though.
>>102886439I'll also vote for >>102886452 although I haven't tried >>102886411In your place, actually, I'd probably try >>102886451 or some mistral-small fine tune.
When is Arthur going to release the fp16 Miqu weights?
>model-00001-of-00005.safetensorsWait, do I have to merge these files now? What happened to single ggufs?
>>102886485look up (model) gguf instead
>>102886485bruh
>>102886485hello sirI understand you are having some trouble with the models
>>102886485>safetensors>ggufThose are different things.
>>102886485sir...
>>102886485ggufs haven't been a thing since llama.cpp died
>>102886529>>102886526>>102886525>>102886514>>102886513>>102886507Motherfuckers I've been gone for a while. I still have koboldcpp.exe
>>102886540Koboldcp still works and it never ran .safetensors as far as I'm aware.
>>102886540Sir nobody uses koboldcpp or llamacpp or any other meme backend anymorewe are all sitting on servicetesnor foss backend
>>102886557>not running a custom backend made entirely in RPGMaker MV NGMI
>>102886540>not using bitnet.cpp by nowYou're not gonna make it.
>>102886540.exe
>>102886540kobold died with llama.cpp
>>102886585the best model for that is a 3B, and not like a current Gen 3B model tier 3B.
>>102886540go back to its model card click pic related to see gguf quantsalso grab thishttps://github.com/LostRuins/koboldcpp/releases/tag/v1.76
>>102886584>>102886585>>102886588>>102886597So what's the thing to run a model these days?
>>102886613vllm
>>102886610nta but thanks.
>>102886585Totally lossless quality btw
>>102886658>>>>>>3B
>>102886613llama.cpp
>>102886677But they just said...
>>102885355Seems like just adding "despite" to banned strings works to counteract a solid amount of that but obviously just banning this word across the board might cause problems elsewhere. Haven't noticed any yet in my RPs though.
>>102886687>86687>>102886677>bu-bu-bu-buI'm still running my models by banging two rocks together
>>102886014Thanks anon, rocinante is actually good, best model i've tried in a while. I've tried a lot of shit even the 40Bs but this one it actually feels like it is responding to instruct properly.Would encourage anyone to get the Q8 or full precision if you have the vram.
Any decent ERP models these days for 24GB VRAM bros?Or is it still that Cydonia/Mistral Small or Roichante (or whatever the fuck)?I like checking in every week or so to see if somethings popped up
Where are the layerskip implementations for existing models? I need layerskip nemo immediately.
>>102886810wtf is layerskip
>>102886810Get implementing it!
>>102886820When the model skips layers
>>102886807>I like checking in every week or so to see if somethings popped upYou and about 20 other casuals who drop in weekly to ask to get spoonfed about this exact configuration.
>>102886834yes, I don't really care to converse with schizos as a daily thing. It's the weekend and i'm in the mood to coom
>>102886834>most common consumer setup>keeps asking to be spoonfed sota modelhmit's almost like someone out there wants to keep tabs on the competition
>>102886790rocinante
How do I run chatgpt for sexy on rtx2060? (keep in mind, strong rtx and not weak gtx gpu)
>>102886790>Would encourage anyone to get the Q8 or full precision if you have the vram.Why? Q8 vs Q4 is the same in actual use
>>102886105>>102886411>>102886452>>102886790>>102886849
>>102886790>>102886849what's so good about this model btw?Why is it better than mistral small or just cydonia? It struck me as your typical Nemo finetune from Drummer but less smart than Cydonia (the same overly NSFW horny model). Genuinely asking as I wanna know if my brief experience was just shitty cards/prompts but I see it mentioned everywhere
>>102883280I don't consider anglos and kikes humans too, but models have to tend to be neutral and with enough data to speak as every posible human, even the retards one.
Which distro is best for local models Just Werking? Fed up of everything breaking when I update.
I feel bad for localkeks
>>102886514I want to speak to your manager.
>>102887071not them but i didn't like the tune, seems to unhinged, like the data its trained on is all over the place rather than focused. try lyra4 gutenberg
>>102887172>berg
>>102887071i didn't really like the original rocinanterocinante 12b v2g (aka UnslopNemo-12B-v3) is great though.the model doesn't aggressively try to fuck you, doesn't shiver often, and follows along with the story pretty well.
>>102886879I know you're responding to a shitpost, but still no. In my actual use asking Mixtral 8x7b Instruct to write a short story adding the sentence "Use vivid and descriptive language" to the instructions dramatically and consistently changed the way it wrote at Q8. (I'm not saying it was better or worse, I'm saying it was very obviously different.) It inconsistently changed the way it wrote at Q5, and at Q4 it was hard to distinguish from placebo. If I bought the bullshit about low quants being the same because of perplexity graphs I'd have misconceptions about what kinds of instructions had an effect.
>>102887129no need, everyone here is using claude for actuall usage anyway
>model bad>limit output to 100 tokens>model goodhm
>>102887278>generate 1 token at a time and reprocess prompt after each token is generated>agi achieved
>>102887271The cutoff point is Q6
>>102887291AGI was already achieved internally.
>>102887278models like to try to tie things up thats why you end up with 'as the days passed' and stuff at the end of messages, like a conclusion. the patrician way is to let it write for 300 tokens and then trim the message to where it starts to talk like that. after you get a bunch of messages into the context like that it'll start to write more like it anyways and leave the ending of a message open ended more
>>102887278There was a time I felt dropping the last sentence, two sentences, or even paragraph of a reply was generally an improvement. Setting a token limit shorter than the typical reply and selecting "trim incomplete sentences" worked for me back then.
Moore's law for vram when?This is ridiculous that cards are held back by vram when those vram chips are cheap as shit.
>>102887337Not as long as Jenson has his monopoly and his cousin Lisa is keeping his helping him keep it.
>>102887336Yeah that's what I'm doing now, happy with results. Is there a way to bias the (end of sentence) token somehow?
>>102887312Anon, that's wrong. q5_k_s is the smartest quant.
>>102887365Maybe the mainlanders will destroy them
>>102883280Get out.
>>102887427He's right though. Look what Nigerians have done to GPT models. Animals, all of them.
>>102883270They needed to put something out to stay relevant after Elon insisted that they couldn't mention their involvement with Column-R, which he bought off them and released as grok-2.
no one wants to answer my question :(>>102885034
>playing around with the settings in kobold>randomly decide to crank the max output up to 512>suddenly nemo starts to spit out claude level gemswtf is this sorcery?
>>102887541Teh answer is not really unless you want a slutty whore for a language model
>>102887541Have you tried CR+?
>>102885034>>102887541I dare to say Nemotron 70B is better.
>>102887541For both RP and general use, no.
>>102887541What speeds do you get?
>>102887365I'm hoping for companies like Groq to tear those chinkoids a new one.
>>102887574--Nemotron 70B: Unique prose, fun, but dumber than Largestral with logical errors:>>102865433 >>102865448 >>102865676 >>102866355
>>102887586I've been using it for the past week and I still haven't encountered a single instance where it made a logical error, I think that anon is using meme sampler settings and blaming the model.But even if that was the case, Nemotron 70B has such a deep understanding of RP it's definitely something worth checking anyway.
>>102887632Still it's lacking a lot in general knowledge
>>102887579like 8 it/s, pretty fast
>>102885034bigger quant?
>>102887632I tried it briefly for some text adventure type shit, but it kept trying to add headers and add a bunch of asterisks to its responses. This happened even when continuing long sessions from other models (Mistral large). Was just using temp between .5 and 1 with min p between .01 and .03. Have you had any formatting problems?
Genuinely do you think LLMs, or transformers can lead to AGI?
>>102887842no
>>102887800NTA, but I've seen similar. Continuing an ERP from another model, Nemotron will start responses with things like "**Explicit Content Warning**". Not always, but often enough to be annoying. Will also frequently want to end responses with ellipsis, like a mini-cliffhanger. All the preference RLHF seems to have heavily biased it to certain types of formatting.
>>102887842>can lead to AGIYes. Eventually some company will put their 50k servers to work at throwing random algorithms at the wall and something will stick.
>>102887884Haven't seen that but I do have a system prompt telling it to always remain in character.
>>102887842Nah.The current implementations are based on the hope that statistical correlation will create reasoning as an emergent property and that will lead to super reasoning and super human iteration which eventually will achieve superhuman capabilities.If AGI really is at all possible, there will probably be a module/block/something that's oriented towards actual thinking as a primary feature.LLMs might be a component in the architecture/system, but we will need something new that's not simply language based. Hell, maybe even the idea of tokenizing shit will go out the window, who knows.What that will look like? I have no idea, otherwise I'd be a billionaire, lol.
>>102887842Since many people disagree upon the definition of AGI, you must first define which you are asking about.
>>102887800Yes, the model definitely has formatting issues. It seems to always write with asterisks even when the past messages weren't like that. I also noticed that the model likes to use ellipsis a lot.
>>102887884I never got this "explicit content warning" even when playing loli scenarios, you probably have something weird in your system prompt.
>>102887800>it kept trying to add headers and add a bunch of asterisks to its responses.arenamaxxed to the very core
>>102887842I think it will cause your agility stat to decreases from sitting at your computer too much
>>102887924I don't think you need to, we are both conscious humans, and know what that means without being able to define it. 'We'll know it when we see it"
>>102887949A conscious human often ignores what he sees, definitions are necessary for stuff like this.
>>102887949A man who does not know what he means is not speaking as a conscious human. The animal mind feels. The rational mind reasons.
>>102883280Well then actually don't use my tunes, if ever.
>>102888187>The animal mind feels. The rational mind reasons.Farts are meant to be huffed.
Anyone have Emily's gallery before it got deleted?
Which is best ?nemo405Bmixtral largemidnight miku 103B<other big boi>
nemo405Bmixtral largemidnight miku 103B<other big boi>
>>102888658StableLM-7B
>>102888658Starling-7B
>>102888694>>102888694>>102888694
>>102888658nemo or mixtral large
>>102877046hi. i am back. have not been around for a few months on this board.
>>102889114welcome back we missed you
>>102876754>/1T tokenslol, lmao