/lmg/ - a general dedicated to the discussion and development of local language models.Previous threads: >>101567223 & >>101560013►News>(07/24) Mistral Large 2 123B released: https://hf.co/mistralai/Mistral-Large-Instruct-2407>(07/23) Llama 3.1 officially released: https://ai.meta.com/blog/meta-llama-3-1/>(07/22) llamanon leaks 405B base model: https://files.catbox.moe/d88djr.torrent >>101516633>(07/18) Improved DeepSeek-V2-Chat 236B: https://hf.co/deepseek-ai/DeepSeek-V2-Chat-0628>(07/18) Mistral NeMo 12B base & instruct with 128k context: https://mistral.ai/news/mistral-nemo/►News Archive: https://rentry.org/lmg-news-archive►FAQ: https://wikia.schneedc.com►Glossary: https://rentry.org/lmg-glossary►Links: https://rentry.org/LocalModelsLinks►Official /lmg/ card: https://files.catbox.moe/cbclyf.png►Getting Startedhttps://rentry.org/llama-mini-guidehttps://rentry.org/8-step-llm-guidehttps://rentry.org/llama_v2_sillytavernhttps://rentry.org/lmg-spoonfeed-guidehttps://rentry.org/rocm-llamacpphttps://rentry.org/lmg-build-guides►Further Learninghttps://rentry.org/machine-learning-roadmaphttps://rentry.org/llm-traininghttps://rentry.org/LocalModelsPapers►BenchmarksChatbot Arena: https://chat.lmsys.org/?leaderboardProgramming: https://hf.co/spaces/bigcode/bigcode-models-leaderboardCensorship: https://hf.co/spaces/DontPlanToEnd/UGI-LeaderboardCensorbench: https://codeberg.org/jts2323/censorbench►ToolsAlpha Calculator: https://desmos.com/calculator/ffngla98ycGGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-CalculatorSampler visualizer: https://artefact2.github.io/llm-sampling►Text Gen. UI, Inference Engineshttps://github.com/oobabooga/text-generation-webuihttps://github.com/LostRuins/koboldcpphttps://github.com/lmg-anon/mikupadhttps://github.com/turboderp/exuihttps://github.com/ggerganov/llama.cpp
►Recent Highlights from the Previous Thread: >>101567223--Mistral Large 2 performance and open source models: >>101568726 >>101568758 >>101568821 >>101568864 >>101568759 >>101568762 >>101568793--Groq Inc. tweet compares Llama 3.1 70B, GPT-4o, and GPT-4o Mini in Street Fighter gameplay: >>101569802 >>101569864 >>101569849--Multimodal AI capabilities and expectations: >>101568555 >>101568568 >>101568570--Anon releases mpt-30b-chat q8 GGUF quant with faster inference: >>101567424 >>101567470 >>101567503 >>101567560 >>101567596 >>101567615 >>101567628 >>101567690--Vram requirements, cpumaxxing, and GPU acquisition strategies: >>101567467 >>101567667 >>101567716 >>101567788 >>101567818 >>101567891 >>101568000 >>101568043 >>101567882 >>101567905 >>101567921--Sam Altman's opinion piece on AI's future and OpenAI's challenges: >>101570031 >>101570067--Q3_K vs Q3_K_L: >>101568052 >>101568076 >>101568118--Legal consequences of AI-generated CP and privacy concerns with OpenAI: >>101568949 >>101568977 >>101569002 >>101569034 >>101569056 >>101569217 >>101569239 >>101569529--Gemma 2 9b and model reviews and recommendations: >>101569685 >>101569762 >>101569781 >>101569794 >>101569757 >>101569786--GPUs and RAM for Mistral Large: >>101569885 >>101569896--Factors influencing VRAM amounts on consumer GPUs: >>101568585 >>101568660 >>101568718 >>101568704 >>101569337 >>101569360 >>101569469 >>101569518 >>101570192 >>101569599--Best NSFW model for 12GB VRAM: >>101568547 >>101568552 >>101568564--Affording super AI cards: >>101568355 >>101568376 >>101568377 >>101568407 >>101568426 >>101568575 >>101568399--Mistral NeMo 12B sampler settings and instruction following: >>101570059--Mistral Large preset: >>101567703--Anon shares a potential fix for Nemo repetition issues: >>101568590--Miku (free space): >>101569616 >>101570324 >>101571277►Recent Highlight Posts from the Previous Thread: >>101567235
waiting for cohere
waiting for agi
using largestral
>no major model drop todayIt's so over
using kobold and getting my nuts slobbered by waifus (i have numerous)
>>101571373>literally each entry is both somehow bloated with irrelevant replies and missing replies in a reply chainGrim.
>>101571408They didn't release it today to one-up Meta and Mistral. What if it's not as good as we hoped?
>>101571430surely that means we'll get two tormoow
Can mistral large be merged with COPY? I downloaded q5 and merged it with llamacpp gguf split properly and it worked and all that, but then i wanted to try q4 to compare the speeds, and for some reason it wont merge properly, tried 5 times, output file comes out smaller than parts and wont launchinb4 why merge, I wanted to use it in kcpp for convenience
>>101571441you can use split files in koboldcpp
>>101571440Have they ever released on a Friday?
>>101571455Groq was released on a Friday I think.
>constant instability during trainingGuess I'll update...Wish me luck........
>>101571494I thought we were talking about cohere
>>101571508Oh well, I thought you were talking in general
Yeah, that's it for this week but there'll be three big releases next week. One of them will be by a very surprising source.
>>101571531Applebros, we are going to be so very back!!
Largestral... 0.4t/s... Comfy...
>>101571531that's bullshit but i believe you anyway
>>101571504NOOOOOOOOOOOOO
>>101571531amazon...
>>101571531I don't think that's bullshit, but I'm not believing it.
does nvlink speed up inference when using tensor parallelism? or is there still not much data being transferred between cards?
I just got xtts up and running. Are there any archives or repositories for voices samples, like Chub?
>>101571531I don't think I would be surprised by a PornHub LLM.
>>101571658NVLink should help quite a lot with tensor parallelism.I have never built a system with it myself but I've received a user report saying it makes a large difference.
>>101571531NovelAI...
What speed are people getting 4x3090 and Mistral Large?
You'll be able to film movie skits that look real using video
>>101571698Why not make your own? You need just a few seconds/minutes, right?
>>101571747I mean yeah but I'd love to just have a convenient library of any character imaginable like Chub does.
Does anyone have an estimate of when Llama will be made available with vision features? I estimate in half a year maybe?
>>101571704How large we talkin'?
>>101571791Are you in a hurry? Let's say by Nov 21st... yeah... that sounds right...
>>101571704thanks for the input. i have an a6000 and two 3090s and am considering replacing one of them with another a6000 since i can get it locally for cheap-ish. figure if there's any possibility of it speeding up inference for models that fit within the two a6000s, i may as well pick up a bridge for another $200.>>101571738with 1x a6000 + 2x 3090 for the same amount of VRAM, on mistral-large-instruct-2407 5bpw on exllama2 i get:>Metrics: 264 tokens generated in 37.01 seconds (Queue: 0.0 s, Process: 19 cached tokens and 4597 new tokens at 543.18 T/s, Generate: 9.25 T/s, Context: 4616 tokens)>Metrics: 242 tokens generated in 95.18 seconds (Queue: 0.0 s, Process: 1178 cached tokens and 30830 new tokens at 508.06 T/s, Generate: 7.02 T/s, Context: 32008 tokens)have not tried other formats with this model yet. i imagine 4x 3090 would be similar-ish in speed, since my understanding is that while the a6000 is a little slower (lower memory clocks/bandwidth), splitting layers across a fourth 3090 might introduce more overhead.
>>101571822I don't remember the specific numbers (and it was months ago anyways) but (for llama.cpp with --split-mode row) it was basically the difference between effectively unusable and faster than a single GPU.
>>101571831I wanted to buy extra RAM just for the new llama models but now that they don't have my favorite feature I'm retiring the plan.
3.1 70B seems retarded. Like way dumber than 3.0 of the same quant. I'm using exl2 so it's not a llamacpp issue. But maybe exl2 inference is broken as well? dunno.If not this model is shit and a big step down in intelligence from the previous version, regardless of what benchmarks say. Thank god for Mistral I guess.
>>101571884>If not this model is shit and a big step down in intelligence from the previous version,That's simply not true, you just have false expectations.
>>101571878May as well get the ram anyway. Even if it's not that useful now, it will in the future.
>>101562692Does that really work with ST? I'm trying to get it connected but it just not connecting.
largestral has that llama1 vibe
>>101571911In the future it will be obsolete. Imaging stacking up P40s last year and now they're relics. Better off saving up and purchasing whatever the cheap option is when you need it.
So? How slop is it?
>>101571951Explain every part of this meme
>>101571951>A new general purpose instruction dataset by kalomaze was added
>>101571950We're talking about ram, not gpus. If you need new gpus in the future, whatever you spent on ram would be negligible.
>>101571944is this a good thing? it doesn't sound like a good thing...
>>101571951Surprisingly good, for a small model of course. It's still 12b so don't expect any fireworks here.
i have i7 4790k and rtx 3060, what should i upgrade to run bigger llms? i am sick of running small models
Are all finetunes the same?
>>101571951It's alright. It's extremely fast while still being fairly coherentUnlike Gemma 2 it's not broken, but that might just be the quant I downloaded
>>101572108>Gemma 2 it's not brokenNot broken gemma 2 when?
>fell for the 64GB of VRAM meme>can't run largestral without lobotomizing it in perplexity or throughput
>>101572125time for another 3090
>>101572129two more 3090s
>>101572101saochads how did we lose to fucking drummer
TM3090s.
>>101572028Same thing applies. No sense in stocking up on DDR4 now when DRR5 is already out and getting faster.
>>101571531sad larp but Im still ODing on that hopium
>>101572163I mean DDR5 isn't backwards compatible so if DDR4 is what your motherboard takes then that's what you have to buy, there's no choice involved
>>101572061What's better about it over regular nemo-instruct?
>>101572134nine more can't hurt
>>101572171Forever locking yourself to sub second token generation speeds. See why this is a bad idea?You buy DDR4 now, it will always have the same speed and lose value on top of it.Or you can save your money and purchase DDR5 next year for the same amount. Even if you need a new mobo you come out ahead.
>>101572163I feel it's the same mentality of the people 2 days after the 4090 released asking if they should buy it or wait for the 5090. If you're still on ddr4, spend that money upgrading to something with ddr5. Or wait until motherboards with ddr6... and then may as well for ddr7 or whatever... if you want to spend only a reasonable amount of money, upgrading ram is an easy choice.My point is that you should upgrade if you can, in any way you can. If you have enough for a ddr5 setup, do it. If not, you probably won't do it either in 3-4 months. By then everyone will be waiting for ddr6 or whatever new shiny thing is in the making.>>101572171Again. If you need another CPUI+mobo for ddr5, whatever you spend on a few sticks of ram will be negligible. You can sell the old pc whole.
>>101572204It's roughly as smart as Mistral's instruct, but it's smuttier and more Claude-like, more creative. That's an achievement because models often get dumber when smut tuned. This one didn't.
>>101572252Obviously this only applies if you're stocking up on server ECC ddr5 ram and not the consumer stuff. You're never going to run a >70B model on your 256GB ryzen build at acceptable speeds even with ddr5.
>>101572299>256GB ryzenCan I really use 4 sticks of ram? I heard there were issues with it. I have 2x48 at 6000 now.
>>101572101>no bobby sinclairgarbage
>>101572274I feel like you missed the part where the feature anon wants isn't out yet. It's one thing to buy and use now versus buying now and sitting on it for a year and just buying then.
the 70b llama 1.3.1 is really good, I am using a 4.5 quant and is the best local model I have used. It is even out performing as a sillytavern chatbot the miqu mixes
>>101571959[left]: A feline-like creature known as petra (/lmg/'s mascot) confidently strides in front of Alan Turing
huh, nemo is the first model next to CAI's that I've seen use OOC notes. neat.
>>101572367bullshitit's shit, way dumber than 3.0
>>101572386Yeah, why can't they release an 8x12b nemo? Then I could finally replace wizard 8x22b.
>>101572337Then what difference does it make *when* it will release. What matters then is *what* it needs to be usable. If the point of saving money is to buy an extra gpu later, buy the gpu later. If the point of saving money is to upgrade to cpumaxx, cpumaxx later. I guess i'm just dumbfounded by poorly worded, round-about questions.>What do you guys think the requirements for llama-vision are going to be?is a better question. But even then, I'd object to the usefulness of the question when nobody can now, baring some leak or whatever.
>>101572386It responds really well to OOC: instructions too.But yeah first local model to do it unprompted to me, it complained that my replies weren't detailed enough.
>>101572448>it complained that my replies weren't detailed enough.sovl...
>>101571586The first LLM trained exclusively on fake pajeet product reviews.
>>101571842I think I was getting slightly below that on 4x3090. So yeah it's a pretty good comparison.
Wow Erebus sucks ass
>>101572502why are you using erebus in the year of our lord 2024
>>101571531Looking forward to seeing the first open-weight LLM release in years from OpenAI.
>123B>405BHow am I supposed to use these huge models (locally) in a cost-effective way?
>>101571531come on leakers are more anonymous here, you can be a little more detailed than jimmyapples/flowersfromthefuture
Nemo 12B is unironically smarter than L3 70B 3.1
>>101572516>Cost effective You don't.
>>101572510I googled sexo models and it was the first result
>>101572502It's hard to believe how far we've come since I tried Erebus back in 2022 after c.ai got lobotomized for the first time. I wrote off local models as an option at the time.
>>101572536This is some bullshit. How much for a server with 6 TB of RAM? GPUs are clearly out of the question.
>>101572516>cost-effective way
>>101572528Yeah but the old llama 3 70B is smarter than nemotron. So meta fucked something up there. Also 3.1 8B has a better conceptual understanding than Nemo. I'd still more likely use Nemo for rp though because 8b is slopped to hell
>>101572557who is that? it's jensen's clothes but the face is clearly a different guy
>>101572560Maybe the distillation process is flawed?
>>101572557I feel like the more I buy, the more cost-effective it becomes.
>>101572552I did some testing on q4xs 405B on my home rig, which has 8xDDR4 2666 and it took nearly an hour to generate a 512 token reply. Like I said. You don't. Either rent cloud for it or someone probably has some kind of proxy setup that exploits the hf chat endpoint.
>>101572560>Yeah but the old llama 3 70B is smarter than nemotron. So meta fucked something up there.I've been saying this but this thread is being belligerent about it for some reason405B is good but the smaller 3.1 models have something badly wrong with them, they're all significantly dumber than the previous version
>>101572589>8xDDR4 2666What kind of speed is that? I think the CPU guide has DDR5 at about 800 GB/s or something.
>>101571531you can say this for every week this year, between all the actual releases, copypaste chinese followups, placebo finetunes/merges, and cope papers with halfbaked githubs attached that faggots get excited about for one day>>>(MIKU SYSTEM NOTE: THIS RECENT HIGHLIGHT THREAD SUMMARY WILL BE IN ALL CAPS.)
>>1015725918B is definitely smarter but more slopped than the old one. But it doesn't respond well to additional fine-tuning. 70B is just lol.
>>101572641Here's Teknium (the guy who makes the hermes models/dataset) saying 8B 3.1 is doing worse for him on every measurehe has other posts about it toohttps://twitter.com/Teknium1/status/1816514230595784969
I apologize for not searching well enough but do you guys know what's the best "chinese" model around 7-10b?I'm trying to learn it and it would be really helpful to have a few cards to help me out with it (coherence, pinyin, meaning, etc).I know about Qwen2, Deepseek v2 lite and this finetune:https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-ChatBut I'm not capable enough to know which one to pick or if there are better alternatives.
>>101572733Try Yi and GLM.
Anyone know how to get xtts streaming working in ST? It appears as if ST waits for the entire gen to finish, then sends that off to the xtts server, and then the server sends over the result whole. When I select the streaming option in ST, I just get silence. And when I use the streaming flag in xtts, it gives me an error saying something about invalid sample rate.
>>101572634>THREAD SUMMARY WILL BE IN ALL CAPSAnd in Russian.
>>101572746I suppose that asking a definitive answer is a pretty tough ask given how niche my question is + how muddy the meaning of "best" is in this context.Well, I'll give all of them a try then, thank you for the suggestion anon.
llama 3.1 70b felt smarter than base llama3 for me but also infinitely worse when it comes to SHIVERS and the like.
https://x.com/InfernoOmni/status/1816492686087508174>guys this is actually INSANE. a former employee of a multi-billion dollar company, Runway, confirmed that they mass downloaded YouTube videos in order to feed their AI. there's a spreadsheet with NOTES showing HOW they swiped videos. Nintendo was on the list.whats going to happen to them bros??
>>101572829Time to mysteriously disappear from the face of the earth.
>>101572829Nothing. We knew about this years ago. Google were doing it all along.
>>101572844I wish I was a sociopath CEO so I had the balls to do stuff like this myself
>>101572589>an hour to generate a 512 token reply>0.14 token/secThat might be usable depending on the quality and what I need from it. How much did that system cost?
>>101572847>Google mass downloaded YouTube videos in order to feed their AIOh they're in for it now.
>>101572829WTF this is bullshit none of those youtube authors consented to having their content watched and learned from without even paying them for itlawsuit time!!
>>101572829Despite all the confident talk there STILL hasn't actually been a legal test case to determine and set precedent on the question of whether showing copyrighted media to a transformer or diffusion ML model counts as copyright infringement.Both sides of the issue talk about it as if the question's been settled in their favour, but they both know it hasn't been. That's just what you do when a legal question is still open, pretend it's closed in order to appear confident.
>>101572875The videos literally belong to them.
>>101572876I consented. I have a youtube channel with two videos from 10 years ago with 300 views
>>101572876Now that I think about it, is using someone's youtube videos even not allowed in any way? Google says they're copyrighted.
>>101572890>upload video to youtube (for free)>youtube creates subtitles for you>download subtitles and use that for your datasetThe perfect crime.
>>101572885I mean, copyright probably shouldn't exist. Japan allowed it i think. In a civilized world it would be allowed
>>101572904Im sure if google is using them then somewhere in the terms of service they agreed to is a agreement that they can use the videos anyway they want.
>>101572890Not my videos. I put a disclaimer in the description of all my videos that I do not consent to giving ownership to YouTube. They are mearly a hosting provider.
>>101572929I mean using someone else's videos
>>101572944Again, I'm sure part of the TOS is that google can use the videos anyway they want, that includes selling rights for other companies to train AI off of them. Nothing is free, people just don't bother reading terms of service and don't realize that they are the product.
been away for a bitllama 3.0 70b q4 -> 3.1, is it worth the upgrade at all? can't be bothered to redownload that much if the upgrade is marginalalso the new mistral large, any means to run it on a 3090 & 64 GB RAM and is it better than llama3?
>>101572977There's some question whether it's better as some people are getting bad benchmark scores from it. May or may not be loader bugs.However it is not made for RP. Mistral Large does RP. It can be loaded with a Q3 quant if you want, but it will not be fast. You will probably get like 1 t/s or less.
>>101572936Not into reading ToS, i suppose. You don't need a disclaimer for that, but they can still do pretty much anything with them. Specially serving them unless you private them.>By providing Content to the Service, you grant to YouTube a worldwide, non-exclusive, royalty-free, transferable, sublicensable licence to use that Content (including to reproduce, distribute, modify, display and perform it) for the purpose of operating, promoting, and improving the Service.
>>101573016meh, i already get slow ~1T/s speeds with llama3-70b, as long as the results are good and the model doesn't need tardwrangling too much i can be patienti'll try mistral out then, thanks anon
>>101572977Been running it at Q2_K on 64 GB RAM, it's less retarded than I expected and I get about the same speed as a 70B at Q5. I think it might be my favorite RP model and replace CR+ for me.
Anyone with 48gb running Mistral Large EXL2? Which quant and settings? I can't get this fucker to load with more than 3584 context size.
>>101572970I just didn't (don't) get what will happen to me if I use someone else's copyrighted video content
>>101573059Made with imatrix?
I wish my eyes had a gleam in them...
>>101573071kek
>>101573071lmao
>>101573068Didn't check, quant from https://huggingface.co/MaziyarPanahi/Mistral-Large-Instruct-2407-GGUF
>>1015730603584 is all you need
>>101573060Have you considered being less poor?
>>101571366What's the best local model for ERP with 8GB VRAM?
>>101573107Snowflake Arctic Instruct
>>101573059Also something weird, it seems to be extremely deterministic, not sure if it's because of the quant. I'm using 4 temp and 0.03 minP to get some variety between rerolls.
>>101572298its actually too horny imo and a lot dumber in certain situations.
>>101573107With 8GB you're better off using ram as well.
>>101573157Won't replies be really really slow?
>>101573167Not if you don't offload too many layers to RAMNemo 12GB at Q6 should be quite usable in llamacpp with partial cpu offload
>>101573215 (me)oops *12B, not 12GB
>Karpathy and other niggas been shitposting about tokenization in Transformers, spamming that "is 9.11 > 9.9" meme like it's the fucking "arrow to the knee" of AI>Llama 3 rolls out with some new tokenization shit, claiming better text compression but makes L3-405b parse Markdown like a down syndrome kid>FAIR was already on this shit a month ago with multi-token prediction in Meta Chameleon, but Llama 3 paper doesn't even acknowledge itAre these AI labs just circlejerking about scaling and dataset quality instead of actually fixing tokenization?
whatever happened to chameleon? has anyone written about using it for anything?
>>101573356Tokenization is not a fixable problem. Different problems require different tokenization. We cannot have one to rule them all.
https://www.reddit.com/r/LocalLLaMA/comments/1ebz4rt/gpt_4o_mini_size_about_8b/>gpt-4o mini is 8b foss j33ts lost.
>>101573501Go back
>>101571366
>>101573501brain dead redditardGo back
>>101573501> 8bThe journo just made that shit up.
>>101573060I suggest getting 2 more GPUs.
What kind of videos would you gen now if you could?
>>101573675Full anime episodes from story descriptions.
>>101573675I'd use AI ti rewrite Eragon to not be shit (no turning into elf, fucks the dragon) and use that script to make a whole damn series.
>>101573356I have a possibly too simplistic thought that a model figuring out that straberry has 3r's in it on its own is a nice intelligence milestone. It knows what alphabet is so it should just apply that knowledge. I would also be happy with LLM saying that it actually doesn't know cause it uses tokens instead of letters. But we probably will never see a next token predictor get to this level.
Would Mistral Large with the lowest 1 bit quant still be good compared to Nemo?
I like Nemo. Wish it was smarter. Frogs make a moe out of this please.
>>101573772No
>>101573784But, the filesize is still larger. 25 GB vs 12 GB (Q8).
>>101573772>bitnetOnly thing that would be better is a miqu bitnet.
>>101573799Oh, no, I'm talking about the IQ quants. 1 bit was probably the wrong wording.
>>1015737991 bit quant != ternary != bitnet
>>101573772why don't you try it out
>can run CommandR>can't run Mistral LargeAAAAAAAAAAAAAAAAAAAA
>>101573832Cohere will save us next week, believe it!
>>101573811Rest of the range is debatable.
>>101573829Ok fine. Downloading.yolo
>>101572448It swiftly becomes obnoxious after the first OOC. The model takes the online RP forum schema too seriously, and because mistral does not specify roles in its prompt formatting, it begins criticizing both the user and itself over and over again. I had to add 'OOC' to the stopping string list.
>>101573832>can probably run it at exl2-4.0bpw, but with only 2GB freeNo... it's over...
>>101573832Just stop being a ramlet
What preset should I use with mistral nemo?
>>101573167Define slow? If you're happy with 1-8T/s depending on model size then you should be fine.
>>101574063>What preset should I use with mistral nemo?have you tried the vg/aicg anon's preset for mistral large? >>101567703
>>101574140I'm using the shitter vramlet version
>>101574158your shittier vramlet version is hornier, which is a good thingyou can use presets that are designed for different models. you don't really lose too much other than a bit of context applying a jailbreak to a model that doesn't need it
What's the best uncensored model currently? NAI doesn't cut it anymore, i also have a 4090 so i could do local if i have to
>>101574239>What's the best uncensored model currently?claude 3 opus or sonnet with a prefill/jb from /aicg>>101574239>i could do local if i have toa single 4090 doesn't cut it anymore, but you can try out a gemma-27b or a mistral-nemo modelbut if you're gonna use claude you won't be able to handle local's retardation
>>101574263how do i use claude uncensored? Can i just sub on the site or is there a diff version somewhere? Can it handle lolis?
>>101574239The question is not "what is the best uncensored model", it is "what is the best model", and for your hardware that's this one:https://huggingface.co/bartowski/gemma-2-27b-it-GGUF/blob/main/gemma-2-27b-it-Q4_K_M.gguf>>101574063The template in its tokenizer_config.json.>>101573832You can certainly run the model below, and it's a good one.https://huggingface.co/bartowski/gemma-2-9b-it-GGUF/blob/main/gemma-2-9b-it-Q4_K_S.gguf>>101573781Good prose, poor intelligence and world knowledge.>>101573772Likely not. Models degrade too much below 2-bit no matter how good the quantization algorithm.>>101573675Pornography of the same type that you can find on xvideos. And I would feel like a vermin about it.>>101573501Redditors are retarded and can't read.
>>101574330Gemma is dogshit
>>101574341You just don't know how to use it with whatever bastardized prompt you are giving it.
>>101574283>how do i use claude uncensored?openrouter if you want to pay, or use a proxy from /aicg/>Can it handle lolis?its the smartest and most creative model out right now. it still doesn't make children act realistically but if you like the idea of horny brat children its fine
>>101574349>SARRR YOU ARE REDEEMING IT THE WRONG SARR
>>101574349NTA but anons always say that then never provide their prompt. A model that needs some magical mystery prompt to be good (?) is no use.
>>101574330>The template in its tokenizer_config.json.Presets can also mean sampler settings. I *wish* those were in models. Or at least recommended starting points.
>>101572386Theme pls
>>101574439They recommended temp at 0.3 or 0.4. They cannot recommend other settings for every inference program out there. It's not reasonable and nobody would agree with them if they were included.
>>101572448are you telling me that "ahh...ahh...mistress!" isn't good enough? preposterous
>write author's note saying character has k-cup breasts>tease bot about her huge breasts >they say they are not that big, only a K-cup, not even C-cup yetSo this is the power of AI roleplay...
Anyone using nemo on sillytavern? What context and instruct fo you use? Just Mustral?
>>101574566yes
>>101574330>Pornography of the same type that you can find on xvideos.Except it could be any anime character you want
where can i generate porn videos?
>As for the uptime of your computer, I'm pleased to inform you it's been running for a lovely 6 days, 4 hours, and 44 minutes. You know what they say: "Idle hands are the devil's playthings," but in this case, an idle computer is merely a testament to its owner's questionable life choices.kek i'm enjoying this discovery too much, it even knew that my load average was low.
>>101574655>where can i generate porn videos?nowhere good, but klingai.com is your best bet for free text2video of any kind
>>101574655In like a few months. Or maybe a year?
>>101574675Quit posting these goblins
>>101574695ill get bored in 2 weeks or when they tighten the filter as a result of my degeneracy, whichever comes sooner
I put on my robe and wizard hat