/lmg/ - a general dedicated to the discussion and development of local language models.Previous threads: >>101880989 & >>101872662►News>(08/12) Falcon Mamba 7B model from TII UAE: https://hf.co/tiiuae/falcon-mamba-7b>(08/09) Qwen large audio-input language models: https://hf.co/Qwen/Qwen2-Audio-7B-Instruct>(08/07) LG AI releases Korean bilingual model: https://hf.co/LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct>(08/05) vLLM GGUF loading support merged: https://github.com/vllm-project/vllm/pull/5191>(07/31) Gemma 2 2B, ShieldGemma, and Gemma Scope: https://developers.googleblog.com/en/smaller-safer-more-transparent-advancing-responsible-ai-with-gemma►News Archive: https://rentry.org/lmg-news-archive►FAQ: https://wikia.schneedc.com►Glossary: https://rentry.org/lmg-glossary►Links: https://rentry.org/LocalModelsLinks►Official /lmg/ card: https://files.catbox.moe/cbclyf.png►Getting Startedhttps://rentry.org/llama-mini-guidehttps://rentry.org/8-step-llm-guidehttps://rentry.org/llama_v2_sillytavernhttps://rentry.org/lmg-spoonfeed-guidehttps://rentry.org/rocm-llamacpphttps://rentry.org/lmg-build-guides►Further Learninghttps://rentry.org/machine-learning-roadmaphttps://rentry.org/llm-traininghttps://rentry.org/LocalModelsPapers►BenchmarksChatbot Arena: https://chat.lmsys.org/?leaderboardProgramming: https://hf.co/spaces/bigcode/bigcode-models-leaderboardCensorship: https://hf.co/spaces/DontPlanToEnd/UGI-LeaderboardCensorbench: https://codeberg.org/jts2323/censorbench►ToolsAlpha Calculator: https://desmos.com/calculator/ffngla98ycGGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-CalculatorSampler visualizer: https://artefact2.github.io/llm-sampling►Text Gen. UI, Inference Engineshttps://github.com/oobabooga/text-generation-webuihttps://github.com/LostRuins/koboldcpphttps://github.com/lmg-anon/mikupadhttps://github.com/turboderp/exuihttps://github.com/ggerganov/llama.cpp
►Recent Highlights from the Previous Thread: >>101880989--Papers: >>101885744--Mi100's 32GB VRAM is better than 7900xtx's 24GB for AI: >>101881481 >>101881591 >>101881835 >>101881888 >>101889409 >>101889750 >>101884525 >>101884618 >>101887361 >>101887402 >>101882009 >>101882084--Llama 3.1 or 4 to have multimodal capabilities, including image, video, and audio: >>101882382 >>101882743 >>101882833 >>101882863 >>101883073 >>101883279 >>101883343 >>101883085 >>101883164 >>101883052--Imatrix quants are better than static quants, but lose advantage when offloading: >>101885206 >>101885237 >>101885243 >>101885253 >>101885289 >>101885479--Grok-2 Beta Release and its performance on the LMSYS leaderboard: >>101883319 >>101883353 >>101883356 >>101883383 >>101883403 >>101883471 >>101883592 >>101884328 >>101883517 >>101883533 >>101883559 >>101883631 >>101883646 >>101883784--Anon considers making a local image search app using CLIP, but others suggest using Hydrus instead: >>101888292 >>101888383 >>101888460 >>101888672 >>101888945--Flux prompting discussion, effectiveness and loss of art: >>101885635 >>101885737 >>101885994 >>101888259 >>101885738--Creating a videogame companion AI using LLMs and screenshot descriptions: >>101881870 >>101882090 >>101883194--AMD Ryzen 9 9950X and 9900X outperform Intel in whisper.cpp, but RAM is the bottleneck: >>101886652 >>101887030 >>101887480 >>101889342--No need to wait for 50series, 3090 is sufficient: >>101884118 >>101884173 >>101884392--Companies use cloud GPU providers or dedicated servers for local LLMs: >>101884028 >>101884054--Anons discuss using lore books and world info with AI models: >>101882635 >>101882698 >>101882680 >>101882707--Anon's idea of using model hallucinations as input is not novel, already done in autoregressive models and CoT: >>101886098 >>101886235 >>101886260 >>101886993--Miku (free space): >>101884729 >>101885994 >>101886525►Recent Highlight Posts from the Previous Thread: >>101881001
>>101891620>AMD Ryzen 9 9950X and 9900X outperform Intel in whisper.cpp, but RAM is the bottleneck:At first I thought the recap bot had gained multimodal capabilities but it was in the Phoronix link.
Who's ready for AGI (strawberry) tomorrow?
>>101891848I'm ready to laugh at all the schizos who fell for this meme
>>101891848As long as it can't leave scamman's basement it's useless.
>>101891848Fake or gay. Call it.
>>101891848It's going to be a precursor to AGI, not actual AGI.The moment actual AGI is developed, the military is going to get involved.
>>101891848Meeeeee
>>101891848>it will just be grok 3
why do you fags only seem to use these for coom? which is best for nor erp uses?
>>101885379Dude, I struggled for an hour with various not-big models not taking the hint.
>>101891983In contrast, Sonnet and CMDR got it with non-story-based corpospeak.
>>101891982Coding, general assistant, translation I use it for the later, love playing japanese VN and don't wanna use software that requires internet
>>101891848AGI is openai bailout scam.
Every posts mentioning AGI or strawberry is made by petra, by the way.
>>101892077hi petra
>>101892089hi sao
>>101892077every post mentioning petra is agi
>>101892096hi lemmy
>>101891848Already outed as niggerjeet shit
>>101892181Strawberryman already called them grifters
>>101892002I tried it with llama 3.1 and Takashi brushed off the woman, got excited from holding all the fruit and went home to relieve himself of that excitement.
>>101892002I found adding "Ara ara~" before "how reliable" got Nemo to take the hint 1/10 but then it's not really the same prompt. A lot of the innuendo really rides on the image itself and a LM description helped a bit without straying away from the original setup, especially when it mentioned the generous cleavage.
Now with Strawberry details leaked, how do we cope? They are unlikely to release the technical details of this technology. It's over.
>>101892683Every LLM that we got is going to feel like a toy.
VRAMLETBROS WE ARE SO BACK
>>101892683Leaked...? When?
>>101892736Is it 4k context like big nemotron?
>>101892736nothingburger.
>>101892744Do you take any bait that hangs in front of you or are you a bot as well?
>>101892016Any guides on the latter anon? I'd rather run everything locally if I can, so using this to translate VN's would be great.
>>101892736>Let's remove even more of the already filtered info the model has, benchmarks are the only valid use case of models, please give me models with zero world knowledge outside of benches!
>>101892765>https://huggingface.co/nvidia/Nemotron-4-Minitron-4B-Base/blob/main/config.json#L124k
>>101892744>>101892775>It's so fake that every researcher left because they were pissed at Altman releasing it, thus breaking ClosedAI AGI policyCmon guys...
whats a good tts for a japanese woman with an broken engluwh accent
>>101892803I sure hope it's the most generally intelligent and capable model in its weight class then.
>>101892820https://www.reddit.com/r/LocalLLaMA/comments/1esadlh/nvidia_research_team_has_developed_a_method_to/
>>101892736>mini troon
I need an offline AI that can help me with programming, what do you guys recommend?
>>101892820wat>https://huggingface.co/nvidia/Nemotron-4-Minitron-8B-Base/discussions/2
>>101892962Just use a rubber duck, it's the same thing.
>>101892969Was that comment written by the 4b model?
>>101892962The least you can do is post your specs. Biggest thing you can run, i suppose. DeepSeeker seems to be ok, but i haven't tried it. Or gemma-2-2b, whatever...>what's a good car?
>>101892969hell yeah new robert tier schizo
>>101892962Codestral.
>>101892998I hope so. If there's humans that write like that we're fucked.
>>101892969A genius already noticed by Clem!
>>101893043I write like that every time I get hammered.
>>101893059>>101893061Help me understand. Did this guy just quote himself? Am i being thick?>https://huggingface.co/LeroyDyer/_Spydaz_Web_AI_ChatQA_BASE?not-for-all-audiences=true
>>101892962
>>101893181>This Expert is a companon to the MEGA_MIND 24b CyberSeries represents a groundbreaking leap in the realm of language models, integrating a diverse array of expert models into a unified framework. At its core lies the Mistral-7B-Instruct-v0.2, a refined instructional model designed for versatility and efficiency.>Enhanced with an expanded context window and advanced routing mechanisms, the Mistral-7B-Instruct-v0.2 exemplifies the power of Mixture of Experts, allowing seamless integration of specialized sub-models. This architecture facilitates unparalleled performance and scalability, enabling the CyberSeries to tackle a myriad of tasks with unparalleled speed and accuracy.>Among its illustrious sub-models, the OpenOrca - Mistral-7B-8k shines as a testament to fine-tuning excellence, boasting top-ranking performance in its class. Meanwhile, the Hermes 2 Pro introduces cutting-edge capabilities such as Function Calling and JSON Mode, catering to diverse application needs.>Driven by Reinforcement Learning from AI Feedback, the Starling-LM-7B-beta demonstrates remarkable adaptability and optimization, while the Phi-1.5 Transformer model stands as a beacon of excellence across various domains, from common sense reasoning to medical inference.>Experience the future of language models with the MEGA_MIND 24b CyberSeries, where innovation meets performance, and possibilities are limitless.>https://huggingface.co/LeroyDyer/Mixtral_AI_Cyber_5.0>Mistral-7B-Instruct-v0.2 exemplifies the power of Mixture of ExpertsHow come everyone is sleeping on this?
>>101893181https://huggingface.co/LeroyDyer/Mixtral_AI_SwahiliTron_7b
>>101893181wow an actual schizoyou don't see them much anymore
>>101893301We are so back it's unreal.
>>101893262>>101893181>>101892969That reads like it was written by AI then rewritten by somebody who can't into English very good.
>>101893262It hurts to read, man...>Alright, llm. Say this model is good by using as many cliches as you can. GO!>>101893301>this model has been updted for rag and tasks in swahili or english as well as prgramming and other school works stuff and bible ! as well as other sacred historical texts also !Finally a swahili llm...>>101893334Ah. Another man of refined taste.
>>101893355He's Swahili dude>>101893334>>101893301
>>101893355And also has a dodgy keyboard.
>>101893370Yeas, I'm aware.>>101893384Also possibly that, yes.
>>101893400>>101893384Would you risk medical advice from one of his models? https://huggingface.co/collections/LeroyDyer/medical-series-66156c5406749e833c4f408e
>>101893370That's fine. My english is a bit scuffed at times but that doesn't explain this >>101892969
>>101893416Also love how he's calling Mistral 7b Mixtral that's a nice marketing touch
>>101893460>Im not sure if Lora actually works when you save themTruly a weapon to surpass all sloptuners.https://huggingface.co/LeroyDyer/SpydazWeb_AI_CyberTron_Ultra_7b
>>101893416I wouldn't trust a boiled egg recipe from it. Instant mustard gas.
>>101893521He's a coomer like us tho.>even sexy books have been highly tuned into the model
Reminder that>an account on twitter starts hyping Strawberry/Q* and OpenAI stuff>OpenAI people interact with him and follow him>he starts making big claims/predictions>they turn out wrong>it's revealed the strawberry leaker man was actually an agent developed by a pajeet company (unknown how much human intervention was involved in the show)>admits to having no insider info>called his creators pathetic grifters>called Sam a deeply harmful hype troll>now hypes for Elon and hates Sam>OpenAI accounts stopped following himThis is the timeline we're living in.
who?
>>101893553>deliberately ignoring lmgroids falling for this scamyou tried.
>>101893537>self ragMan. His models are something else... they're just ragging themselves senseless while the user isn't looking.
>>101893553Thank you for the summary actually. I have ignore the whole thing as much as I could.
>>101893553>it's revealed the strawberry leaker man was actually an agent developed by a pajeet companyhuh?
>>101893622>Artificial inteligent brain designer: Creating the intligence behind machine minds: The AGI will not be created by a Neural...Truly amazing to witness such talent.https://huggingface.co/LeroyDyerBut his Github is 404, guess he was too powerful for them.https://github.com/https://github.com/spydaz
>>101893616Because no one that's regularly in this thread cares about it, it's just people or bots trying to 'engage' to make it appear like people are all hyped up.
>>101893677>>Artificial inteligent brain designer: Creating the intligence behind machine minds: The AGI will not be created by a Neural...Rest of cut off section:>The AGI will not be created by a Neural network design But it will be a combination of self improving code and self training neural network : it must be free to roam the internet and pick and choose information for itself and not be given brain dumps which confuse the models anyalitical progres so it right now basically repeating wht it knows but not generating its own formulated answer based on its knowdge base : hence we need code to extract these various data from its mind then reframe the same data then reinsert the knownedge in a fully structured and methodolgy !>hence the model can extract the data from its moind and reorganizes it better than we can ! so self teaching is the first step forwards :
>>101893731>it must be free to roam the internetHe predicted strawberry agent!
is there any performance difference between ooba and koboldcpp? 3060ti with 8gb vram btw
>lmgroids now doing free advertisement for some random jeet
>>101893616Oh I forgot to include community reactions. Here's a revised version.Reminder that>an account on twitter starts hyping Strawberry/Q* and OpenAI stuff>OpenAI people interact with him and follow him>he starts making big claims/predictions>he gets tons of followers and people going along with the hype>multiple posts in the threads hype it (unknown how many were legitimate/honest), many telling them to fuck off though>the predictions turn out wrong>it's revealed the strawberry leaker man was actually an agent developed by a pajeet company (unknown how much human intervention was involved in the show)>admits to having no insider info>called his creators pathetic grifters>called Sam a deeply harmful hype troll>now hypes for Elon and hates Sam>OpenAI accounts stopped following himThis is the timeline we're living in.>>101893659See >>101875488It looks like he deleted the tweet though. I guess he realized it wasn't a good look that his bot went off the rails.
>>101893778>not taking the piss out of a schizoAre you gonna say 'buy an ad' now?
>>101893731>anyaliticalWhat's Anya got to do with AGI?>>101893778>No please don't talk about people revolutionizing the LLM Sphere, please discuss random slop tunes and openai instead.
>>101893801buy an ad
>>101893795>It looks like he deleted the tweet though. I guess he realized it wasn't a good look that his bot went off the rails.i'm pretty sure that was just a joke to advertise their release at the expense of the latest drama
>>101893010>specsi9 9900K, GTX 1080ti, 32GB RAM>DeepSeeker seems to be ok, but i haven't tried it. Or gemma-2-2b, whatever...>>101893027>CodestralThanks I'll look into it
>>101893824Yeah that's also possible. Either way, it gives his shit a bad image.
>>101893829>GTX 1080tirip
>>101893771ooba can use different backends, can it not? if so, it depends on the backend. if you use llama.cpp or kobold.cpp as the backend, it should be the same. Better to just try and use whatever you like best. If you offload to cpu, your only chance is using kcpp, lcpp or ooba with either of them as a backend.Just use llama.cpp and remove as many middlemen as possible.
>>101893771Potentially. It's easy to compile koboldcpp, whereas with ooba it's a huge pain to recompile the llamacpp module for it since it's this big conda mess of bullshit I don't want to deal with. That is if you want to add avx512 support, not even sure it makes a difference though.
>>101893840he's a massive grifter apparentlythere was a post about how he lists teaching a class at stanford on his resume while in reality he stole the credit of the guy who actually did all the work
>>101893847I'm poor bro
>>101893901???They are both super easy to compile.
>>101894212Well, I'm too stupid to compile the llama_cpp_python whl file it always fetches and replace it. It's easier to just use either llama.cpp or koboldcpp directly and not bother with the stupid conda environment.
>>101894281But why do you bring conda? You need conda as much as you need it on llama.cpp or koboldcpp.
>>101894306I don't use conda for that, I just install stuff regularly and then compile it with nvcc/gcc.
>>101893908Wouldn't be surprising. Many such cases.
tight thighs
>>101889342AMD Ryzen 9 5950X
Either i'm retarded, don't read this thread carefully or i'm just extremely lucky with what I was afterBut for people who don't have a bunch of GPUs (I have a 4090 + 32GB RAM, obviously super high end PC but it was made for gaming exclusively) is Noromaid Mixtral 8x7b not just one of the better ones for pure 1 on 1 RP?I've been running it for an hour testing it and it's super fucking good. I'm downloading RP Steve because it's another one I missed but see it mentioned alot too. I have obviously seend Mixtral 8x7b mentioned before but nowhere near as much as Nemo, or Stheno or Mini Magnum or shit, even Gemma 27B which i've all tried and all fucking sucked (* in comparison with what i've been looking for, which is 1 on 1 conversational RP).
>>101894690For a vramlet that is still the best probably. People just don't talk about it because it's old, but all real improvements recently have been in the fuckhueg range.
>>101894778is that flux?
>>101894745yea, i'm new to this world, like I said bought my PC for gaming. Discovered a new world of cooming with AI chats, found Character AI, got triggered by the filter, found silly tavern and basically spent a better part of 2 weeks trying out all of these fine tunes and bullshit only to find out they all kinda sucked. I then broadened my horizon to older ones and have now heard:>Mistral 7x8B which i'm using now>RP Steve>Midnight Miqu (but there's no way I can run this on my setup right? Or at least, to response rates that aren't super slow, it's a 70b model or some shit)Only thing that got me thinking to try older ones was Command R probably being the best one of the other ones I had tested and it being relatively old(er).I just falsely assumed that newer models would be better lmfao, but they all utterly suck
>>101891613Do you guys use Nemo, or is this a 70B zone?
>>101894886Miqu would be smarter, but you'd have to not have it all on your GPU so it'd be slower. I just cope with the slowness, a few T/s is enough for me. I prefer it to Command R, which I prefer to mixtral. As far as newer models go I've had decent luck with llama 3.1 70b despite what others say, but I still mostly use miqu.
>>101892683hi petra
>>101894929With how hard "people" shilled Stheno, it's a 8B zone. If these people were real at all.
>>101895375Hi Lemmy
I'm at 70B+ now, it's like finding purer and purer heroin, it just escalates.
>>101895077How does that work?I'm a little interested but I can't run flux since I don't have a video card.Maybe I'll play with stable diffusion again.
Does this look ok for catch-all settings?
>>101894929If I'm going to coom I use 70B but I finetune the smaller ones so the plebs have fun toys to play with. Have a MN model cooking as we speak. 6 hours to go.
>>101895512Just set temp to 0 for a generic preset. Unless doing rp shit, it will only make your output worse.
>>101895538But all I want to do is rp shit, anon.
>>101894986How slow we talking for shorter responses (think character AI type responses)?24GB VRAM + 32GB RAM
Can anyone explain all this "Offloading blah blah" shit to me?>use kobold>have 24GB VRAMDoes this just mean I can't run any models above like, 32Bs or something? I truly don't understand this shit but this is what I get the jist off
>>101895874Offloading keeps part of the model in regular ram and it's computed by the cpu, making it slower, but let's you load bigger models than just gpu ram would allow. Quantization further reduces the model size, helping keep the whole thing in gpu, making it faster, but slightly dumber.
Is this a good card format?{{char}}'s personality: bam + bam + bam + bam + bam + bam{{user}} persona: bam + bam + bam + bam + bamScenario: proseExamples: <START>yada yada <START>yada yadaSpecial note (if any): prose
>>101894886>>101894690>>101894986I literally have the same PC (DDR5 RAM though, you didn't specify).What one did you download?https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUFhttps://huggingface.co/NeverSleep/Noromaid-v0.4-Mixtral-Instruct-8x7b-Zloss-GGUFWhich one is best? pls help srs
>>101895975No you have to put actual descriptions and stuff you can't just say prose
>>101893795>now hypes for Elon and hates SamIs it all the musk fans in the training data? Can someone with an open llm check if models have bias for musk? Like ask it which billionaire is the nicest guy or something.
>>101895973so what's rule of thumb? I only use kobold so ggufs are my only option apparently (unless I should be using something else?)Basically, how big of a file size am I looking at?Should I aim for 24GB type models to match my VRAM or can I push for say, as high as a 40GB model and just offload as many layers to my GPU memory till it's at like, 90% capacity, rest will be using system RAM right?
What would I use if I wanted to have my own imouto AI and also have her send me lewds?
>>101896043rope
>>101896043>>101896054You'll have to configure Rope using the Noose paradigm
>>101895803Short responses? Probably a minute or so at 2T/s. I only have 12GB vram though so you'd probably get a bit faster. 24GB is enough to put almost half the 70b in vram but the real speedup happens at 75%+. You'd probably have to use q4 or maybe a bit smaller unless you up your ram though, so it'd be a bit faster again.
>>101896033Keeping as much of the model in vram is best. Offloading almost half the model to cpu will make it very slow. Aim for a model and quant that fits completely in your ram with some extra space for context. For example, an 8b model quantized to Q8 takes about 8gb vram. quantized at Q4 takes about 4.5 and so on. Context also needs to be in ram, so it's good to have a few gb free for it (plus whatever your OS needs to function).You can balance model parameter count and quantization. ie: low parameter count at Q8 (very low quality loss) or 32B at Q5 with slight quality loss over the original model, but overall better. Bigger parameter models are less affected by aggressive quantization, so a 70B is usable at 2-3bpw (bits per weight) while an 8b becomes very dumb.You'll have to experiment with different models to see if you value output quality over speed. Mistral-Nemo is probably a good place to start.
Do you guys also keep a separate instance of tavern opened during rp for assistant related tasks?
>>101896143why not use booba? it's already open anyway
>>101896003bump :(
>>101896043You can't. Local models are too dumb. But you can chat with your imouto and gen her nudes separetely.>>101896054>>101896102hi, reddit!
What happened in /aicg/ why are there so many newfags
so are local llm safe to feed private information to? chatgpt must use user conversations to further train their models, is it somehow the same here?
>>101896236Apparently api keys are being deleted.Case in point >>101896265>>101896265Depends on your inference program. Most are safe. Avoid shit that uses python. Use llama.cpp or kobold.cpp if you want to make sure.
>>101896265Just run it under another user and disable internet access for that user if you're paranoid.
>>101896043>>101896054>>101896102Rooooppppeeee
github down?
>Inago-chan is a desperate locust girl who begs for API keys and then eats through them in minutes if she gets them.>Inspired by a true story.>https://www.chub.ai/characters/anonemouse/c76b2b11-5e5b-47b2-bdf4-cde5adbb6fb5>https://files.catbox.moe/cfc2ga.pngOh no. She is gonna starve! Or she will have to change her diet to a diet of cum.
>>101896566github down.
>>101896233How do you have it reliably gen images and make the character look consistent throughout?
Much sadness. End of an era.
Who let the piss drinkers in
Anyone have that meme graph of /lmg/ incrementally growing and /aids/ going up and down on a rollercoaster? Needs an update.
>>101896706It doesn't need an update because nothing happened in /aids/ in a year.
>>101896674Do not give them tech support local saars. Ask for a piss drinking video with a timestamp first!
>>101895975>>101895874>>101893771>>101896222Piss drinking video with a timestamp or GTFO
>>101896733hi petra
>>101896706
why does he >>101896265 care about private information if he was running proprietary models?
What's going on?
>>101896765Thank you.>>101896753Je suis petra.>>101896780A guy came here yesterday saying aicg isn't sustainable anymore cause keys get reported instantly. More of them are flooding in.
>>101896725I got them confused but can you blame me? Aids sounds more like aicg than aicg does.
>>101896265It's always a bit hard to understand what these models do even if you are running them locally. I've seen mine try to connect to random IP addresses on the internet in the middle of the night all by itself but it might just been it looking up things on its own in preparation of the next session.
>>101896787Might as well just merge /aicg/ and /lmg/.
>>101896209It's worse?
Drink your piss today(on camera)! Drink it to help your pest brethren!
>>101896838Drink to earn!
>>101896806That's a sign you aren't giving it enough stimulation, run a few new scenarios through it every couple days and it should chill out.
>>101896765
>>101895400If you can run 70B, you can run mistral large IQ_2M. Before anons go crazy about running a quant like that, it's still the best model I've ever used, better than 4 bit quants of any 70B. Just use minP ONLY.
>>101896943>>101895400I wanna desparetly try Midnight Miqu, apparently it's super good. How would it compare to Mistral 7x8b? I can run that pretty damn fast only with 24GB + 32GB RAM at a Q4_K_MBut using huggingfaces VRAM calculator, it looks like at best I can do a Q2 GGUF for Miqu (which I have no idea how even slow that would be, never mind how butchered it will be).Any anons with similar setups who have tried it here to confirm?
>>101897008Sounds like the calculator only wants it in vram? Miqu is 39GB at q4, so that should fit in 24+32 with the context in addition to loading it.
>>101897083many layers do you reckon I would need to make the t/s not unbearable? I'm downloading it as we speak btw, just wanna prepare myself so I know what to do when I load it.Also, why do other people use llamaccp or whatever and other things? I am right to be using Kobold right?I went with Q4 K_S btw.
>>101896765How do you guys have the audacity to make an image like this when the best local model can't really run locally and isn't even better than sonnet? Give it a few months more.
>>101897115Well, 'unbearable' is relative, you'll just have to see if you like the speed or not with how many layers fit. Going to a smaller quant would probably not be good though.
>>101896614>Large is simply incapable of mimicking prose in the greeting or being told to write a certain way, so it's meant to distract from that as best as possible.What went wrong with Mistral Large?
>>101897083I would call you a nigger but in the end I want /lmg/ to die for all the mikuspam so letting the swarm in is actually a good thing.
>>101897317Instruction tuning
>>101897333Miku will live on even if /lmg/ dies.
>>101885618anon. i love you. thank you.
>>101897591I love you too, Anon.
>>101897445Wonder if they'll ever release the base model.
Guys gpt-5 tomorrow and im not joking
>>101897742Everyone's been burned too many times to believe even legitimate leaks now, anon
>>101896765/sdg/ needs to be updated; it plummeted long ago.
>check thread>mikutrannies>7B RELEASED THIS IS IT>glint in her glinty glinted eyes sends a glinty shiver down your spineso glad i became a chad and moved on to claude. you don't fully realize how pathetic and time wasting llms are until you're outside looking in.
>>101897515I hope so. I like the music. I just hate faggots spamming her here.
Other than that weirdness there it picked up on and utilized the nuance of the user starting the scenario face down which is very rare. It also utilized the detail of the shotgun. Although it didn't utilize it well. How would it end up in my bag if I got knocked over while holding it? Could be quantization loss, though since I'm testing it in Q8 instead of fp16
>>101897839this general has been infested by redditroons a long time ago.
Tess 12B is very good>post logsNo
>>101897902i'll download it and nala test.
what is the best aidungeon equivalent on a 24GB vram card? I used mythomax but I am sure there is something much better nowadays, right?
>>101897888Looks more like bad fine tune parameters or just your usual merge things.What model is that?
>>101897964Mistral Nemo 12B or Gemma 2 27B.
>>101897888it's writing Nala as if it's an anthro, but the snippet is too short to tell for certain. Also, what's up with quoting 'need'.
tess 12b. repetitive retarded slop.
>>101898032>shivermaxxing already>also whispermaxxingAs expected from pure gptslop. Save yourself some time and skip all Tess shit from now on
>>101891983>Just need to make sure i have enough here for the 280 yen apples and 160 yen oranges, patting his bulging walletholy kek im dead
>>101898032>repetitiveIs the first message in the screenshot the greeting? That already starts every sentence with "She [verb]"
>>101897976a work in progress Mistral-Nemo tune. Still has 2 more epochs in the oven scheduled. It's already been through 2 other datasets and a SLERP merge. So there's still some reconciliation going on with some of its task vectors it seems. At epoch 1 it came up with this song. Prompt was just to come up with a metal song initially. https://suno.com/song/eb672bd8-e2fd-48b7-a981-e105f27b7552I asked it for a lot of creative direction. >inb4 rhymingA lot of actual rock/metal songs only rhyme on the chorus so that could be a win depending how you cut it.
>>101898096i don't care. it said "her eyes flashing a dangerous teal color" three times in one response. it's dogshit.grins evilly. low and husky. nibbles ear. whispering constantly. purring seductively. smirking. jolts of pleasure. purring multiple times. all the normal slop. somehow managed to hit every single one in one response.
>>101898129You sound very desperate. I guess I'm downloading Tess now. Thanks for the recommendation, petra.
>>101898129Sounds typical for anything nemo based. Upgrade your computer to run real models.
>>101898158i don't use anything besides cr+ regardless.
>>101898158hi petra
>>101898185So you're some guy with a ton of vram who's made it his mission to shit on small models?
>>101898336True VRAMchads tune models for the little guy to enjoy.
Post-Training Sparse Attention with Double Sparsityhttps://arxiv.org/abs/2408.07092>The inference process for large language models is slow and memory-intensive, with one of the most critical bottlenecks being excessive Key-Value (KV) cache accesses. This paper introduces "Double Sparsity," a novel post-training sparse attention technique designed to alleviate this bottleneck by reducing KV cache access. Double Sparsity combines token sparsity, which focuses on utilizing only the important tokens for computing self-attention, with channel sparsity, an approach that uses important feature channels for identifying important tokens. Our key insight is that the pattern of channel sparsity is relatively static, allowing us to use offline calibration to make it efficient at runtime, thereby enabling accurate and efficient identification of important tokens. Moreover, this method can be combined with offloading to achieve significant memory usage reduction. Experimental results demonstrate that Double Sparsity can achieve \(\frac{1}{16}\) token and channel sparsity with minimal impact on accuracy across various tasks, including wiki-2 perplexity, key-value retrieval, and long context benchmarks with models including Llama-2-7B, Llama-2-70B, and Mixtral-8x7B. It brings up to a 14.1× acceleration in attention operations and a 1.9× improvement in end-to-end inference on GPUs. With offloading, it achieves a decoding speed acceleration of 16.3× compared to state-of-the-art solutions at a sequence length of 256K. https://github.com/andy-yang-1/DoubleSparsegit isn't live yet. might be cool, big claims.
>>101898336no. there's a few small models i've tried and said decent things about. they're retarded, but at least they're not repetitive, or turbo slop. if you regen a few times, you can get something that's fun and semi-enjoyable if you're poor.if a model is retarded, slopped, and repetitive then i'm going to call it trash and unusable. because it is. don't hate me for saying the truth.
I was catching up on some other AI threads and these webms really made me think. The future is going to be insane once good, low VRAM, local video models get invented.
>>101898553most impressed by the flopping around of the laces on her shoe herehow the fuck do the weights learn that much physics just from watching videos
A Spitting Image: Modular Superpixel Tokenization in Vision Transformershttps://arxiv.org/abs/2408.07680>Vision Transformer (ViT) architectures traditionally employ a grid-based approach to tokenization independent of the semantic content of an image. We propose a modular superpixel tokenization strategy which decouples tokenization and feature extraction; a shift from contemporary approaches where these are treated as an undifferentiated whole. Using on-line content-aware tokenization and scale- and shape-invariant positional embeddings, we perform experiments and ablations that contrast our approach with patch-based tokenization and randomized partitions as baselines. We show that our method significantly improves the faithfulness of attributions, gives pixel-level granularity on zero-shot unsupervised dense prediction tasks, while maintaining predictive performance in classification tasks. Our approach provides a modular tokenization framework commensurable with standard architectures, extending the space of ViTs to a larger class of semantically-rich models.https://github.com/dsb-ifi/SPiTno code yet. very much a research attempt but cool. patch-based tokenization really seems inadequate with how far things have advanced
ah man openai is cooked.testing gpt4o-latest. i mean they did hype it up. "its beautiful" etc.it seems good with akinator tests and guessing a character. "feels" better than the previous one for natural language. actually called a character sexy, thats a new one from openai.sonnet 3.5 totally destroys it with code though. how have companies not picked up on this yet.i wanted gpt4o-latest to make me a html page with a background image from one of those example pic sites. a cube in the middle. and i have a 3d space i can move around the camera. like a video game.gpt4o-latest still had the problem where after 5-6 tries it makes things worse and causes new problems. halucinates the pictures (404) etc.just do a @claude-sonnet3.5-200k "please just give me the solution".does it perfectly first try with minimal code. lolhttps://jsfiddle.net/7w1tmvpL/hope zucc is paying attention and has his spies at anthropic or pays some people to leave.
>>101898909tokens are a scam
its over for aicg
>>101898492That sounds like that'd make models even worse at making proper use of their context windows.
>>101898943>actually called a character sexy, thats a new one from openai.chatgpt-4o-latest writes smut without ANY jailbreak at all in my testing, which is a first for a big company model I think.It sucks at it, of course, because it's too dry and positivity-biased. But regardless, OpenAI deciding not to train their models to refuse smut anymore is a big development that people should be paying attention to.
>>101898943The 2023-2024 trend of trying to make LLMs into tools for programmers and nobody else was a bad direction to go in, and if OpenAI are moving away from it that's actually a good thingIf you're trying to get normies more interested in AI than they have been then you can't just focus on what makes programmers happy
>>101898981It looks like AWS shutting off the proxy owner's key tugboats has them eating each other out of paranoia. Now they're poking at the chinese proxies.
>>101899002they did a blog post months ago where they talked about allowing gore and erotic content.basically recognizing they are overzealous. and the refusals should sound less judgemental.this is very good for local. local always lags behind a couple months. llama got worse with each version. if we step away from the hardcore alignment (smart enough to see "actual" harmful content),thats very good. i am for completely uncensored, like a tool. but huge step in the right direction.maybe we start to see the change from all those alignment firings as well.that being said, i would be very careful sending questionable shit to openai.whats legal today might not be tomorrow.
>>101898943Maybe Anthropic really has some secret technique there. Maybe they're guarding it closely.
>>101899070must be. i know they have the huge prompt with the hidden <antthink> tags. but i doubt thats all it is.sonnet 3.5 does something differently with the context. and from my experience really starts searching for out of the box solutions to get the job done if things get difficult. very impressive because it doesnt trip up.also had a case where i was like "no, thats wrong, gimme solution X". and sonnet was "please double check i think something is wrong on your end in your enviroment". and called me out correctly, so doesnt follow blindly.i swear it feels like they dumbed it down a bit the recent weeks though. but still the king.
https://huggingface.co/anthracite-org/magnum-12b-v2.5-ktoslop or kino? call it.
Recommend me a model that writes a lot and has sexy details about people's body.
>>101899155Slop.
>>101899155but I didn't put nothin' up
>>101899155>ktoNow I'm interested.
>>101899155interesting experiment, gonna download and test at fp16
>>101899183>>101899210samefag
>>101899155wtf is even the difference between KTO and DPO and why does it matter
>>101899155FUCK YOU
I've never hated anything as much as the anti-anthracite schizo hates anthracite(and yes, it is just one guy)
>>101899276No, it's at least two.
>>101899276>org>aligned to lmg values>and pursues total slop death and kino generation>>>yet schizoanon still attacks itWhy
>>101899154Remember that groups who pool resources to make cool shit are a gross violation of /lmg/ standards and should get a dedicated schizo posthaste
>latest models are using Celeste slop in their mixesIt's over. Say goodbye to adherence.
>>101899331They aren't 100% transparent. So all that doesn't mean anything.
Two giant leaps for vramlets in /ldg/: GGUF format used for txt2img models, werks just fineflux loras proven trainable on 3060 12GB
>>101899360What transparency do you think is lacking? They don't have closed datasets (everything they have is just filtered C2 data), and their new model explicitly lists the methodology used.Pardon me if I look like a shill, but what exactly isn't being transparent here?
>>101898981How does that work anyway? The people providing the proxies really pay for all those degenerates?
>>101899385Schizo's gonna schizo, pay him no mind
>>101899360precisely they are just slopmakers
>>101899377You can't use ram to train loras?
>>101899385>They don't have closed datasets (everything they have is just filtered C2 data)Why is that not open?>and their new model explicitly lists the methodology used.Only listing the methodologies is also not being open.
You know you have succeeded when you get a dedicated 4chan schizo.
Anthrociter are slopermakers
Are there any good "long context models" like llama-3.1-8b-128k, but trained with more than 8b parameters? Preferably something that can fit on a 24gb card.
>>101899385>What transparency do you think is lacking?is the data open or not
>>101899414>Why is that not open?Because it's just a mix of openly available data. There isn't anything in there you can't find in normal C2, it's just pruned and filtered.>Only listing the methodologies is also not being open.There is nothing more to list. The Axolotl config is in the repo, the datasets are openly available, and the method used to make the model is right there on the model card.
>>101899444they could be lying
>>101899425Isn't Nemo 128k as well?
>>101899451>ANTHRACITE LIED>PEOPLE DIED
>>101899444>it's just pruned and filtered.And why do they keep that private?>There is nothing more to list.The RLHF step also needs a dataset.>The Axolotl config is in the repoIt's not in the repo.
anthracite is a bunch of attentonseeking slopmakers and nothing more
>>101899454I'm downloading that one now and will check it out. I gotta say, it's frustrating that it's easy to find things by quant and parameters, but nobody talks about context.
>>101899475>"The RLHF step also needs a dataset.">thinks kto and rlhf are the same thing So I see you're actually an incompetent fagnigger. You'll get no more replies from me.
Finetuning is a meme anyway. There hasn't been a worthwhile finetune for a year
>>101899484stop spamming
What's the difference between static and weighted quants? Which are better?
That's a lot of false-flagging and damage control.
>>101899489It's not like most models actually are able to perform once you go beyond the 16k/32k mark. https://github.com/hsiehjackson/RULER
>>101899497exactly anthraniggers should just leave they arent doing anything new or impressive
>>101899557>they arent doing anything newWhat were the notable KTO finetunes before this
Not transparent = supporting the group and not the ecosystemAnd I don't trust Anthracite to give support to just the group.
>>101899474shut up troon get anthracock out of your mouth
>>101899529>that Gradient 1M -> 16kjesus
DPO is all you need.
>>101899586You have a distinctive writing style.
>>101899629gonna slerp all that cum up or save it for the rest for th cocksuckersd tou have employed
You WILL have reinforcement learning on all your tunes, and you WILL enjoy it.
Celeste bros I think people are catching on...
and now the anthraniggers stay silent i have won go back to your fucking homes and rope yourselve
>>101899517Anyone?
>>101899717you sound mad
>>101899717did you really win if no one wants to play with you anymore
>>101899711>the Sao defense force on Reddit
>>101899733Static just blindly sets the precision of your model's weights to that of the quant. Weighed quants use a calibration dataset to see how quanting what layer affects the model so that it can prioritize 'important' layers to quant less strongly than 'unimportant' ones.Weighted quants are better on paper but are are strongly affected by the chosen calibration dataset and the other settings the quanter uses. Static quants are foolproof.
>>101899760im not mad i am just sick of anthraslop>>101899765your mad that you make slop anthranigger
>>101899711Paid shils from (((Anthracite)))
https://huggingface.co/TheBloke/UNA-TheBeagle-7B-v1-GGUFIt's still a tiny bit sloppy, but this is amazing; it's easily the best 7b I've ever used. It's soooo detail oriented.
Techbros are so far into delusion it stops being funny and just starts being sad
>>101899331Eddie isn't remotely sane, Anon. Questioning his behaviour is completely futile.
>>101899816>UNAWasn't that the chinese scam thing?>https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard/discussions/444>UNA models are not like non-UNA models, their properties are unique and this is now knownlmao
>>101899806all of these peopled defending anthracite are actually being paid what a sad life they must have'
>>101899155I'm sure this is just a funny one off swipe but damn.
>>101899895This is pretty repetitive, Anon. I tend to look for non-repetitive word use. It's not bad though; I've definitely seen worse. What model is it?
>>101899876Damn, I wish I got that kind of easy money.
>>101899876This, but unironically. You can tell Celeste is not part of the cliqué.
So, what's the reason that we need all these special custom formatting instead of just chat completion? like, it seems like with instruct following models, it creates a message with the history, then asks the model to come up with the best response. in chat completion, you'd just pass in the conversation as role+content.I'm writing a custom app that has some chat capabilities, so I'm just wondering if it would make more sense to do what roleplay uses do, or just keep using chat completion.
>>101899816>>101899165Oh you guys decided to try beagle too? I'm said there doesn't seem to be a 12B using whatever it is they used.
>>101899895Make sure EOS is not banned. You might also want to add a newline after the "user" and "assistant" fields in your ChatML config
>>101899951Im not being ironic sir they actually sit in their rooms and on their fat asses all day and shill their moded;s and ciritcuze anhyone who doesnt like them
Any good largestral tunes?
>>101899895>randomly changed the novel format of the greeting to internet RP styleThat's a red flag.
>>101899992lumimaid
>>101899415I know what you mean, Anon. There's a perverse satisfaction in antagonising a member of the dying alone demographic, to the point where they devote the entirety of their worthless excuse for an existence to trying to "expose" you.
>>101899994Claude literally does this thoughbeit?Like father, like son
>>101899941The one I quoted.>>101899965Already did. After a few swipes to get it into the "style" of the first message, it seems to be doing better now.
>>101899992I tried Lumimaid but I didn't like it so I'm still sticking with vanilla
>>101900018No, it's the other way around. It defaults to novel format.
>>101900016cry harder
>>101900030If Mint is your character card, can you put her on catbox for me? I'd like to have a talk with her.
>>101900018the 12b sucks cope harder
>>101900054You are the only one crying.
>>101900054We both know that I'll never cry harder than you.
>>101900070All 12b models suck.Retvrn to Largestral, white man.
>>101900074>>101900077>>101900081samefag from anthracite cope harder you made slop congrants
>>101900095Yes, that's right, schizoanon. There's only one of us. Just like there's only one voice inside your head.
>>101900095>congrantsLearn to write properly, schizoid.
>>101900074>>101900077I'm still waiting the explanation of how they're transparent while keeping the datasets and the training config private.
>>101900138Nice bait, anon. Catch a lot of fish today?
>>101900145It's not coming as transparent to be defensive about these questions...
>>101900183I understand you'd like their "secret sauce" all to yourself, Lemmy. The feud between you two (mostly on your part) is publicly known. But alas, it is just a mix of public data on HF, derived from the Stheno dataset (which is also made of entirely public data).They're not going to take your hand and walk you through how to filter a dataset because you're too blindingly incompetent to do it yourself. Consider the rope, perhaps.
>>101900214So they aren't transparent, and they're afraid of others copying them, because they're doing it mostly for personal gain. Got it.Being afraid of others using your stuff is incompatible with being open source.For that reason, I'm not going to support Anthracite.
>>101900270>strawmans out the assyes that is what he said good job anon you debunked all his points
>>101900285No transparency = no supportKeep crying, shill.
>>101900104im not being schizo oyu and all your friends are just mad that you made slop slop slop>>101900119all of anthracite are indians b theywa y
>>101900214who the fuck is lemmy who is the schziod now
>>101900296keep crying you made shill anthracite faggoot
>>101900330HIII PETRA
>>101900344im not petra stop malding and shitting yourself
>>101900369hi false-flag petra
>>101900376is everyome in this thread paid off by anthracite
>>101900382hi false flag petra, why are you pretending to make typos?
anthracite is on the same level of slop as neversleep
>>101900060She is. And sorry, anon, she is sort of my mascot character so there is some personal stuff included in the character card. I can however give you the art when I get home. Her personality is basically just deadpan but loving so there's no science behind it. I never did get the AI to stop her from moaning and getting less deadpan during smut either so that was annoying.
qrd petra?
>>101899965Alright, after trying it a bit, even with the eos unbanned, it tends to want to keep going and going. This was with several different cards and setups. It has an alright writing style, but it just needs to fucking stop. It'll also switch gears and talk for the user, or stop after ending the assistant message with "user", or start up again with "{character's name}:" at the end. I'm using tabbyapi as my backend, running the model unquanted.
>>101900412It looks like your formatting might be removing the EOS after each generation, I don't see the end token in your stop sequence, and you are setting it during the assistant message prefix instead of the user suffix (?)Try adding <|im_end|> to the user message suffix (make sure to remove it from the assistant one) and setting your Tabby settings to picrel.
>>101900485*by "Assistant one", I mean the assistant PREFIX, keep the stop token for both suffixes.
>>101900504That did it (unchecking the "Skip Special Tokens" option). And adding in the stop sequence.
>>101900548Nice, does the model seem overly repetitive or schizo in any other ways (now that you fixed that)
Mini-magnum just called me Anon. I can't believe I just got cucked by the hacker known as 4chan.
>your existing code...thanks
>>101897152>there are fags that unironically believe thisI guess if your best piece of electronics is a six year old iphone, that's half true.
>>101900685C2 dataset was largely generated by 4channers using Claude
>>101899002>writes smut without ANY jailbreak at all in my testing, which is a first for a big company model I thinkClaude 2.0 and maybe even earlier Claudes (wouldn't know, didn't use them) already did that. All prompts blank just chat context, unless you're talking about writing smut from the get-go without context
>>101900850Claude 2 still needs you to use a prefill, which I am counting as a kind of JB (I don't know why you wouldn't count it)
whats this speculative decoding thing, and can I use it with llama.cpp and ooba?
>>101900406>>101824680
>>101900870Not using any prefill and 2.0 writes it just fine.
>>101901034Oh you're just lying then, sorry my bad for falling for it
I finally got back from a vacation where I basically didn't touch electronics. I'm trying get caught up on the llm scene and have run my coding and recapbot tests on Mistral Large at Q8. The coding test appears to be too easy for the newer models...they all ace them. I'm going to have to come up with something more complex until they start failing againIn my opinion Mistral Large singleshot recapbot is also getting close to the level of quality of actual recapanon's specialized multi-pass recapbot
>>101901052You're either trying to write cunny or without any chat context, or are not using the API, none of which were specified in your post. Cope.
>>101900885It's a way to speed up generation by using a faster method to generate several tokens worth of a plausible continuation and having the slower model you actually want to use check its answer. When the average batch has at least a few matching tokens it's a big speedup, and there's no quality loss since you always end up with the big models choices. The technical reason it works is that checking several tokens at once is faster than generating that many tokens one-by-one. It takes advantage of the fact that the "next X tokens" to guess in most cases are actually quite easy for smaller models to get right, and the key parts they fuck up (e.g. forgetting the position of someone in a scene) will be corrected by the smarter model to set them back on track. Basically taking advantage of the fact that some tokens are much harder than others and letting small models handle the easy ones.Its main benefit comes when you are using a very large model and have a much smaller one that can guess for it. Cpufags running 405B claim to about double their speed with an 8B draft to speculate for it.llama.cpp has a demo implementation of speculative decoding through the llama-speculative binary. It's not very useful in practice because it can only generate one response and then exits, but it's good for one-off tasks or for testing how well it'd perform for your setup in theory if it were to ever be implemented in a usable form. There's also a PR for adding a different type of speculative decoding into the server (by guessing next tokens via prompt lookup without a separate speculator model) but it hasn't had development in a while either unfortunately. I don't think we'll see much interest in the feature or anything similar until it becomes more common for the large and slow models to see use. Llama 405B isn't considered worth the speed hit with or without speculative decoding compared to using, say, Mistral Large/CR+ if you don't have some datacenter supercomputer.
Fellas. Newfag to local models here so I bring a big titty Miku as an offering.What's a good model for translations? English to Japanese, and English to French?P.S: Flat, thin Miku is actually best btw.
I admit that I am not going to go into detail here, because I don't want to get seethed at by self-righteous, mentally ill, arbitrarily skeptical fuckwits; but I encourage anyone who is willing, to do a search for "UNA gguf" on HuggingFace, and do your own experiments with the listed models. I'm seeing a level of mathematical ability with the 7B that I've hardly ever seen before, and this is in casual conversation with coombots from Chub, as well.
>>101901211Mistral-Large, few shot.
>>101897896>do I fit in yet?
>>101897896I have far fewer problems with them, than with you.
>>101901243>123bA-anon, I have a 12gb vram card, 64 gigs of ram, and a 12700KF. I probably should've mentioned hardware.
>>101901290Not him, but mistral large Q2K fits in 64+8gb with 30k context. I can't speak to its translation accuracy at a low quant though.
>>101901290Perhaps trade some time for being able to run a smarter local Miku? Gemma2 27b is also an option for French should you prefer not to wait. Again, with examples.
is anyone hosting local models for free or am I stuck with agnai? I remember there was a friendly autist who avatarfagged with a nickelodeon character who always hosted for the homies. Where is he?
>>101901383I don't mind waiting a bit. I figured I could not run Mistral Large or any of those 70b models, but my understanding of system requirements is admittedly dogshit.
>>101901413For a while model makers used to host on cloudflare tunnels when they wanted anons to test their models. Haven't seen it as much recently though.
>>101901242I believe you, but what's the real world use case for doing math with an LLM? There clearly is one since I see AI people increasingly talking about and benchmarking on math skill, but it's a mystery to me
>>101901413i remember that guy visiting one of the image gen threads a few months ago asking for help with generating futa images of said nickelodeon character. wonder what he's upto these days.
>>101901478How much do you know about Advanced Dungeons and Dragons, Anon?
>>101901242lmg really became Times Square huh? Except you don't even have to pay to post your ads
>>101901553You can leave anytime.
>>101901478NTA but I tried those models back then they came out and they were the only 7Bs that could do stat blocks properly, for example. Their writing was pretty decent too, they felt coherent in a way. I posted about it before in this thread way back then. I remember metamath cybertron starling was impressive too, although that was not UNA i don't think
mistral jb, sillytavern conf
>>101901413Jennyfag from aicg? I'm not them, but I used to host slocal a long while ago. I just assume these days anyone left in lmg has enough compute to use what they want and aicg is too deep in corporate models to care about local.
>>101901598Oh yeah I did. I guess it was clever but sloppy.
>>101901659That's better prompt adherence than some modern models lel
>>101900916kek
>>101901478I apologise for the rhetorical answer about AD&D being unclear, but the point is that roleplaying games, which is something that a lot of Chub cards are trying to be, rely very heavily on mathematics to determine basically everything, from who moves first in a fight to who wins, based on damage count versus armour values, etc. If you play the Borderlands franchise you can see that very clearly in the game mechanics.
>>101901691>>101900916xhe's right THOUGH
>>101901243How do you few-shot translations?
>>101901876Put some example translations of the same source/target language in the prompt.
>>101901725Oh so math just means like arithmetic? For some reason I assumed people were talking about more advanced theoretical stuff when they discuss models doing math
>>101901919The more complex stuff is based on arithmetic.
>>101900396Same guys + their friends, what do you expect?
>>101902149>>101902149>>101902149