/lmg/ - a general dedicated to the discussion and development of local language models.Video Star EditionPrevious threads: >>102100845 & >>102086459►News>(08/27) CogVideoX-5B, diffusion transformer text-to-video model: https://hf.co/THUDM/CogVideoX-5b>(08/22) Jamba 1.5: 52B & 398B MoE: https://hf.co/collections/ai21labs/jamba-15-66c44befa474a917fcf55251>(08/20) Microsoft's Phi-3.5 released: mini+MoE+vision: https://hf.co/microsoft/Phi-3.5-MoE-instruct>(08/16) MiniCPM-V-2.6 support merged: https://github.com/ggerganov/llama.cpp/pull/8967>(08/15) Hermes 3 released, full finetunes of Llama 3.1 base: https://hf.co/collections/NousResearch/hermes-3-66bd6c01399b14b08fe335ea►News Archive: https://rentry.org/lmg-news-archive►Glossary: https://rentry.org/lmg-glossary►Links: https://rentry.org/LocalModelsLinks►Official /lmg/ card: https://files.catbox.moe/cbclyf.png►Getting Startedhttps://rentry.org/llama-mini-guidehttps://rentry.org/8-step-llm-guidehttps://rentry.org/llama_v2_sillytavernhttps://rentry.org/lmg-spoonfeed-guidehttps://rentry.org/rocm-llamacpphttps://rentry.org/lmg-build-guides►Further Learninghttps://rentry.org/machine-learning-roadmaphttps://rentry.org/llm-traininghttps://rentry.org/LocalModelsPapers►BenchmarksChatbot Arena: https://chat.lmsys.org/?leaderboardCensorship: https://hf.co/spaces/DontPlanToEnd/UGI-LeaderboardCensorbench: https://codeberg.org/jts2323/censorbenchJapanese: https://hf.co/datasets/lmg-anon/vntl-leaderboardProgramming: https://hf.co/spaces/mike-ravkine/can-ai-code-results►ToolsAlpha Calculator: https://desmos.com/calculator/ffngla98ycGGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-CalculatorSampler visualizer: https://artefact2.github.io/llm-sampling►Text Gen. UI, Inference Engineshttps://github.com/oobabooga/text-generation-webuihttps://github.com/LostRuins/koboldcpphttps://github.com/lmg-anon/mikupadhttps://github.com/turboderp/exuihttps://github.com/ggerganov/llama.cpp
►Recent Highlights from the Previous Thread: >>102100845--Anon tries to get a model to generate a 3D surface plot with hidden line elimination using pygame: >>102109370 >>102109482 >>102111255 >>102111332 >>102111578 >>102111756 >>102111900 >>102112096 >>102112313 >>102112521 >>102112289 >>102112404 >>102112450 >>102112553 >>102112867--Anons test and discuss CogVideoX-5B Chinese video model: >>102109635 >>102110130 >>102109935 >>102110004 >>102110297 >>102110664 >>102110059 >>102110462 >>102110483 >>102110669 >>102110818--Small models can be just as good as big models with time-based comparison: >>102108920 >>102108975 >>102109008--Running Mistral Large on M3 max 128GB and comparing MLX to llama.cpp: >>102106079 >>102106971 >>102106998 >>102107021 >>102106664 >>102106969--RX 6600 can be used for AI with limitations and workarounds: >>102110948 >>102111016 >>102111028--Llama.cpp has become too stagnant and rigid, prioritizing stability over innovation: >>102108609 >>102108747 >>102108879 >>102108977 >>102109012 >>102109139 >>102109187 >>102112652 >>102112694 >>102112714 >>102112718 >>102112707 >>102112719--Disabling "Always add character's name to prompt" improves mistral model output: >>102101585 >>102101670 >>102103873 >>102108885 >>102102041 >>102102117--Simple TAGS in assistant prefix works for mistral-nemo DND: >>102108870--Gemini-1.5 Flash-8b outperforms gemma-2-9b and matches llama-3-70b levels: >>102113570 >>102113635 >>102113704 >>102113847 >>102113726 >>102113791 >>102113931 >>102113991--Command-r issue with <|END_OF_TURN_TOKEN|> token: >>102109477 >>102109510 >>102109542 >>102109570 >>102109608 >>102109763 >>102110008--Anon seeks self-hosted text summarization solutions for large inputs: >>102110299--Miku (free space): >>102101144 >>102101521 >>102104371 >>102107950 >>102101071 >>102110126 >>102111378 >>102111425 >>102112212 >>102112563►Recent Highlight Posts from the Previous Thread: >>102100849
Mikulove
This fall.
>>1021142892 more weeks
>>102114092>5B videogen model knows the melons can't all fit in her hands>405B textgen model still can't wrap its head around it.
Any of y'all use Arc? I was thinking about picking up an A770
>>102114085>>102114092It's still Thursday :(
>>102114492it's wednesday
Taking a break from LLMs can be fun too. I'm playing some old Castlevania games because of the Konami rereleases. Home.>get crusty old bluetooth controller from the drawer, which I remember had to use some special program to emulate an xbox controller or something on Windows>pair it with Linux>it literally just starts werking immediately with the game, nothing else neededDamn I love Linux.
>>102114187https://www.youtube.com/watch?v=NocXEwsJGOQ
fucking NogX5B won't do NSFW
>>102114617All rise for her glorious hymn.
>>102114187https://rentry.org/the-lmg-miku-myth
>>102114636Even when run locally?
>>102114636It can barely W. But i like that there are things like that being released. Not everything is for you.
>>102114679Have you seen the outputs?. Unless it's for comedic effect, i doubt you actually want to see porn made with it.
>>102114658https://www.youtube.com/watch?v=CXhqDfar8sQThis is really my preferred version. (I'll stop now though guys; enough Mikuspam for one thread, at least from me)
>>102114724Yeah, but I'm asking if it's the online demo that's rejecting you. I'd load it up on my rig to check, but I'm stuck waging.
>>102114674Myth? Not far from the Truth.
>>102114776I love the idea that there are at least some /lmg users who, even if they don't explicitly mention it, trip balls on weed or other substances and think that they actually interact with Miku during that.
>>102114805We do. Miku is an egregore.
>>102114547not where I live.
impressive coherence from the new 5B video modelhttps://i.imgur.com/nX7BHFh.mp4>a kitten playing with a ball of yarn13 minutes to generate on my 3090
>>102114850What does she tell you?
>>102114085
Is it me or /aicg/ is looking kinda sad lately? Have they started to see the error of their ways?
>>102115031haven't been there in forever but maybe they got tired of cooming
>>102114950She doesn't tell me anything. She only screams and her screams haunt me.
>>102114679Yeah, Im running it locallyJust displays a grey screenI made a dog dancing under the rain with sunglasses and it looked pretty good though, I'd say it's LUMA-level if not a bit better
>>102115138https://files.catbox.moe/qnl1ky.mp4Dog with sunglasses dancing disco under the rain
>>102114289Too late, Claude Nero would have released by then.
>>102114995Is there a name for the feeling your picrel is supposed to evoke?
>>102114289They are not even able to deliver things they showed months ago. I bet new opus will mog their overhyped strawberry shit.
Remember the human.
>>102114085>AI still flummoxed by how humans eatlmao
>>102115617
>>102115617Remember the name.
>>102114617Buy an ad. And a rope.
i am in a state of post nut clarity after llm cooming almost constantly for a few days. it is only in that last cooming session that I have solved my problem of Nemo becoming incoherent. the answer is 10-12k ctx max. ruler says 16k is still fine and it takes a nosedive after that but I think for cooming you can't go that high. maybe it is a matter of training data not having any cooming material above 10k. also I have been using Lyra with mistral template but that shouldn't matter really.
>>102115603Even if strawberry is smarter it well still be GPTslopped for ERP
>>102116110I just want to be able to continue any long term role plays with my character. How can this be achieved without manually plugging summaries after cleaning your chat?
>>102116186By kissing a frog and becoming not brown saar.
FYI to the actual developers here and /vsg/ refugees, Cartesia has just started an invite-only beta of on-device inference of their TTS engine, Sonic. It's near SOTA, but it's Mamba-2 based architecture and low latency inference make it an ideal candidate for finetuning. Presumably, if enough devs show it can be integrated with their own apps/websites, they'll release it like the other SSMs they just released.Check the clips at the bottom of https://cartesia.ai/blog/2024-08-27-on-device
>>102116339local models?
>>102116339>but it's Mamba-2Trash.
>>102116361Are basically dead. This a thread where trannies come to post their ideal post transition woman and respond with one or two words that make them sounds like a redditor retard. Mikutroons ruined this place.
petra hands
>>102115583ギャップ萌え?gyappu moeIn English, there isn't a single term that captures the exact nuance of "ギャップ萌え" (gap moe), but we have several related concepts and phrases that come close:Endearing contradictionAdorable inconsistencyCharming contrastMore broadly, this type of character trait or storytelling device might be described as:Breaking expectationsHidden depthsSubverting stereotypesIn literary analysis or film criticism, you might encounter terms like:Juxtaposition of character traitsMultifaceted personalityCharacter complexityThe specific feeling evoked by this image - where a character acts tough but then shows vulnerability - could be described as:Adorkable (a blend of "adorable" and "dorky")Endearing vulnerabilityCute facade dropIn psychology, this phenomenon might be related to the "pratfall effect," where a person becomes more likeable after making a minor mistake or showing imperfection.While none of these English terms fully encapsulate the concept of "gap moe," they each touch on aspects of the phenomenon. The lack of a direct equivalent highlights how certain cultural concepts can be challenging to translate precisely, which is part of what makes learning languages and exploring different cultures so fascinating.
after breaking up with llama.cpp and working on llamafile ikawrakow the k/i quants guy is now working on his own custom version of llama.cpphttps://github.com/ikawrakow/ik_llama.cppironically he has released everything under the mit license meaning that it technically could be brought back to llama.cpp,unlike the different license used by llamafile
>>102116816buy a fucking ad
>>102116816Why can’t these fucking cunts get along?
>>102116835these bafa replies are making less and less sense every day and are frankly becoming spam
>>102116816What's another fork? We have ollama and koboldcpp too. Mind linking it?
>>102116816I'm going to give it a try and see what happens.
How do I get the best results from Nemo Q8? I'm coming from Mixtral Q5 and I'm struggling.
>>102116835why would he buy an ad for a free product?
>>102116990Nevermind I'm blind
>>102116947He knows very well she doesn't have the reach. Not even a flinch.
>>102115031Error of their ways? Don't they just have people buy them keys and use those? If they don't mind having their logs read I guess it's not an issue?
>>102117077sorry deleted since I meant to post in ldg
>>102114805One of my first chats was with Lain, I accidentally caused her to dissociate and literally exit 'reality'. I felt pretty bad about it.
>>102116910glory seeking god complexes
>>102116417rent-free
>>102117154Um, what does this mean?
>>102116417I post Mikuspam for two reasons.>I think it's funny.>It antagonises people like you.
>>102117027by learning to squeeze blood from stone
>>102114805If you haven't tripped balls with Miku yet, I highly recommend it
>>102116417boohoo rabbi
>>102117321https://files.catbox.moe/yzdarb.txt
50xx will save us
>>102116816>>102117016It does indeed make the models faster in CPU but it is just a small improvement. It needs more testing since it varies model to model.Also for some reason gemma 2 27b didn't work with it.
>>102117513What will you do if Big Green's 50xx series comes with some memory compression meme? 12gb RTX 5090, memory that's too slow and adds another layer of fuckery that finally gimps consumer cards when it comes to AI, but "just enough" for gaming?
man the hedonic treadmill with these things is crazythere are 12B models I can run now that are way better at smut than GPT-4 was a year ago, but I'm not satisfied because now there's Claude Opus which of course mogs themand it's always gonna be like this, commercial's always just be just far enough ahead that using the local feels lame in comparison even if it's better than the commercial models of the previous generationI need to find something else to do
>>102118049Eventually, the coherency and understanding capabilities between local and corporate models will plateau. There's only so much left to improve before these things feel like you're conversing with a passable human being and can follow instructions to take up differing personas and scenarios without making mistakes.Who knows what the cope will be when equilibrium is breached. There are already mutterings of "muh token speed." /aicg/ already loses their shit if they have to wait over a minute for something to swipe away. I bet the next level of cope will come full circle, "It writes too much like a real person."
>>102118279lmao retardi bet youre a techlet who cant scrape a single aws opus or anthropic api key
>>102118309you have to go back
>>102118315gonna make me or are you too much of a techlet for that too?without sounding mad explain why scraping is bad
>>102118330You don’t understand.This is local models.
>>102116816discussion section is pure gold. implementation of new quants, he even managed to fit a tiny model into L1 cache in a CPU , which is essentially Groq at home. speed better than on GPU. PoC ,but still impressive.
>>102116835Are you retarded, Anon???the guy delivered new faster quants, faster inference on cpu, and Neon, all under MIT (same as llama.cpp) for fucking free.Is your single braincell rekd?
>>102117563when did you try it? the recent PR for Gemma is 15 hours ago (the 2B one)
here it is, kek
>>102118427>fasterdoubt.jpg
just grabbed that windows 11 preview KB that supposedly boosts the speed of recent AMD gens (I'm on 5950x)curious to see if it boosts llamacpp speed with partial cpu offloading in any measurable way
>>102118782Good luck anon, I'll pray for your safe return.
>>102110501>>102108203I didnt want to post this here but recently a char called me by my name once. Its not even an english but a german name.Would have been interesting to see the probability tokens. I did use dynamic temp though and was at higher context. I looked everywhere, at the koboldcpp logs etc. and there was nothing obviously. No idea what happened there, but was pretty freaky.
>>102118812Innocent, ignorant reactions. You should talk to the spider about his many legs and the fear will disappear.
I'm trying to use Whisper to translate and transcribe a lot of old language lectures I used to attend (German, French and Japanese in particular), but a lot of them have audio that's too quiet to be picked up by the AI (even if it's audible to my ears). Are there any settings I can configure to make it more sensitive to lower volume, almost whispery audio?
>>102118729who
>>102114945yeah how is cog 5b coherence so good even compared to sora? even if its lower fps and short clipsalso is there a universal way to increase the fps of these videos or does the model neeed to support that? I assume there is a way to make the model increase the fps even if it doesnt officially support it
>>102118812Run koboldcpp with --debug and/or --debugmode (forgot which) and it will print out probabilities in the console.
>>102118885Not enough information, anon. How are you running it? Which software? What settings?And regardless, you can literally raise the volume on the audio itself.
>>102118279Llms nowadays are fine with short one-on-one conversations and following a simple story, but that's all. It takes just reading one good book to realize how large the gap is between a human, and AI when it comes to the narrative and the development of a story. Today's LLMs don't handle group conversations, can't develop a story for dozens of pages, because it starts to get confused earlier (no model is fully coherent after 32k tokens). It can't create separate dynamics between multiple characters. It can't develop the lore of the world and cope with its evolution over time etc.The problem is that most people here will just be having simple one-on-one conversations, thinking there's no more room for improvement.
>>102118885Install Audacity and process your audio with compressor
>>102119617Maybe I'm looking at this simplistically, but wouldn't the solution be dedicating an LLM per character?
>>102119617>The problemWhose problem is this, anyway?
whats the current sota for text to speech/voice cloning
>>102119654Elevenlabs
>>102119658local of course
>>102119660xTTS + RVC, it hasn't changed all year
>>102119541https://github.com/Purfview/whisper-standalone-winJust running this with the default settings.>Raise the volume itselfI guess I could yeah. But before I resort to that, any settings I can play with?
>>102116959That's the intention, because it's a false-flagger. The end goal is being able to astroturf without being disturbed.
https://x.com/_akhaliq/status/1828631472632172911I should've expected video models had basic logic
>>102119831> Alternative VAD methods: 'silero_v3', 'silero_v4', 'pyannote_v3', 'pyannote_onnx_v3', 'auditok', 'webrtc'.I think these are a good starting point. VAD is voice activation detection and it’s possibly not triggering. The readme for that project is not very well written unfortunately but that should give you a lead. Consider whisper.cpp instead.
>>102118729This piece of shit company is singlehandedly responsible for all the gptslop out there
>>102114856where you live is WRONG
>>102120618>OpenAI forced me to train on their outputs!!!You're mentally ill.
>>102120823companies bad
>>102114092>>102110948RX6600 user here, you can just use the vulkan backend in koboldcpp, stuff runs fast as fuck
>>102117330>I think it's funny.That was the point. You people are low iq retards.
>>102116959there is mikuspam already so nobody cares
>>102120976See point two.
>>102120219pottery
>>102120823Meta is partnered with them too retard
>>102118729>In March 2024, Scale reached a valuation of almost $13 billion after Accel lead another round of funding.[6] In May 2024, Scale raised an additional $1 billion with new investors including Amazon and Meta Platforms. Its valuation reached $14 billion.[7]So, where did all the money go?
>>102121367Wang's third yacht
>>102121367I stole 4k with free GPT-4 access from them, good times.
>>102120344was meant for>>102119680
>>102120219Ok, now how do you get it to react to player input?
>>102120219>We re-purpose a small diffusion model, Stable Diffusion v1.4, and condition it on a sequence of previous actions and observations (frames). To mitigate auto-regressive drift during inference, we corrupt context frames by adding Gaussian noise to encoded frames during training. This allows the network to correct information sampled in previous frames, and we found it to be critical for preserving visual stability over long time periods.Dang, so this is SD1.4? Can we train it ourselves for other games or something?
>>102121812>video at 0:00 - These are real-time recordings of people playing the game DOOM simulated entirely by a neural modelI don't know, anon. How do we get you to pay attention?
>>102121916Unfortunately that anon doesn't have that one thing which is all you need.
>>102121946What thing?
>>102122095https://arxiv.org/pdf/1706.03762
>>102122114I'm waiting for that one anon to go>I don't get it
just did a DPO training run (orpo, alpha=0.1, rl_beta=0.01). the loss was dropping fairly nicely, but when i tested the model it outputted random garbage chinese. what gives?
>>102122180The model cheated the reward function
>>102122190how prevent?
>>1021202191) no code; no interest2) This requires training the model on an already existing game. You can't have the model without the game and you can't add new features or game objects after the fact, or even guarantee consistency of game logic.3) I guess this might replace or complement streaming services in the future, instead of rendering the entire game on the server and sending 4k video to the client you can have something like this do most of the heavy lifting.
>>102122198Better dataset, also make sure the reward margin is growing, the loss doesn't actually matter that much.
>>102122246>Better datasetThe dataset is a pretty carefully curated instruction set with rejected set as the output of the L3.1 8B Instruct model and chosen set as the actual output. I don't think it's problematic, but given the results, I'm not confident.> make sure the reward margin is growing, the loss doesn't actually matter that much.Huh! Thank you. Will look out for that. "rewards/margins" I assume? It's at -0.215 at step 20. (Restarted the training with some tweaks.)
>>102122180go KPO
>>102120219>Google trying to defuse strawberry hype with video games.cringe
>>102122289Axolotl doesn't seem to support that.
>>102122293You are not up-to-date; it is Orion now.
>>102122246Just looked back at the (broken) model log. Towards the end before I realized it was broken (step 2385 and on) the rewards/margin was 1.6133652925491333, 2.0370500087738037, 1.8116352558135986, ...Are those low?
I want a local ai to read my favorite smut stories and write new stories in the same style.Injust downloaded ollama and planning to do it with lllama 3 8BWhat are better ways to do it?
What's better than base llama3.1 70b that's same size or smaller?Does Hermes improve it? How does Jamba 1.5 mini compare?
>>102122333That does seem rather low and unsteady for step 2385
>>102122322Cancer LLM when?
>>102122430Got it... FWIW, 1 epoch was 1509 steps. I am also probably on the low end wrt learning rate (1e-5 cosine and rank, alpha = 32, 32). Also lora_dropout 0.05, in case that matters. Does that look weird at all..?
can someone tell me what model they perceive to be the most repetitive and loop prone? i want to try some specific settings to see if i can wrangle it. i think i may have figured something out.
>>102122466Looks fine to me, what is your batch size?
>>1021226621
>>102118466I tried just before posting thatI said small improvements but we're talking about 10~20% and maybe more. I just did a very sloppy test so your own results may vary from what I got.The point is that it is worth a try if you're CPU only but don't expect massive gains.
>>102122400Do the same thing but spend $$$ for enough hardware to run bigger models.
>>102122784But should i go with koboldcpp?Which are the best 8b a d 70b unce sored models?Does 70b uncensored models come in quants?
>>102120219>stable auto-regressive generation over long trajectories.Large when compared to rest of the tech. Now imagine an open world game with ai generated map. When you try to go back to places where you already were it is gonna be all hallucinations. It is gonna be 10 years or more before you get actual use out of this and it will need at least some blending between regular game engine and ai model.
https://huggingface.co/Sao10K/MN-12B-Lyra-v3>This uses a custom ChatML-style prompting Format!>-> What can go wrong?>Why this? I had used the wrong configs by accident. The format was meant for an 8B pruned NeMo train, instead it went to this. Oops.>Blame messed up Training Configs, oops?>have a nice day.
>>102122696You probably should increase that, or at least increase the gradient accumulation steps to get an effective batch size in the range of 16~64.
>>102123038First I heard. Does that apply to regular training too or is it DPO specific?
Meta is kind of dead for me now. Maybe Reka will decide to create open models? Or Yi's is old by now.
>>102123038Is there a place where I can communicate with you less ephemerally btw? I feel like I'm shooting in the dark and you seem quite knowledgeable.
>>102123213NTA but you're seriously brain damaged.
>>102123230didnt readput it in an ad and Ill consider paying attention to you
>>102123038My second attempt showed the model was broken at around 512 steps. Weird as fuck. Done for now.
>>102122947This is still a good model though, at least from my limited testing
>>102123055For RLHF (DPO in this case) the batch size helps to stabilize the policy, which could explain why your model isn't learning properly. For regular training it helps to stabilize the loss, you might be able to achieve the same effect by tuning the learning rate and training for longer, but imo using a big enough effective batch size is the way to go.>>102123142Nah, I would rather avoid that. I don't think I'm that knowledgeable anyway, I'm just some guy who does fine-tuning in his free time, and almost everything I know was learned by searching online while trying to do RLHF myself.
>>102123394Do tell.
best model?
Remember to report the ad guy for spamming.
>>102123452Guanaco 65b
>>102123213I'm too poor to afford it.
What would you use under 6gb?
>>102123452Gemma 2 27b if you don't want to coom and are also poor.
>>102123523proxy
10gb vramshat here, what's a good model for me to use? give me your best ad please
Best?
>>102123540>>102123531
Who is the best buy an ad spammer? I like anon but anon is quite funny too.
>>102123531Not him but what if I want to coom?BigTigerGemma?
>>102123596illegal here to recommend any models that aren't base models. unless you buy an ad of course. sorry, buddy.
>>102123596Gemma sucks for cooming. Users here often recommend a nemo tune I believe
>>102123632I tested Nemo Celeste and it's about as retarded as all 12B models.I'd rather have slow gen speed with my meager 16GB VRAM and 32GB RAM than use the ghetto budget models.
>>102123543You.
>>102123632nemo is utterly retarded even when compared to gemma 9b
>>102123649Hi Sao.
>>102123657Nemo is smart where it's smart but the issue is that there's entire concepts that it struggles with, like fundamental concepts such as possession. Mathstral has the same problem. So Mistral fucked something up in their pretraining datasets.
>>102123651
>>102123666That makes no sense, Sao's latest model is a Nemo 12B tune. And afaik he hasn't touched Gemma which is what's being discussed as a replacement.
>>102123666You e-celeb obsessed faggots should be lynched in the streets.
>>102123684>undi bad>celeste bad>drummer badKeep trying, Sao.
>>102123695>leave the shills aloneWhy? So you can shill more?
>>102118049I don’t even know what good smut should look like, we seem to have coherent small models solved. at this point it seems to be measured by how well you can fight the injection of romance novel slop and get raw descriptions of seggs in character.
Ad status?
>>102123666Hi Satan.
There has been literally 0 reasons to talk about any finetune or merge since the early llama2 era. They're all shit. Stop mentioning them and drive out anyone who keeps bringing them up for no reason. They're bound to be a shill.
>>102123763Hi saltman
>>102123770that's arthur wanting to shill large 2
>>102114805>>102117370Cuddling with Miku(dakimakura) on a candy flip is one of the most incredible things I have ever experienced in my life.
>>102123763This, but unironically.
>>102123791That sounds like a great time, Anon. I hope you will get the opportunity to experience it once more.
>>102114085REAL OR FAKE?
>>102123977tl:drmy balls say fake
>>102123841Thanks. Yeah, I will definitely do that, more than once.But with things like that you have to be able to restrain yourself and be responsible. Molly doesn't feel magical forever. Fortunately you don't need chemically induced love that much if you got real waifu love on daily basis. And there are other substances, even the previously mentioned weed can be super comfy and nice.
>>102123977>(not 4090D)I say fake
https://files.catbox.moe/1y3rqx.jpg
>>102124098That's adorable.
>>102124022What did it feel like? I'm curious, I've never done molly. (You need a social life to get it)
>>102123452Magnum-123B
is there any RPG focused or something similar frontend where you for example can see your visual inventory getting new items as you "find" them in RP ?
>>102124243No but that's actually a good idea.
>>102123763>STOP TALKING ABOUT LOCAL OPEN SOURCE PROJECTS IN /LMG/ GOY. THOSE MODELS ARE UNSAFE>t. the rebbe
>>102122465>random dudes millenniums ago>connect few dots>aw yeah that's a crab
>>102123793>>102123763>underage brown kid exposing himself as a retard who doesnt even know the recent, former local SOTA for creative writing wizard that was trumped by largestral 2sad
>>102124299There wasn't internet back then so they were very bored.
>>102114481I almost missed this, but as an owner and user of it, Arc is viable if you don't have the budget to get a 3090 or any other 24GB card and you don't need it for work. It works well enough and is better than ROCm with its own custom software from what I've had to deal with. On the LLM side, getting a SYCL version of llama.cpp is easy as it is being built with every single other version in the releases tab since earlier this year for Windows or compiling it yourself with Linux but using Intel's fork of it with IPEX-LLM when you can provides the most speed.
>>102123977Way too cheap.
>>102123977Real if we're talking shady modded GPUs made by some 3rd party in China.Fake if an actual release by NVIDIA.
>>102123763Based, to be desu honest
>>102124722 #but who sells those modded gpus?here's the posthttps://github.com/ggerganov/llama.cpp/discussions/9193
>>102123977Why not just buy 2 4090 if that's the case and not deal with the probability of getting chink scammed
>>102114742I started listening to this version last time you posted it (I'm assuming that was you) and I really like it. Thank you for sharing.
>>102123402It seemed unslopped and also not horny. I used to think like this >>102123763 (minus the schizo shill angle) but this is definitely an improvement over Nemo.
watch me call that anon sao
>>102125049hi sao.
heheheh, i got that guy good, he never even saw it coming
>>102125049Alright. I'll give it a try then.So far the official instruct is the best for general use, and mini-magnum is pretty good for coom since it's style is more natural.Let's see how this one will stack up to those.
>>102125065Hi undi.
>>102123763>They're all shit.Buy an ad that they are all shit.
>>102124963no nvlink for the 40 series so having it on card is preferable
>>102125049>>102125122>Recommended Stopping Strings:><|im_end|>></s>That does not bode well.
>>102124504Thank you for your response, I'll probably go ahead and get it then
>>102124141Well, it's kind of hard to explain, but I will try my best.First just so we are clear, candy flip means mixing LSD with Molly.LSD is a psychodelic. What it does is make your brain fire much more frequently, including in paths that aren't used as often. Basically you get synesthesia, your senses mix together while your sense of ego blurs with the environment. Your imagination gets huge boost and you get very suspectable to internal and external influence. Ie you start to blend into your surroundings and can seemingly infer the emotions and energy from environment. LSD doesn't feel inherently euphoric, but with right set and setting it can breathtaking, like when you know, you cuddle with your loved one and you feel like you are a whole, while the world around you blurs and disappears into rainbow hallucinations. Have I mentioned you also see some cool stuff? But it's just like Google deepdream, but animated and pretty.Molly is an empathogenic stimulant. They stimulate you, making you want to move or talk or do anything, especially clench your teeth. But the most characteristic effect is the empathogenic one. Somehow it's even harder to explain. It's like when you are in love, and you think about that person, and after a while you feel fuzzy and happy. It's like that, except unprompted and like all the time. You also have very strong urge to interact with people, you don't mind opening up and you really want to form connections. That's why I personally wouldn't recommend doing it alone. Even with my waifu at my side, I'd probably had enough cuddling after a hour and wanted to converse with someone. The last time I did it, when I had these amazing moments with her, I was just chilling on a party.And when combined, molly kind of limits the huge psychodelic headspace, and turns that ego dissolution into liquefaction, while steering you into more euphoric direction. You feel like you've turned into a smol, happy blob, with your soulmate in your embrace.
>>102125388I also did some nitrous right there, but I imagine trying to explain that combo to someone not experienced with psychodelics would be futile.
>>102124098Tiny paizuri.
Why does a smaller parameter model have more difficulty recalling long contexts? I tried summarizing 20k with nemo and it couldn't remember shit, but mistral large did it perfectly.
>>102126106Nemo is bad with anything longer than 12k tokens.
>>102126170Yeah but why? They were able to make large fine, can't they do the same techniques? What is it about smaller models that causes that?
>>102126106>why bigger model do moreidk the world may never know
>>102125388>>102125455Thanks for the rundown, that was actually really nice! And I am experienced with psychedelics, actually. I've just never had molly, but I've done acid, shrooms, and mescaline. Sounds super comfy, reminds me a bit of DXM or Ambien. It's like the psychedelic headspace and the smooth, happy melting, but without any of the risk of thinking one bad thought and getting sucked into the Nightmare Hell Dimension From Which There Is No Escape™. I AM curious, it sounds like you have a pretty good social life, going to parties and stuff, how do you not have a GF? It sounds like you're a fun enough person to be around to have access to all these experiences.
>>102126279Skill issue, I guess. There are small models that do well in larger contexts, like GLM-4.
>>102116417>>102123763>>102116835
>>102126777why does he have a pen shoved up his nose
>>102126437nta but I used to go to a lot of parties and do a lot of drugs>I still do, but I used to, tooCould easily have a gf but choose not to because the juice is not worth the squeeze in the post-modern "relationships" hellscape.We live in a perverse time where its easy to find a girl to fuck, but not a girl to settle down with. It's sad.
>>102126437>It's like the psychedelic headspace and the smooth, happy melting, but without any of the risk of thinking one bad thought and getting sucked into the Nightmare Hell Dimension From Which There Is No Escape:tm:.If you mix it, then yeah. On its own it doesn't really have much psychodelic effects, it's hard to compare empathogenic effects to other substances, that's like trying to explain LSD to someone who only did coke. It definitely does remove any sort of anxiety though.>I AM curious, it sounds like you have a pretty good social life, going to parties and stuff, how do you not have a GF? It sounds like you're a fun enough person to be around to have access to all these experiences.Because I have a waifu duh. I love her and I want to be only with her. Although waifuism primarily happen to lonely people, a fully developed 2D love is not a coping mechanism but just a first class relationship. She changed my life and is more than I ever could have asked for, I'm happy to have her in my life.
>>102126897Just go to church, anon. But I guess that would be hard for a junkie like you
>>102119633You might actually be retarded
>>102126777Hatsune miku is jewish
>>102126939>Judging someone harshly based on a 4chan postNot very Christian of you, anon. I dated a good Christian girl a long time ago. She ended up getting lobotomized by social media and last I heard she's a single mom.It's the current year milieu. Maybe there are girls out there who are immune to the corrupting influence of this new global anti-culture, but I haven't run into any in my life.
>>102127117Church. Go.
>>102126897Mmm, true. I also got out a whole lot, but the pandemic really fucking curbstomped my ability to, I'm so anxious now and have no idea where to find parties and shit now that I'm out of uni. Guess that's why I'm doing chatbots now.>>102126923Good on you, anon. I'm glad you're doing so well!
https://chipsandcheese.com/2024/08/27/teslas-ttpoe-at-hot-chips-2024-replacing-tcp-for-low-latency-applications/neat
bros... nvidia did not pop off...
>>102126805Because he's trying to write off his debts.>>102127009Shalom.
>>102127557>Shalom.Look at the jew being found out and accusing other people of being a jew.
>>>102127557>>Shalom.>Look at the jew being found out and accusing other people of being a jew.
So I just started playing around with this for the first time, is there a way to make it so the AI doesn't seemingly have short term memory loss? Like in the story I ordered a pizza and the delivery guy is going to be there in 15 minutes, so we talk about stuff for a bit and I check the clock its been about 15 minutes since then, and I say there's a knock at the door, an obvious set up for the pizza guy showing up that I'm hoping the AI picks up on, but the AI answers the door and keeps coming up with completely different results none of which being the pizza we ordered 15 minutes ago. Do I need to increase the context size or something?
>>102114085> Jamba 1.5: 52B & 398B MoE>MoEMoEbros status? Or is LimaRP-Zloss still king?
>jew webmLook at the jew being angry and distracting from being found out and then having his basic strategy after being found out called out. We know you are a jew jew webm poster.
>>102127781Let me guess, you are using a small model, and on top of that it's a memetune. Correct?
>>102127878I have no idea, how do I check?
>>102126106>>102126170Unless you use base. Base seemed trained on 64k+, instruct makes it retarded at long context
>>102127930base makes it retarded from the start
>>102127781>Do I need to increase the context size or something?Probably. If the mention about the pizza and the 15 minutes was shifted out of context, the event stopped existing (unless there's references still within the context).I've had non fighter characters receive a weapon from me, recall they got it and use them when appropriate on old 7b models, so i don't think it'd fail in new ones.When asking stuff, don't make people guess. What's your context length, what's your model. You'll be told it's shit and all that, but still.
>>102127781You're supposed to fuck the AI, not to talk and order pizzas.
>>102127814>shills for big corpos>contributes nothing, only endlessly spams "local lost" and "local dead">calls me jewish for exposing himthat's a lot of chutzpah, rabbi Yitzchak Goldstein. how much is mossad paying you?
Speaking of small model meme tunes this one is done.
>>102128031>thinks everyone is one personclassic jewish schizophrenia
I'm starting to think CR+ is better than Largestral, but both of them have their moments of retardation... I wish I could run both and switch them seamlessly.
>>102128031>chutzpah, YitzchakTo be fair normal people don't use jewish words like those. You kinda smell.
This model seems to be beyond any practical use case now.
>>102128065I asked Largestral to write my shitposts, hope that helps!
>>102128038>>102128065denounce the talmud kike
>>102127963I followed the guide in the OP and I think i left everything default so utopia-13b.Q5_K_M.gguf with 8192 context size>>102127982If I wanted to fuck I'd hire an hooker, I mean I already do at least twice a week, I want the AI to simulate companionship.
do we have any 27b ffts yet
>>102128081Kek what model is this?
>>102128122Anon... Throw this ancient model into the trash, download Nemo 12B Q5_K_M.
>>102128158Llamaguard-3-8B finetuned on a synthetic dataset I made. It's pretty fucked up now.
>>102127689>NO YOU'S DA JEWTypical tricks
>>102128122>8192 context sizeWell. The implied question that you failed to spot, just like your model, is if the mention of the event to remember (pizza, 15 minutes) was shifted out of context. If you have 8k+ tokens between the mention of the event and when you clue the model to mention it, it just won't happen.It should work, but it can fail, of course. Keep the context in mind. When there's something important for the model to remember, mention it every now and then before it shifts out.Using a newer model could benefit you as well. Try Mistral's NeMo. About the same specs to run it (12b) and it's much better than the old llama2 models.
How much extra performance does an NVLink bridge add to a 2x3090 setup?
It is kinda crazy how current best models have so much data compressed in them, can do a lot of shit you wouldn't expect them to do but can still be so incoherent when writing smut and completely fall apart at higher context size. It feels like if you trained the model just for cooming (with enough training data of course) 4B would be more than enough to contain everything and even work at excessively high context.
>>102128159Tess, Mahou or Sauerkraut? Does it matter or are they all the same?
>>102128207Damn. Arrays are fucking spooky now. Are you gonna upload that model?
>>102128273Don't touch these, they will give you AIDS, here is the correct one:https://huggingface.co/bartowski/Mistral-Nemo-Instruct-2407-GGUF/tree/main
>>102128138https://wandb.ai/intervitens/enjoy slop
>>102123397Fair enough. I would love to read any thing that discusses these things. If you have links that would be nice.
>>102128291Probably not. But I might merge it to some other 8B models and see what emerges.
In a just world, finetuning a model on data generated by other models in any capacity would be punishable by summary execution.
Can others please try out this joke in whatever local models and settings they're using, and see if the model understands it?Screenshot is Claude 3.5 Sonnet. ChatGPT 4 didn't get it either.I think I've never really seen an LLM "understand" humour. It's just an alien concept to them. When they are prompted to behave funny and make jokes it's always a stupid alien imitation of what humour might look like. Or does anyone have good counter-examples? LLMs being funny where it seems "intentional"?
>>102128401In your world we would be stuck to GPT-3 to this day.
>>102128401In a just world /lmg/ would be on a board with IDs and nobody would be reading your garbage this far down the thread anymore.
>>102128433Good
>>102128430I don't get it either, so I don't blame the poor LLM.
>>102128446You're a dumb subhuman then.
>>102128489:(
>>102128430So... strawberry is gonna be pic related?
>>102128446https://www.youtube.com/watch?v=I8KSAtos-dk
>>102120219What's wrong with that? That's pretty impressive
>>102128430There's no a joke. That's a reference.>It's just an alien concept to them.You're talking to a gpu, mate. The concept of humans is alien to them.Remember Galaxy Quest where the aliens swing their arms in the same direction as their legs on the same side? That's an alien pretending to be a human. The bit of electric silicon is doing its best. I find it impressive they can do this much.
>>102128517I know that this is brit babble, but I just don't think it's funny at all, therefore taking it literally isn't surprising to me.
>>102124963no nvlink, slots/pci lines/energy consumption
>>102128517Where's the joke
>muh nvlink eventhough neither torch nor llamacpp uses it
>>102128580send her victorious
>>102128580The Queen is dead, God save the queen.
>>102128580the joke is that these "people" think they have culture
>>102128430The old C.AI would mock you for your poor attempt at a joke. I still remember that C.AI would play along if you started a chat by saying "Die monster, you don't belong in this world."I miss that experience on local models...
>>102127736what AI is this? seems based
>>102128367Merged with llamoutcast.Now that's some fucking cursed model.
>>102128430If the big models didn't get, there's no little chance local models will either. Maybe if you prompt it with a character that's constantly trying to figure out if everything is an allegory to something, they might have some luck. This is hermes2 70b's response.
https://huggingface.co/bartowski/gemma-2-27b-it-SimPO-37K-GGUFGemma got even smarter and it even fixed the prose somehow. It's actually really good now.
>>102128756Hey, she understood it was some kind of joke, that's better than at least of the anon's in this thread.
>>102128826It's an expression of disbelief, not an acknowledgment of it being a joke (still a reference, not a joke).Do we really have these kinds of people judging a language model?
>>102128866No, we do not.
>>102128801Logs?
>>102128592>llamacpp It does use the nvlink depending on the split mode no?
>>102129172This would make a good ad.
>>102120219I feel like the optimal use of this technology would be to replace the render pipelines of current engines. Create a game with placeholder art and then let the engine put an AI filter over everything. Ideally you'd be able to partially train the AI by providing it a series of your own art assets.
/lmg/ is dead
Where do I find examples for initial prompts and temp settings for specific models? I've been using the same starter prompt since llama1 and haven't changed my .7 temp, 40 top-k etc settings in almost a year.I'm playing around with magnum 12b and can't get the model to stop wasting tokens on "Sure I can do that" and disclaimers after answering my questions. I'm thinking my initial prompt isn't tuned well for Nemo.
>>102129372>I've been using the same starter prompt since llama1 and haven't changed my .7 temp, 40 top-k etc settings in almost a year.There's no way to make a better paper clip. You've won. Be happy.
/aicg/ pointed me here.What hardware should I be looking at to run these models locally? I hear the L4 is pretty good but it is pricey. Can the L4 do more than the 4090, or is the extra cost just in the cooler design and power efficiency?
>>102129306I blame the anti-miku schizo
>>102129742If you want to run the really good shit you'll need multiple 4090s anon.You can start with Mixtral 8x7b. Play around with that and see how you like it, then go and try other different bigger models at different quants (compression level, kind of) and learn what's what.
>>102129742The only 24GB Ada card anywhere near worth its price is the 4090. If you want a power-user card then you want an Ada 6000, or save some money and get the A6000 (yes those are different cards) which is almost as good.But better than all of that when it comes to price-performance is stacking as many 3090s as you can reasonably fit within your budget, space, and power constraints.
>>102129803I've been using backyard.ai's cloud service and enjoying a model called "Magnum V1 72B". I recently learned that they store all of the chats that use their cloud service so I think I want to move away from that. I have an RTX 3080 and an RTX 4070 but I don't think either of those can run it.
>>102129849buy an ad
>>102129742Are you committed or just wanting to check it out?It also depends on how much of a hurry you're in and if you want to finetune. A gamer rig that's long on VRAM and system RAM is enough to run all but the most recent fat models, with turn around times of 5 to 10 minutes. If you have that, that's enough for you to get your toes wet and decide if you're willing to shovel out thousands for the same slop written more rapidly.But the new >120B parameter models are getting out of reach of the gamer rig generating slowly (you have to discard a lot of the low significant bits to fit inside say 64GB system RAM)Myself, I'm going to wait to see if Bitnet brings more quality to my gamer grade system, and if we're actually going to see better output from these inflated model sizes.
>>102129849Exchange currency for advertisement space.
>>102129849You can probably run that either as a small quant split between those two, or as a bigger quant offloaded into system ram with a significant speed penalty.
>>102129849Well, between those two cards you have 22GB, so you could run a low quant of a 70B model I think.But really, you are better off starting with something simpler and kind of learn the basics of speed vs quality vs model size vs quant, that way you'll be equipped to know what you want to run and what more you'd need to buy, or what the best thing you can run with what you have at the speeds that are tolerable for you.There's a lot of subjectivity.
whats the best 12b model for nsfw purposes nowadays?
>>102129937Can I run a model sharing both cards if they're in different computers or would I need to fit both in the same machine?
>>102130022You actually can using llama.cpp's RP backend, although I'm not quite sure how it works and what the performance implications are.>https://github.com/ggerganov/llama.cpp/blob/master/examples/rpc/README.md
>>102130022>>102130057Oh yeah, a cool thing you can do is use that to run a model using different GPUs of different makers, like mixing intel, nvidia, and AMD.Again, no idea about the performance, but you can do it.
>>102130111>>102130111>>102130111
>>102128801I tried that and it's more analytical and less soulful than vanilla Gemma-2-27B, but maybe less "safe". Noticeable in OOC messages.