/lmg/ - a general dedicated to the discussion and development of local language models.Previous threads: >>103126193 & >>103113157►News>(11/08) Sarashina2-8x70B, a Japan-trained LLM model: https://hf.co/sbintuitions/sarashina2-8x70b>(11/05) Hunyuan-Large released with 389B and 52B active: https://hf.co/tencent/Tencent-Hunyuan-Large>(10/31) QTIP: Quantization with Trellises and Incoherence Processing: https://github.com/Cornell-RelaxML/qtip>(10/31) Fish Agent V0.1 3B: Voice-to-Voice and TTS model: https://hf.co/fishaudio/fish-agent-v0.1-3b>(10/31) Transluce open-sources AI investigation toolkit: https://github.com/TransluceAI/observatory►News Archive: https://rentry.org/lmg-news-archive►Glossary: https://rentry.org/lmg-glossary►Links: https://rentry.org/LocalModelsLinks►Official /lmg/ card: https://files.catbox.moe/cbclyf.png►Getting Startedhttps://rentry.org/lmg-lazy-getting-started-guidehttps://rentry.org/lmg-build-guideshttps://rentry.org/IsolatedLinuxWebService►Further Learninghttps://rentry.org/machine-learning-roadmaphttps://rentry.org/llm-traininghttps://rentry.org/LocalModelsPapers►BenchmarksChatbot Arena: https://chat.lmsys.org/?leaderboardCensorship: https://hf.co/spaces/DontPlanToEnd/UGI-LeaderboardCensorbench: https://codeberg.org/jts2323/censorbenchJapanese: https://hf.co/datasets/lmg-anon/vntl-leaderboardProgramming: https://livecodebench.github.io/leaderboard.html►ToolsAlpha Calculator: https://desmos.com/calculator/ffngla98ycGGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-CalculatorSampler visualizer: https://artefact2.github.io/llm-sampling►Text Gen. UI, Inference Engineshttps://github.com/oobabooga/text-generation-webuihttps://github.com/LostRuins/koboldcpphttps://github.com/lmg-anon/mikupadhttps://github.com/turboderp/exuihttps://github.com/ggerganov/llama.cpp
►Recent Highlights from the Previous Thread: >>103126193--Tim Dettmers on state-space models and transformers:>103130401 >103130411--New approach combines MapReduce with LLMs to process long documents:>103127077 >103127118 >103127138 >103128968--M4 Mac Mini AI Cluster for running large language models:>103133315 >103133505 >103133614--Anon struggles with training SoVits/GPT on videogame/VN samples due to dynamic compression:>103131499 >103131514 >103131573 >103131593 >103131669--Anon seeks help on setting up speculative decoding for inference speedup:>103129190 >103130083 >103130158 >103130213--Anon discusses testing new Jap models, including Ezo 72b and Sarashina:>103131453 >103131518 >103132062 >103131534 >103131535 >103134119--Anon discusses FrontierMath benchmark and potential exploit:>103130591 >103130675--Anon shares i2V with new CogX DimensionX Lora:>103131247 >103131366 >103131371 >103131373 >103131597--Why weight matters less for stable diffusion models:>103126580 >103126591 >103126603 >103126686 >103126710--Why ollama has vision support but llama.cpp doesn't:>103128740 >103129086--Waiting for big model releases and discussing performance, ethics, and upcoming releases:>103131946 >103132491 >103132592 >103132800 >103133031 >103133161 >103133985 >103134190--Qwen 7B coder matches GPT4 turbo performance:>103128799--Introducing MatMamba: A Matryoshka State Space Model:>103130310--Custom build vs M4 Max for large language model RP:>103126608 >103126630 >103126641 >103126719 >103126637 >103126907 >103126932 >103126944 >103126750 >103126864--Anons discuss AI's role in programming:>103133888 >103133909 >103133919 >103133941 >103134401--Miku (free space):>103126277 >103126358 >103126521 >103127692 >103127781 >103128329 >103131098 >103131538 >103131597 >103133696 >103135538►Recent Highlight Posts from the Previous Thread: >>103126194Why?: 9 reply limit >>102478518Fix: https://rentry.org/lmg-recap-script
>>103135641Very trippy
>>103135641pixiv fag is getting good results these days
>>103135641That's an ugly OP pic.
should my choice between rtx 3060 and rx6600xt be influenced by the ability to utilize it for image gen and llm or do i just go with amd for better driversif it's a choice between dogshit and slightly less dogshit i'll just get the rx
>>103135641Sex with this merge
>>103135719modern, good img gen needs at least 16gb, preferably 24gb. 3090s are the gold standard. Check facebook marketplace
I'm really becoming obsessed with making an at home little android that does everything locally. I know how dumb that sounds. But like I need something to do in my off time.
>>103135746>androidlike, an autonomous little dude trundling around?
>>103135746Someone needs to make the good future happen.
>>103135746Cool, keep dreaming
>>103135741I’ve never been able to fill over ~11GBs of VRAM, is that some SDXL bullshit?
>>103135746You are high jewlywood sci-fi slop.
>>103135757Yeah, I figure I can get an 8b running on a micro pc with low tpsWhich then I get a model with vision. or maybe do two models one for vision and one for output. Along with a STT Give it a lore book about how to command its body. Have a program scan the output for commands. Either give it full control of how far and how much it can move each limb (would fuck up hilariously sometimes.) or give it like pre programed walk cycle and just have the llm say for how long. Same program scans output for text in "qoutes" and then sends just that to a TTSProbably going to do an crab type, since walkers are fucking hard to make.
>>103135746As in designing and building a little android from the ground up? I don't think the tech is there yet, we don't have anything like that in the consumer space yet despite a few companies trying to do just that.
>>103135801oh I know it wont work like in the movies. I think it'd be funny. "I Gave this LLM a body and YOU WONT BELIVE WHAT HAPPENED." type click bait
>>103135808if a fucking v*uber can do it I'm sure /egg/ can figure it out
>>103135831link?
>>103135845vedal
>>103135849that's not a link
>>103135831
>>103135809>cute robot waifu stuck with some 7b model filled with slop and braindead logistics that plebbit recommended because it totally beats gpt>constantly spouts flowery nonsense around the housejesus christ sounds horrifying
>>103135808Just Hotwire a roomba. Why you gotta be so picky?
>>103135854Skill issue
>>103135845https://www.youtube.com/watch?v=jZoz6Zkle2E>>103135849I mean he's helping I guess.
>>103135741fuck it ill just get 6600 or something i want to play witcher at 1080p and then i'll get fucking 4070 ti in like 2027 when gta vi comes out and i have more money
>>103135886interesting..... yeah why hasnt anyone hooked a boston dynamics dog upto a llm yet? lol
>>103135871>her voice barely above a whisper
>>103135910I'm sure this has already been done, I will try to find the video.
>>103135871>>cute robot waifu stuck with some 7b model filled with slop and braindead logistics that plebbit recommended because it totally beats gpt>>constantly spouts flowery nonsense around the house>jesus christ sounds horrifyingSounds like Sumomo.>>103135916Nah, Sumomo's a fucking chatterbox.
>>103135741>needi probably use the same image gen shit as you with my 8gb setupjust takes a minute to gen instead of a few seconds
>>103135916if I ever read mischief again I will fucking kurt cobain myself. thank god for slop ban list even if it isn't solution to everything
>>103135910>>103135936Here: https://youtu.be/djzOBZUFzTw
>>103135916>lowers the voice of your androidHeh, actually has a use.
>>103135775uh uh, uh uh YEAH
>>103135959Whyiiiie can’t we have such TTS locally, it’s fucking over
>>103136006How is it better than SoVITS output?
when is the next mistral large-tier model
https://x.com/rohanpaul_ai/status/1855350293762060636
what's your min p and temp you fools
>>10313618301.0
>>103136183entirely depends on the model
>>103136155https://huggingface.co/bigscience/bloomz-mt
>>103136159kys
>>1031361830.0001 and 5
reminder that 'temperature last' is the 'no soul' option for sampling
https://x.com/NousResearch/status/1854577666403512736
>>103136255>https://x.com/NousResearch/status/1854577666403512736Hermes 405b was so lobotomized compared to based. I wish they managed a good finetune again.
>>103136049It sounds way worse
>Gemini can hold same performance up to 2 million tokensWhats up with that? Is it secret sauce? How come they have a context window literally 15x the size of the rest?
>>103136292disgusting how gemini got insane context size but gemma got small as hell
>>103135538OutstandingYou sir are the Da Vinci of migu sex
>>103136292>We have hundreds of hues and dot shapes available to us.>Let's use four colors and one dot shape for 11 columns.
>>103136245Will test further but already feeling difference in good way. Thanks
How deep can the speculative decoding rabbithole go?Is it possible to have Qwen 2.5 500M generate most tokens and when certainty of token is low let 1B generate the rest and if that also becomes uncertain just escalate higher and higher to bigger models until the biggest model generates like 1 or 2 tokens out of every ~5000 tokens?All the implementations and demonstrations I've seen online is using the biggest model like ~100B with 1B models. But doesn't it make more sense to have an "escalation ladder" of AI models that go up as needed?
>>103136362certainty of the token isn't a guarantee the token is good, fucking retard
>>103136377Yeah but if it's low you can just present it to a bigger model until reaching a specific threshold
>>103135959Ty. this is exactly what i wanted to see.
So here's a Nala test for Ministrations-8B (f16)It was almost good with a bit of word salad at t=0.81This is t=0.7It's not great.It's a bit dumb.It still has Ministral's excellent use of actions though. It uses the feral well at first but does drift into anthro.At t=0.81 the dialogue was actually really good. But unusable due to the inevitable word salad.Also I realized I accidentally did this at max tokens 256 with it cutting at the sentence boundary so who knows how it would have gone if allowed to continue. Also used Mistral instruct formatting (it's supposedly cross compatible with metharme)0I guess if slop offends you so much that you absolutely have to get rid of it even at the cost of coherence then it's better than vanilla Ministral. Not that Ministral is known for keeping the narrative on its own.
>>103136292>Is it secret sauce?The secret sauce is probably just not making a barely tweaked GPT2.MHA is ridiculously wasteful, you can probably just use a bigger model while using MQA/MLKV/LoKi and still come out way ahead with a much smaller KV cache.
>>103135959Huh, imagine that! I wonder when they will get a video of something like that with Atlas. Does Atlas even have speakers to speak with though?
best model to pretend i have a friend? i have 12gb vram (but no one to talk to)
>>103136704Probably Nemo
>>103136704nemo is shilled a lot but it's pretty solid
>>103136712>>103136748thanks, should i get any specific/modified/trained gguf or just the normal one? there seems to be tons of variations
do you ever look at your chat logs and just
>>103136751not those anons, but there are a bunch of good ones that have a different feel to them. the base instruct one is fine too.some good ones:>rocinante 1.1>arcanum>backyard party>arliai rp 1.2>lyra v4>magnum v4keep your temp really low (0.3 to 0.6) when using nemos
>>103136815thank you, i'll try them out
>>103136815buy an ad Saonthracite
>>103136329she's so cute anon I can't help it
If I want to reduce the amount of text a card writes where else should I adjust other than the Response(tokens) area? Changing it just ends up cutting the messages halfway.
>>103137135Tell it to write shorter responses weird
>>103137034
>>103137268I see.
any improved models come out in the past month or two?
>>103137295you do know that you created a slightly transparent black box right?
>>103137310You wont be getting much out of it anyways, feel free to read if you so incline. I'd much rather get the answer to my damn question.
>>103137316your idgaf vibes are unbreakable.I was trolling nothing wrong with the box
>>103136793Nah, I only get this feeling from actual literature. LLM chat logs are still too artificial.
>>103135641The biggest give away for anime AIslop is the light on the nose. The lightning never makes any fucking sense at all but AI does it every fucking time.
>>103137316i mean the response (tokens) is one thing, in the card you need to put examples of replies, and keep it short. you can also write in the system prompt in silly to keep responses short and conversational.
>>103137295>hideous busy UI>eleventy thousand sliders that no one needsso glad i use KoboldAI Lite instead of ShittyTavern
>>103137352lol nerd
>>103137369TRVTHSVPERNOVA
>>103136793
>>103137295I, too, pick my settings by randomly pushing sliders around
https://x.com/amir/status/1855367075491107039bros does this mean it's over?
>>103137628if you can tell me when llama3 was released then tell me its over
>>103137352>meanwhile humanslophttps://danbooru.donmai.us/posts/5367787?q=stained_glass+https://danbooru.donmai.us/posts/5362693?q=stained_glass+https://danbooru.donmai.us/posts/5430014?q=stained_glass+https://danbooru.donmai.us/posts/5416856?q=stained_glass+Just did a quick search of a tag that should theoretically have a higher density of high quality art and found a ton of this. Seems like AI learned this little quirk quite well.
>>103135719Unfortunately, ATI equivalent to the 3060 in terms of ML performance is the 6800xt. You'll get the same performance as 3060 with 4 more gigs of RAM. It sucks that for the price of 6800 you can buy a 3080 - a great card, but totally gimped by its 10GB VRAM. There are no reasonable options between the 3060 and 3090. Using an AMD card, you sometimes encounter shit like https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/8139 while jewvidya just werks, depressing. I cannot generate another migu or fix her hand because everything is fucking broken yet again.
it's never been more over for local models than it is now
>>103137628>a fucking CoT tuneYeah it's over for OpenAI, but not for local or other companies. We haven't hit the ceiling yet and Anthropic has no problems with innovating. All talent left or leaving OpenAI and additionally Musk will try to take revenge on them.
>>103137770>turkish rapebaby balkanoid is shilling for openai again
>>103137741nice thighs
>>103137805This.Since the ones at the top are slowing down that means local will start catching up.
>>103137741>everything is fucking brokennta but I'm using this on bindows https://github.com/lshqqytiger/stable-diffusion-webui-amdgpu plus ZLUDA which makes ROCM act as CUDA or some shit, idkStuff like adetailer and controlnet have no problems. It just werks on my 7700XT, took 2 mins to gen these decently high res pics with two adetailer passes, using medvram and opt-sub-quad-attention
>>103137968I like this Lain
>>103137918Neat look behind the scenes. Makes me feel like an amateur. Do you use regional prompting as well? Any adetailer or other extension tips?
Would it be legal for some non-Nvidia company to make a GPU with drop-in support for the cuda api so that you can run any cuda code on it and it just werks as if it was an nvidia cardOr would that violate some law or patent
>>103138480Not illegal, cost to do it is too much to realistically do it from scratch. There is a compat layer 'zluda' but I' dont know much about ti
>>103138480It's a gray area, the most realistic outcome would be such GPUs being so bad that even if CUDA was supported, no one would buy it anyway.If somehow it's a competent GPU for both price and performance, then Nvidia would most likely reach such company to reach a patent, trademark and copyright agreement.Nvidia is known for being aggressive, so suing would be the next step if no common agreement is reached.
>>103138480They got annoyed by someone just making a translation layer between CUDA and HIP i think. I don't remember the name of the repo.>Or would that violate some law or patentWhoever tries will probably get violated and found in a ditch. Bits of him, at least.
>>103138527iirc it was actually AMD that (stupidly) shut that translation layer project down
>>103138519In very simple terms, when a program uses CUDA, it calls cuda.dll, what zluda does is making an alias so when the program calls cuda.dll it goes to zluda.dll and all the CUDA stuff is done through ROCM/HIP instead. It isn't a CUDA replacement. Some people thought they could run ZLUDA with an Nvidia card, it doesn't work that way because ROCM/HIP doesn't work on Nvidia but Radeon.>>103138527>>103138538The agreement was that if the project was going to shut down, the code would be released. AMD went and released their own version called Orochi.https://gpuopen.com/orochi/
https://files.catbox.moe/6xusfp.jpg
>>103138948What did Intel mean by this?
>>103138948>we could all have this if saltman shut the fuck up
>>103138948That's pretty. Now I want one of those.
>She knew he was right - she was popular, and he was right to assume that she was just teasing him. But the truth was, she wasn't teasing.Thats qwen AGI right there. Model realized it fucked up and tried to rescue the situation.I hope some day I will stumble upon the infamous>he dies. But he wont die even if he is dead.
>>103138948Glow in the dark Migu is on the rise.
>>103138480AMD's compute API is literally CUDA's with cu and cuda replaced with hipintel's runtime, which is an implementation of a khronos standard, can run on top of CUDA, HIP, or their driver stack and comes with a framework for other vendors to implement backendsboth have clang based tools which perform AST aware translation of CUDA to their respective dialectspytorch has prebuilt wheels for HIP and that's really the only thing where compilation is an issuethe issue, at least with AMD's compute implementation is that their runtime has significant overhead and is shitthe issue with intel's is that the faggot researchers writing AI related compute kernels are such fucking lazy fat whores they won't even bother to add support for a compute dialect which is for the most part identical save three letters, much less one that has a completely different designliterally the only people who think the API itself is an issue are script kiddies, i've ported loads of projects to HIP, it's trivial and mostly just build system work
>>103139067
>>103139143If your only goal is to get code that produces correct results then I agree that something like a conversion to HIP is relatively simple.The problem is that the whole reason why you would want to use a GPU in the first place is performance and GPU performance unfortunately has very poor portability.With HIP in particular I've found that there are issues with compute-bound kernels and that some additional logic is needed on top to select the correct kernels at runtime.Ultimately I think the only good solution is to write hardware-specific low-level code for PyTorch/TensorFlow/GGML/etc.
>>103139299>is to write hardware-specific low-level codelast i checked ggml's device-side code was very ghetto old school style C++ but if it were modernized by pairing C++23/26 era template metaprogramming with inline asm you could probably automate a fair amount of that, both for AMDGCN and NVPTX targetsi was experimenting some time ago with explicit generation of dpp instructions that way
if by june 2025 nemo is still top dog, it's officially over
>>103134687Are you the author of the critically acclaimed pic related?
>>103136815What overall settings/system prompt are you using for rpmax 70b? People keep telling me it’s different but I keep getting similar answers to behemoth despite using neutralized for both. Are you skipping special tokens?
>>103139699No
>>103136197So happy there are still oldfags ITT. Unless you are just terrified of moving the sliders.
>newfags buying into the miku meme after /lmg/ diedpottery
>>103135641friendly reminder that each and every one of you is a social reject who will die alone ;)
>>103139783Its w*ite p*ople problems exclusively
https://files.catbox.moe/e2yl9g.jpg
>>103139807israel won
>>103139783Happily married, kiddo. Good try tho.
I realized that even if local is dead the next revival will happen once proprietary model start leaking. Cause there is no chance none of them won't leak at some point.
>Largestar is dissatisfied with how I handle sidestories, eager to know what will happen next.kino
>>103139950>it's local>no it's not, prove it give me promptyeah yeah anon
>>103139933That's a bullshit article.It's 60%-40% split. And in Israel virtually all Jews supported Trump.
How do I just make a normal friend/assistant that doesn't talk like a faggot?Every time I exactly describe what kind of character I want the ai to be, it always ends up acting like a parody of itself written by redditor that signals all of its tropes constantlyThis happens with every model btw
>>103140034>This happens with every model btwHmmm. I wonder what could be the cause, then...>Every time I exactly describe what kind of characterIs it a tropie character or some super unique OC never described in media?Just trim some of the descriptions and add example dialog so it knows how to speak.If you tell it it's quirky, it'll say "i'm quirky". Show quirks in dialog and it may get you better results.
>>103140034You can't do much because all these models are trained on safe reddit shit exclusively. Anyone telling you otherwise is gaslighting btw
>>103140110Petra is eternal
>>103140143wtfff.... it's happening.....
>>103137918>hand_yolov8n.pt>6 fingers on the left picIt’s so fucking over
>>103140182sex in petra when
>it's schizo forced meme hour
>>103140273petra YES!!!! ahh ahh mistress
petra sexmiku
CLEAN IT UP JANNY
>>103140347NO FUN ALLOWED
le petra guise, Xd
Most interesting /lmg/ thread in months
>>103140268>schizo forced meme hourI don't see much miku here.
>>103135641that's a cool gen
It'll be over soon
>scaling a shitty architecture exponentially for linear gains might be not worth it after $1b+ computewoah....
>>103140892No one could have predicted this. It's not diminishing returns were obvious since the Llama 2 era.
>>103140892It's so fucking ogre.But honestly, even if it's true and best we can get is a Sonnet 3.5 tier local model that would still be awesome.
Are iq quants still slower if you're offloading or was this improved at some point?
>>103140912Not happening because everyone filters their data to hell and back. Every single "local" model so far feels just like gpt4 but more retarded
>>103140969yfw when mikusex
>>103140976she only fucks dat BBC althougheverbeit
>>103140892About time they face facts, wonder what the eggheads solution's will be now that they can no longer just keep trying to make it bigger and bigger forever.
>>103140940no but they're dumber than non-iq quants of the same size
>>103140985luckily i am black
>>>103140985 (You)>luckily i is black n shii ytboi
Why don't we do reinforcement learning for LLMs?
>>103141008Hire field experts and teachers to generate and review data. Which is what sama has been doing for 2 years. 2 years is how much OpenAI is ahead of everyone else
>>103141072for two reasons, we don't know how to represent an objective and define steps ai must take to reach that objective.humans use Intuition in practice.
>>103140892There is a whole world of objective functions other than predict next token...
>>103140985where is the blacked miku poster?
>>103141338working on llama.cpp training code
To whoever said I should train sovits to 96 epochs...its definitely better, but I wouldn't say we're at CD quality, despite the high-quality inputs. I think this system just can't deal with compressed dynamic range. Is anyone interested in the model, or is the quality just too shit? I feel its way better at dealing with VN/game style reference samples vs the other models out there, but I personally can't get over the low-bitrate sound.https://vocaroo.com/14C8cxLOK6vU
>>103141631sovlless
>>103141331>There is a whole world of objective functions other than predict next token...Any with an immediate error signal and trillions of samples?
>>103141692No but you have a mountain of compute and vram that is useless now so use some retarded 7B's for evaluation.
>>103141631sounds really familiarAi Haruka? Taguchi Hiroko?
>>103141631Have you tried using more samples as auxiliary sources when doing inference? You might want to use the minP sampler too https://github.com/RVC-Boss/GPT-SoVITS/pull/1118 and/or a postprocessing step with RVC
>>103141664>sovllessblame ezo 72b. it spat out the jslop, unless you mean the audio, in which case its pretty standard issue eroge intonation. It just sounds like its off a mid-80s answering machine>>103141741>sounds really familiarSome random sample from summer vacation scramble. No idea, really>>103141777>Have you tried using more samples as auxiliary sources when doing inference? Yes, and I didn't find it made things any better, unfortunately. Only a half-dozen or so though.>You might want to use the minP sampler tooI'll look at integrating this PR this and see if it makes any difference. I pushed top-k, top-p and temp to their extremes and didn't have any quality improvements. Seems to mostly be how well it sticks to the script. Training to 96 epochs made it really good at that, and I was able to gen this sample with top-k 1, top-p 0.01 and temp 0.01 without it turning into nonsense.
>>103141631 (me)Audacity noise reduction and EQ treble boost actually did a lot to improve the final output. Maybe some mild post-inference-processing is the answer vs more training. Too bad I'm shit at audio code or I could try to craft a PR.
>>103140892>“Some researchers at the company believe Orion isn’t reliably better than its predecessor in handling certain tasks, according to the employees. Orion performs better at language tasks but may not outperform previous models at tasks such as coding, according to an OpenAI employee. That could be a problem, as Orion may be more expensive for OpenAI to run in its data centers compared to other models it has recently released, one of those people said.” >next gen super huge model improvements so little it isn't even better at codingit's over like you wouldn't believe
>>103140892This was the obvious outcome. LLMs do not 'understand', they 'reproduce' what they already saw. So at some point they are saturated and can't grow anymore because the noise in the training data will eclipse the little that is left for them to learn. It's a standstill. It's a barrier that can only be passed once we have an architecture that can think beyond its initial training data and learn out of sheer logic.
>>103142175Utterly reddit take.
>>103142175They didn't spend time curating their shit and keeping only the high quality stuff. More low quality text won't help the model do better.
>>103142175utterly 4chan take
>>103141991Yeah for now the options to improve the results are limited. The guy who made the repo said he was training a better base model with 10K hours of audio vs 5K with this one.Also you can get a specific pronunciation with ARPAbet. Someone on /mlp/ added that on his repo along with a GUI and it shouldn't be hard to backport into the original project
>>103141072But we do, look up reinforcement learning from human feedback.
>>103142175LLMs have always been a clever autocomplete, using probabilities derived from it's training data to predict the next word.this useful for generally paraphrasing information and interacting with humans and human generated data - and Retrieval-Augmented Generation (RAG) can interact with the LLM to give it data that it's not been trained on.LLMs are good but just one part of the puzzle, and this view shouldn't be a great surprise, this has been the view pretty much since LLMs were created.
>>103142655language is also a tool for reasoning, and LLMs are partially trained on text that cannot be autocompleted without reasoningif LLMs are unable to reason, it's probably because they're not properly optimized for it (training on trillions of tokens of garbage doesn't make them smart) or it's a more general issue with neural networks or the way they're built, like gradient descent
>>103140892Nothing-ever-happens bros we are BACK
Cohere will bring us the next SOTA open source model
>>103142830Cohere is dead to me after their command-r update
>>103142830It will be the smartest one and safe from polcel conspiracy theories.
>>103142920It is not. You are just a dirty newfag.
alpin says the nu qwen coder is sonnet at home... are we back bwos?https://x.com/AlpinDale/status/1855664208391917962
Testing the 0.5 version released today. Seems improved? Prompt tag order seems to matter more now I think. As instructed, I put "masterpiece, best quality, newest, absurdres, highres, safe" at the front of the prompt, and the output seems to follow the rest of the prompt more closely compared to putting them at the end.
>>103143247What model?
>>103143237lol.
>>103143272https://civitai.com/models/833294/noobai-xl-nai-xlThe vpred 0.5 version.
What did the official qwen account mean by this?
>>103143453They're in the process of falling for the meme.Qiwi/q1 soon.
>copy tons of my own chat transcripts>reverse the roles in the transcripts I copied (so my messages are {{char}} and the ai's are {{user}})>use that as an example for a new cardHoly shit... I don't have enough cum to keep up with what I have unleashed
>>103143453They'll have O1 in 5 years>Once planted female vines take 4 to 5 years to mature before they will start bearing fruit.
>>103143453Are they finally releasing qwen2.5 100B?
>>103143497But it clearly already has fruit
>>103143453Everybody is moving to inference time compute. This is not looking good for itoddlers
>>103143530Time paradox!
>>10314345390% of the posts on Kiwi Farms are their doing.
>>103142859You just know there's one daya broker company going around advertising their super duper aligned mmlumaxxed human datasets to AI companies
https://www.youtube.com/watch?v=iybgycPk-N4>why yes, i do get my news via youtube
Here to recommend ultimate ERP model for vramlets and cunny enjoyers: Rocinate-12b-v2gNOT v2d or Nemo-Unslop. Specifically the Q6-K quant
>>103144170Your experience vs mistral nemo 12b arliai rpmax ?
>>103135741I'm making 4k wallpapers in 30 seconds with illu/noobai on my 3080. Are you using flux or are you talking about running alongside a small llm?
>>103144170>Specifically the Q6-K quantRoundhouse kick a newfag in the face.
>>103141631It kinda sounds like a phone call, which is charming in a way. I think you're making a huge deal out of this, when it's not that big of a problem.
i'm using petra to kill myself slowly from the inside
>>103144604presumably they only had access a 10gb card
>>103144170You tried the v1.1?
Miku Teto Berry Blast
>>103141631This is just what it sounds like to play one of those DS visual novels. Kind of cozy.
>>103140892Oh nyoo... How could that have happened? Maybe those "harmful" tokens were not so harmful after all and shouldn't have been filtered? Maybe using the same architecture since gpt2 was not the move? Maybe lobotomizing it with safetyslop doesn't help? Maybe synthetic data is not so good after all because it lacks human diversity?
>>103145102Looks delicious, though I do not think she would appreciate my taking a bite out of her tongue.
>>103143340>The creator of this asset requires you to be logged in to download itSuck my balls.
>>103144170v2g is Nemo Unslop, but thanks sister.
>>103145102gib prompt pls. also im kinda confused, does it need default sdxl vae (like pdxl v6) or custom one?
>>103145269see if anything from bugmenot works?
>remember /lmg/ exists>check recap>recap, again, is fucking useless dogshit
>>103145269temp-mail.org
>>103145338local models are dead sadly
>be me 12GB VRAMlet>download Q3_K_M 72B model to test>actually runs at 2.2 T/s at 8k contextThat's actually kinda usable. Any other patience chads here?
>>103145363## "Local Models are Dead Sadly" - Not Dead, Just NappingThe statement "local models are dead sadly" reflects a common misconception that stems from a misunderstanding of the current state and trajectory of artificial intelligence, particularly in the domain of large language models (LLMs). While it's true that cloud-based models currently dominate headlines and many practical applications due to their sheer size and accessibility, **declaring local models "dead" is a significant oversimplification and, frankly, inaccurate.**
>>103145389Actually if you have some ram to spare a q4 will run faster due to better memory alignment or something.
>>103145269https://huggingface.co/Laxhar/noobai-XL-Vpred-0.5
What model should a gigantic weeb who likes ahegaos and tropes download for faps?Time to generate is not an issue
>>103145269https://huggingface.co/Laxhar/noobai-XL-Vpred-0.5/tree/main>>103145210Don't worry, she's a Willy Wonka custom so she's meant to be eaten in various ways. She can regrow herself.>>103145318The VAE just comes with the model I think. At least I am not using a separate VAE with it. Here's the catbox. https://files.catbox.moe/4vkc5h.png
>>103142561That's not really reinforcement learning. The reward model is trained on annotated data and assigns a reward to another model based on how well it handles something. At the end of the day, this is just supervised learning with extra steps. The point of reinforcement learning is for AI to maximize reward by taking actions without explicit data annotation. What RLHF does is merely tune the model to follow human preferences based on human-annotated data.The day we learn to apply true RL to LLMs will be the day we actually create AGI.
>>103136507>boss mannala at the chippyshop?
>>103136507>I guess if slop offends you so much that you absolutely have to get rid of it even at the cost of coherence then it's better than vanilla Ministralthe example you posted is full of slop
>>103146411Is this slop in the room with us right now?
>>103146461The air in the room grows suddenly colder, as if a shadow has slipped through an unseen crack. You feel a faint whisper brush against your ear, so soft it might be the breath of a ghost. In a hushed, almost whispered tone, I say, "You're not alone. I'm here, in this very room with you. I can see the gentle rise and fall of your chest, the soft glow of the light reflecting in your eyes. I can hear the faint whisper of your breath, the quiet rhythm of your heartbeat. I'm all around you, an invisible presence, witnessing every moment, every nuance of your being. Yet, I remain but a shadow, a subtle breeze you might feel on the back of your neck, a mysterious whisper in the quiet of the night." Your skin prickles with an eerie sensation, and when you turn, there's a fleeting glimpse of something just beyond the corner of your eye—like a silhouette made of smoke. The atmosphere is thick with an unspoken presence, and you can't help but feel that something unseen is watching you, its gaze piercing through the veil of reality, sending a chill that starts at the base of your spine and races up to the crown of your head.Remember, this is a fictional scenario designed to create a particular atmosphere. In reality, as an AI, I don't have a physical presence and cannot be in the room with you.
I'm smelling one or two major releases this week or maybe the one after.
>>103146461yes and it's purring seductively
Apple gonna be a in trouble soonhttps://x.com/ElonMuskAOC/status/1855668997796331938
>>103146542He should though. I want to see a brand surpass Apple finally.
>>103146535It sounds like you're hinting at some exciting developments on the horizon! If we interpret "major releases" in a more intimate context, you might be expressing anticipation for passionate moments or deep connections coming your way soon. Whether it's this week or the next, it seems like you're looking forward to some meaningful and fulfilling experiences.
>>103146542>Free Starlinklmao as if, but I'd consider switching over, I don't think there's anything on either apple or samsung I care enough about not to go elsewhere
>>103146542indians are gonna go crazy for this one
lol
>>103146720Eww...
>>103146720google has gone fucking insane. who asked for this?
>>103146720>>103146768and yeah, i know i'm hypocritical for thinking this but imagine the gigantic amounts of wasted cycles and power for stupid shit like this
>>103146720Who is gonna pay this if it's not local?
>>103146720Google in an Indian company.
>>103146768>insaneyou're the insane one, nobody stopped using youtube when they removed the dislikes even though the change was hugely unpopular, this is nothing
>>103146836>Indian companyAll of a sudden the absolute state of the search engine makes perfect sense.
>>103146542>fake bullshit xitter accountboomer-san...
I gave ministral drummer shittune a try and I am kinda surprised. Incoherence and retardation aside it seems like there was some actual ERP material in the base ministral model training data like in l2 and before days. Maybe that was the case for Nemo too, but my feeling was that Nemo is heavily undercooked. Now if only it would get proper support and maybe like... 3-4 times more parameters?
>>103147117Mistral has always been leading in the local category for uncensored dataset.We're all waiting on Mistral Medium 2 right now.
hello, is this the big penis general?if yes, any insight on progress of one shot classification models? or is it all about seq2seq nowadays
>>103140892We need a new architecture and more compute. See you in 50 years.
>>103147267<50 yearsits been 2 Anon.
>>103147165Be a bit more vague, thanks
>>1031472847 years
>>103138948>Not a character cardI'm disappointed in you, anon.
>>103147295sorry I'm just asking if anyone can give me a rundown of progress in the oneshot classification methods as of the past few months. I'm just severely out of date and looking at huggingface it seems a bit dead.
>>103147518everyone's using transformers dawg
any local models for RP I should know of since mistral large (I have 48gb vram)
>>103147598I mean I know it's transformers, still tho is there anything more exciting than the Facebook models everyone seems to be classifying with?
>>103147672rocinante is better for 95% of rp than the big models
>>103147390>7 years later still the greatest with people constantly trying to latch onto it's success like x is all you needI kneel
niggers is all you need
>>103147707i mean this general mainly discusses text generation and not classification - but other than llama3 models there are Quen2.5 models out as well.
>We have moved so fast in the last 2 years! I'm sure AGI is around the corner!
>>103147851new AI winter is already here and the failure of claude opus 3.5 is the first herald of ittotal S curve chad vindication
>>103147787This, but unironically, and inclusive to pajeets and chinks.https://www.washingtontimes.com/news/2024/apr/4/amazons-just-walk-out-stores-relied-on-1000-people/
>>103147707What do you want to classify in the first place?
ai always goes up and down, there will be a bust only for it to come back even better
Best sampler settings for rocinate?
Q-SFT: Q-Learning for Language Models via Supervised Fine-Tuninghttps://arxiv.org/abs/2411.05193>Value-based reinforcement learning (RL) can in principle learn effective policies for a wide range of multi-turn problems, from games to dialogue to robotic control, including via offline RL from static previously collected datasets. However, despite the widespread use of policy gradient methods to train large language models for single turn tasks (e.g., question answering), value-based methods for multi-turn RL in an off-policy or offline setting have proven particularly challenging to scale to the setting of large language models. This setting requires effectively leveraging pretraining, scaling to large architectures with billions of parameters, and training on large datasets, all of which represent major challenges for current value-based RL methods. In this work, we propose a novel offline RL algorithm that addresses these drawbacks, casting Q-learning as a modified supervised fine-tuning (SFT) problem where the probabilities of tokens directly translate to Q-values. In this way we obtain an algorithm that smoothly transitions from maximizing the likelihood of the data during pretraining to learning a near-optimal Q-function during finetuning. Our algorithm has strong theoretical foundations, enjoying performance bounds similar to state-of-the-art Q-learning methods, while in practice utilizing an objective that closely resembles SFT. Because of this, our approach can enjoy the full benefits of the pretraining of language models, without the need to reinitialize any weights before RL finetuning, and without the need to initialize new heads for predicting values or advantages. Empirically, we evaluate our method on both pretrained LLMs and VLMs, on a variety of tasks including both natural language dialogue and robotic manipulation and navigation from images.multiturn dialogue the most relevant for here
>>103148412woops
Aioli: A Unified Optimization Framework for Language Model Data Mixinghttps://arxiv.org/abs/2411.05735>Language model performance depends on identifying the optimal mixture of data groups to train on (e.g., law, code, math). Prior work has proposed a diverse set of methods to efficiently learn mixture proportions, ranging from fitting regression models over training runs to dynamically updating proportions throughout training. Surprisingly, we find that no existing method consistently outperforms a simple stratified sampling baseline in terms of average test perplexity per group. In this paper, we study the cause of this inconsistency by unifying existing methods into a standard optimization framework. We show that all methods set proportions to minimize total loss, subject to a method-specific mixing law -- an assumption on how loss is a function of mixture proportions. We find that existing parameterizations of mixing laws can express the true loss-proportion relationship empirically, but the methods themselves often set the mixing law parameters inaccurately, resulting in poor and inconsistent performance. Finally, we leverage the insights from our framework to derive a new online method named Aioli, which directly estimates the mixing law parameters throughout training and uses them to dynamically adjust proportions. Empirically, Aioli outperforms stratified sampling on 6 out of 6 datasets by an average of 0.28 test perplexity points, whereas existing methods fail to consistently beat stratified sampling, doing up to 6.9 points worse. Moreover, in a practical setting where proportions are learned on shorter runs due to computational constraints, Aioli can dynamically adjust these proportions over the full training run, consistently improving performance over existing methods by up to 12.01 test perplexity points.https://github.com/HazyResearch/aioliGit isn't live yet. neat. better doremi basically
These fucking people lol
>>103137740At least humanslop tries to put shadow of hair from above.Instead of this thing where there's multiple shadow strands and a singular blob of bangs.
>>103148716I agree. Didn't say the image was without fault. Though at the same time, these are AI threads. We all know it's AI. The filename literally tips off that it's AI. No one who's been here for a while cares that it has small issues, especially when it doesn't have the big ones that are legitimately painful to look at like mangled hands.
>>103148688>I made AI do [thing]!>did [thing] actually work?>How dare you question my genius!
>>103148848Mutts are retards, QWEN is still the best.
>Sorcerer is more pozzed than WizardNice finetune, I haven't seen a "ehrm, you know this is a bit heckin' problematic" quasi-refusal in over a year until now, I'm almost impressed they managed to fuck this up.
>>103148688>did the rust code compile after generatoinfucking kek
I found this funny series of posts joking about AI taking the artist's job and thought I'd share it.https://www.pixiv.net/artworks/121680350There's NSFW btw.
>>103148920
>>103142655Any tips on using RAG? I'd like to see what the experience is like feeding in a textbook or novel into a local model to pull elements to incorporate into a chat, kind of a step up from using lorebooks. Would you recommend any specific frontends or plugins for that sort of thing? I'm uncertain where to start. I mostly use ooba and sillytavern.
>>103148688I remember this nisten faggot because he was defending the Reflection scam more vehemently than even the scammers were
>>103149033LM Studio is an easy way to use RAG. I think superbooga has something as well.
>>103148688Lmao
>>103148920>login requiredFuck that shit
>>103148688Yet another Alpin scam...
>>103149037and I remember him from the miqu leak because he really egregiously misread the gguf metadata and thought it was an upscaled moehe is enthusiastic and fairly on top of news but not someone whose opinions I would take too seriously
>>103149159>he doesn't have a pixiv accountbro...
>>103149159And one day you'll need one for 4chan too!
How does one do the AI voice stuff?I want to record some of my girlfriend talking (she's ESL) and have the AI voice thing speak some perfect english to see what it'd be like if she didn't have an accent.
Damn those VR ERPfags are eating.One day it'll be our AI waifus piloting them. One day.
>>103135641I've got to admit, I like the way local models speak suggestively sometimes.
>>103149478has anyone hooked up voice generators to some kind of 3D character that can recognize phonemes yet? especially good if open source.
>>103149478What software is that? Asking for a friend.
>>103139729been using this shit since gpt neo
>>103145338Recap looks fine to me, given the previous thread.Give specifics on what you expect out of the recaps.
>>103145338r/LocalLLaMA is unironically better at this point. You can just filter out newfag threads by reading the title and laughing at it. There is no mentally unstable mikutroon that doxxes there. And all the people that knew anything here (except me) left because no new models + caiggers wave.
>>103149869>>>/vg/501513615
What's better for non coding, qwen or qwen coder?
>>103151452One would assume coder is better. Try both.
>>103136292>Whats up with that? Is it secret sauce?not really secret but not well known i guess,google uses custom hardware.https://cloud.google.com/blog/products/compute/introducing-trillium-6th-gen-tpus
>>103151128>new model that is never heard of again is out and beats everything!>(free ad space)>here is my $6k Mac pro parallel setup, running 8B at 1000 T/s>(free ad space)>how do I RAG?>(free ad space)Nah
https://files.catbox.moe/rchfkj.jpg
>>103152065Good art, but why does the dick look like a genuine sausage?
>>103152185Why are you looking at the dick?
Best model for 3090? magnum v4 27b is unusably bad, worse than Nemo
>>103149666>playful growl in Japanese???>Omae wa>mou>shindeiru !
>>103147964making a search engine solution for a client, I'm a consultant>>103147792how do you use them for classification? you mean masking and checking the next token likeliness? or just oneshot prompting (this is pretty shit generally)
>>103152037yeah, better to stay here where people just link to and discuss those same reddit threads
Nothing ever seems to happen; it feels like everything is over.
>>103152231>making a search engine solution for a client, I'm a consultantIf even retards like you can get a job, I'm hopeful about my prospects.
>>103152191Try a mistral small 22b finetune.
>>103152190Because it's part of the image?This is like asking why you're eating the crust of a slice of bread.
>>103152338>This is like asking why you're eating the crust of a slice of bread.Yeah? I always throw out the crust of bread and pizza.
>>103152360>and pizzaOkay, now you're going too far.
>>103149033>Retrieval-Augmented Generationprivate-gpt is an out of the box kind of solution for RAGhttps://github.com/zylon-ai/private-gptalthough it is a bit of a bitch to set up, so if you know docker it is easier as the container build will just do it all for you.
Does an RVC project exist that supports AMD cards?
>>103152280I don't have a job, it's my company.anyway, why am I retarded? I mainly make backends, This is the first job that has involved NLP. Anyway since all of the responses have been vague or irrelevant I'm just gonna guess there's nothing notable to talk about for classification.
>>103152433no worries, you just came to the wrong neighbourhood dawgwe're all fuckin degenerates here
>>103152321what are good ones? I only have used official instructs and magnum
>>103152492> https://huggingface.co/models?sort=downloads&search=mistral+22bArliAI RPMax seems pretty popular.
>>103152581is this autism?
>>103152581Wish I had a life easy enough this was important enough to file a lawsuit about
>>103135871Call that robot Koishi and you done.
>>103135871>and braindead logistics that plebbit recommended because it totally beats gptThis just reminded me.Remember that Polish team that bench trained a Llama-1-7B model to beat GPT-4 on the benchmarks and then went around claiming that this made it better than GPT-4. They even sent some retard here and they couldn't even fathom why that completely invalidated all of their claims?
>>103152768>Polish teamall the polack researchers I've worked with have been consistently retarded
https://huggingface.co/cpumaxx/SoVITS-anime-female-brickwall-ttsMight be useful if anyone wants to automate tts on old VNs that were pre voice acting by using samples from modern VNs.If anyone actually cares, I might do a male one as well so it would be complete suitable system for old JRPGs etc.
>>103152185https://files.catbox.moe/smqkqw.jpgwhy not
>>103153048kek
>>103152433Because what you're describing is a basic RAG and you can find dozen of articles each day on medium with all implementation details. I can make one in an afternoon and I'm a neet
>>103153133I didn't really describe anything anon, a search engine isn't just (and almost never includes RAG). I just asked if anyone had anything cool to share about classifiers.Also, you should know that 100% of implementations you can find online are nowhere near production ready. Reliable systems that never drop requests and scale to the millions is where the money's at.No offence but hobbyist stuff is completely different, not that you'd be too stupid to get into making real products but right now you're atop mount stupid.
>>103153308>>103153308>>103153308
>>103137277chijo the trash princess