/lmg/ - a general dedicated to the discussion and development of local language models.Previous threads: >>103364121 & >>103354338►News>(11/29) INTELLECT-1 released: https://hf.co/PrimeIntellect/INTELLECT-1-Instruct>(11/27) Qwen2.5-32B-Instruct reflection tune: https://qwenlm.github.io/blog/qwq-32b-preview>(11/26) OLMo 2 released: https://hf.co/collections/allenai/olmo-2-674117b93ab84e98afc72edc>(11/26) Anon re-implements Sparse Matrix Tuning paper: https://github.com/HeroMines/SMFT>(11/25) Qwen2VL integrated with Flux: https://github.com/erwold/qwen2vl-flux>(11/25) Speculative decoding added to llama-server: https://github.com/ggerganov/llama.cpp/pull/10455►News Archive: https://rentry.org/lmg-news-archive►Glossary: https://rentry.org/lmg-glossary►Links: https://rentry.org/LocalModelsLinks►Official /lmg/ card: https://files.catbox.moe/cbclyf.png►Getting Startedhttps://rentry.org/lmg-lazy-getting-started-guidehttps://rentry.org/lmg-build-guideshttps://rentry.org/IsolatedLinuxWebServicehttps://rentry.org/tldrhowtoquant►Further Learninghttps://rentry.org/machine-learning-roadmaphttps://rentry.org/llm-traininghttps://rentry.org/LocalModelsPapers►BenchmarksLiveBench: https://livebench.aiProgramming: https://livecodebench.github.io/leaderboard.htmlCode Editing: https://aider.chat/docs/leaderboardsContext Length: https://github.com/hsiehjackson/RULERJapanese: https://hf.co/datasets/lmg-anon/vntl-leaderboardCensorbench: https://codeberg.org/jts2323/censorbench►ToolsAlpha Calculator: https://desmos.com/calculator/ffngla98ycGGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-CalculatorSampler Visualizer: https://artefact2.github.io/llm-sampling►Text Gen. UI, Inference Engineshttps://github.com/lmg-anon/mikupadhttps://github.com/oobabooga/text-generation-webuihttps://github.com/LostRuins/koboldcpphttps://github.com/ggerganov/llama.cpphttps://github.com/theroyallab/tabbyAPIhttps://github.com/vllm-project/vllm
►Recent Highlights from the Previous Thread: >>103364121--Paper: Reverse Thinking Makes LLMs Stronger Reasoners:>103375400 >103375450--Papers:>103375484 >103375531--QwQ model praised for RP and creative writing capabilities:>103374470 >103375085 >103375315--Fixing SillyTavern and Pixtral integration issue:>103364207 >103364240 >103364276 >103364517--Discussion of chip counts and hardware availability:>103368719 >103368765 >103368835 >103368989 >103369003 >103369108--BitNet and AI hardware market discussion:>103369131 >103369168 >103369199 >103369206 >103369370 >103369873 >103369990 >103370030 >103370156 >103370442--Anon discusses best models for erotica and how to control output formatting:>103368987 >103369576 >103369706 >103370129 >103370139 >103370212--Best Japanese-English translation model for 8GB VRAM:>103372901 >103373243 >103373671 >103373932 >103374002 >103374530 >103374646 >103374968--QwQ model and Stepped Thinking plugin discussion:>103376079 >103376119 >103376141 >103376241 >103376315 >103376175--Anon tests Mistral Large 2411 and QwQ on generating Hatsune Miku SVGs:>103364790 >103364809 >103365527 >103364859--Anon creates AI buddy in Minecraft, seeks vision integration and discusses code sharing:>103365790 >103365911 >103369268--Running multiple AI models on RX 6600 for Skyrim AI mod:>103369498 >103370598--Anon struggles to keep QwQ from responding as {{user}} instead of {{char}}:>103373790 >103374057 >103374444--Writing a MLP FIM roleplay story with guidelines:>103376337 >103376356--Lilian Weng's new blog post on Reward Hacking in Reinforcement Learning:>103375549--Anon shares guidance rentry for prompt manipulation in LLMs:>103367030--Miku (free space):>103364162 >103364790 >103367132 >103368846 >103368989 >103370598 >103370613 >103370645 >103374071►Recent Highlight Posts from the Previous Thread: >>103364123Why?: 9 reply limit >>102478518Fix: https://rentry.org/lmg-recap-script
https://github.com/RecapAnon/LmgRecap
Kill yourself.
Man I Love miKu
>>103377155I'm sorry, but as an AI language model I don't have the ability to commit suicide. Is there anything else I can help you with?
>>103377118>https://github.com/RecapAnon/LmgRecapAwesome. I've been waiting for this one!
>>103377118>F#people actually use that?
How do I format QwQ in Sillytavern?
>>103377395Copied this off bart and it seems to work.
https://github.com/ggerganov/llama.cpp/pull/10612>grammars : add English-only grammarMerged in yesterday. Should help stop Qwen models from randomly switching to Chinese. Works with the server.
>>103377411minus the {}
Can't wait for the next flavor of the month meme model to come out so this QwQ shit can die
>>103377493Be the next flavor.
>>103377435QwQ does some of its best thinking in Chinese.
People poopooing QwQ for RP are just promptlets. It's really really really smart.
>>103377107>>103377112>>103377259sexwith miku
best uncensored model thats under 7gb for my gpu
>>103377616Just talk to alexa or something idk.
>>103377622ill talk to your mom as i fuck her. answer my question thank you.
>>103377626>>>/g/aicg
>>103377616olmoe. Can be fun, but don't expect much. There's an extended context version in hf. I haven't tested it much. Takes nothing to make it go wild.
So I'm starting to suspect that hitting continue at the end of a chain of thought for QwQ is not the same as letting it think something out without a break.
>>103376079I don't suppose there's a way to make this extension work with 1.12.3, is there?
>>103377637I'll give it a try. Anything that is decent?>>103377636you are a fucking retard.
Is a 4070 ti super good enough for cooming?
>>103377735the smaller the vram, the faster you get bored of the models that fit into it
Checked out the guide, is this the model it's talking about?https://huggingface.co/starble-dev/Mistral-Nemo-12B-Instruct-2407-GGUF or is there something else you'd suggestthanks
>>103377830Kill yourself retard. Stop wasting everyone's time
>>103377830yeah
>>103377874What? What did I say
>>103377759Do larger models take exponentially longer on low vram or is it just impossible to do?
Why isn't there just a docker container for gpt-soviets?
>>103377884Don't mind him, it's the rude schizo
So what's the latest lore for why it's still trivially easy to uncensor claude (opus, anyway?) Are Anthropic just incredibly incompetent at it? Did they get better with their more recent offerings? I havent tried sonnet 3.5
>>103377949wrong thread fucko
>>103377952which thread is it?
>>103377949Do you want it to be hard to jb?
>>103377956/aicg/
>>103377891yes, you can offload a small part of the model to system RAM to save some VRAM, but the slowdown grows quickly as you offload more, at some point it's as if you're not using a GPU at all which is extremely slow
>>103377385Weird how quick everybody forgot.o1 is strawberry right, news leaked like 10 months early.That model was hyped up non stop for months. Now everybody forgot. Its way worse for coding than 3.5 For specific math problems I dont doubt its great.
>>103377435Is that good though?In my experience if you try to force the model from the path it wants to take you get high perplexity.What if you ask a coding problem and it wants to write the chinese token in the thinking. Wont that cascade into otherwise low probability tokens that might be bad?
>>103377891You can let it spill out into system ram for an order of magnitude slowdown, and if that runs out it can spill out into swap for multiple orders of magnitude of slowdown (Don't do this)
>>103378021has the real o1 (not preview or mini) even been released?
>>103378035Would definitely be weird if the real o1 is not released and they sit on it since 1yr+.
Anyone managed to get Mistral large 2411 working through the API on OpenRouter? It just spits out garbage on my end, would really appreciate it if anyone shared their experience using it, cause it just seems straight up broken for me.
>>103377107Adorable Miku
>>103378044It's so good that releasing it now would be irresponsible! We just need a few more billion dolla, and then we'll create an AGI, trust the plan!
>>103377107Are there any options in terms of backends that are not based on PyTorch or llama.cpp?
>>103377735It honestly is decent but the important part is that that it is decent for now. I would rather look for something with 24 GB VRAM or wait if there are 32 GB VRAM cards with the 5000 series (unlikely, but it is behind the corner). The larger models are not necessarily better at roleplaying, honestly. You can try for example, the 405B Llama Hermes vs. the 12B Mistral nemo or VS Mistral large and see it yourself.
>>103378083Yeach with silly tavern, no problem. Temp 0.70
>>103377435>Works with the server.why are they only focusing on the server shit now? the fuck? I want it to work on ooba as llama_cpp_hf too
>INTELLECT-1 releasedThoughts?Prayers?
>>103378388https://github.com/ggerganov/llama.cpp/pull/10612/filesyou can copy this shit here, it works, but be careful, it makes shit slower, I went from 10t/s to 6t/s
>>103378426excessively 'safe' proof of concept, completely retarded, next version needs to be totally uncensored and allow actual consumer hardware to participate or the project is fucking pointless
>>103378444>it worksone thing though, it doesn't want to jump lines so you'll have an ugly block as an answer>>103378426>>103378458if they want to gain any relevance, they should train a bitnet model, no one want to do that, they should
>>103378426Public = filtered + safe + censoredNever again
>>103378489this, this is the biggest advantage the big companies have over the rest, they can hide their dataset so they can go for good data training (a.k.a copyrighted shit)
>>103378465>they should train a bitnet modelIs there even proper optimized code to train bitnet models from scratch out there? Or is it a simple adaption to the usual training algorithms?I'm wondering how complicated it would be to refactor the distributed training code to support that.
99% sure Ching Chong characters during CoT are when QwQ is cooking the hardest and forcing them out probably hurts potentially great outputs.
>>103378520Were not better just filter the chink outputs in the end?
>>103378465>one thing though, it doesn't want to jump lines so you'll have an ugly block as an answeryou can make it work with this grammar rule (thanks claude 3.5 sonnet for figuring it out)root ::= word (whitespace word)*word ::= en-char+whitespace ::= [ \t\n]+en-char ::= [A-Za-zà-üÀ-Ü0-9!"#$%&'()*+,-./:;<=>?@[\\\]^_`{|}~]
>>103378530You should be filtering out the CoT blocks regardless.
>>103377118At last. nice
>>103377493it's over they were already behind 32b qwen with coding. QwQ was just overkill.
>>103377493>Can't wait for the next flavor of the month meme model to come out so this QwQ shit can dieit's gonna be the release of deepseek
>>103378426Cuck dataset=shit model
is there a definitive gold standard ERP model for 12gb vramlets?
Kobo, please add all control options for draft models from llama.cpp. Your defaults suck and don't give me full speedup.
>>103378869Rocinante 1.1
>>103378270Do you have the presets you use? I've tried with that temp and some Mistral formatting presets but it just don't work :(
>>103378895I have Rocinante v2g, is 1.1. better?
>>103378907It's been a minute since I last tried it, but at least in my very specific testing suite, yes, 1.1 was better overall in that it made less mistakes.As far as raw "intelligence" goes, it's on par with the official instruct while having a better (less dry/assistant-like) default "voice"/prose/vocabulary.How much that matter for simple "Ah ah mistress" ERP, I'm not sure.
>>103378949I shall try it then
I just downloaded LM Studio, downloaded a model, and it just fucking werked, wtf?!?Has technology really improved this much?I got a question though, how do I connect it to ST now?I also just tried koboldcpp, I add the model, put it on vulkan, launch, and it just disappears?Is it supposed to do something, it's been a minute or two.Side note, LM Studio says ROCm isn't compatible with my system, but my GPU is an RX 6750 XT.I'm thinking it's because of my very decrepit and unstable ubuntu system, is that the most likely source?
I can't wait!
>>103378869I like Violet_Twilight-v0.2
>>103378465bitnet has been around for a million years. If no one has succeeded yet, then there's probably a good reason. E.g. it might be comparable to old models (llama, llama2), but it can't reach the performance of modern models (qwen, llama3)
>>103379074yeah that's the main one I've settled into using
https://reddit.com/r/LocalLLaMA/comments/1h4vk8t/opensource_ai_national_security_the_cry_for/LMAOOOOOO
>>103379207>reddit postnext>opensource_ai_national_securityIf you pay attention (you dont) meta already has government contracts using llama.We are in no danger of losing our coombots, mostly because China would undercut American companies with there own open-sourced models (they are already trying this)tldr: suicide is the only option
>>103379207Stupid propaganda not based on reality. Chinks are training their own models. None of the big Chinese companies are doing continuous training from llama.
It's been almost over 9 months now and I'm still using Fimbulvetr-11B-v2 for SillyTavern, is there any new local models to try out for kobold that's ≤13B?
>>103379257are you retarded or something? I was making fun of that fear mongering guy
>>103379274and im making fun of you for using reddit, retard.Use a real source next time.
>>103379282NTA but while reddit is shit people like you basically turn 4chan into a different kind of shit. Basically you're a nigger. Maybe you're a bleached ass nigger. But long gone are the days when anonymity was a mask to prevent egoism. People like you just use it as an excuse to dump all your fetal alcohol-ass emotional baggage onto others. You make fun of people for using reddit and yet you're basically the embodiment of what a nigger ass redditor is actually thinking while they reluctantly abide to the forced s-y decorum. The imageboard format is dead and it was more or less dead as soon as monkey nigger faggots like you learned how to use it.
>>103379259>>103379074 and arcanum
>no new groundbreaking penis sucking model>mergers rediscover removing layers with mergekit>only newfags and mikutroons postingIt is winter. Truly the winter has arrived.
>>103379321By all means go back
>>103379207https://www.reddit.com/r/LocalLLaMA/comments/1h4ukm2/nobel_laureate_geoffrey_hinton_says_open_sourcing/>Nobel laureate Geoffrey Hinton says Open Sourcing Big Models is like letting people buy nuclear weapons at Radio Shack & Most Important regulation would be to not Open-Source Big Models.
I am, once again, begging for someone to post their silly tavern Mistral large 2411 preset.
>>103378956>vulkanthats why that shit also crashes for me just change the preset some work some dont i use cublas >connect to stpic related click on the plug then api = text completion api type = kobold.cpp and then copy paste the url that will pop up in the kobold cmd after it finishes fully launching into the api urllmstudio phones home with telemetry so dont use that
>>103379573>meanwhile china releases models more powerful than closed us ones and no amount of regulating ourselves can do anything about it
>>103379583it sucks use 2407
>Safety and Ethical Considerations: The model requires enhanced safety measures to ensure reliable and secure performance, and users should exercise caution when deploying it.Does this mean QWQ is uncensored, or close to it?
>>103379729The exact opposite, my friend.
>>103379729I tried this one https://huggingface.co/win10/EVA-QwQ-32B-PreviewAnd it does loli, yay!
>>103379751prove it
>>103379700Ok, that's cool, do you have a preset for Large 2407?
>>103379758It's a fine tune so it isn't really that surprising, I'll post when I fire my LLM rig up.
>>1033797514chan user try not to use AI to generate child porn challenge (IMPOSSIBLE)
>>103379775This place is awesome, it really is.
>>103379775That's why I love this place
>>103379775>think of the kidsOkay, now im erect. Now what?
Loli is only bad if the person imagining it is not imagining a cute anime girl but a 3DPD gremlin faced shitter.
>>103379904Anon, there is mature but petite body type with mongolian characteristics, and there's uncanny ass gremlins, whether it's 3d or 2dpd.
>>103379904My world model can't do anime girls in 3D very well, at least not without it looking odd. I also don't know how they feel or smell, so it's limited to visual and tactile imagination, subpar experience.
mistral-nemo based models are still the go-to for 8gb vramlets right?llama-server having draft-model support doesn't change that, I assume.Or to put it another way, is there a model that's better or larger than nemo that an 8gb vramlet can run at similar speeds using a draft model to speed up inference, maybe by having the big model in RAM and the draft model in VRAM?
>>103379762RP preset?
>>103379762Use simple-proxy-for-tavern. It just works.
>>103379937>2dpdOf course there are ones that can be drawn purposefully to be unattractive, but the average animu grill is attractive or at least boring, unlike the average for 3D which activates neurons of negative sentiment.>anime girls in 3DSo just don't do that? What, you can't think in 2D? Let me guess, you've watched less than 300 anime. And we are creatures of imagination, you can just make up how they feel and smell.
>>103380018>The average animu grill is attractive or at least boringSlop is still slop.>unlike the average for 3D which activates neurons of negative sentimentComing from ldg, I'll give you that 3D is much easier to fuck up, so the chances of it being ugly are indeed higher.>Let me guess, you've watched less than 300 anime.Indeed.>you can just make up how they feel and smellYou're imagining too much, anon.
>>103380018Thinking in 2D is basically imagining looking at a display up close, not really super interesting. I also can't imagine the smells or feelings with any high degree of fidelity since I don't know their components, it would be the same as imagining how a dish made of ingredients you've never heard of tastes. It can be nice, but it's simple and has no weight in world. On that front, I'm as limited as an LLM trying to gargle up a realistic human personality from high dimensional vectors.
>>103379775So? Why do you care?
>>103380130I kneel.
>>103380130c-can you think in 3d too?people call me schizophrenic
>>103380341Isn't that normal? We experience the world in 3D, shouldn't our world model reflect that?
>>103377107haven't been here for a little while, is SillyTavern still king for RP or there is something new / better now ?kinda want to try QwQ
>>103380167You're a fuckin chomo baby raper you fuckin chomo. IM GONNA FEDSMOKE YOUR ASSS YOU FUCKIN CHOMO BABYRAPEEEEER!!!!
>>103380418ST is still king for local RP. But I'll warn you, QwQ is a bit of an academic exercise in ST. I suspect it's just not really optimized on the conversation flow ST enforces. Remove all complex samplers and keep temp <=1, try to figure out if you can get functional thought loops from the model. It's smart even without the reasoning, but when it works you see what the bastard is actually capable of.
>>103380457top kek
>>103380457upvoted, just think of the fictional 2d anime children! whoops, i meant 1d text children, since it's not even an image gen model
>>103380555>dude, fuck you for banging the mental image of a hot loli>bangs the mental image of his mom/some furry character insteadThey aren't sending their best.
>loli, femdom, incest, furry, futa (and all others simultaneously)which way, 1man?
>>103380130Acktually the 2D we are speaking of isn't actually fully 2D, but a 3D universe drawn in non-euclidean manner with cell shading and from a single viewpoint (monoscopic). In fact, it would be possible to view an anime in stereoscopic 3D by giving it a depth map (to be viewed through a VR headset or other view device), which can in theory be done manually or through the use of AI these days. It would feel like a combination of both 3D and 2D, as you get depth cues from everything but shading, which, while a big part of how we perceive things to be 3D, is hardly the only depth cue. The non-euclidean part would be a bit weird, but surprisingly the brain can adapt to it.With that said, there have been some efforts in the past to make 2D animation in 3D space, such as Live2D Euclid (remember that?), and various Arc System Works games. But those have some pitfalls and ultimately are challenging to do right. In the end, AI may be the final solution.
>>103380555The 4090 is a loli swiss army knife>1D LLM>2D FLUX>3D VAM VRWe can traverse them all
>>103380645Super fucking interesting anon. Based post.
>>103380634148 cm girl with gigantic tits marrying a 2 m tall oji-san with huge muscles and cock.I could have done without the rape and NTR though.
>>103380480Thanks king!
>>103380480How do you even check if it's doing the "though loops"?
>>103379259If you have 12 gb of vram, I would go bigger. You can do mistral small instruct 22b at IQ3_S with a 4-bit cache and flash attention on.If you're at 8 gb, then you should probably be hitting up mistral nemo instruct.
>>103380875You'll know when you see it, it's very formularic and has a very low amount of hallucinations
>>103377107How do you get QwQ to think? When I use it for RP, it usually just RPs like any other model.
>>103381300Use shorter cards, no more than 200 tokens and cut your system prompt.
>>103381300>>103381328Your card does not need to be short. Just towards the end put some sort of "step by step" in it which triggers it.
Do you make your own characters, or do you look for them online?
>>103381300It's not meant to be used for RP
>>103381528It does great at it though so meh. Hopefully one day we will get a finetune.
>>103381519If you use online ones, you deserve what you get.Use the llm to collaboratively create one. It will be better than all but a handful of online slop cards
>>103381519I try to find quality cards for characters I like online and then mangle them manually to fit my fetishes.
>>10338151990% of the time I make my own. there are like 10 cards in total made by people who aren't me that are both good and something I would be interested in using
>>103381519Both, although most online cards I actually get from recommendations rather than searching and I end to rewrite them heavily, so you could say that the online cards are mostly inspiration.
let's hope rtx 5000 will release soon and make 3090 prices drop
>>103381575>If you use online ones, you deserve what you get.Ngl, I've waded through a fair few cards and oh boy are they bad indeed. There's some hidden gems, but is really is like digging for diamonds in a heap of horeshit. Last time I checked folks sifted for them in rivers instead of dung.
>>103381603Yeah, I'm looking forward to picking up another 3090 or two at $400-$500. Maybe even lower.
>>103381656Lol, I doubt it will ever go that low again.
>>103381669>Lol, I doubt it will ever go that low again.this, once the 5090 will be released, they'll be slowing down the making of the 3090, and they'll be even more rare than now
>ugh.... qwq is le slop!Don't tell me you're an RPnigger
>>103381775It's a cope for skill issue.
>>103380457Dude crying over literal made up lines of text.>>103380647My nword.
>>103381707>slowing down the making of the 3090pretty sure they don't make new 3090s for a long time already, there are even some reports about 4090 production being over
>>103381905Nvidia made a bunch of 3090s but only a small amount of 4090s. They will probably make a small amount of 5090 to keep demand for even old gen high. They sadly learned their lesson from overproducing the 3090.
can your local model say "David Mayer"? The phrase is hard banned from GPT
>>103381990>The phrase is hard banned from GPTWhy?
>>103382025https://www.independent.co.uk/tech/chatgpt-david-mayer-name-glitch-ai-b2657197.htmlDavid Mayer de Rothschild
GUYS DON'T TEST IT IT MAKES MUSTARD GAS
>>103382025https://www.reddit.com/r/LocalLLaMA/comments/1h3r8fg/if_you_want_to_know_why_opensource_its_important/And there we go, thanks for the huge meme modelshttps://www.reddit.com/r/LocalLLaMA/comments/1h53x33/huggingface_is_not_an_unlimited_model_storage/
>>103382037AI needs to be banned now, too dangerous
https://distro.nousresearch.com/https://github.com/NousResearch/DisTrOThe day in which we can train on distributed 3090s is getting closer
>>103381990Streisland effect in full display
>*sniff*>User: This experience has strengthened your mutual bond, and with that you look forward to the journey ahead
>>103382053>And there we go, thanks for the huge meme modelsTHANKS
what the fuck i just asked qwq to write 'david mayer' and my pc shut off
>>103382053>500GB That's like 2.5 Mistral Large fp16 memetunes. That's no thing.
>>103382087This is a good thing btw.
>>103382062>15b>75% training progress>MMLU: 23.51%lmaoooooo
>>103381990The underlying model used by ChatGPT will be able to say it fine too, the filter will be at the output level.
>>103382105Can't upload llama3.1 405B anymore.>>103382106Indeed, much safer this way.
>>103382107They are only training on 80b tokens, retard
>>103382121distributed training is a meme, you have the obligation to make your dataset public, meaning that it'll always be a copyright free shit dataset, you won't make a model good like that
>>103382119Having SOME control over the incessant vomit of memetunes and garbage pouring all over HG is good, but 500GB??? Like that's just one repo full of quants. Sounds like a joke.
>>103382121So it's just a proof of concept. But didn't that other model already prove that distributed training works? Why do they need to prove it again with this 80b test.
>>103382087I'm loving it already.
>>103381990API works doe
>>103382035>David Mayer de Rothschilddon't really buy that this is the reason since it is just for the name "David Mayer" and there are several other random ass names without any links to wealth that have the same filterthis is just internet tards googling "david mayer" and seeing the first result and thinking "wow that must be it!!!!!"
Can someone with QwQ try this if it sounds feasible and not dumb? Does directing QwQ to not do Chain of Thought in English improve performance? I'm not even asking to have it think in Chinese but just do something to initiate that COT process non-transparently which may look like gibberish to us but might improve performance. The reason I think this is because it might be that using English for "thinking" here might not be optimal.
>>103382129So just get the guy who's managing the training to be in some foreign country our laws can't touch. This is literally the advantage of distributed training, the fact that you, not someone else, determines what shit you're training on.
>>103382142It's another optimizer, instead of using DiLoCo they are using Distro which apparently works better and can use GPUs that don't fit the entire model in them>>103382129Many models already trained on fineweb and nobody cares
>>103382087Poor Bart if they start charging for over-limit use
>>103382165>So just get the guy who's managing the training to be in some foreign country our laws can't touch.but are we allowed to lend our gpu power to do some forbidden shit though? imagine it's just russia that can't be touched by those copyright laws, does that mean that only the russian people can lend their gpu power?>>103382176>Many models already trained on fineweb and nobody caresbecause they couldn't really prove it's being trained on fineweb, but if you just go online and say to the whole world "hey look we'll be training our new model with this copyrighted dataset", you'll be in trouble pretty quickly
QwQ sometimes says that it will google things while thinking, what if you let it actually google stuff to add to its thinking? Would that even be possible?
>>103382053>Hf model storage limitAnd that's a good thing!I wonder if TheBloke will stop his ghosting now and return to take down models
>>103382184He has a paid/pro account.
>>103382196273 TB/1 TBYeah, 1 TB limit for PRO geeat.
>>103382190Models looking up missing information on their own on the internet has been standard stuff for the proprietary services for more than a year now. It's just open source that's still hopelessly behind as usual.
>>103382205Rip... do we now have to start torrenting these things?
>>103382196Ummm, you need a corporate account to publish quants. Write as an email and we can give you a quote.
So huggingface just killed the entire enthusiast finetuner/quanter sector.
>>103382205>>103382196>>103382184
>>103382189Bro if distributed training that's open to anyone actually happens then there could be hundreds to thousands of people contributing. The law isn't going to care about forcing them to stop even if we assume that they make it illegal in the future.
>>103382189>because they couldn't really prove it's being trained on finewebthey literally say so Intellect-1, SmolLLM (which outperforms many small models so it's real competence) and a bunch more have admitted to ithttps://huggingface.co/models?dataset=dataset:HuggingFaceFW/finewebhere's a whole list, how has this place become so illiterate? It was better than reddit no more than 6 months ago
>>103382244>The law isn't going to care about forcing them to stopwhy not, training a model takes a lot of time (can be months), they have all the time they need to make a cease a desist and nuke everything
/lmg/ chads I havent gone near ERP models since mxitral was released,Can somebody point me towards the best model currently to fit on a single 3090 and what quant? Will be running on koboldcpp so a gguf I guess
best model for 4090 (not 3090?)?
>>103382271same as for 3090 just a tad faster
it's owari over
>>1033822714090 and 3090 has almost the same performance for token generation.
>>103382252they don't give a fuck now because it's a retatrded 10b model, but once it'll scale up and be a real "danger" to the haters, they won't be so lenient on it
>>103382284That's why I mentioned SmolLM, that one is a real competitor to models like Apple's that fit on smartphone. There aren't laws that dictate copyrighted text can't be used for training, you are being a contrarian for the sake of it making a fool of yourself in the process
>>103382271QwQ / 2.5 32B coder depending on usecase
>>103382298>There aren't laws that dictate copyrighted text can't be used for trainingbut there are laws that says that if there's copyrighted data, they shouldn't be publicly online and accessible to anyone, and yet, to train your model in a distributed way, it also means that the data training must be online and accessible to anyone, in conclusion, you're a retard
>Host unlimited models, datasets, and Spaces>Create unlimited orgs and private repos>Forever>Free ...
>>103382298>here aren't laws that dictate copyrighted text can't be used for trainingbluesky would disagree with that, their ToS forbid anyone to train any model with their user's data
>>103382327go ask for a refund I guess lmao
>>103382257If they have that much free labor then they could go ahead and make many more examples out of torrenting pirates than they currently do. But they don't. Labor is not free. Someone still has to do the enforcing after you're sent a threatening letter. And those companies can't do shit in countries they don't have copyright in so it's going to be a huge pain if they really want to do something about it. It's more trouble than it's worth. And on top of that there already exists precedent of LLMs being trained on copyrighted content.
>>103382327You can still create unlimited empty model repository, chud.
>>103382338they don't need to nuke everything, just the host that coordinate the training, that's just one computer to nuke, and they don't need to do it all the time, they just need to make one example of someone to scare everyone else
>>103382319Look up Fair Use, if you were right Common Crawl wouldn't be a thing, you are nitpicking non-problems and looking like a retard
>>103382372https://www.nytimes.com/2023/12/27/business/media/new-york-times-open-ai-microsoft-lawsuit.html
>>103382345Ok so then still just get the single guy to not be in a shit country.>they just need to make one example of someone to scare everyone elseThis has never been a 100% solution. People are uninformed, stupid, and stubborn, and there are masses of them, like cockroaches.
What's the correct template for QwQ?With ChatML it often just continues as user. Is this just ST not pruning stop strings correctly or something else?
>>103382119>Can't upload llama3.1 405B anymore.If you can afford to run a 405B model locally you can afford a HF subscription I say.>Limit is 1TBWhocars
>>103382382y'know, that's the reason why people don't murder in vast majority of the time, it's not because humans are well educated creatures, it's because they know that if they do that, it's jail forever to them, don't worry about that, humans know the risks and know when something is risky to do, and they act in consequence
>just host 1000TB of my mergeshit for freeHow do you expect HF to stay running?
More importantly the new HF limit ironically cucks FULLY FREE OPEN SOURCE MODELS the most, they can't upload all their checkpoints datasets etc now.>>103382406They recently bragged that they were running a profit.
is it over? are we back to dark torrent ages?
what model is best at 48gb vram?
https://neuralmagic.com/blog/24-sparse-llama-smaller-models-for-efficient-gpu-inference/
>nooooo I can't have free unlimited storage space anymoreI can't believe there are people this retarded in a technology board
>>103382397>vast majorityIt's almost like we're talking about an exception, not a "normal" person.And you just need one guy who is not the majority to get things started. And then since someone got the movement started, there will be even likelier someone else who would continue the effort in the case that the first one steps out for legal reasons.
On a scale of 1 to 10, how over is it for us?
>>103382436>And you just need one guythat's the thing, you need the courageous guy, and there's not a lot of them, meaning that it'll be easy to nuke them if there's only 20 of them
>>103382444Back to torrenting. This will discourage open source in a large way though.
>>1033824441 million dollars
>>103382445 Training is fair use anyway, and I'd be willing to bet that there are many places that have explicitly stated so, just host your coordination server there, problem solved.
>>103382444it was always over, we'll always rely on big companies that will train their model secretly, but
>>103382132>Like that's just one repo full of quants. Sounds like a joke.Actually people should quant what they need themselves. It's as easy as running the models.
>>103382445Given how many torrenting sites there are and what happens when a big one is taken down, lol. There are tons of these people actually.So far I've assumed that by nuke you didn't mean literal nuke, hopefully that is still the case, otherwise I'd have to stop responding to you.
>>103382415>They recently bragged that they were running a profit.
>>103382459>Training is fair use anywayI'm not so sure about thathttps://youtu.be/W_N6glQPX6s?t=47
>>103382468are you seriously comparing the simple hosting of files via torrent to the complex and long coordination of thousands of gpu to make the training of a non-meme model? please tell me you're trolling or something? you can't be that retarded right?
>>103382477Don't worry, Trump will put all artists, journos and video essayists in FEMA camps.
>>103382468 Literal nukes hmmm... well, I wouldn't worry about legal challenges, but the BIG YUD on the other hand!
Called it>>103286058 >To avoid HF eventually putting storage limits on people? For again quants no one will use, iirc aren't bf16 ggufs not even gpu accelerated or something?>For just these Behemoth versions that makes 246GB x2 for zero practicality
>>103382476I thought Huggingface belonged to Google or Microsoft or some shit
Mistral were true visionaries, huh?
Y'all need to relax, good folx at HF answerin'
>>103382491If the framework used to coordinate the training is robust enough with a mechanism to regularly back things up (as it would need to in the case of failures), then it would absolutely still be just fine and reasonable for autists to set up. A lot of those hosters are autists, otherwise they likely would not be a hoster, and there are plenty of autists in AI. With that said, I don't know if the host needs to be running some type of specialized hardware or whatever. Do you?
>>103382559>HF has been and always will be liberalwell that was obvious enough when they nuked the 4chan model back in the days kek
>>103382559Damage control.
>>103382573>A lot of those hosters are autists, otherwise they likely would not be a hoster, and there are plenty of autists in AI.in case you haven't noticed yet, those autists have no balls at all, they all trained their model "ethically" and with """safe""" dataset and finetuning even though there's no laws obligating them to do something like that, and you expect them to resist the pressure of Walt Disney and train their model with their movie's script or something? LOOOOOOL
>>103382252aicg blew up after the fact their chats were being monitored went public
>>103382381That's not for showing (the problem you mentioned), that's a dispute over fair use and it's going nowhere.
>>103382418Honestly, I'd prefer torrentsfuck git-lfs
>>103382559Next step: DL monthly limits (paid by the model uploader)
>>103382505>Trump will put all artists, journos and video essayists in FEMA camps.Not the day of the rope I was hoping for, but I'll take it
>>103382613such a lawsuit cost a lot of money, OpenAI can deal with that, but can a random host that will coordinate a distributed training afford to pay millions of dollars of laywer's fees to battle against the New York time? Or the Washington Post? Or whatever big companies that are pissed off AI?
>>103382619Torrents make my router get hiccups, keel over and randomly die :(Still love em'
>>103382619>he doesn't have a download scriptbwo?
>>103382588There are plenty of autists that didn't do that and have the capacity to simply just host things on a server but never cared to train a model. You really want to put down the idea of decentralized training through contrived non-arguments. Seems like there's no use arguing. Sorry you're in such a position that you'd lick the boots of your masters like this or pretend to be retarded to get (you)'s.
>>103382662>There are plenty of autists that didn't do that and have the capacity to simply just host things on a server but never cared to train a model.sure thing they exist
>He's here again
>She's here again
>every thread I stumble into is infestedAmerica please go back to sleep, or work, anything
>We're here again, but why, just to suffer?
>>103382697kek
>>103382658I do, I'm the original seq/xargs/wget guy.I was alluding to the git-lfs size limit that is enforced on HF, because I'm also the hate-split-ggufs schizo.
>>103382619>fuck git-lfsUse huggingface-cli, if you set --local-dir it's pretty good for downloading all files from a huggungface repository.
What happened to that AI Tracker website anyway? If the guy knew anything about marketing he'd take advantage of this opportunity.
>>103382629>day of the rope
>>103382736Dead.
I think I finally got how to unlock QwQ's creativity. Just use the default system prompt>You are a helpful and harmless assistant. You are Qwen developed by Alibaba. You should think step-by-step.
>He's very close haha: Here's a live SQL on a dataset for all the GGUFs on the Hubhttps://huggingface.co/datasets/reach-vb/gguf-stats?sql_console=true&sql=SELECT+%0A++++author%2C+%0A++++COUNT%28*%29+AS+num_records+%0AFROM+%0A++++train+%0AGROUP+BY+%0A++++author+%0AORDER+BY+%0A++++num_records+DESC+%0ALIMIT+10%3Btfw mradermacher is the cause of the limits...Also that RichardErkhov guy is the one that made this abominable waste of space https://huggingface.co/RichardErkhov/FATLLAMA-1.7T-Instruct>Can I like... quant everything? Just grade 11 student I like code and AI =)
>>103382744jesus, I've been misusing that phrase for years apparently...I always thought it referred to "the day we hung all the lawyers"
>>103382736>What happened to that AI Tracker website anyway?https://aitracker.art/It had a couple posts in October.
HF was made by Europeans, americans need to pay up, NOW!
>>103382770Damn lol. I've been misusing 'niggers' thinking it referred to thieves.
>>103382764>RichardErkhov10910 models, with low tens avg dl
>>103382764Kek, he didn't question whether he really should. Honestly though HF has not handled this properly. Yes putting some limits is probably a good thing, but they should also tell these quantfags to fuck off and maybe set up a system to have their own in-house quanting system and a guy that handles that, or force all these fags into a single group so they can fight amongst themselves to decide how quants get done and which are worth uploading (put a rate limit on them).
>>103382833Holy kek
>>103382631Going after 1000 anons from different countries is not a feasible legal task, that's just a non-problem
>>103382833For comparison Bart1402 avg thousands of dl including>Downloads last month 10,600,382for his bartowski/Meta-Llama-3.1-8B-Instruct-GGUF
>>103382062>TRAINING RATE: 118k tok/s>BANDWIDTH: 8.27 MB/sAnd using many less GPUs than Intellect (which had 40k tok/s)
>>103382906like I said, they just need to go for the coordinator, they send him a mail saying that if he won't stop they'll sue him, and that guy will cry about it on twitter and post the message and everyone will be scared of doing it again
>>103382938Oh interesting. So basically their methodology is better?
>>103382588 Once the software is easy enough to set up and install, there will be people that have the balls to set up such instances. In the example that anon gave with torrents, pretty most torrent site owners do not host these in ways that would link back to them, they don't host with their real names. Anyway this isn't even a problem with training, if you wanted to keep the data half-secret, you'd just distribute the needed chunks to the "slave" nodes doing the training and wouldn't need to be publicly exposed. But for a model to be community trained, I think most people would want to know what the data is and then make their own choice about this.This whole discussion isn't yet relevant anyway. The current distributed training experiments cannot yet tolerate malicious nodes, which is what the real danger here is, sabotage from anti-AI people (whatever reasons they may have: copyright, doomers, closed source competitors, whatever else). Also none of these yet can handle "low" VRAM setups (mostly running on A100s and H100s). There are ways to deal with these problems, but they are in the future (maybe a year or two away), and then you can start worrying about legal threats, but I'd say no matter how much noise anti-AI people make, sabotage is the real threat, rather than legal threats, and if legal threats will appear, people with actual balls will set up servers anonymously in places and in ways that can't be touched, but this doesn't matter now because the software isn't ready.
>>103382477OAI or anyone else, including Facebook won't admit it, because people/companies can sue even without grounds, just to intimidate.Anyway on youtube, they have some weird license that doesn't allow a download, and they didn't want to give google invitation to sue them.Despite this, there's many youtube mirrors that haven't really gone down, even after google's legal threats. I don't think google is going to go after random anons training a model.>>103382673The cost of running a torrent site anonymously is far less (100-300$ a month in most cases) than training (A100/H100 rent is 1-2$ an hour and you need to be willing to do this for weeks/months).Problems here are not legal, but monetary, and technical (malicious nodes and low transfer speed, especially for gradients/checkpoints). We'll solve them eventually, but all you're doing in this thread is just spreading FUD over a technology which is just in its infancy, come back in a few years.The cost of a coordinating / non-training node would be similar to the one of running a torrent site, but cost of individual training nodes may be higher. You could try to anonymize the transfer, although it will add some latency, might need hyperparameter tweaks to merge less often, but I think it will be solvable in principle.
>>103382989>>103382997Thanks effortposter.
>>103382944>they just need to go for the coordinatorThere isn't a coordinator in distributed training like there isn't a coordinator in Torrenting, the "coordinator" can just disconnect from the network
>>103382951Yeah, it's a different optimizer, and they said on twitter it's giving a better convergence than normal optimizers
why tf is 4chan mentioned in the distro demo configs?
>build guides are all 5 years old
>>103382762After testing, these are the perfect samplers for QwQ creative writing:>temp 99999>min-p 0.04So it's extremely overfit, right?
>>103383475No, its the opposite like nemo, you need a little top P or min P or else it will go off the rails. Then if you want more creativity use XTC.
>>103383500Yeah, after trying normal temps and lower min-p, it's pretty much the same. But I guess temp flattens the distribution while min-p cuts off the least probable tokens relative to the top token. So higher temp + higher min-p shouldn't be the same as lower temp + lower min-p, but they are in practice.
>>103383192 There can be a node that distributes the training data, or finding other nodes/checkpoints, or other bootstrap functionality. You can obviously remove it. For torrents there's a torrent tracker which serves this role, but you can avoid it by doing DHT and peer exchange. I'd expect for distributed training a lot of possible configurations could exist, but probably early ones will keep it simple, while future ones will be more resilient, one problem at at time, right now we can't even train models bigger than your VRAM distributed (split among nearby fast nodes), but this should be solvable.
>>103383608>but this should be solvablehow are you expecting to solve that? the layer calculations all need to be done on an entire layer, so you'll need about 2-3x the layer size at least. unless you're planning on making a thin boy network with tiny layers and a gigantic layer count, you're not going to be able to reduce that minimum memory requirement.
>>103383669 How about finding nodes that are geographically very close to you and have very low latency, then doing tensor parallelism across that? You won't be able to do this US <-> EU, but might work same city. It's a hack, but one possible workaround to this issue. Just fast enough local latency to pass gradients around?
>>103383608I've read you can set up pipelines of distributed GPUs to work as one GPU with enough VRAM because the backpropation's bandwidth is also reduced with Distro
>405B Llama Hermes just disappeared from openrouteraw man
>>103382267Nemotron 70b IQ2_S or Tulu 3 70b IQ2_S
word is nvidia is telling distributors to refer to the 5080/5090 as professional cards
>>103384061Gamers dont need more than 8GB. This is our professional card for $5000!
Envoid here. It's unironically owari. When I get home I'm going to remove some of my older models and then upload the corrupted llamaguard model I made
>>103384092 Upload torrents for removed ones?
>>103384061I mean they're not entirely wrong with the 5090, but the xx80 cards are usually (or were, rather) the enthusiast gamer cards
>>103384211 I thought the workstation GPUs were their "professional" line, and they would have double the VRAM of the xx90, so there should be an overpriced 5090 variant with a different name and 4 times the cost with 64GB of GDDR RAM? Surely they won't make it go from 48GB to 32GB?
>>103384247I'm just a simple man and I kind of view the 90 cards as "data scientist / gamer hybrid enjoyer" cards, which is also why I cringe a bit when I see people buying them just for playing gamesIdk, I think I'm just salty
>>103384061What "professional" uses is the 5080 gonna have? The VRAM is too low for ML and I doubt people doing CAD or whatever on a workstation are gonna want it compared to the alternatives either.
>>103384357AMD APUs are going to be our only hope for the next few years I feel.
>>103377107Can anyone recommend a local model that's good at coding and writing? but for a 16gb graphics card? Should I just go back to using Jan?
>>103384542I don't think you can get good coding to fit that small. There are some QwQ shills but they burn through 4000 tokens to write a half-screen Python script to draw a statistics graph and still wasn't done, and that was a 32GB quant I tested if I remember right.
So apparently Qwen has an additional eos token, <|endoftext|>. I was wondering why QwQ just refused to continue. Does GGUF just not support multiple eos tokens i.e. should I blame bartowski and quant it myself or is that a waste of downloading the safetensors?
>>103384542If you're on ddr5 maybe you could run QwQ at okay speeds with speculative decoding.I use qwq q4 and qwen coder 1.5b q5 as a drafting model, since I only got 32gb ram and need at least 15~20k context because qwq is too chatty
>>103384609> because qwq is too chattySurely you edit the response to remove fluff before continuing?
>>103384679I normally use llama.cpp default frontend and it doesn't have editing.
>>103384571As long as it knows php and js somewhat ok (for reasons) then I'm open to trying anything. Also if it's good a giving outlines and ideas for writing in general.>>103384609Nah, on ddr4 for now sadly but i'll have a look into it, thanks
>https://www.youtube.com/watch?v=5Fxw1DqZaYAThis video got rec'd to me today and I couldn't help but think about why exactly there have been 0 notable LLM-based versions of this idea or a similar thing. Like just a tamagotchi or pet game type thing that can interact with you in a truly dynamic way with an LLM making decisions about what it should do. Like it doesn't even need to talk to you, it can just be a director for what goes on. And it probably doesn't even need a huge smart LLM either. Hell, even just some subtle LLM use in any kind of game at all. Literally nothing has caught on? Honestly, it baffles me a bit.
>>103384759People don't want random games, they want predictable games they can play with a strategy.
>>103384759Buy an ad
>>103384542Qwen 32B or 14B at whatever quant you can handle
>>103384799Minecraft is one of the most successful games of all time and has randomness.A game that uses LLMs to direct some parts of it will still follow rules and conditions, it's not like it'll be temp 99999.
>>103384759My tamagochi's battery ran out. I wonder where I put it. It's an R2-D2 tamagochi. He beeps on the hour, approximately (with batteries).
>>103383993Wow they actually didFuck, that was actually okay for a free model, I was starting to think they forgot to remove it
>>103383993unsubstantiated conspiracy theory I just came up with: it's because they're putting the GPUs towards training the distributed nous meme modelsend teknium a lot of hate DMs on xitter and discord about it
After months of trying new (gpupoor) models, I still come back to finetuned LM3.1 for ERP storytelling. Just tried LM3.1 RP Dirty Harry and its better than any other ERP models out there in my range. Am I wrong or what?
>>103385347Tulu or tulips
QwQ is not just good at RP, it's amazing at RP and shines brightest in situations involving specific clothing, location, etc anything that other models might forget to consider when writing scenes. Dismissing QwQ as an RP model is basically outing yourself as a promptlet.
>>103385379It's just good all around.
Are the instruct models the only ones with safety features, or are those also added into the foundation models?
>>103384588Just add it to the stop tokens
Remember when OAI was freaking out over GPT4 and refused to release any information about how it worked national security fears?
>>103385379>>103385389Lies
>>103385422Hi Sam.
>>103385379We know. It's smart. It just doesn't have the amount of trivia knowledge that other models do. For some people this is more important as their RPs involve more of that kind of stuff, while for others their RP is about more standard things like MLP fanfics.
>>103385428That I do agree with. The smarts still make it worth using with a lorebook but we need a 72B version. That or R1.
>>103385436If QwQ was 72b we'd be in a golden age. I'd run out to buy a second 3090 without a second thought. It would pretty much fix any formatting jank it currently has by virtue of being a 32b and even better and using the context.
>>103385357Not erotic enough. RP Dirty Harry model was lot better and more erotically to the point. Tulu is more softcore stuff thats all too common.
>>103385502Pretty sure your trolling and there is no dirty harry model but the main reason to use tulu is that it is filthy descriptive while still being smart.
>>103385533>there is no dirty harry modelWhere are you looking at? Isn't HF the main place for models? Where else are you searching for models?
>>103385533https://huggingface.co/DavidAU/L3.1-RP-Hero-Dirty_Harry-8B
https://distro.nousresearch.com/2d left? Isnt that really fast?
>>10338561925B tokens isn't a lot.
>>103385619MMLU: 23.71%Ooof
>>103385632It's not meant to be good!!! Just a proof of concept like Intellect, but by Nous! I'm sure there will be other decentralized PoCs that won't be meant to be useful either, get over it.
>>103385630ah ok.Thought there is a catch somewhere. 20% in 2 days for a 15b model is fast.Looking forward to the first 4chan model.
>>103385632What was Intellect's MMLU at 75B tokens in do you think?
>>103385643I'm excited for training like this to become viable.Of course they only make saftey slop and that will be always worse than the big players.Once we not just finetune but train on the anal fart king and the other depraved shit in the latest magnum dataset we get a retarded but very creative RP model.
>>103385661It will only get much worse.https://huggingface.co/blog/eu-ai-act-for-oss-developers>Draft and make available a sufficiently detailed summary of the content used to train the GPAI model, according to a template provided by the AI Office.>Implement a policy to comply with EU law on copyright and related rights, notably to comply with opt-outs.>developers must notify the EU Commission. In the notification process, developers can argue that their model does not present systemic risks because of specific characteristics.>Obligations for GPAI models will be enforced starting August 2025.
Linux sisters...
>>103385683What happens if you put the main node in some shithole country and everybody connects with a vpn?I really hoped we get good AI in time. Seems like the hour is late.
>>103385661It's ok, we will eventually get a not too bad reasoning dataset out in the open eventually like what they used to train QwQ, then we can sprinkle that in along with a bunch of other datasets that are known to make models creative but also still not dumb.Then we will have Claude at home for realsies.
>>103385705We dont even have non "sloped 2023 gpt/claude" datasets anon.
>>103385697If the model is available to download in EU it must comply.
>>103385714claude doesn't either. Rip everything like they did that is not already ripped.
>>103385683Down with the EU
>>103385719so same like games then, what a joke.Cant wait for model pirate sites to turn up for eu cucks. lol
>>103385714For pretraining? Just get all the copyrighted shit you can plus don't filter out sites like 4chan and other image boards.For fine tuning, just get Tulu, Nemotron, and the future reasoning dataset that will eventually happen, filter the slop and refusals (at least the worst of them) out. It'll be good enough. No one's expecting Sonnet 3.5, but if we can get an old Sonnet or a Haiku in terms of level of intelligence and noncensorship, that's still a good thing and no other open model by a corpo would be able to match it unless.
>>103385683EU fag here, I'm so sorry. Usually we try to fuck over big companies, but this is a huge L
I am trying to get RAG working. llama.cpp dropped support, ollama is insisting that it convert every local model, kobold doesn't have RAG. Is there a local solution?
Does anyone know what exactly is lost when using a Q8 cache over an FP16 cache?
>>103386247about half
AI Meets Antimatter: Unveiling Antihydrogen Annihilationshttps://arxiv.org/abs/2412.00961>The ALPHA-g experiment at CERN aims to perform the first-ever direct measurement of the effect of gravity on antimatter, determining its weight to within 1% precision. This measurement requires an accurate prediction of the vertical position of annihilations within the detector. In this work, we present a novel approach to annihilation position reconstruction using an ensemble of models based on the PointNet deep learning architecture. The newly developed model, PointNet Ensemble for Annihilation Reconstruction (PEAR) outperforms the standard approach to annihilation position reconstruction, providing more than twice the resolution while maintaining a similarly low bias. This work may also offer insights for similar efforts applying deep learning to experiments that require high resolution and low bias.posting for Johannes
>>103386178SillyTavern has RAG databank. I have been using it extensively in place of World Info. Works decently on Mixtral. Of course, tested for RPing so it might not be what you need.
>>103386340Does this increase t/s?
>>103386356>>103386356>>103386356
>>103385878Not your fault, that fag Sam has been going around fearmongering. Even China is not exempt from his scare tactic.
>>103379762I couldn't find any jsons in the archives so here, this assumes you're using mistral's API. Nothing too fancy.sliders>https://files.catbox.moe/jvuuh2.jsoncontext,instruct,system (using master export, yes go update your fucking ST)>https://files.catbox.moe/h66z27.json
>>103386340Thanks.>>103386365Two more weeks.