/lmg/ - a general dedicated to the discussion and development of local language models.Previous threads: >>102524339 & >>102513868►News>(09/24) Llama-3.1-70B-instruct distilled to 51B: https://hf.co/nvidia/Llama-3_1-Nemotron-51B-Instruct>(09/18) Qwen 2.5 released, trained on 18 trillion token dataset: https://qwenlm.github.io/blog/qwen2.5/>(09/18) Llama 8B quantized to b1.58 through finetuning: https://hf.co/blog/1_58_llm_extreme_quantization>(09/17) Mistral releases new 22B with 128k context and function calling: https://mistral.ai/news/september-24-release/>(09/12) DataGemma with DataCommons retrieval: https://blog.google/technology/ai/google-datagemma-ai-llm►News Archive: https://rentry.org/lmg-news-archive►Glossary: https://rentry.org/lmg-glossary►Links: https://rentry.org/LocalModelsLinks►Official /lmg/ card: https://files.catbox.moe/cbclyf.png►Getting Startedhttps://rentry.org/llama-mini-guidehttps://rentry.org/8-step-llm-guidehttps://rentry.org/llama_v2_sillytavernhttps://rentry.org/lmg-spoonfeed-guidehttps://rentry.org/rocm-llamacpphttps://rentry.org/lmg-build-guides►Further Learninghttps://rentry.org/machine-learning-roadmaphttps://rentry.org/llm-traininghttps://rentry.org/LocalModelsPapers►BenchmarksChatbot Arena: https://chat.lmsys.org/?leaderboardCensorship: https://hf.co/spaces/DontPlanToEnd/UGI-LeaderboardCensorbench: https://codeberg.org/jts2323/censorbenchJapanese: https://hf.co/datasets/lmg-anon/vntl-leaderboardProgramming: https://hf.co/spaces/mike-ravkine/can-ai-code-results►ToolsAlpha Calculator: https://desmos.com/calculator/ffngla98ycGGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-CalculatorSampler visualizer: https://artefact2.github.io/llm-sampling►Text Gen. UI, Inference Engineshttps://github.com/oobabooga/text-generation-webuihttps://github.com/LostRuins/koboldcpphttps://github.com/lmg-anon/mikupadhttps://github.com/turboderp/exuihttps://github.com/ggerganov/llama.cpp
>>102535977fuck your anime garbage>inb4 anime sitekys
►Recent Highlights from the Previous Thread: >>102524339--Papers:>102527814--Quantization turns floats into ints, smaller quant = faster and lower VRAM:>102531358 >102531397 >102531399--JavaScript code to linkify greentext quotes in threads:>102525946 >102526315 >102527428 >102530573--Discussion on the timeline, definitions, and challenges of AGI and ASI:>102525653 >102525785 >102526071 >102526499 >102526932 >102527674--Working on a userscript to treat different symbols as quotes:>102525273 >102525299 >102525393 >102525519--Custom Floating-Points for LLMs, but results may be model-specific and lack statistical significance:>102531991 >102532241 >102532256--Choosing Lora rank and alpha, understanding Loras, mathematicians vs. engineers:>102532942 >102533148 >102533195 >102533307 >102533287 >102533302 >102533338 >102533423 >102533538 >102533570 >102533627 >102533559 >102533602 >102532973 >102532996 >102533100--Anon shares concerns about cloud AI services logging and safety:>102531752 >102531956 >102532242--Nvidia's Llama-3_1-Nemotron-51B-Instruct model discussion:>102524761 >102524862 >102525022 >102525041 >102525306--Choosing between Aphrodite, vLLM, and llama.cpp based on hardware constraints:>102529626 >102529816 >102529861 >102529888--Request for de-slopped Llama3 405B tune for AMD MI300x:>102525233--Chromebook insufficient for serious AI inference:>102530210 >102530784 >102530860 >102532235--Anon praises Florence2 multimodal architecture:>102532950 >102533001 >102533091 >102533288--Miku (free space):>102524999 >102525111 >102525197 >102525335 >102527437 >102532158 >102533585 >102534092►Recent Highlight Posts from the Previous Thread: >>102524347
any abliterated Qwen2.5 72B?it refuses to do certain wholesome and family friendly things.
>>102535991I want to fuck the anime girl.
Happy autumn, /lmg/
Thanks but I'm still sticking with Command-R v01
>>102536036Why are her eyes empty?
>>102535991anime site
>>102535999Can an AI create that image you attached to the post, or does a human still have to make that manually?
>>102535991>fuck your anime garbageOP needs an image to start the thread. What would you put there instead if you had your druthers?Honest question, I'm curious.(I'm assuming an actual answer and not a content-free negative eg "not anime")
Is Autumn the season of hibernation?
>>102536081There's a body under the leaves and you are too close.
>>102536127neither, just a normal script does that part
>>102536036rake teto
>>102536036Isn't it too early?... Oh, wait, it's almost October already!
>>102536036tetoctober soon
>>102536179>fall red eyesI like this Miku
>>102536128lecunny. dead body of sama. llama. graph from an llm paper. model card screenshot.
>12 channel DDR5-6400>or 12 channel MRDIMM-8800>PER SOCKETepycbros I don't feel so good..................
>>102536215>lecunnyYes, because we should revolve the entire general around a person that spends all day being passive agressive on social media.>dead body of samablue board>llamaMeta will eventually turn against open source. They all do.>graph from an llm paper. model card screenshot.Mostly empty image that no one will spend any time looking at. If you want to be a psued and virtual signal your intelligence, you can fuck off back to r-ddit.Anime. Website.
teto
>>102536237It's a meme, bro, you won't even get half of the speed
>>102536272You are just arguing for the sake of arguing. All of those options are more on topic than hatsune miku(male). And that makes all of them better.
>>102536277Is this Academic bullying?
>>102536215>dead body of samakinoi think I've seen all the others in past threads, thoughThanks for the honest answer. I personally like variety in the OP, especially when the image is riffing off of a previous thread's conversations/news. Vocaloids make good stand-ins for pretty well any scenario, so I think that's partly why they get used (along with the whole virtual idol/AI thing being appropriate to LLMs)I am still curious as to why you hate anime images so much, though. I'm sure there's some other thing out there that would annoy me in the same way, but your reaction still puzzles me
>>102536237Israel isn't a real country.
>>102536349free energy not bullying
>>102536308>It's a meme, bro, you won't even get half of the speedlike epyc? 30% cheaper and 50% slower (at least on any hard math, specifically atan2) kek
reminder anthracite spent all their money on failed finetunes and now can't even pay their shills
why aren't there ever Macross references in this thread? There are lots of other famous AI type shows out thereI know we get some bladrunner images sometimes, but it feels like there's a lot of other fertile ground that's being ignored
>>102536215>no anime>model card screenshotwut>>102536334>miku is off topic>but let me sperg out about her and bump the threaddouble wut
>>102535991I’d just like to interject for a moment. What you’re referring to as Anime, is in fact, a Vocaloid, or as I’ve recently taken to calling it, ボーカロイド.
Reminder that you said you were trying to quit trolling, Evan.
>>102536355I just hate the local troons that gather around that image. I actually like miku songs.
>>102536366I like magnum-123B and will willingly shill it for free.
>>102536237>epycbrosI've gotten enough good use out of my dual EPYC build for a relatively cheap build price that I've got no regretsTech marches on. Glad Intel is putting something better out there. Hopefully there's some way to get ahold of it for less than 6 figures.Good luck to future cpumaxxers!
i am starting to feel like taking a blacked miku shit...
calm down cuda dev
>>102536407>local troonsThis isn't actually a thing is it? I haven't seen a non-troll tranny references in years in this general
>>102536355He is just butthurt that there are a few VRAM chads itt that like genning mikusHe's been spilling spaghetti all over this general for a while now, see >>102525042>>102525071
>>102536460>in years in this generalthis general is only a year old retard
How much data do I need to make a finetune worth anything?
>>1025364723
>>102536460It is and one mikufaggot here was doxing people when he thought it was one of anti-miku trolls.
>>102536416it was a bit of a shock to come back to this site after a decent number of years and see all this obsession over trannies in any thread over the slightest thing. when no one used to talk about them at all.
>>102536482Is there a unit to that, or is data a dimensionless quantity?
Bootstrap. Iterate. Bootstrap. Iterate. And one day we get ASI. Meanwhile Lecun basically admitted defeat saying it's too hard. Hope I can see his AGI cat in ten years.
>>102536494>when no one used to talk about them at allBecause they were rightly ridiculed and called out for being mentally ill instead of having extra privileges.
>>102536511yes
>>1025365113
>>102536519Do not care. Not your personal army. Fuck off.
>>102536472as few as 1000 samples can get you a meaningful result, as per the lima paper. more is better though, ideally you should do as much as you can as long as your data is all good quality. high number of good quality samples > low number of good quality samples > high number of low quality samples
>>102536517JEPA will R U I N you.I can't wait to see your reaction.
>>102536578JEPA is vaporware
>days have passed>he's still upset that I accurately pointed out all mikuposters are pedophilesseethe harder nonce
>>102535991>inb4 anime siteCorrect. Now go kill yourself, reddit troon
>>102535977>70B-instruct distilled to 51BWhat do I need to imagine here exactly, 70b TYPE quality in the form of 51b? I guess that would be more ideal at Q6-8 or whatever instead of 70b at Q4?
>>102536804imagine lobotomized slop
>>102536816Google's managed to do distillation surprisingly well.
>>102536804It's a sort of calibrated type of pruning followed by knowledge distillation, so it should come pretty close, in theory at least, although probably not on every domain.
gpt voice is rolling out for real this time
>>102536823Interesting idea to say the least, question is how well it worked out for them.
>>102536834I set up gemma2 with Whisper, a shell enabled dialog engine, and Festival and I can say without out a doubt that is the most frustrating, clumsy way to use a computer.Fucking toggling in machine code on a switch panel would be more ergonomic.
Are the new CR's that bad?
>>102536885Probably about the same as every other distillation attempt. Great on benchmarks, but retarded for any practical usage.
>>102536895>Are the new CR's that bad?They lost their only differentiating factor when they started chasing benches with slop datasets. They don't have the same soul that some could squeeze out of the first gen.
>>102536895The CR+ refresh is a side grade at best
>>102536895Cohere is dead. After what has happened to Mistral and the Chinese models, the only hope for local at this point is that Jamba 2 can actually carry its (literal) weight.
>>102536986>Isreal is our only hopehow many months has the Jamba 1 pull request been festering?
>>102536908>great on benchmarkswhat do I buy to invest in this brilliant model!?
>>102536535You sound like a troon. Their main strategy is trying to make people ignore them as they take more and more power.
anyone complaining about anime here is a newfag who desperately needs to go backit's specifically the mikuposters who need to face the wall
>>102537316>complains about anime
>>102537289>Not my personal army!? YOU SOUND LIKE A [buzzword]!And you sound like a child. You need to be 18 to post here, also not your private army, faggot.
Slow day huh?
>>102537405>>102536366
>102537390miku is the vocaloid mascot, newfag. go back.
>>102537390>replying to the resident schizo's schizobabble
>>102537405slow month, slow yearnot looking good for local models
it's actually over this time huh
good
>>102537625it was over the moment troons took control of this place. current state of it was just a matter of time.
>102535999Fix the fucking references.
the poster above me needs to go back
the poster above me needs to kill himself
In retrospect, what went wrong?
the poster above me is really cute :3
Someone is desperate to derail this thread. They REALLY don't want us speculating about Meta's big announcement tomorrow. I wonder why? Who benefits from this...?
>>102537670Faggot. Drummer.
>102537647>>102532154 >>102478518>tldr can't have more than 9 mentions now, probably cause of the "ever wonder why" poster
>102537625>102537635>102537654>102537662>102537668>102537670what's up with these gay ass posts? embarrassing.
>>102537714oh i get it, miku shitter is mad because OP pic is not miku this time, lol
What is the prompting secret sauce so that characters know they can't look you in the eye while turned away, and other such things?
>>102537742not using Magnum 12B
>>102537742About 100b promptamaters
https://x.com/OpenAI/status/1838642444365369814
What's a good llm to run with Sillytavern and simulate an online chat with my imaginary waifu?Right now I'm running the NemoMix-Unleashed-12B-Q6_K.gguf which is kinda alright but some messages are really weird
Terrible voices
>>102537792largestral
>>102537711So you are saying he is ban evading and trying to skirt the rules? Shouldn't some janny take care of this?
>>102537792Try theseMistral SmallMistral LargeHermes-3-Llama-3.1-70BGemma 2 27bMidnight Miqu 70Bolder version of Command-R Mixtral 8x7B (fast on CPU for it's size)
>>102537792
>>102537750>>102537764What's the solution if I'm poor?
>>102537850>So you are saying he is ban evading and trying to skirt the rules? Shouldn't some janny take care of this?resolving uncomfortable or difficult people issues through ham-fisted technological means is a tried-and-true method used by lazy managers everywhere. Bonus points if it makes the world worse for everyone else.
>>102537862Buy more ram
>>102537861That's just average local turd experience. >>102537776People bullshitting on fossjeet here: https://x.com/reach_vb/status/1838645845652332955
>>102537862>What's the solution if I'm poor?Assuming "dont be poor" is too far out of reach for you, then "be more patient" is typically the fallback.Alternatively, you could also sell your soul to online services
>>102537813>>102537856Thanks, I check them out
>>102537862Use a different 12b.
>>102537856don't forget the mixtral 8x22b Wizard LLM fientuneSo many things depend on hardware specs tho
>Model scopes for Vector Storage will be enabled by default in the next release. Opt-in earlier by setting enableModelScopes to true in the config.yaml file. This will require to regenerate stored vectors.i enabled it with a previously made db and it didn't seem to regenerate. is this normal or am i expected to purge old ones first? usually this kind of migration stuff is automatic
Best base text continuation model for 40gb VRAM + 64gb RAM?
>>102538092Mixtral
>>102538102don't listen to this retard, download Magnum 2.5 Kto
I'm going to eat 7 hitlerbars
>>102538102Mixtral is worse than even nemo.
>>102537856I'd recommend Hermes 2 over 3. 3.1 can be strange.
I like slop
>>102538124think of the children
>>102538194qwenbro...
>>102538205Sure they'll also get some hitlerbars
>>102538194I don't mind it if the model is doing great otherwise.if that's the price to not read how a girl giving me a blowjob while kissing my lips softly, I'll gladly take it.
>>102538194Go away woman. Fuck some chad.
I do not know what slop even is.
>>102538194Slop likes you too :)
>>102538250thank you :)
>>102538258Look into the mirror anon
>>102537862Install russian onlin super-RAM
>>102538194based
>>102538296What kind of weird potato is that?
>>102536237Time for a tetomaxxing guide
>>102538321the fluffy kind
If the models have a rolling window, why does it still go schizo when it nears to filling the context allotted? Mind you, I am using 32k context size. Am I misunderstanding what a rolling window means when it comes to LLMs?
24 hours from now, Llama, and thus local, will be saved.
>>102538446They only released 3.1 a month ago. 4 isn't coming until next year.
>>102538446two more years
>>102538428What model? If it's something like nemo it's only good to 16k. So try setting it to that if you're using context shifting.
Is there a way to let an llm search the web on it's own if it realizes that it doesn't have enough information about a topic?Let's say the cut-off date is 2023 and I'm asking "Tell me what happened in the year 2024" the LLM will then give the answer and reflect that this is wrong or useless information and will perform a web search instead.
>>102538194they hated him because he was the same as them
>>102538498function calling
wait, you guys have a local schizo too? i thought that was just /sdg/ and /ldg/
>>102538194good slop:>half-lidded eyes>shiversbad slop:>ministrations>don't think this means anything, i still
>>102538446Tacked on multimodal won't save anything.
>>102538573local schizoid general
>>102538498>Is there a way to let an llm search the web on it's ownYes. read on function calling.>if it realizes that it doesn't have enough information about a topic?They have no introspection. They don't know what they know, for a very generous definition of knowledge.
>loss going down>eval loss going down>epoch 0.5I'm feeling it! This time I will make the best model ever.
>>102538573Every general has resident schizos, simply the way it be
>>102538596i'll use your model if it's 12b or under
>>102538596i'll use your model if its 70b or over
>>102538333Checked
>>102538583>ministrationsI've never even heard the word "ministrations" irl
>>102538583>shiversUsed too much but not necessarily bad.Another one is "a mix between"
>>102538664me neither, and i'm a pretentious pseudo-intellectual sesquipedalian scrabble player
>>102538664It is a word invented specifically for harlequin romance aimed at women.
>>102538641try sonnet, you'll never go back to localcucking
>>102538775this is /lmg/ retard
>>102538775>$0 / month>Access to Claude 3.5 SonnetWhat's the catch?
>>102538799They train on your logs also you get rate limited
>>102538793>retard thinks local only applies to language modelsthis is your brain on 8k context
>>102538854yes, to a local models generalnow fuck off to your proxy before it croaks and you have to cook up another piss drinking video, faggot
>>102538854>Local MODELS generalKys shill
>>102538883>>102538885mald more, you will never have local gpt-4o capable AI.
>>102538908r u sure
>>102538883you're an absolute retard. you can use non-local text models with other local models like image gen and they are a million times better
>>102538908>you will never have local gpt-4o capable AI>ClosedAI so afraid of local that they're banning people for trying to reverse engineer a promptBack to /aicg/ little pajeet
>>102538922>image gen????not a general for this either? did your mother drink excessively during pregnancy or something?
>>102538922Then go to a thread for image gen, or aicg, not one meant literally for llm's
But how will I know when I get there?And how will I know when to leave?
>>102538922>a general dedicated to the discussion and development of local language models.
>>102538927>afraidfunny headcanon
>Miku, get the locust spray
>>102538922>you can use non-local text models with other local modelshow?
>>102538940>>102538941>>102538958discussing how trash local models are in comparison is discussion. you can't possibly be this dumb
>>102538959>No mention of GPT5, No mention of Sora, No mention of GPTo with voice enabled>Months of work for a COT finetune>Btfo'd by Qwen in coding>Btfo'd by Sonnet in literally everything else>Sama seething on twitterKek.
>>102538995SillyTavern with local TTS connected to claude for example.
>>102539010nice fanfic
>>102539000No faggot, you interrupted an actual discussion about local by saying just use Sonnet
>>102539010>No mention of GPTo with voice enabledr you blind? >>102537776
>>102539032My bad, they delivered on one of their promises after months, OpenAI is back and Sama def wasn't dilating on twitter
>>102539030it's the best solution. if you can't handle discussion on 4chan, try reddit, you can downdoot facts that infuriate you.
>>102539032I can see fine I just can't hear so how could I know that retard?
i have a 6700 XT 12 GB and an i7-10700k. is there anything i can run decently local or do I need a nvidia gpu?
>>102539051Openai's tech actually works though.
>>102539093Look into the rocm koboldcpp build.You can run nemo-instruct at a decent quant with a good amount of context.
>>102539057Best solution for the tards at /aicg/ This is local models retard
the only schizo in this general is the schizo who calls everyone a schizo
magnum shills have been real quiet ever since anthracite ran out of money huh
>>102539118>This is local models retardoh the horror
>>102539154Good riddance
>>102539154Rocinantesisters we won
>>102539154money for what? I thought they got all their compute undeserved
>>102538664I have, it is mainly used in religious connotations.
>tard squad finetunes a shitty base>makes it marginally better>/lmg/ opens their wallets>tard squad finetunes larger models>/lmg/ realizes the dataset sucks>pretends they never liked tard squadhappens at least 4 times per year
>>102539154>>102539173How does one (1) guy alone absolutely BTFOs anthracite so much?
>>102539250hi drummer
>>102539256Hi Sao. I am not Sao. You are Sao.
>>102539269unironically Drummer
>>102539219They can't scam me if I don't have money to begin with.
What if the final solution model never comes? And it will be a perpetual state of new slightly smarter differently slopped models you can kinda enjoy for 2-3 roleplays before you see everything it tends to repeat and you can't take it anymore. And you will have to keep 100 of models around to swap them to get different styles?
>>102539312Go back to Pyggy and see if things have improved or not
>>102539312That sounds like a you problem.
I can't make sense of this thread at all and I consider myself pretty knowledgeable about open source LLMs. What the fuck are y'all talking about?
>>102539323The cooming plateau is here.
Saars in their natural habitat are funny.
Can I run Qwen2.5 32B 4.65bpw or even 5.0bpw on a 3090?
>>102539682check filesize
>>1025396934.65bpw is 20GB and 5.0bpw is 21.68, but context also takes some space so I'm not sure.
Is there any way to speed up context loading? At the cost of extra ram perhaps?
>>102539729You need to enable turbo mode.
>>102539729You need to download more FLOPS
>>102534097Flux D 1.0
can i run this stuff on amd cards? rx 7900.I dont want to train anything, just want to play around with image and chatbot.I dont mind compiling stuff and digging through forum posts, but I'm not sure if this is a complete fool's errand
>>102540000>>102539117
>>102540000yei have a fun time with just 8gb on my shitty 4060 so you'll probably have a blast with your 16-20gb (or whatever) even if it's stinky amd
small question, if i want to make a lora of a model, does it need to be the pure safetensors file or can i use a gguf?
>>102539803The Dota 2 Turbo mode?
>>102540090The turbo mode you enable on your pc case.
Best model for erotic RP? Im not sure whats the latest stuff
>>102540176this one's my current favoritehttps://huggingface.co/mradermacher/Arcanum-12b-GGUF/tree/mainit's not leaps and bounds over other nemo merges or anything though.
>>102540130I only have a turbo mode on my gamepad, but it's connected to a pc. Does that count?
>>102536215Image genned with a local model, what's the problem anon?>model card screenshotAh so what you really want is free real estate to shill, fuck right off
>>102540239back to cage >>>/a/nimal
>>102540196>combining TheDrummer/Rocinante-12B-v1.1 and MarinaraSpaghetti/NemoMix-Unleashed-12B using a novel merging technique.>novel merging technique.Without a proper cooming testing methodology why does this mean anything?
>>102540176mistral nemo / mistral small / mistral largeBiggest you can fit.
>>102540266Checked
>>102535977
>>102540324kek
>huggingface.co/gghfez/SmartMaid-123b-exl2New largestral slop dropped no fp16 weights???
>>102539312imo I just need these models to have better spatial reasoning/world models
>>102540969>maidUndislop?
>>102540969Buy an ad
>>102540267>why does this mean anything?it doesn't.
Small but commendable performance improvement on code generation: https://arxiv.org/html/2309.02772v3On this topic, do you guys know about anything else that could improve code generation?
>>102541017No thanks, rabbi. I think instead I'll post whatever the fuck I want.
>>102539312things are about to accelerate
>>102541159If I had to guess, newer models trained on datasets from COT models will probably increase coding benchmarks significantly.
>>102541543pls tell me I won't have to work ever again and can instead live my life doing things I actually enjoy
>>102541713That's communism
>>102541722I'll take it as long as it doesn't turn into authoritarian garbage.
What's the best most intelligent, creative, soulful model for RP currently?
>>102540049>>102540079cool thanks anons
>>102538583the only good slop is the one ood
Merging LoRAs like Playing LEGO: Pushing the Modularity of LoRA to Extremes Through Rank-Wise Clusteringhttps://arxiv.org/abs/2409.16167>Low-Rank Adaptation (LoRA) has emerged as a popular technique for fine-tuning large language models (LLMs) to various domains due to its modular design and widespread availability on platforms like Huggingface. This modularity has sparked interest in combining multiple LoRAs to enhance LLM capabilities. However, existing methods for LoRA composition primarily focus on task-specific adaptations that require additional training, and current model merging techniques often fail to fully leverage LoRA's modular nature, leading to parameter interference and performance degradation. In this paper, we investigate the feasibility of disassembling and reassembling multiple LoRAs at a finer granularity, analogous to assembling LEGO blocks. We introduce the concept of Minimal Semantic Units (MSUs), where the parameters corresponding to each rank in LoRA function as independent units. These MSUs demonstrate permutation invariance and concatenation-summation equivalence properties, enabling flexible combinations to create new LoRAs. Building on these insights, we propose the LoRA-LEGO framework. This framework conducts rank-wise parameter clustering by grouping MSUs from different LoRAs into k clusters. The centroid of each cluster serves as a representative MSU, enabling the assembly of a merged LoRA with an adjusted rank of k. Additionally, we apply a dual reweighting strategy to optimize the scale of the merged LoRA. Experiments across various benchmarks demonstrate that our method outperforms existing approaches in LoRA merging.might be cool no code though so w/e
>>102541866Same as "you reached context limit - enjoy OOM moment or extreme hallucinations".
>>102541824Seconding this but it needs to fit onto 24 GB of VRAM without stepping below 8-bit quantization.
>>102541824>>102542255mythomax
>>102542255No it needs to fit into 64G of ram
slop is soul and I'm tired of pretending it's not.
what's the best model for flirting with a venezuelan math teacher while I roleplay as a homeless black midget pretending to be a middle schooler?
>>102542298Probably something by anthracite
>>102542290>buckbroken
>>102542275Thank you, Anon.
Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of Expertshttps://arxiv.org/abs/2409.16040>Deep learning for time series forecasting has seen significant advancements over the past decades. However, despite the success of large-scale pre-training in language and vision domains, pre-trained time series models remain limited in scale and operate at a high cost, hindering the development of larger capable forecasting models in real-world applications. In response, we introduce Time-MoE, a scalable and unified architecture designed to pre-train larger, more capable forecasting foundation models while reducing inference costs. By leveraging a sparse mixture-of-experts (MoE) design, Time-MoE enhances computational efficiency by activating only a subset of networks for each prediction, reducing computational load while maintaining high model capacity. This allows Time-MoE to scale effectively without a corresponding increase in inference costs. Time-MoE comprises a family of decoder-only transformer models that operate in an auto-regressive manner and support flexible forecasting horizons with varying input context lengths. We pre-trained these models on our newly introduced large-scale data Time-300B, which spans over 9 domains and encompassing over 300 billion time points. For the first time, we scaled a time series foundation model up to 2.4 billion parameters, achieving significantly improved forecasting precision. Our results validate the applicability of scaling laws for training tokens and model size in the context of time series forecasting. Compared to dense models with the same number of activated parameters or equivalent computation budgets, our models consistently outperform them by large margin.https://huggingface.co/Maple728Only the smallest 50M model has been uploaded so farhttps://github.com/Time-MoE/Time-MoE300B timepoint dataset still to be released
>>102542314>anthracitethe slop brigade? no thanks I don't want the model forgetting I'm a black midget every swipe.
Well now that's an interesting result. I was expecting a lobotomized model. It's certainly forgotten what an EOS token is, though.
L3.1-70B-Hanami seems good so far. 3.1 smarts but it seems to be breaking it's dryness.
>>102542430I seem to have created one of those man made horrors beyond your comprehension.
>>102542430>>102542447commaslop
>>102542430Which model?
>>102542458Some qlora I ran on Mistral-Small-Instructan experiment in using extremely high dropout rate.
>>102542430>>102542447>She>Her>She>Her>She>She>Her>She
You realize you're coming up on the 16k context limit. Do you:1. Keep going, trusting that discarding the start of chat history will be fine2. Switch to a lower quantization of your current model so you can increase the context without a big slowdown3. Increase the context at the price of having to offload more to RAM, drastically slowing down4. Summarize the chat and restart5. Other (write your own)
I don't load my models with 16k context
>>102542492lmao
Whisper in Medusa's Ear: Multi-head Efficient Decoding for Transformer-based ASRhttps://arxiv.org/abs/2409.15869>Large transformer-based models have significant potential for speech transcription and translation. Their self-attention mechanisms and parallel processing enable them to capture complex patterns and dependencies in audio sequences. However, this potential comes with challenges, as these large and computationally intensive models lead to slow inference speeds. Various optimization strategies have been proposed to improve performance, including efficient hardware utilization and algorithmic enhancements. In this paper, we introduce Whisper-Medusa, a novel approach designed to enhance processing speed with minimal impact on Word Error Rate (WER). The proposed model extends the OpenAI's Whisper architecture by predicting multiple tokens per iteration, resulting in a 50% reduction in latency. We showcase the effectiveness of Whisper-Medusa across different learning setups and datasets.https://github.com/aiola-lab/whisper-medusakind of cool. the medusa-block is probably the one to use.
>>102542430>>102542447Also here's an old forgotten arxiv paper on it. https://arxiv.org/abs/2403.00946In their experiment they used 90+% dropout, but that was back when finetuning was still done layer by layer I think.I tried at 90% at first but it was instant lobotomy so I dropped both the learn rate and the dropout to 75% and yeah... I think the results are interesting and worth exploring further.
You realize the woman you're having sex with is actually a man. Do you:1. Keep going, it's too late to take it back anyway2. Switch to your hand so you don't get blue balls3. Kill him and hope your state has gay panic laws4. Bend over and give him a turn5. Other (write your own)
>>102542586Qwen2.5
>>102542492summarize and keep going until the model's stupidity drives me insane and I do something else for a few days
sovl
interesting. It seems to have undone Arthur's cook-in of the correct answer.
>>102542492>all that to get 10% max. of cloud model's power
>>102542492Keep going, I only summarize when the current scene has run its course.
>qwen 2 vl says retarded things2.5 vl when?
>>102542724A cloud model is useless because you're at a corporation's mercy. Nobody here is interested in your shilling.
>>1025424922. Increased to 32k context and now I'm going at 2.5 tokens per second.
>>102542746fuck I mean 3, undo
>>102542740No shilling, telling it as is, you will never have anything usable with these toys.
>>102542724localbros, how do we respond without sounding mad?
>>102542789Keep making "ahh ahh mistress" one message tests i guess?
Would a q2 qwen 72b program better than q5 gemma 27b?
>>102542756There is no undo, face the consequences of your actions and take responsibility.
>>102542492Due to limitations in local, I tend to keep my roleplays episodic in nature while keeping the overarching themes intact, either RAG or lore book maintenance. Option 4 is perfect in that regard. For programming or more serious, ‘normie-friendly’ projects where my own ideas or privacy doesn’t matter, I always opt for cloud.
>>102542820It's over. After 19730 tokens of context a switch flipped and the model repeated its last reply like a broken robot in a TV show.Zero regens until now. Zero edits until now. Is my run over?I guess I should have done >>102542636 >>102542830
>>102542851claud doesn't have that problem
Gemma doesn't know how to make ascii art.What model can do decent ascii art?
>>102538573Don't forget /aids/, they have multiple
I'm writing a small script (https://github.com/battleprogrammershirase/BUERgence) to quickly narrow down on the best inference parameters for llama.cpp. Right now I'm only testing -t and -ngl since these seem to have the biggest impact on performance. Are there any other parameters I'm missing out on especially as a VRAMlet?
>>102542830I dislike lore books because they can't affect the first message where they keyword appears if the keyword was in an AI response. This problem isn't theoretical for me. Actual case in my last chat using a Monster Girl Encyclopedia lore book I was trying to improve. When talking about something else the model starts talking about werewolves and weresheep because they're things that reasonably could (and do) exist in the setting and wrote a bunch of stuff that contradicted MGE lore.Not great solution: when a new lore book entry would be triggered by the newest AI post immediately regenerate it with the additional entry.My solution: stop caring about MGE lore because it's bland and many of the descriptions are the same thing.
>>1025424925: Thank God that He created me with both the intelligence and the drive to not be a poorfag
>>102540239These Tetos are always interesting to admire.
>>102542851Yeah it's falling apart. Regen was fine, next message questionable, next was going back to the time loop. RIP. I guess the adventure is done. Even if I switch to a more powerful model the writing style and ideas of how the story should work won't be the same. It will be like someone else took over all of a sudden. Maybe I can delete all the example messages to get another 1.2k of context to try to limp along to a conclusion but around 19k tokens looks like the limit of Mistral Small.
>>102542513Even if you set it longer the recall isn't as good past 16k for a lot of models.
I hate to say it but I think I might really go back to Wizard or some 8x22B after all. Mistral Large is too slow for me, and Mistral Small and Nemo are too dumb. I haven't checked out Sorcerer yet. Maybe I'll try it out. Miqu 2 when?
>>102535999>florence is amazing!>not a single example in the threadblegh
>>102543037MemoRAG: Moving towards Next-Gen RAG Via Memory-Inspired Knowledge Discoveryhttps://arxiv.org/abs/2409.05591https://github.com/qhjqhj00/MemoRAG
>>102543295it's happening?
>>102543295Why does the graphic go from bottom to top?
>>102543206Screenshot of full log.
>>102542586I would not have sex with someone I don't already know deeply about and love.
>>102541866How long is the limit?
>>10254347245 minutes per day
>>102543285
>>102543491That's not too bad, assuming they don't count silence so you can actually use it like they advertise as an always-on tool.
>>102543498Cool, is there a comparison for Florence2?
>>102543562https://desuarchive.org/g/thread/101749053/#q101750118https://desuarchive.org/g/thread/101749053/#q101750162https://desuarchive.org/g/thread/101749053/#q101750228the guy who originally posted it said Florence was Florence-2-Large-ft
>she Xs, her eyes Ying
>Tuesday is overBet. A new model will release today that is not from Meta.
>>102543519no, silence counts too.redditfag claimed he needed to take a phonecall, so muted himself in chatgpt app but it still counted down the minutes.45min per day is still more than i thought. i was thinking like 15 min per day or something.the main problem is people are complaining everywhere how they get the "my guidelines wont allow me to talk about that" CONSTANTLY. even for work related stuff.And apparently its output is being gimped even harder since the initial rollout. so even less imitations/effects.The most funny part is normies STILL get their own voice cloned or hear a unrelated 3rd voice. "scary". loloi hope somebody comes along who just doesnt give a shit. just relase it and let people figure it out. ms-paint would not have been released in 2024.first reactions: what if somebody would draw child genitalia with it?!?! total normal behavior from these SF freaks..
>>102543667catbox the uncensored version
>>102543677Wow, I thought it'd be funny if I jinxed it but if that's crazy if true. Like what the fuck, what a scam.
>>102543726The pixiv link is right there anonymo-Wait that's a 4chanx feature isn't it.Just install 4chanx bro, you're going to save yourself a lot of trouble in the future.
>>102543230>Miqu 2Looking back on it, miqudev was the most based person to ever grace this general. We may never see his like again...
>>102539312Improvements are constantly being made, though they are primarily refinements. I think the next big leap will be when they solve catastrophic forgetting. Once they do that it will be all about continuous learning and years of refinement will be done on that. We have no need to rush, AI isn't going anywhere anytime soon.
>>102543726Don't do what the other Anon said, never install 4chanx if you can help it.
This might be a retarded idea but why can't we add user feedback on ST for discarded gens and prefered gens and use that as a dataset to train a custom little reward model that would be used later to prefilter the next generations? I think cai did something like that before.
>>102543726install 4chanx but leave it disabled
>>102544223t. regularly gets filtered
>>102544261
>>102544249>why can't we add user feedback on ST for discarded gens and prefered gens and use that as a dataset to train a custom little reward model that would be used later to prefilter the next generationsYou are asking why can't we make local models not local? I dunno anon... But yeah it is a great idea you could ask locusts to do. I am sure they can make an extension for that or something.
>>102544476Are you retarded? Everything I said can be done locally
>>102544450updates... doko...
>>102544450that face...
>>102544249Why do you need feedback? Just delete any gens you don't like and all the jsonl is now your dataset.
>>102544489>Are you retarded?Are you? It is incredible how you don't see a problem with this.
>>102543295>https://github.com/qhjqhj00/MemoRAGOh fuck yeah. Thank you, chinks.
>>102544515Do you even know what a reward model is?>>102544516Enlighten me then?
>>102543295Wait... Is this the holy waifu grail? And we are finally gonna get waifus and the final problem won't be the alzheimer's but their positivity bias and how they will talk to us about consent? Weird timeline.
>>102544450bad gen, her top is like an unfinished suggestion
>>102544543I know. But I like top 80% of the picture a lot and I didn't want to cut it.
good night /lmg/
>>102544540>Weird timeline.kys
>>102544540>https://github.com/qhjqhj00/MemoRAGKinda want to try out their summarization module. But that might be just limited to the model that being used at the end of the day.
>>102544553Miku, it's 10 AM. Get out of bed.
>>102544526NTA but it’s inspiring that you talk to retarded people like that they’re probably very lonely and crave the social connection. Anyway yes that’s an interesting idea but people like to switch models all the time and you’d need to have the hw capacity and time to do the actual fine tuning every time you switched. It could be QoL maxxed with some effort tho.
>>102544632>>102544526samefag
>>102544647Retard
>>102544647You got me...
>>102544661>>102544687The duality of anon
>>102543295I just thought about next steps and what is gonna be the new sally brothers thing for testing if your waifu can remember things? Cause you can bet everyone here is gonna be doing all those memory riddles instead of actually enjoying their LLM waifu.
>>102544848>>102544848>>102544848
>>102544540It's just a better lorebook, that won't solve the long memory issue in a conversational setting
Bump
sage sage sage