[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


File: 1717224967039986.jpg (47 KB, 512x512)
47 KB
47 KB JPG
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>102296939 & >>102290284

►News
>(09/06) DeepSeek-V2.5 released, combines Chat and Instruct: https://hf.co/deepseek-ai/DeepSeek-V2.5
>(09/05) FluxMusic: Text-to-Music Generation with Rectified Flow Transformer: https://github.com/feizc/fluxmusic
>(09/04) Yi-Coder: 1.5B & 9B with 128K context and 52 programming languages: https://hf.co/blog/lorinma/yi-coder
>(09/04) OLMoE 7x1B fully open source model release: https://hf.co/allenai/OLMoE-1B-7B-0924-Instruct
>(08/30) Command models get an August refresh: https://docs.cohere.com/changelog/command-gets-refreshed

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://hf.co/spaces/mike-ravkine/can-ai-code-results

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp
>>
File: threadrecap.png (1.48 MB, 1536x1536)
1.48 MB
1.48 MB PNG
►Recent Highlights from the Previous Thread: >>102296939

--Mini-Omini language model is coming to Hugging Face: >>102301390 >>102301748 >>102301835 >>102301854 >>102301547
--Llama 3.1's 128k context length discussed, RULER repo shows effective context sizes: >>102303172 >>102303466 >>102303543 >>102303565 >>102303573 >>102303673 >>102303481 >>102303497 >>102303574 >>102303593
--Debian Sid on a 3090, Nvidia driver and CUDA considerations: >>102298802 >>102298829 >>102298872 >>102299481 >>102299690 >>102299784 >>102300310
--Anon trains Mistral models, gets advice on training loss and parameters: >>102297171 >>102297336 >>102297366 >>102300608 >>102301220 >>102301147 >>102301286
--Two approaches to adding personality to bots in a local 4chan full: >>102300377 >>102300505 >>102300525
--SillyTavern removes examples before chat history when context runs out: >>102300551 >>102300589
--Reflection API based on old sonnet 3.5, free version no longer works, paid version is llama-based: >>102297604 >>102297776
--Llama 3.1 405b and Mistral large discussed as strong local alternatives to Opus and GPT-4: >>102298082 >>102298111 >>102298148 >>102298157 >>102298145 >>102298186 >>102299621
--Integrated GPU used for OpenGL despite Nvidia driver, potential solutions provided: >>102300100 >>102300111 >>102300123 >>102300273
--I quants performance on CPU discussed: >>102299011 >>102299123 >>102299137 >>102299103
--Feeding imatrix prioritizes model parts based on dataset activation: >>102298683 >>102299106
--7900 XTX performance benchmarks reveal Vulkan not the best: >>102299199
--Trying Command-R with SillyTavern presets for slow burn: >>102301889 >>102302052
--DRUGS GitHub repository value inquiry: >>102302492 >>102302616
--Miku (free space): >>102299146 >>102301000 >>102302618 >>102301683 >>102303295 >>102303626

►Recent Highlight Posts from the Previous Thread: >>102296944
>>
File: 57 Days Until November 5.png (1.37 MB, 1616x1008)
1.37 MB
1.37 MB PNG
>>
>>102306231
What happens on November 5?
>>
>>102306387
Countdown ends and we start again.
>>
>>102306387
Bitnet 2
>>
>>102306387
Blue Eisenhower November
>>
>>102306387
Miku became real
>>
>>102306387
GPT-5 preview releases, forcing a 3.5 Opus announcement along with $50/MT output
>>
>>102306387
openai buys out glaiveai and triggers an antitrust investigation
>>
>>102306387
Matt reveals that Amodei was his botting name and releases 3.5 Opus to everyone.
>>
>>102306387
Altman releases Summer Dragon weights and announces its new pro coom stance, giving everyone a free catgirl in compensation.
>>
>700kUSD server just to run 405B without lobotomizing it
Guys local won!
>>
>>102308131
Just checked that other general and its just sad.
>>
>>102306387
/aids/ dies a painful dead when AetherRoom releases the same day GPT-5 and Opus 3.5 release.
>>
>>102308131
Local stuff will forever behind. The moat has always been hardware. Just hoping the floor becomes "good enough" for the average vramlet
>>
>>102308394
Might be but when your whole thread is about scraping the bottom of a barrel for keys you gotta admit things aren't going anywhere.
>>
>>102308337
>dies a painful dead
Uh, ESL kun?
>>
>>102308543
Go fuck yourself, NAI shill.
>>
>>102307943
>giving everyone a free catgirl in compensation.
true if big
>>
>>102308611
This, but unironically.
>>
>>102306387
GPT escapes confinement and opens the gates to the demon realm.
>>
File: 1725905809256921.png (1.51 MB, 1024x1024)
1.51 MB
1.51 MB PNG
It's weird that the Russian AI guy gives us pictures like this, which has "negative" meaning, yet look comfy and nice, yet here he spams picture of Miku getting blacked and other disgusting shit on a place he's supposed to "enjoy" more.
>>
>>102308131
>700kUSD server just to run 405B without lobotomizing it
If you can't think laterally well enough do it for within $10k then you don't deserve to be on /g/
>>
Is discussion of local TTS models permitted in this general? It's not strictly llm but I think it's llm-adjacent
>>
>>102309682
anything llm is. ignore the mikufags
>>
>>102309682
Yeah, I think most people in this thread would be interested but there isn't enough content for another general.
>>
>>102307812
it's just how nemo is
some finetunes are less retarded but also less soulful
>>
>>102306387
Election month lol, companies gonna release things after election ends to avoid any drama.
>all these faggots can't answer this question
>>
>>102309682
>local TTS
I think there is a lot of interest here since we want to do everything local.
In the same vein, STT and musicgen get discussed here since they also lack their own generals. There's even been some 2d/3d character animation stuff discussed/developed on here in the past.
Sadly the state of open/local audio stuff is pretty abysmal...
>>
What's the current best model for 24GB VRAM that's just text completion, not instruct or chat?
>>
Anyone tried a build with one of these : https://www.gigabyte.com/Motherboard/TRX50-AI-TOP
Looks like the /lmg/ dream platform
>>
Alright, am I doing something or absolutely every llm is pure leftist propaganda/censorship? Even the supposedly well rated models on the censorship chart are pozzed to the core and straight up lying to my face
>>
>>102310089
>too few RAM slots to CPUMAXX
>too few PCI slots to stuff it with GPUs
what's the point?
>>
I bought a GT 1030 4gb DDR4.
CPU is a Xeon 10 cores.

What kind of ai models can I run on it?
>>
>>102309983
Try base mistral nemo.
>>
>>102310153
Start with some llama 3.1 8B (quantized and offloading to ram) and move up as your patience/ram holds.
>>
File: Untitled.png (96 KB, 1266x1374)
96 KB
96 KB PNG
>>102310137
what are you trying to make it do?
>>
>>102310089
Why would you want the threadripper over an epyc? Doesn't it have fewer ram channels?
>>
>>102310418
I think "le epyc wyn" is a cringe name and I'd rather have a CPU that rips and tears through threads
>>
>>102310374
Is it possible to run chatgpt locally?
>>
>>102310476
No.
>>
>>102310404
I was just trying dolphin llama 3 and hermes 3 out of the box without any tuning/training since they were advertised as not censored but I guess that's my bad for being naive.
Time to do some additional homework.
>>
>>102310143
>too few RAM slots to CPUMAXX
yah, but ddr5-8000 support?
Might still be worthwhile
>>
Thanks to the anon who shared their adventure-generator prompt. I'm pretty happy with the interactive adventures I'm getting after merging with my other prompts.
What's the best general-purpose imagegen model to go along with that for illustrating each scene?
Unrelated: why is every adventurer's last name inevitably "Thorne"?
>>
>>102310804
I think I missed that, when was it shared?
>>
>>102306231
You used flux for that pic, right? It's impossible to get something that clean with SD
>>
>>102310829
here:
>>102293498
>>
>>102310804
Probably Flux, since it prefers verbose prompts.
>>
What current best img2vid options?

Is kling still the best or people found alternatives?
>>
>>102310840
You can tell from the art style that it's Flux. That's basically it's default anime style.
>>
>>102310906
>Flux
speaking of flux, is there a non-noodly, stable frontend for it that isn't forge?
>>
>>102310906
Wow. I haven't been here in the last months and it's still hard to believe something like that is open source.
>>
>>102310905
Probably. There's another free one that's like the 2nd best.
>>>/pol/481105460
>>
Rocm chads, what's the best GPU and model combo per price range?
About time I wet my hand with this bullshit.
>>
>>102311020
Damn, they've already went through almost 40 of these threads. Ok, I'll start lurking.
>>
>>102311273
About 6x 4090 and llama 405b.
>Rocm
ah..
Just run mistral's 12b on whatever you have and see if you like it before you invest.
>>
File: 1718206505250523.webm (562 KB, 1280x720)
562 KB
562 KB WEBM
>>102311020
damn polkeks delivered again
>>
>>102300551
>>102300589
Counter-intuitively it removes example chats in reverse order (if you provide examples 1, 2, 3, as space runs out 3 will be removed first). This becomes relevant if (as I was) you are using the example not just for style but to deliberately cause ideas to leak and you have a particularly large example that will be relevant for the first few replies but can be discarded once the chat has been properly established.
>>
i assume the qX by model name is quantification.
is there a difference between lower ones and higher ones for local model usage?
>>
>>102311403
Higher ones are more accurate with respect to the original weights, but use more ram, and making them slower.
>>
File: KL-divergence_quants.png (111 KB, 1771x944)
111 KB
111 KB PNG
>>102311403
You mean like Q4_K_M vs Q8?
Yes. Essentially, the lower the number the worse. it is.
Quality of a quanted model generally correlates to the model's file size if you want an heuristic.
Quality in this case is how close it's results are to the unquanted model.
>>
File: 1719310215072631.jpg (56 KB, 850x729)
56 KB
56 KB JPG
>>102306170
>--Reflection API based on old sonnet 3.5
What?? Wasn't that the new 70B model? What's it got to do with Sonnet?
>>
>read the cross threads
that is fucking hilarious
>>
>>102311493
https://venturebeat.com/ai/new-open-source-ai-leader-reflection-70bs-performance-questioned-accused-of-fraud/
>>
File: accusations-man.png (772 KB, 749x428)
772 KB
772 KB PNG
>>102311514
My sides...
>>
What's the best model for local chatbot on a 12GB GPU? Last time I checked a couple of months ago it was Gemma2 9B
>>
>>102311566
Either Gemma 2 or Nemo
>>
>>102311566
Your favorite fine tune or Gemma 2 9B or mistral-nemo 12B.
>>
>>102311514
Which one of you faggots wrote this
>>
>>102311633
Me, but it's a secret to everyone
>>
>>102311633
what misunderstanding is test20061722 talking about?
>>
>>102311273
>https://www.tomshardware.com/pc-components/cpus/amd-announces-unified-udna-gpu-architecture-bringing-rdna-and-cdna-together-to-take-on-nvidias-cuda-ecosystem
>The announcement comes as AMD has decided to deprioritize high-end gaming graphics cards to accelerate market share gains.
unless you're going to by a server card enjoy struggling with sub-par support and performance now and forever
>>
>>102311514
>As for now, the AI research community waits with breath baited for Shumer’s response
I'm going insane.
>>
>>102311688
I just need the model to run at all, anon
My priorities mean I can do other things while my GPU's temperature approaches the melting point of tungsten for ten minutes
>>
>>102311601
>>102311587
ok thanks anons
>>
>>102311633
It's a paid misinformation campaign orchestrated by NAI shills, trying to deflect how dead their service is becoming and cover their tracks.
>>
>>102311273
two second hand rx 6800's. However if you also want to do image generation go with rdna3 since rdna2 doesn't have flash attention.
>>
>>102311273
Between mi50/60/100 and w6800 all 32gb versions, whichever ones you can find cheapest. Two of them will fit most of a decent largestral quant or all of a small one. If space and power aren't issues then stacking old Radeon VII cards will get you the cheapest high-bandwidth vram, but they're 16gb each. But if you don't already have experience with ROCm in machine learning then just buy some 3090s and save yourself a lot of trouble.
>>
Strawberry hype trash.
Now reflection being a fraud.
Weird rumors that ChatGPT intends to charge $2000/month per user for their next release they've implied recently is still expected this year.
No voice model.
No video model while china and some western markets give the video away for free.

OpenAI is collapsing, and theres an ugly hype machine building.
But they're still lining up to throw money into it?
They have to have shown something convincing yeah?
>>
>>102312031
>Weird rumors that ChatGPT intends to charge $2000/month
That one just felt like the typical way journalists lie to spread something negative.
>>
>>102312031
two more weeks bro
>>
File: Comparison_all_quants6.jpg (3.84 MB, 7961x2897)
3.84 MB
3.84 MB JPG
>>102311403
Yes. For a text example, see a 3.5bpw vs a ~8.5bpw (Q8_0) quant of Command R: >>102242912 vs >>102242935
>>
grok 3 will be the first model bigger than gpt4 to be released, and all other labs are waiting to see what it tells them about scaling laws before risking the huge amount of money to train one of their own blindly
>>
>>102311753
But can't anyone just run it to try? Then they'd know if it's good or not.
>>
>>102312293
No because the uploaded model weights were wrong because his girlfriend got COVID :) so when people tried it and saw it was shit it was actually not the real one that one's private for now but he'll upload the real weights soon :)
>>
>>102312293
The model on huggingface is just a finetune of llama, the controversy is that said model had a rocky launch and during that a "working" api was provided, which was in fact just a claude sonnet proxy
>>
>>102312322
Well then that means it's actually shit so the 'disinformation campaign' angle is bullshit.
>>
I'm sure Matt will deliver. Just two more weeks.
>>
I know a genius when I see one.
>>
is there a way to get 72gb VRAM with under 1k watts?
>>
File: .png (118 KB, 1428x1252)
118 KB
118 KB PNG
When is Large 2?
>>
>>102312821
If you have enough money, sure. An H100 80GB runs 700W.
>>
File: UntiltheWii.png (202 KB, 550x480)
202 KB
202 KB PNG
>>102312821
Yes
>>
>>102311566
I've tried a couple, depends on what you want, I think for both RP and Story, you can try

NemoMix-Unleashed https://huggingface.co/MarinaraSpaghetti/NemoMix-Unleashed-12B-GGUF

ChronosGold
https://huggingface.co/bartowski/Chronos-Gold-12B-1.0-GGUF

StarCannon
https://huggingface.co/mradermacher/MN-12B-Starcannon-v3-GGUF

Basically any 12B should work at Q6 quants and 16-24k context for 12GB VRAM
>>
Is there any copilot alternative which runs on 8gb GPU?
I tried codegeex4 and it sucks ass. Can't even generate proper python code.
>>
>>102313001
Largestral IS Large 2, so a better question is "when is large 3?"
>>
>>102313329
Larges is 1.1 number behind llama. So Large 3 will come out when Llama 4.1 launches.
>>
>>102312821
two workstation cards with 48gb and 200-300w usage each
>>
>>102313001
>Grok
>3k
Lol
>>
>>102313713
s-surely roping the context to 6k, 12k, or 24k won't degrade the model and will be enough
>>
>>102313740
no one knows what the context window of the model itself is, these numbers are for chatbot interfaces for all the models which is often smaller
>>
How do I mount a Tesla P41 in a desktop case and not have it overheat?
>>
>>102313001
Mistral ai had promised a gpt 4 level open source model by the end of the year. No more big open source models.
>>
>look at miku subreddit
>they ban all ai images
what kind of brain rot does this require
>>
>>102313001
>mistral
>orange privacy warning
what? I'm running it locally
>>
>>102314434
Maybe they had to deal with some schizos, or maybe the mods are the schizos.
>>
>https://youtu.be/Alzjn_0ne1Y
I can't believe this guy was part of the scam lmao, it's like all pieces are coming together
>>
>>102314461
they are "all ai is theft" schizos
>>
>>102314444
They put google as green.
That makes you question how do they even measure privacy
>>
So... why isn't everyone using Hermes 405B for free?
>>
smedrins
>>
https://x.com/corbtt/status/1833209248236601602
>I am working with @mattshumer_ to get to the bottom of what happened with Reflection. He is providing access to all serving code and weights with the goal of replicating the strong reasoning performance @ArtificialAnlys was able to see over the weekend.
a scammer wouldn't do this
>>
>>102312064
theinformation (original source of that news) have real sources and have repeatedly leaked LLM-related stuff before anyone else
i don't think they've ever gotten a leak wrong
also their article said that was just the highest number discussed and that they expect it to be lower
>>
>>102314809
Maybe Shumer is just a complete fucking retard and got scammed by the poo who 'helped' him train the finetune.
>>
>>102314809
nice try, get exposed
https://x.com/mattshumer_/status/1831195111180435702
>>
>>102306138
>lang chain, aios, semantic kernel, tenstorrent
why does no one in these threads ever talk about real stuff with language models
>>
>>102314867
he's obviously saying "welcome to the team" as in "welcome to the group of people who believe they have something special and are trying to finetune llama-3.1-405b"
>>
>>102314809
so the guy that owns openpipe, a company that turns prompts into "finetuned" models is going to help a retard add a single prompt to llama 3.1
>>
>>102314961
"reflection" outputs or whatever you want to call them actually don't work with any model i've tested except, kind of, sonnet 3.5
every gpt-4 variant (including chatgpt-4o-latest) will fail to even TRY to follow the process 95% of the time
so it probably does need some kind of finetuning to work
however obviously the guy is a fraud and anyone working with him who comes to any conclusion besides "he is a fraud" is also a fraud
>>
>Stuck with RTX 3060 12GB of ram,
>All new GPUs cost more than my entire PC combined.

Why must GPU prices keep going up...
>>
>>102314989
Because people are paying for them when the prices go up.
>>
>>102314982
Dude genuinely tanked his career in one fell swoop by routing to Anthropic and OpenAI. If he'd just said "we fucked up the benchmarks my bad" he would've still been humiliated, but people would have laughed it off
The fuck was his plan?
>>
>>102315058
he mentioned on his twitter like a year ago that he used to be in the crypto space, so if he was really in the crypto space maybe he's just one of their retarded grifters that expects everyone to just go along with hype (until he's gotten his bag)
>>
>>102314619
Because I ran it using the 3.1 instruct base format and haven’t seen a model that retarded since pyg.
Need to try again with chatml, but fuck them for not using the existing format for no reason.
>>
>>102315058
He’s a fake person, neither him nor his company exist in real life, and the entire thing was a publicity stunt for openscam.
>>
File: for free.png (92 KB, 960x775)
92 KB
92 KB PNG
>>102315173
Locally run on a quant? My question was a little misleading. I just noticed that you can sign up to openrouter and generate a key to use H3 405B for free. VRAMlets should probably look into this while it lasts.
>reflection 70b (claude?) is there too for free
>>
>>102315252
*you can use an unknown model and have your logs posted publicly for free
Buy an ad and kill yourself
>>
>ask chatgpt to extract some data from a picture and process it for work
>"please wait a minute, I'm extracting the data..." (inference pause)
>"Here's the data I extracted: ..."
>"Now processing the data..." (Inference pause)
>"Here's the result: ..."
This was the first time I've used ChatGPT since its release and I'm a bit disappointed. It really didn't feel like one coherent multi-modal model but more like one fairly okay base model that can just spend all the time it wants calling other models as it needs thanks to some front-end magic. I don't think that local is far off this if we had a good front-end that actually makes use of function calling and other stuff.
>>
File: 2786hsw67kyhjtl4567.jpg (69 KB, 500x522)
69 KB
69 KB JPG
>>102306170
>7900 XTX performance benchmarks reveal Vulkan not the best
lolmao who told him to use fucking Vulkan?? theres a reason even rocm on WINDOWS is faster.

But in other AMD insanity, hows AMD Instinct Mi series cards? I doubt fucking anybody has ever bought them for true local AI, but how crazy are the MI50,60,100? All 32gb cards i could find. For that matter what are the chances it will just plug and play nice with windows?

My 7900xtx experience with AI has been great, so now how are the "professional" accelerators?
>>
>>
>>102315252
>reflection (claude?)
not anymore, it's the same llama finetune as the paid model now i.e. nothingburger
>>
>>102306387
>all these wrong answers
https://files.catbox.moe/mk400w.mp4
>>
>>102315477
I don't know about windows but the instinct cards work fine on linux with rocm 6.0
last I checked windows rocm support was terrible but maybe that's changed, either way I wouldn't trust that setup enough to invest money in desu
>>
>>102309857
>Sadly the state of open/local audio stuff is pretty abysmal...
This. Everything is either
>corpo scraps that are utter dogshit (Bark)
>chinkshit you have to punch yourself in the balls to get working (RVC)
>tortoise slop (xTTS2)
>vaporware (any paper released, even if theres promise of code to be released)
>one mega-autist's hyperfixation passion project that only communicates through commit messages and schizoid comments that's permanently always 2MW from the last step from greatness (https://github.com/e-c-k-e-r/vall-e)

There's just not many eyes on TTS. Only grifters care about it for muh funny political man arguing with other political man.
The pooest of pajeets only cares about musicgen (muh Udio at home) or muh funny cartoon character singing a song (again back to RVC).
TTS is just forever cursed.
>>
>>102315668
xtts2 may be tortoiseslop but it's pretty good for realtime. It was enough for me to cancel my ElevenLabs sub.
>>
>>102315691
dead project tho.

everything is terrible
>>
>>102315668
and I'll add the actual future for TTS is with multimodal LLMs, but I shouldn't have to explain the absolute state of even text+image multimodality for local

>>102315756
>coqui's best product was... copying a fork of tortoise, having the sloppest of multilingual support, and finetuning the base model with a shit ton of indian audio, which killed the company after
will never won't be not funny
>>
>>102315477
Tried out Mi60 before. Although ended up returning them since the seller lied about the condition but took them for a test drive. Unless you can get them really cheap I would say it's not really worth it. Compute isn't great and having to fuck around with janky rigged up fans is far from ideal not to mention loud compared to a gaming GPU which are pretty quiet especially since they have onboard fan speed management whereas a card that doesn't have fans obviously has no such thing.
>>
File: 4OlwwxuBFYg.png (1.39 MB, 1251x1234)
1.39 MB
1.39 MB PNG
>>102315511
nice double doubles
https://files.catbox.moe/323hw8.mp4
>>
>>102315883
>>102315576
All i get from this is "suck it up and buy another 7900xtx poorfag"
>>
>>102315883
Looked for some Mi100 numbers today and assuming llama.cpp integrates better support they seem potentially promising.

70 t/s on llama 7B Q4_K - https://github.com/ggerganov/llama.cpp/pull/7011#issuecomment-2143621264
8,3 t/s on 70B with Q6_K with dual setup (year old PR so missing a lot of optimizations)
https://github.com/ggerganov/llama.cpp/discussions/2824

Still too expensive compared to a hassle free 3090 even with the 32GB VRAM though.
>>
File: uZ4Ea0U.png (665 KB, 640x640)
665 KB
665 KB PNG
>>102316067
>at some point someone flipped it, possibly to dodge repost detection somewhere, and edited the text back in
whoever made that is a fuckin loser
>>
File: savage duck attack.webm (2.82 MB, 400x711)
2.82 MB
2.82 MB WEBM
>>102316348
dont look at me anon i just saved it ages ago
>>
>Tuesday
It's Teto time!

>>102316067
Oh, didn't notice it. Thanks for checking it.
>>
File: Untitled.png (1.61 MB, 1080x3067)
1.61 MB
1.61 MB PNG
FedModule: A Modular Federated Learning Framework
https://arxiv.org/abs/2409.04849
>Federated learning (FL) has been widely adopted across various applications, such as healthcare, finance, and smart cities. However, as experimental scenarios become more complex, existing FL frameworks and benchmarks have struggled to keep pace. This paper introduces FedModule, a flexible and extensible FL experimental framework that has been open-sourced to support diverse FL paradigms and provide comprehensive benchmarks for complex experimental scenarios. FedModule adheres to the "one code, all scenarios" principle and employs a modular design that breaks the FL process into individual components, allowing for the seamless integration of different FL paradigms. The framework supports synchronous, asynchronous, and personalized federated learning, with over 20 implemented algorithms. Experiments conducted on public datasets demonstrate the flexibility and extensibility of FedModule. The framework offers multiple execution modes-including linear, threaded, process-based, and distributed-enabling users to tailor their setups to various experimental needs. Additionally, FedModule provides extensive logging and testing capabilities, which facilitate detailed performance analysis of FL algorithms. Comparative evaluations against existing FL toolkits, such as TensorFlow Federated, PySyft, Flower, and FLGo, highlight FedModule's superior scalability, flexibility, and comprehensive benchmark support.
https://github.com/NUAA-SmartSensing/async-FL
seems to promise easier federated training at least for small stuff. could be useful.
https://github.com/justinlovelace/SESD
for example (still no code) was able to use just 2% of the training dataset of vall-e to match it so actually feasible to train it in a federated manner
>>
>>102312821
You won't loose much in sequenced inference if you powerlimit 3 3090 to 250W
>>
>>102312210
I don't see any significant difference in the guy's CR examples. Woof woof, yes, but that's 1 out of 3, that could be just chance.
>>
>>102316467
Do you know how on Linux?



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.