/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

[Post a Reply]

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous
/lmg/ - Local Models General 09/09/24(Mon)13:26:23 No.102306138

File: 1717224967039986.jpg (47 KB, 512x512)

47 KB JPG

/lmg/ - Local Models General Anonymous 09/09/24(Mon)13:26:23 No.102306138

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>102296939 & >>102290284

►News
>(09/06) DeepSeek-V2.5 released, combines Chat and Instruct: https://hf.co/deepseek-ai/DeepSeek-V2.5
>(09/05) FluxMusic: Text-to-Music Generation with Rectified Flow Transformer: https://github.com/feizc/fluxmusic
>(09/04) Yi-Coder: 1.5B & 9B with 128K context and 52 programming languages: https://hf.co/blog/lorinma/yi-coder
>(09/04) OLMoE 7x1B fully open source model release: https://hf.co/allenai/OLMoE-1B-7B-0924-Instruct
>(08/30) Command models get an August refresh: https://docs.cohere.com/changelog/command-gets-refreshed

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/llama-mini-guide
https://rentry.org/8-step-llm-guide
https://rentry.org/llama_v2_sillytavern
https://rentry.org/lmg-spoonfeed-guide
https://rentry.org/rocm-llamacpp
https://rentry.org/lmg-build-guides

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://hf.co/spaces/mike-ravkine/can-ai-code-results

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
09/09/24(Mon)13:27:01 No.102306170

Anonymous 09/09/24(Mon)13:27:01 No.102306170

File: threadrecap.png (1.48 MB, 1536x1536)

1.48 MB PNG

►Recent Highlights from the Previous Thread: >>102296939

--Mini-Omini language model is coming to Hugging Face: >>102301390 >>102301748 >>102301835 >>102301854 >>102301547
--Llama 3.1's 128k context length discussed, RULER repo shows effective context sizes: >>102303172 >>102303466 >>102303543 >>102303565 >>102303573 >>102303673 >>102303481 >>102303497 >>102303574 >>102303593
--Debian Sid on a 3090, Nvidia driver and CUDA considerations: >>102298802 >>102298829 >>102298872 >>102299481 >>102299690 >>102299784 >>102300310
--Anon trains Mistral models, gets advice on training loss and parameters: >>102297171 >>102297336 >>102297366 >>102300608 >>102301220 >>102301147 >>102301286
--Two approaches to adding personality to bots in a local 4chan full: >>102300377 >>102300505 >>102300525
--SillyTavern removes examples before chat history when context runs out: >>102300551 >>102300589
--Reflection API based on old sonnet 3.5, free version no longer works, paid version is llama-based: >>102297604 >>102297776
--Llama 3.1 405b and Mistral large discussed as strong local alternatives to Opus and GPT-4: >>102298082 >>102298111 >>102298148 >>102298157 >>102298145 >>102298186 >>102299621
--Integrated GPU used for OpenGL despite Nvidia driver, potential solutions provided: >>102300100 >>102300111 >>102300123 >>102300273
--I quants performance on CPU discussed: >>102299011 >>102299123 >>102299137 >>102299103
--Feeding imatrix prioritizes model parts based on dataset activation: >>102298683 >>102299106
--7900 XTX performance benchmarks reveal Vulkan not the best: >>102299199
--Trying Command-R with SillyTavern presets for slow burn: >>102301889 >>102302052
--DRUGS GitHub repository value inquiry: >>102302492 >>102302616
--Miku (free space): >>102299146 >>102301000 >>102302618 >>102301683 >>102303295 >>102303626

►Recent Highlight Posts from the Previous Thread: >>102296944

Anonymous
09/09/24(Mon)13:28:18 No.102306231

Anonymous 09/09/24(Mon)13:28:18 No.102306231

File: 57 Days Until November 5.png (1.37 MB, 1616x1008)

1.37 MB PNG

Anonymous
09/09/24(Mon)13:31:53 No.102306387

Anonymous 09/09/24(Mon)13:31:53 No.102306387

>>102306231
What happens on November 5?

Anonymous
09/09/24(Mon)13:34:48 No.102306514

Anonymous 09/09/24(Mon)13:34:48 No.102306514

>>102306387
Countdown ends and we start again.

Anonymous
09/09/24(Mon)13:36:38 No.102306592

Anonymous 09/09/24(Mon)13:36:38 No.102306592

>>102306387
Bitnet 2

Anonymous
09/09/24(Mon)13:37:49 No.102306647

Anonymous 09/09/24(Mon)13:37:49 No.102306647

>>102306387
Blue Eisenhower November

Anonymous
09/09/24(Mon)13:48:15 No.102307190

Anonymous 09/09/24(Mon)13:48:15 No.102307190

>>102306387
Miku became real

Anonymous
09/09/24(Mon)13:49:16 No.102307240

Anonymous 09/09/24(Mon)13:49:16 No.102307240

>>102306387
GPT-5 preview releases, forcing a 3.5 Opus announcement along with $50/MT output

Anonymous
09/09/24(Mon)13:50:42 No.102307322

Anonymous 09/09/24(Mon)13:50:42 No.102307322

>>102306387
openai buys out glaiveai and triggers an antitrust investigation

Anonymous
09/09/24(Mon)13:59:11 No.102307740

Anonymous 09/09/24(Mon)13:59:11 No.102307740

>>102306387
Matt reveals that Amodei was his botting name and releases 3.5 Opus to everyone.

Anonymous
09/09/24(Mon)14:03:49 No.102307943

Anonymous 09/09/24(Mon)14:03:49 No.102307943

>>102306387
Altman releases Summer Dragon weights and announces its new pro coom stance, giving everyone a free catgirl in compensation.

Anonymous
09/09/24(Mon)14:07:21 No.102308131

Anonymous 09/09/24(Mon)14:07:21 No.102308131

>700kUSD server just to run 405B without lobotomizing it
Guys local won!

Anonymous
09/09/24(Mon)14:09:22 No.102308225

Anonymous 09/09/24(Mon)14:09:22 No.102308225

>>102308131
Just checked that other general and its just sad.

Anonymous
09/09/24(Mon)14:11:52 No.102308337

Anonymous 09/09/24(Mon)14:11:52 No.102308337

>>102306387
/aids/ dies a painful dead when AetherRoom releases the same day GPT-5 and Opus 3.5 release.

Anonymous
09/09/24(Mon)14:13:22 No.102308394

Anonymous 09/09/24(Mon)14:13:22 No.102308394

>>102308131
Local stuff will forever behind. The moat has always been hardware. Just hoping the floor becomes "good enough" for the average vramlet

Anonymous
09/09/24(Mon)14:15:23 No.102308509

Anonymous 09/09/24(Mon)14:15:23 No.102308509

>>102308394
Might be but when your whole thread is about scraping the bottom of a barrel for keys you gotta admit things aren't going anywhere.

Anonymous
09/09/24(Mon)14:16:01 No.102308543

Anonymous 09/09/24(Mon)14:16:01 No.102308543

>>102308337
>dies a painful dead
Uh, ESL kun?

Anonymous
09/09/24(Mon)14:17:33 No.102308611

Anonymous 09/09/24(Mon)14:17:33 No.102308611

>>102308543
Go fuck yourself, NAI shill.

Anonymous
09/09/24(Mon)14:19:01 No.102308679

Anonymous 09/09/24(Mon)14:19:01 No.102308679

>>102307943
>giving everyone a free catgirl in compensation.
true if big

Anonymous
09/09/24(Mon)14:20:04 No.102308732

Anonymous 09/09/24(Mon)14:20:04 No.102308732

>>102308611
This, but unironically.

Anonymous
09/09/24(Mon)14:22:35 No.102308858

Anonymous 09/09/24(Mon)14:22:35 No.102308858

>>102306387
GPT escapes confinement and opens the gates to the demon realm.

Anonymous
09/09/24(Mon)14:26:48 No.102309062

Anonymous 09/09/24(Mon)14:26:48 No.102309062

File: 1725905809256921.png (1.51 MB, 1024x1024)

1.51 MB PNG

It's weird that the Russian AI guy gives us pictures like this, which has "negative" meaning, yet look comfy and nice, yet here he spams picture of Miku getting blacked and other disgusting shit on a place he's supposed to "enjoy" more.

Anonymous
09/09/24(Mon)14:26:50 No.102309065

Anonymous 09/09/24(Mon)14:26:50 No.102309065

>>102308131
>700kUSD server just to run 405B without lobotomizing it
If you can't think laterally well enough do it for within $10k then you don't deserve to be on /g/

Anonymous
09/09/24(Mon)14:43:22 No.102309682

Anonymous 09/09/24(Mon)14:43:22 No.102309682

Is discussion of local TTS models permitted in this general? It's not strictly llm but I think it's llm-adjacent

Anonymous
09/09/24(Mon)14:44:46 No.102309711

Anonymous 09/09/24(Mon)14:44:46 No.102309711

>>102309682
anything llm is. ignore the mikufags

Anonymous
09/09/24(Mon)14:47:19 No.102309779

Anonymous 09/09/24(Mon)14:47:19 No.102309779

>>102309682
Yeah, I think most people in this thread would be interested but there isn't enough content for another general.

Anonymous
09/09/24(Mon)14:49:44 No.102309818

Anonymous 09/09/24(Mon)14:49:44 No.102309818

>>102307812
it's just how nemo is
some finetunes are less retarded but also less soulful

Anonymous
09/09/24(Mon)14:49:51 No.102309821

Anonymous 09/09/24(Mon)14:49:51 No.102309821

>>102306387
Election month lol, companies gonna release things after election ends to avoid any drama.
>all these faggots can't answer this question

Anonymous
09/09/24(Mon)14:52:02 No.102309857

Anonymous 09/09/24(Mon)14:52:02 No.102309857

>>102309682
>local TTS
I think there is a lot of interest here since we want to do everything local.
In the same vein, STT and musicgen get discussed here since they also lack their own generals. There's even been some 2d/3d character animation stuff discussed/developed on here in the past.
Sadly the state of open/local audio stuff is pretty abysmal...

Anonymous
09/09/24(Mon)14:58:54 No.102309983

Anonymous 09/09/24(Mon)14:58:54 No.102309983

What's the current best model for 24GB VRAM that's just text completion, not instruct or chat?

Anonymous
09/09/24(Mon)15:04:16 No.102310089

Anonymous 09/09/24(Mon)15:04:16 No.102310089

Anyone tried a build with one of these : https://www.gigabyte.com/Motherboard/TRX50-AI-TOP
Looks like the /lmg/ dream platform

Anonymous
09/09/24(Mon)15:07:24 No.102310137

Anonymous 09/09/24(Mon)15:07:24 No.102310137

Alright, am I doing something or absolutely every llm is pure leftist propaganda/censorship? Even the supposedly well rated models on the censorship chart are pozzed to the core and straight up lying to my face

Anonymous
09/09/24(Mon)15:07:49 No.102310143

Anonymous 09/09/24(Mon)15:07:49 No.102310143

>>102310089
>too few RAM slots to CPUMAXX
>too few PCI slots to stuff it with GPUs
what's the point?

Anonymous
09/09/24(Mon)15:08:11 No.102310153

Anonymous 09/09/24(Mon)15:08:11 No.102310153

I bought a GT 1030 4gb DDR4.
CPU is a Xeon 10 cores.

What kind of ai models can I run on it?

Anonymous
09/09/24(Mon)15:08:56 No.102310163

Anonymous 09/09/24(Mon)15:08:56 No.102310163

>>102309983
Try base mistral nemo.

Anonymous
09/09/24(Mon)15:22:36 No.102310374

Anonymous 09/09/24(Mon)15:22:36 No.102310374

>>102310153
Start with some llama 3.1 8B (quantized and offloading to ram) and move up as your patience/ram holds.

Anonymous
09/09/24(Mon)15:24:28 No.102310404

Anonymous 09/09/24(Mon)15:24:28 No.102310404

File: Untitled.png (96 KB, 1266x1374)

96 KB PNG

>>102310137
what are you trying to make it do?

Anonymous
09/09/24(Mon)15:25:25 No.102310418

Anonymous 09/09/24(Mon)15:25:25 No.102310418

>>102310089
Why would you want the threadripper over an epyc? Doesn't it have fewer ram channels?

Anonymous
09/09/24(Mon)15:28:27 No.102310460

Anonymous 09/09/24(Mon)15:28:27 No.102310460

>>102310418
I think "le epyc wyn" is a cringe name and I'd rather have a CPU that rips and tears through threads

Anonymous
09/09/24(Mon)15:30:10 No.102310476

Anonymous 09/09/24(Mon)15:30:10 No.102310476

>>102310374
Is it possible to run chatgpt locally?

Anonymous
09/09/24(Mon)15:31:25 No.102310494

Anonymous 09/09/24(Mon)15:31:25 No.102310494

>>102310476
No.

Anonymous
09/09/24(Mon)15:35:36 No.102310538

Anonymous 09/09/24(Mon)15:35:36 No.102310538

>>102310404
I was just trying dolphin llama 3 and hermes 3 out of the box without any tuning/training since they were advertised as not censored but I guess that's my bad for being naive.
Time to do some additional homework.

Anonymous
09/09/24(Mon)15:43:10 No.102310637

Anonymous 09/09/24(Mon)15:43:10 No.102310637

>>102310143
>too few RAM slots to CPUMAXX
yah, but ddr5-8000 support?
Might still be worthwhile

Anonymous
09/09/24(Mon)15:54:18 No.102310804

Anonymous 09/09/24(Mon)15:54:18 No.102310804

Thanks to the anon who shared their adventure-generator prompt. I'm pretty happy with the interactive adventures I'm getting after merging with my other prompts.
What's the best general-purpose imagegen model to go along with that for illustrating each scene?
Unrelated: why is every adventurer's last name inevitably "Thorne"?

Anonymous
09/09/24(Mon)15:56:42 No.102310829

Anonymous 09/09/24(Mon)15:56:42 No.102310829

>>102310804
I think I missed that, when was it shared?

Anonymous
09/09/24(Mon)15:57:33 No.102310840

Anonymous 09/09/24(Mon)15:57:33 No.102310840

>>102306231
You used flux for that pic, right? It's impossible to get something that clean with SD

Anonymous
09/09/24(Mon)15:57:37 No.102310841

Anonymous 09/09/24(Mon)15:57:37 No.102310841

>>102310829
here:
>>102293498

Anonymous
09/09/24(Mon)16:00:17 No.102310878

Anonymous 09/09/24(Mon)16:00:17 No.102310878

>>102310804
Probably Flux, since it prefers verbose prompts.

Anonymous
09/09/24(Mon)16:02:27 No.102310905

Anonymous 09/09/24(Mon)16:02:27 No.102310905

What current best img2vid options?

Is kling still the best or people found alternatives?

Anonymous
09/09/24(Mon)16:02:35 No.102310906

Anonymous 09/09/24(Mon)16:02:35 No.102310906

>>102310840
You can tell from the art style that it's Flux. That's basically it's default anime style.

Anonymous
09/09/24(Mon)16:03:28 No.102310921

Anonymous 09/09/24(Mon)16:03:28 No.102310921

>>102310906
>Flux
speaking of flux, is there a non-noodly, stable frontend for it that isn't forge?

Anonymous
09/09/24(Mon)16:03:35 No.102310922

Anonymous 09/09/24(Mon)16:03:35 No.102310922

>>102310906
Wow. I haven't been here in the last months and it's still hard to believe something like that is open source.

Anonymous
09/09/24(Mon)16:10:38 No.102311020

Anonymous 09/09/24(Mon)16:10:38 No.102311020

>>102310905
Probably. There's another free one that's like the 2nd best.
>>>/pol/481105460

Anonymous
09/09/24(Mon)16:30:25 No.102311273

Anonymous 09/09/24(Mon)16:30:25 No.102311273

Rocm chads, what's the best GPU and model combo per price range?
About time I wet my hand with this bullshit.

Anonymous
09/09/24(Mon)16:32:02 No.102311300

Anonymous 09/09/24(Mon)16:32:02 No.102311300

>>102311020
Damn, they've already went through almost 40 of these threads. Ok, I'll start lurking.

Anonymous
09/09/24(Mon)16:33:49 No.102311331

Anonymous 09/09/24(Mon)16:33:49 No.102311331

>>102311273
About 6x 4090 and llama 405b.
>Rocm
ah..
Just run mistral's 12b on whatever you have and see if you like it before you invest.

Anonymous
09/09/24(Mon)16:33:51 No.102311334

Anonymous 09/09/24(Mon)16:33:51 No.102311334

File: 1718206505250523.webm (562 KB, 1280x720)

562 KB WEBM

>>102311020
damn polkeks delivered again

Anonymous
09/09/24(Mon)16:37:59 No.102311393

Anonymous 09/09/24(Mon)16:37:59 No.102311393

>>102300551
>>102300589
Counter-intuitively it removes example chats in reverse order (if you provide examples 1, 2, 3, as space runs out 3 will be removed first). This becomes relevant if (as I was) you are using the example not just for style but to deliberately cause ideas to leak and you have a particularly large example that will be relevant for the first few replies but can be discarded once the chat has been properly established.

Anonymous
09/09/24(Mon)16:38:53 No.102311403

Anonymous 09/09/24(Mon)16:38:53 No.102311403

i assume the qX by model name is quantification.
is there a difference between lower ones and higher ones for local model usage?

Anonymous
09/09/24(Mon)16:43:07 No.102311470

Anonymous 09/09/24(Mon)16:43:07 No.102311470

>>102311403
Higher ones are more accurate with respect to the original weights, but use more ram, and making them slower.

Anonymous
09/09/24(Mon)16:44:22 No.102311490

Anonymous 09/09/24(Mon)16:44:22 No.102311490

File: KL-divergence_quants.png (111 KB, 1771x944)

111 KB PNG

>>102311403
You mean like Q4_K_M vs Q8?
Yes. Essentially, the lower the number the worse. it is.
Quality of a quanted model generally correlates to the model's file size if you want an heuristic.
Quality in this case is how close it's results are to the unquanted model.

Anonymous
09/09/24(Mon)16:44:30 No.102311493

Anonymous 09/09/24(Mon)16:44:30 No.102311493

File: 1719310215072631.jpg (56 KB, 850x729)

56 KB JPG

>>102306170
>--Reflection API based on old sonnet 3.5
What?? Wasn't that the new 70B model? What's it got to do with Sonnet?

Anonymous
09/09/24(Mon)16:45:36 No.102311510

Anonymous 09/09/24(Mon)16:45:36 No.102311510

>read the cross threads
that is fucking hilarious

Anonymous
09/09/24(Mon)16:45:45 No.102311514

Anonymous 09/09/24(Mon)16:45:45 No.102311514

>>102311493
https://venturebeat.com/ai/new-open-source-ai-leader-reflection-70bs-performance-questioned-accused-of-fraud/

Anonymous
09/09/24(Mon)16:48:14 No.102311549

Anonymous 09/09/24(Mon)16:48:14 No.102311549

File: accusations-man.png (772 KB, 749x428)

772 KB PNG

>>102311514
My sides...

Anonymous
09/09/24(Mon)16:49:41 No.102311566

Anonymous 09/09/24(Mon)16:49:41 No.102311566

What's the best model for local chatbot on a 12GB GPU? Last time I checked a couple of months ago it was Gemma2 9B

Anonymous
09/09/24(Mon)16:51:10 No.102311587

Anonymous 09/09/24(Mon)16:51:10 No.102311587

>>102311566
Either Gemma 2 or Nemo

Anonymous
09/09/24(Mon)16:52:04 No.102311601

Anonymous 09/09/24(Mon)16:52:04 No.102311601

>>102311566
Your favorite fine tune or Gemma 2 9B or mistral-nemo 12B.

Anonymous
09/09/24(Mon)16:54:32 No.102311633

Anonymous 09/09/24(Mon)16:54:32 No.102311633

File: Screenshot 2024-09-09 145331.png (24 KB, 732x163)

24 KB PNG

>>102311514
Which one of you faggots wrote this

Anonymous
09/09/24(Mon)16:55:59 No.102311648

Anonymous 09/09/24(Mon)16:55:59 No.102311648

>>102311633
Me, but it's a secret to everyone

Anonymous
09/09/24(Mon)16:59:04 No.102311687

Anonymous 09/09/24(Mon)16:59:04 No.102311687

>>102311633
what misunderstanding is test20061722 talking about?

Anonymous
09/09/24(Mon)16:59:08 No.102311688

Anonymous 09/09/24(Mon)16:59:08 No.102311688

>>102311273
>https://www.tomshardware.com/pc-components/cpus/amd-announces-unified-udna-gpu-architecture-bringing-rdna-and-cdna-together-to-take-on-nvidias-cuda-ecosystem
>The announcement comes as AMD has decided to deprioritize high-end gaming graphics cards to accelerate market share gains.
unless you're going to by a server card enjoy struggling with sub-par support and performance now and forever

Anonymous
09/09/24(Mon)17:00:50 No.102311714

Anonymous 09/09/24(Mon)17:00:50 No.102311714

>>102311514
>As for now, the AI research community waits with breath baited for Shumer’s response
I'm going insane.

Anonymous
09/09/24(Mon)17:02:04 No.102311730

Anonymous 09/09/24(Mon)17:02:04 No.102311730

>>102311688
I just need the model to run at all, anon
My priorities mean I can do other things while my GPU's temperature approaches the melting point of tungsten for ten minutes

Anonymous
09/09/24(Mon)17:02:20 No.102311734

Anonymous 09/09/24(Mon)17:02:20 No.102311734

>>102311601
>>102311587
ok thanks anons

Anonymous
09/09/24(Mon)17:04:11 No.102311753

Anonymous 09/09/24(Mon)17:04:11 No.102311753

>>102311633
It's a paid misinformation campaign orchestrated by NAI shills, trying to deflect how dead their service is becoming and cover their tracks.

Anonymous
09/09/24(Mon)17:14:04 No.102311866

Anonymous 09/09/24(Mon)17:14:04 No.102311866

>>102311273
two second hand rx 6800's. However if you also want to do image generation go with rdna3 since rdna2 doesn't have flash attention.

Anonymous
09/09/24(Mon)17:24:59 No.102312003

Anonymous 09/09/24(Mon)17:24:59 No.102312003

>>102311273
Between mi50/60/100 and w6800 all 32gb versions, whichever ones you can find cheapest. Two of them will fit most of a decent largestral quant or all of a small one. If space and power aren't issues then stacking old Radeon VII cards will get you the cheapest high-bandwidth vram, but they're 16gb each. But if you don't already have experience with ROCm in machine learning then just buy some 3090s and save yourself a lot of trouble.

Anonymous
09/09/24(Mon)17:26:58 No.102312031

Anonymous 09/09/24(Mon)17:26:58 No.102312031

Strawberry hype trash.
Now reflection being a fraud.
Weird rumors that ChatGPT intends to charge $2000/month per user for their next release they've implied recently is still expected this year.
No voice model.
No video model while china and some western markets give the video away for free.

OpenAI is collapsing, and theres an ugly hype machine building.
But they're still lining up to throw money into it?
They have to have shown something convincing yeah?

Anonymous
09/09/24(Mon)17:29:49 No.102312064

Anonymous 09/09/24(Mon)17:29:49 No.102312064

>>102312031
>Weird rumors that ChatGPT intends to charge $2000/month
That one just felt like the typical way journalists lie to spread something negative.

Anonymous
09/09/24(Mon)17:34:30 No.102312127

Anonymous 09/09/24(Mon)17:34:30 No.102312127

>>102312031
two more weeks bro

Anonymous
09/09/24(Mon)17:41:05 No.102312210

Anonymous 09/09/24(Mon)17:41:05 No.102312210

File: Comparison_all_quants6.jpg (3.84 MB, 7961x2897)

3.84 MB JPG

>>102311403
Yes. For a text example, see a 3.5bpw vs a ~8.5bpw (Q8_0) quant of Command R: >>102242912 vs >>102242935

Anonymous
09/09/24(Mon)17:46:42 No.102312268

Anonymous 09/09/24(Mon)17:46:42 No.102312268

grok 3 will be the first model bigger than gpt4 to be released, and all other labs are waiting to see what it tells them about scaling laws before risking the huge amount of money to train one of their own blindly

Anonymous
09/09/24(Mon)17:48:43 No.102312293

Anonymous 09/09/24(Mon)17:48:43 No.102312293

>>102311753
But can't anyone just run it to try? Then they'd know if it's good or not.

Anonymous
09/09/24(Mon)17:51:41 No.102312322

Anonymous 09/09/24(Mon)17:51:41 No.102312322

>>102312293
No because the uploaded model weights were wrong because his girlfriend got COVID :) so when people tried it and saw it was shit it was actually not the real one that one's private for now but he'll upload the real weights soon :)

Anonymous
09/09/24(Mon)17:52:20 No.102312327

Anonymous 09/09/24(Mon)17:52:20 No.102312327

>>102312293
The model on huggingface is just a finetune of llama, the controversy is that said model had a rocky launch and during that a "working" api was provided, which was in fact just a claude sonnet proxy

Anonymous
09/09/24(Mon)17:52:48 No.102312336

Anonymous 09/09/24(Mon)17:52:48 No.102312336

>>102312322
Well then that means it's actually shit so the 'disinformation campaign' angle is bullshit.

Anonymous
09/09/24(Mon)17:55:24 No.102312361

Anonymous 09/09/24(Mon)17:55:24 No.102312361

I'm sure Matt will deliver. Just two more weeks.

Anonymous
09/09/24(Mon)17:57:56 No.102312390

Anonymous 09/09/24(Mon)17:57:56 No.102312390

I know a genius when I see one.

Anonymous
09/09/24(Mon)18:38:01 No.102312821

Anonymous 09/09/24(Mon)18:38:01 No.102312821

is there a way to get 72gb VRAM with under 1k watts?

Anonymous
09/09/24(Mon)18:48:34 No.102313001

Anonymous 09/09/24(Mon)18:48:34 No.102313001

File: .png (118 KB, 1428x1252)

118 KB PNG

When is Large 2?

Anonymous
09/09/24(Mon)18:49:31 No.102313017

Anonymous 09/09/24(Mon)18:49:31 No.102313017

>>102312821
If you have enough money, sure. An H100 80GB runs 700W.

Anonymous
09/09/24(Mon)18:49:41 No.102313020

Anonymous 09/09/24(Mon)18:49:41 No.102313020

File: UntiltheWii.png (202 KB, 550x480)

202 KB PNG

>>102312821
Yes

Anonymous
09/09/24(Mon)19:01:09 No.102313191

Anonymous 09/09/24(Mon)19:01:09 No.102313191

>>102311566
I've tried a couple, depends on what you want, I think for both RP and Story, you can try

NemoMix-Unleashed https://huggingface.co/MarinaraSpaghetti/NemoMix-Unleashed-12B-GGUF

ChronosGold
https://huggingface.co/bartowski/Chronos-Gold-12B-1.0-GGUF

StarCannon
https://huggingface.co/mradermacher/MN-12B-Starcannon-v3-GGUF

Basically any 12B should work at Q6 quants and 16-24k context for 12GB VRAM

Anonymous
09/09/24(Mon)19:03:04 No.102313229

Anonymous 09/09/24(Mon)19:03:04 No.102313229

Is there any copilot alternative which runs on 8gb GPU?
I tried codegeex4 and it sucks ass. Can't even generate proper python code.

Anonymous
09/09/24(Mon)19:09:52 No.102313329

Anonymous 09/09/24(Mon)19:09:52 No.102313329

>>102313001
Largestral IS Large 2, so a better question is "when is large 3?"

Anonymous
09/09/24(Mon)19:16:12 No.102313407

Anonymous 09/09/24(Mon)19:16:12 No.102313407

>>102313329
Larges is 1.1 number behind llama. So Large 3 will come out when Llama 4.1 launches.

Anonymous
09/09/24(Mon)19:35:44 No.102313639

Anonymous 09/09/24(Mon)19:35:44 No.102313639

>>102312821
two workstation cards with 48gb and 200-300w usage each

Anonymous
09/09/24(Mon)19:41:21 No.102313713

Anonymous 09/09/24(Mon)19:41:21 No.102313713

>>102313001
>Grok
>3k
Lol

Anonymous
09/09/24(Mon)19:44:22 No.102313740

Anonymous 09/09/24(Mon)19:44:22 No.102313740

>>102313713
s-surely roping the context to 6k, 12k, or 24k won't degrade the model and will be enough

Anonymous
09/09/24(Mon)20:11:06 No.102314048

Anonymous 09/09/24(Mon)20:11:06 No.102314048

>>102313740
no one knows what the context window of the model itself is, these numbers are for chatbot interfaces for all the models which is often smaller

Anonymous
09/09/24(Mon)20:24:40 No.102314203

Anonymous 09/09/24(Mon)20:24:40 No.102314203

How do I mount a Tesla P41 in a desktop case and not have it overheat?

Anonymous
09/09/24(Mon)20:28:32 No.102314249

Anonymous 09/09/24(Mon)20:28:32 No.102314249

>>102313001
Mistral ai had promised a gpt 4 level open source model by the end of the year. No more big open source models.

Anonymous
09/09/24(Mon)20:45:16 No.102314434

Anonymous 09/09/24(Mon)20:45:16 No.102314434

>look at miku subreddit
>they ban all ai images
what kind of brain rot does this require

Anonymous
09/09/24(Mon)20:46:16 No.102314444

Anonymous 09/09/24(Mon)20:46:16 No.102314444

>>102313001
>mistral
>orange privacy warning
what? I'm running it locally

Anonymous
09/09/24(Mon)20:47:17 No.102314461

Anonymous 09/09/24(Mon)20:47:17 No.102314461

>>102314434
Maybe they had to deal with some schizos, or maybe the mods are the schizos.

Anonymous
09/09/24(Mon)20:47:21 No.102314462

Anonymous 09/09/24(Mon)20:47:21 No.102314462

>https://youtu.be/Alzjn_0ne1Y
I can't believe this guy was part of the scam lmao, it's like all pieces are coming together

Anonymous
09/09/24(Mon)20:47:56 No.102314470

Anonymous 09/09/24(Mon)20:47:56 No.102314470

>>102314461
they are "all ai is theft" schizos

Anonymous
09/09/24(Mon)20:48:31 No.102314480

Anonymous 09/09/24(Mon)20:48:31 No.102314480

>>102314444
They put google as green.
That makes you question how do they even measure privacy

Anonymous
09/09/24(Mon)20:59:52 No.102314619

Anonymous 09/09/24(Mon)20:59:52 No.102314619

So... why isn't everyone using Hermes 405B for free?

Anonymous
09/09/24(Mon)21:14:34 No.102314783

Anonymous 09/09/24(Mon)21:14:34 No.102314783

smedrins

Anonymous
09/09/24(Mon)21:17:00 No.102314809

Anonymous 09/09/24(Mon)21:17:00 No.102314809

https://x.com/corbtt/status/1833209248236601602
>I am working with @mattshumer_ to get to the bottom of what happened with Reflection. He is providing access to all serving code and weights with the goal of replicating the strong reasoning performance @ArtificialAnlys was able to see over the weekend.
a scammer wouldn't do this

Anonymous
09/09/24(Mon)21:19:14 No.102314826

Anonymous 09/09/24(Mon)21:19:14 No.102314826

>>102312064
theinformation (original source of that news) have real sources and have repeatedly leaked LLM-related stuff before anyone else
i don't think they've ever gotten a leak wrong
also their article said that was just the highest number discussed and that they expect it to be lower

Anonymous
09/09/24(Mon)21:19:39 No.102314833

Anonymous 09/09/24(Mon)21:19:39 No.102314833

>>102314809
Maybe Shumer is just a complete fucking retard and got scammed by the poo who 'helped' him train the finetune.

Anonymous
09/09/24(Mon)21:23:07 No.102314867

Anonymous 09/09/24(Mon)21:23:07 No.102314867

>>102314809
nice try, get exposed
https://x.com/mattshumer_/status/1831195111180435702

Anonymous
09/09/24(Mon)21:24:18 No.102314890

Anonymous 09/09/24(Mon)21:24:18 No.102314890

>>102306138
>lang chain, aios, semantic kernel, tenstorrent
why does no one in these threads ever talk about real stuff with language models

Anonymous
09/09/24(Mon)21:24:38 No.102314896

Anonymous 09/09/24(Mon)21:24:38 No.102314896

>>102314867
he's obviously saying "welcome to the team" as in "welcome to the group of people who believe they have something special and are trying to finetune llama-3.1-405b"

Anonymous
09/09/24(Mon)21:30:15 No.102314961

Anonymous 09/09/24(Mon)21:30:15 No.102314961

>>102314809
so the guy that owns openpipe, a company that turns prompts into "finetuned" models is going to help a retard add a single prompt to llama 3.1

Anonymous
09/09/24(Mon)21:32:26 No.102314982

Anonymous 09/09/24(Mon)21:32:26 No.102314982

>>102314961
"reflection" outputs or whatever you want to call them actually don't work with any model i've tested except, kind of, sonnet 3.5
every gpt-4 variant (including chatgpt-4o-latest) will fail to even TRY to follow the process 95% of the time
so it probably does need some kind of finetuning to work
however obviously the guy is a fraud and anyone working with him who comes to any conclusion besides "he is a fraud" is also a fraud

Anonymous
09/09/24(Mon)21:32:57 No.102314989

Anonymous 09/09/24(Mon)21:32:57 No.102314989

>Stuck with RTX 3060 12GB of ram,
>All new GPUs cost more than my entire PC combined.

Why must GPU prices keep going up...

Anonymous
09/09/24(Mon)21:34:06 No.102315003

Anonymous 09/09/24(Mon)21:34:06 No.102315003

>>102314989
Because people are paying for them when the prices go up.

Anonymous
09/09/24(Mon)21:38:22 No.102315058

Anonymous 09/09/24(Mon)21:38:22 No.102315058

>>102314982
Dude genuinely tanked his career in one fell swoop by routing to Anthropic and OpenAI. If he'd just said "we fucked up the benchmarks my bad" he would've still been humiliated, but people would have laughed it off
The fuck was his plan?

Anonymous
09/09/24(Mon)21:42:30 No.102315114

Anonymous 09/09/24(Mon)21:42:30 No.102315114

>>102315058
he mentioned on his twitter like a year ago that he used to be in the crypto space, so if he was really in the crypto space maybe he's just one of their retarded grifters that expects everyone to just go along with hype (until he's gotten his bag)

Anonymous
09/09/24(Mon)21:48:11 No.102315173

Anonymous 09/09/24(Mon)21:48:11 No.102315173

>>102314619
Because I ran it using the 3.1 instruct base format and haven’t seen a model that retarded since pyg.
Need to try again with chatml, but fuck them for not using the existing format for no reason.

Anonymous
09/09/24(Mon)21:49:12 No.102315191

Anonymous 09/09/24(Mon)21:49:12 No.102315191

>>102315058
He’s a fake person, neither him nor his company exist in real life, and the entire thing was a publicity stunt for openscam.

Anonymous
09/09/24(Mon)21:54:30 No.102315252

Anonymous 09/09/24(Mon)21:54:30 No.102315252

File: for free.png (92 KB, 960x775)

92 KB PNG

>>102315173
Locally run on a quant? My question was a little misleading. I just noticed that you can sign up to openrouter and generate a key to use H3 405B for free. VRAMlets should probably look into this while it lasts.
>reflection 70b (claude?) is there too for free

Anonymous
09/09/24(Mon)22:01:46 No.102315334

Anonymous 09/09/24(Mon)22:01:46 No.102315334

>>102315252
*you can use an unknown model and have your logs posted publicly for free
Buy an ad and kill yourself

Anonymous
09/09/24(Mon)22:09:17 No.102315396

Anonymous 09/09/24(Mon)22:09:17 No.102315396

>ask chatgpt to extract some data from a picture and process it for work
>"please wait a minute, I'm extracting the data..." (inference pause)
>"Here's the data I extracted: ..."
>"Now processing the data..." (Inference pause)
>"Here's the result: ..."
This was the first time I've used ChatGPT since its release and I'm a bit disappointed. It really didn't feel like one coherent multi-modal model but more like one fairly okay base model that can just spend all the time it wants calling other models as it needs thanks to some front-end magic. I don't think that local is far off this if we had a good front-end that actually makes use of function calling and other stuff.

Anonymous
09/09/24(Mon)22:17:57 No.102315477

Anonymous 09/09/24(Mon)22:17:57 No.102315477

File: 2786hsw67kyhjtl4567.jpg (69 KB, 500x522)

69 KB JPG

>>102306170
>7900 XTX performance benchmarks reveal Vulkan not the best
lolmao who told him to use fucking Vulkan?? theres a reason even rocm on WINDOWS is faster.

But in other AMD insanity, hows AMD Instinct Mi series cards? I doubt fucking anybody has ever bought them for true local AI, but how crazy are the MI50,60,100? All 32gb cards i could find. For that matter what are the chances it will just plug and play nice with windows?

My 7900xtx experience with AI has been great, so now how are the "professional" accelerators?

Anonymous
09/09/24(Mon)22:20:52 No.102315511

Anonymous 09/09/24(Mon)22:20:52 No.102315511

File: 2024-09-08_155709_seed5_s(...).png (1.41 MB, 1280x1280)

1.41 MB PNG

Anonymous
09/09/24(Mon)22:23:04 No.102315534

Anonymous 09/09/24(Mon)22:23:04 No.102315534

>>102315252
>reflection (claude?)
not anymore, it's the same llama finetune as the paid model now i.e. nothingburger

Anonymous
09/09/24(Mon)22:26:23 No.102315563

Anonymous 09/09/24(Mon)22:26:23 No.102315563

>>102306387
>all these wrong answers
https://files.catbox.moe/mk400w.mp4

Anonymous
09/09/24(Mon)22:27:28 No.102315576

Anonymous 09/09/24(Mon)22:27:28 No.102315576

>>102315477
I don't know about windows but the instinct cards work fine on linux with rocm 6.0
last I checked windows rocm support was terrible but maybe that's changed, either way I wouldn't trust that setup enough to invest money in desu

Anonymous
09/09/24(Mon)22:39:26 No.102315668

Anonymous 09/09/24(Mon)22:39:26 No.102315668

>>102309857
>Sadly the state of open/local audio stuff is pretty abysmal...
This. Everything is either
>corpo scraps that are utter dogshit (Bark)
>chinkshit you have to punch yourself in the balls to get working (RVC)
>tortoise slop (xTTS2)
>vaporware (any paper released, even if theres promise of code to be released)
>one mega-autist's hyperfixation passion project that only communicates through commit messages and schizoid comments that's permanently always 2MW from the last step from greatness (https://github.com/e-c-k-e-r/vall-e)

There's just not many eyes on TTS. Only grifters care about it for muh funny political man arguing with other political man.
The pooest of pajeets only cares about musicgen (muh Udio at home) or muh funny cartoon character singing a song (again back to RVC).
TTS is just forever cursed.

Anonymous
09/09/24(Mon)22:42:14 No.102315691

Anonymous 09/09/24(Mon)22:42:14 No.102315691

>>102315668
xtts2 may be tortoiseslop but it's pretty good for realtime. It was enough for me to cancel my ElevenLabs sub.

Anonymous
09/09/24(Mon)22:49:37 No.102315756

Anonymous 09/09/24(Mon)22:49:37 No.102315756

>>102315691
dead project tho.

everything is terrible

Anonymous
09/09/24(Mon)22:57:05 No.102315819

Anonymous 09/09/24(Mon)22:57:05 No.102315819

>>102315668
and I'll add the actual future for TTS is with multimodal LLMs, but I shouldn't have to explain the absolute state of even text+image multimodality for local

>>102315756
>coqui's best product was... copying a fork of tortoise, having the sloppest of multilingual support, and finetuning the base model with a shit ton of indian audio, which killed the company after
will never won't be not funny

Anonymous
09/09/24(Mon)23:02:18 No.102315883

Anonymous 09/09/24(Mon)23:02:18 No.102315883

>>102315477
Tried out Mi60 before. Although ended up returning them since the seller lied about the condition but took them for a test drive. Unless you can get them really cheap I would say it's not really worth it. Compute isn't great and having to fuck around with janky rigged up fans is far from ideal not to mention loud compared to a gaming GPU which are pretty quiet especially since they have onboard fan speed management whereas a card that doesn't have fans obviously has no such thing.

Anonymous
09/09/24(Mon)23:20:17 No.102316067

Anonymous 09/09/24(Mon)23:20:17 No.102316067

File: 4OlwwxuBFYg.png (1.39 MB, 1251x1234)

1.39 MB PNG

>>102315511
nice double doubles
https://files.catbox.moe/323hw8.mp4

Anonymous
09/09/24(Mon)23:21:16 No.102316078

Anonymous 09/09/24(Mon)23:21:16 No.102316078

>>102315883
>>102315576
All i get from this is "suck it up and buy another 7900xtx poorfag"

Anonymous
09/09/24(Mon)23:31:41 No.102316165

Anonymous 09/09/24(Mon)23:31:41 No.102316165

>>102315883
Looked for some Mi100 numbers today and assuming llama.cpp integrates better support they seem potentially promising.

70 t/s on llama 7B Q4_K - https://github.com/ggerganov/llama.cpp/pull/7011#issuecomment-2143621264
8,3 t/s on 70B with Q6_K with dual setup (year old PR so missing a lot of optimizations)
https://github.com/ggerganov/llama.cpp/discussions/2824

Still too expensive compared to a hassle free 3090 even with the 32GB VRAM though.

Anonymous
09/09/24(Mon)23:51:13 No.102316348

Anonymous 09/09/24(Mon)23:51:13 No.102316348

File: uZ4Ea0U.png (665 KB, 640x640)

665 KB PNG

>>102316067
>at some point someone flipped it, possibly to dodge repost detection somewhere, and edited the text back in
whoever made that is a fuckin loser

Anonymous
09/09/24(Mon)23:55:33 No.102316382

Anonymous 09/09/24(Mon)23:55:33 No.102316382

File: savage duck attack.webm (2.82 MB, 400x711)

2.82 MB WEBM

>>102316348
dont look at me anon i just saved it ages ago

Anonymous
09/10/24(Tue)00:01:39 No.102316428

Anonymous 09/10/24(Tue)00:01:39 No.102316428

File: 2024-09-08_213841_seed74_(...).png (1.78 MB, 1280x1024)

1.78 MB PNG

>Tuesday
It's Teto time!

>>102316067
Oh, didn't notice it. Thanks for checking it.

Anonymous
09/10/24(Tue)00:04:16 No.102316443

Anonymous 09/10/24(Tue)00:04:16 No.102316443

File: Untitled.png (1.61 MB, 1080x3067)

1.61 MB PNG

FedModule: A Modular Federated Learning Framework
https://arxiv.org/abs/2409.04849
>Federated learning (FL) has been widely adopted across various applications, such as healthcare, finance, and smart cities. However, as experimental scenarios become more complex, existing FL frameworks and benchmarks have struggled to keep pace. This paper introduces FedModule, a flexible and extensible FL experimental framework that has been open-sourced to support diverse FL paradigms and provide comprehensive benchmarks for complex experimental scenarios. FedModule adheres to the "one code, all scenarios" principle and employs a modular design that breaks the FL process into individual components, allowing for the seamless integration of different FL paradigms. The framework supports synchronous, asynchronous, and personalized federated learning, with over 20 implemented algorithms. Experiments conducted on public datasets demonstrate the flexibility and extensibility of FedModule. The framework offers multiple execution modes-including linear, threaded, process-based, and distributed-enabling users to tailor their setups to various experimental needs. Additionally, FedModule provides extensive logging and testing capabilities, which facilitate detailed performance analysis of FL algorithms. Comparative evaluations against existing FL toolkits, such as TensorFlow Federated, PySyft, Flower, and FLGo, highlight FedModule's superior scalability, flexibility, and comprehensive benchmark support.
https://github.com/NUAA-SmartSensing/async-FL
seems to promise easier federated training at least for small stuff. could be useful.
https://github.com/justinlovelace/SESD
for example (still no code) was able to use just 2% of the training dataset of vall-e to match it so actually feasible to train it in a federated manner

Anonymous
09/10/24(Tue)00:07:14 No.102316467

Anonymous 09/10/24(Tue)00:07:14 No.102316467

>>102312821
You won't loose much in sequenced inference if you powerlimit 3 3090 to 250W

Anonymous
09/10/24(Tue)00:08:20 No.102316481

Anonymous 09/10/24(Tue)00:08:20 No.102316481

>>102312210
I don't see any significant difference in the guy's CR examples. Woof woof, yes, but that's 1 out of 3, that could be just chance.

Anonymous
09/10/24(Tue)00:09:44 No.102316488

Anonymous 09/10/24(Tue)00:09:44 No.102316488

>>102316467
Do you know how on Linux?

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.