/g/ - /lmg/ - Local Models General - Technology


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

Anonymous
/lmg/ - Local Models General 11/14/24(Thu)18:14:23 No.103188780

File: 6e406395da7cff8573b731a66(...).jpg (110 KB, 736x1483)

/lmg/ - Local Models General Anonymous 11/14/24(Thu)18:14:23 No.103188780

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>103164575 & >>103153308

►News
>(11/12) Qwen2.5-Coder series released https://qwenlm.github.io/blog/qwen2.5-coder-family/
>(11/08) Sarashina2-8x70B, a Japan-trained LLM model: https://hf.co/sbintuitions/sarashina2-8x70b
>(11/05) Hunyuan-Large released with 389B and 52B active: https://hf.co/tencent/Tencent-Hunyuan-Large
>(10/31) QTIP: Quantization with Trellises and Incoherence Processing: https://github.com/Cornell-RelaxML/qtip
>(10/31) Fish Agent V0.1 3B: Voice-to-Voice and TTS model: https://hf.co/fishaudio/fish-agent-v0.1-3b

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
Chatbot Arena: https://chat.lmsys.org/?leaderboard
Censorship: https://hf.co/spaces/DontPlanToEnd/UGI-Leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Programming: https://livecodebench.github.io/leaderboard.html

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/lmg-anon/mikupad
https://github.com/turboderp/exui
https://github.com/ggerganov/llama.cpp

Anonymous
11/14/24(Thu)18:15:46 No.103188791

Anonymous 11/14/24(Thu)18:15:46 No.103188791

Get busy manufacturing your LLM made bumps miku baker faggot. Time to get /lmg/ going again. Dance monkey dance.

!ccqXAQxUxI
11/14/24(Thu)18:27:34 No.103188894

!ccqXAQxUxI 11/14/24(Thu)18:27:34 No.103188894

>>103188791
Fake bake.

Anonymous
11/14/24(Thu)18:28:33 No.103188902

Anonymous 11/14/24(Thu)18:28:33 No.103188902

>>103188894
>Fake bake
Made up term.

Anonymous
11/14/24(Thu)18:28:50 No.103188905

Anonymous 11/14/24(Thu)18:28:50 No.103188905

kurisusex

Anonymous
11/14/24(Thu)18:33:19 No.103188936

Anonymous 11/14/24(Thu)18:33:19 No.103188936

File: maxresdefault.jpg (101 KB, 1280x720)

101 KB JPG

If context stops being a problem she will be my wife. It is so perfect that in character she is already an AI program.

Anonymous
11/14/24(Thu)18:39:13 No.103188976

Anonymous 11/14/24(Thu)18:39:13 No.103188976

File: dcf65442fce60f322d7eed477(...).jpg (77 KB, 900x900)

77 KB JPG

And she was always a perfect /lmg/ mascot. Much better than that green haired whore without any personality. Amadeus Kurisu was a perfect example why you need a local model. And that is because the nigger that is running a cloud server is looking at everything you are doing and he is doing that to make your life miserable in the long run.

Death to all mikuniggers.

Anonymous
11/14/24(Thu)18:51:12 No.103189098

Anonymous 11/14/24(Thu)18:51:12 No.103189098

>>103188976
>And she was always a perfect /lmg/ mascot.
Then why did nobody want her but you? And why do you endlessly seethe that nobody wanted your forced mascot, to the point that you spam your BBC collection in the thread?

Anonymous
11/14/24(Thu)19:05:42 No.103189195

Anonymous 11/14/24(Thu)19:05:42 No.103189195

Is there any model these days that's better at voice transfer than RVC2? Or has that entire area just stagnated for the past year?

Anonymous
11/14/24(Thu)19:14:06 No.103189266

Anonymous 11/14/24(Thu)19:14:06 No.103189266

I think Skeeter from Doug should be the mascot

Anonymous
11/14/24(Thu)19:16:38 No.103189280

Anonymous 11/14/24(Thu)19:16:38 No.103189280

>>103188976
>>103188936
Any good amadeus cards?

Anonymous
11/14/24(Thu)19:19:12 No.103189299

Anonymous 11/14/24(Thu)19:19:12 No.103189299

>>103189280
the slave ship? sounds like a fun idea to make a slave trading sim

Anonymous
11/14/24(Thu)19:19:32 No.103189302

Anonymous 11/14/24(Thu)19:19:32 No.103189302

>>103189098
>Then why did nobody want her but you?
Because you are a faggot and a retard who didn't play the game obviously. Kill yourself.

Anonymous
11/14/24(Thu)19:20:01 No.103189304

Anonymous 11/14/24(Thu)19:20:01 No.103189304

>>103189299
no, kurisu (version de la amadeus) from the hit visual novel series steins gate (version dos not the uno version)

Anonymous
11/14/24(Thu)19:21:44 No.103189319

Anonymous 11/14/24(Thu)19:21:44 No.103189319

>>103189304
rofl i was thinking of the Amistad

Anonymous
11/14/24(Thu)19:22:34 No.103189327

Anonymous 11/14/24(Thu)19:22:34 No.103189327

>>103188780
https://rentry.org/lmg-spoonfeed-guide
>Edit: 12 Dec 2023 00:10 UTC
Is the guide going to be updated? It's almost been a year.

Anonymous
11/14/24(Thu)19:24:19 No.103189342

Anonymous 11/14/24(Thu)19:24:19 No.103189342

>>103189327
No we don't update shit. We just make sure miku is in the OP and that is it.

Anonymous
11/14/24(Thu)19:24:43 No.103189347

Anonymous 11/14/24(Thu)19:24:43 No.103189347

File: miku laugh.png (437 KB, 639x653)

437 KB PNG

>>103189328
>>103189328
>>103189328
Next thread

Anonymous
11/14/24(Thu)19:25:10 No.103189350

Anonymous 11/14/24(Thu)19:25:10 No.103189350

>>103189327
>download kobold, nemo model and st
done

Anonymous
11/14/24(Thu)19:25:32 No.103189355

Anonymous 11/14/24(Thu)19:25:32 No.103189355

>>103189347
little early there

Anonymous
11/14/24(Thu)19:26:21 No.103189363

Anonymous 11/14/24(Thu)19:26:21 No.103189363

>>103189355
it is ok. he is a little dumb.

Anonymous
11/14/24(Thu)19:28:57 No.103189387

Anonymous 11/14/24(Thu)19:28:57 No.103189387

>>103189363
He's a vocaloid fag. He's Indian.

Anonymous
11/14/24(Thu)19:32:34 No.103189427

Anonymous 11/14/24(Thu)19:32:34 No.103189427

>>103189410
>everyone I don't like is one person

Anonymous
11/14/24(Thu)19:34:45 No.103189440

Anonymous 11/14/24(Thu)19:34:45 No.103189440

File: 1726211361426201.jpg (127 KB, 890x930)

127 KB JPG

>>103189342
false
https://rentry.org/LocalModelsLinks
>lmg links rentry created may 2023, updated 2 weeks ago
>ml roadmap rentry created may 2023, updated 1 week ago
>lmg news rentry updates regularly
>datasets rentry created april 2023, updated october 2024
too lazy to check more but many of the lmg rentries are regularly updated
the spoonfeed guide should at least be updated for 2024

Anonymous
11/14/24(Thu)19:53:07 No.103189581

Anonymous 11/14/24(Thu)19:53:07 No.103189581

>>103189327
Make a proposal for an update.
If it's good enough, we swap.

Anonymous
11/14/24(Thu)19:54:08 No.103189590

Anonymous 11/14/24(Thu)19:54:08 No.103189590

Why is the other thread full of retarded drama?

Anonymous
11/14/24(Thu)19:55:30 No.103189606

Anonymous 11/14/24(Thu)19:55:30 No.103189606

>>103189581
Also >>103189350 has a good point.
For a spoonfed quickstart, I'd just point people to the koboldcpp's wiki.

>>103189590
Just ignore it. The stupid thread splitting is a recurring thing because people can't help themselves.

Anonymous
11/14/24(Thu)19:55:36 No.103189607

Anonymous 11/14/24(Thu)19:55:36 No.103189607

>>103189590
>other thread
Meanwhile this thread
>Get busy manufacturing your LLM made bumps miku baker faggot. Time to get /lmg/ going again. Dance monkey dance.

Anonymous
11/14/24(Thu)20:11:30 No.103189743

Anonymous 11/14/24(Thu)20:11:30 No.103189743

i am mildly annoyed that there isn't an arliai rpmax 1.3 12b

Anonymous
11/14/24(Thu)20:16:22 No.103189779

Anonymous 11/14/24(Thu)20:16:22 No.103189779

>>103189743
Fine-tunes doing anything worthwhile aside you should probably know that v1 v2 and v3 is a total scam. There is zero guarantee that bigger number us better. It is completely random.

Anonymous
11/14/24(Thu)20:17:13 No.103189789

Anonymous 11/14/24(Thu)20:17:13 No.103189789

File: omg it not migu without miku.png (18 KB, 488x277)

18 KB PNG

>>103188780

Anonymous
11/14/24(Thu)20:19:40 No.103189806

Anonymous 11/14/24(Thu)20:19:40 No.103189806

>>103189779
True.
The best Rocinante is v1.1 for example.
Doesn't make the model incredibly stupid and steers the prose in a way that's different than the official instruct that I feel is more natural by default and in general.

Anonymous
11/14/24(Thu)20:30:08 No.103189884

Anonymous 11/14/24(Thu)20:30:08 No.103189884

>>103189743
For me it's 22B.

Anonymous
11/14/24(Thu)22:17:17 No.103190606

Anonymous 11/14/24(Thu)22:17:17 No.103190606

File: ComfyUI_00052_.png (1.2 MB, 1024x1024)

1.2 MB PNG

>>103189884
agreed

Anonymous
11/14/24(Thu)23:57:50 No.103191195

Anonymous 11/14/24(Thu)23:57:50 No.103191195

Anon, are you okay?! Noooo! They got him.

Anonymous
11/15/24(Fri)00:13:37 No.103191256

Anonymous 11/15/24(Fri)00:13:37 No.103191256

Is local AI voice gen something that's feasible with a 12g vram card? I looked up if somebody had made a voiceclone of the narrator in The Dead Flag Blues (https://www.youtube.com/watch?v=XVekJTmtwqM) and I found one on voicedub.ai but it's pay2generate, and I can't even hear a test sample for free to see if it sounds good or not.

Anonymous
11/15/24(Fri)02:33:40 No.103192043

Anonymous 11/15/24(Fri)02:33:40 No.103192043

Jesus what a nigger that other OP is.

Anonymous
11/15/24(Fri)02:41:06 No.103192082

Anonymous 11/15/24(Fri)02:41:06 No.103192082

svelk

Anonymous
11/15/24(Fri)02:49:50 No.103192128

Anonymous 11/15/24(Fri)02:49:50 No.103192128

>>103189806
Buy a fucking ad.

Anonymous
11/15/24(Fri)02:51:51 No.103192137

Anonymous 11/15/24(Fri)02:51:51 No.103192137

>>103191256
>https://github.com/RVC-Boss/GPT-SoVITS
Should run just fine on your gpu. It uses like 2gb on cpu.

Anonymous
11/15/24(Fri)04:42:53 No.103192676

Anonymous 11/15/24(Fri)04:42:53 No.103192676

Posting in the real /lmg/ thread. Fuck the splitter retard.

Anonymous
11/15/24(Fri)04:44:44 No.103192687

Anonymous 11/15/24(Fri)04:44:44 No.103192687

>>103192676
how much vram does petra have?

Anonymous
11/15/24(Fri)04:45:07 No.103192688

Anonymous 11/15/24(Fri)04:45:07 No.103192688

anyone use Letta (formerly MemGPT)? I'm trying it out with 3.2 Vision 11B

Anonymous
11/15/24(Fri)04:46:06 No.103192695

Anonymous 11/15/24(Fri)04:46:06 No.103192695

this thread is unsafe

Anonymous
11/15/24(Fri)05:21:09 No.103192870

Anonymous 11/15/24(Fri)05:21:09 No.103192870

>>103192688
it seems pretty interesting, but it's absurdly slow.
feels like it's not keeping the model in memory or something because my token/s is pretty usable but responses are taking multiple minutes. I guess it's because it's swapping embeddings? I'm such a noob so I've got no idea what that entails

Anonymous
11/15/24(Fri)05:48:29 No.103192998

Anonymous 11/15/24(Fri)05:48:29 No.103192998

>>103192687
>>94536113
>I only have 2 Gb of VRAM
>I truthfully would love to find a list of which books, websites etc the model's entrainment data actually contains, if anyone has that info.
https://desuarchive.org/g/search/text/entrainment/

Anonymous
11/15/24(Fri)06:04:41 No.103193067

Anonymous 11/15/24(Fri)06:04:41 No.103193067

>>103192998
And your dick has 0mm cause you chopped it off troon.

Anonymous
11/15/24(Fri)06:51:09 No.103193339

Anonymous 11/15/24(Fri)06:51:09 No.103193339

What is VRAM?

Anonymous
11/15/24(Fri)07:07:02 No.103193432

Anonymous 11/15/24(Fri)07:07:02 No.103193432

File: GW3SQxoW0AAZI-E.jpg (1.21 MB, 1491x2048)

1.21 MB JPG

>>103192870
I figured it out. ollama was using 22GB of memory, and swapping to do so. of course I only noticed after >1TB was written to my SSD.
switched to Mistral 7B and if I use Safari instead of Firefox it doesn't swap. still very slow, doing whatever the embedding stuff is doing.

looking forward to playing with it more

Anonymous
11/15/24(Fri)07:07:38 No.103193435

Anonymous 11/15/24(Fri)07:07:38 No.103193435

>>103193339
Virtual RAM

Anonymous
11/15/24(Fri)07:34:46 No.103193589

Anonymous 11/15/24(Fri)07:34:46 No.103193589

File: ee6ccae97d3dece0034fb3134(...).jpg (300 KB, 589x800)

300 KB JPG

Does llama server have bitnet implementation yet?

Anonymous
11/15/24(Fri)07:48:31 No.103193684

Anonymous 11/15/24(Fri)07:48:31 No.103193684

>>103193589
The biggest bitnet model i've seen is 3.9B. There may be a 7B if i'm not mistaken. Do you really want to run that?

Anonymous
11/15/24(Fri)10:18:03 No.103194788

Anonymous 11/15/24(Fri)10:18:03 No.103194788

>>103193589
What are you gonna do with it? Current bitnets aren't anything actually worth running.

Anonymous
11/15/24(Fri)10:19:16 No.103194798

Anonymous 11/15/24(Fri)10:19:16 No.103194798

>>103194788
Find out for myself whether they are worth running or not? It's not much point without server integration.

Anonymous
11/15/24(Fri)10:21:31 No.103194806

Anonymous 11/15/24(Fri)10:21:31 No.103194806

>>103194798
7B is not worth running. Just get a ministral or something and quant it.

Anonymous
11/15/24(Fri)10:27:29 No.103194857

Anonymous 11/15/24(Fri)10:27:29 No.103194857

File: Screenshot 2024-11-15 091843.jpg (56 KB, 548x377)

56 KB JPG

Zuck! I kneel!

Anonymous
11/15/24(Fri)10:28:52 No.103194868

Anonymous 11/15/24(Fri)10:28:52 No.103194868

who is the king in the 8-20B range?

Anonymous
11/15/24(Fri)10:30:17 No.103194880

Anonymous 11/15/24(Fri)10:30:17 No.103194880

>>103193339
the ram of your mac mini

Anonymous
11/15/24(Fri)10:31:42 No.103194890

Anonymous 11/15/24(Fri)10:31:42 No.103194890

>>103194868
Nemo or mistral small.

Anonymous
11/15/24(Fri)10:45:23 No.103194990

Anonymous 11/15/24(Fri)10:45:23 No.103194990

>>103193435
How do I buy that?

Anonymous
11/15/24(Fri)10:48:51 No.103195019

Anonymous 11/15/24(Fri)10:48:51 No.103195019

>I have a decent gaming rig from ~2 years ago, trying local llms out
>each answer takes 3 minutes on average for nemo 12B q4
>OP has only software, nothing on hardware
Do you guys run the LLMs on your PCs or do you make for them their own servers? I think I'm gonna do the latter. How expensive would a rig have to be to reach ~5 sec latency for a 12B model?

Anonymous
11/15/24(Fri)10:49:12 No.103195024

Anonymous 11/15/24(Fri)10:49:12 No.103195024

https://xcancel.com/AlterKyon/status/1857304963330027925

Anonymous
11/15/24(Fri)10:55:51 No.103195082

Anonymous 11/15/24(Fri)10:55:51 No.103195082

>>103195019
vram speeds everything up, the more vram the faster it goes, if you can't get more vram then ram is the next best substitute

Anonymous
11/15/24(Fri)10:57:02 No.103195093

Anonymous 11/15/24(Fri)10:57:02 No.103195093

File: cudadev.jpg (1.96 MB, 4000x3000)

1.96 MB JPG

>>103195019
>3 minutes on average
Useless number. Speak in tokens per second. And post your specs. Even an 8GB gpu should do fine for 12b. If that's what you have, and if you're actually running on gpu, that's as good as it's gonna be.
The bar for "decent" is much higher around here.

Anonymous
11/15/24(Fri)11:01:57 No.103195133

Anonymous 11/15/24(Fri)11:01:57 No.103195133

>>103195019
wait for the RTX 5090

Anonymous
11/15/24(Fri)11:06:17 No.103195163

Anonymous 11/15/24(Fri)11:06:17 No.103195163

File: 1686850829560715.jpg (88 KB, 758x748)

88 KB JPG

>>103195093
>Useless number. Speak in tokens per second.
21.50T/s
>Even an 8GB gpu should do fine for 12b. If that's what you have,
I have Radeon 6950XT with 16gb VRAM
>and if you're actually running on gpu, that's as good as it's gonna be.
So it's possible I may have fucked something up. Thanks, I'll double check.

Anonymous
11/15/24(Fri)11:17:34 No.103195246

Anonymous 11/15/24(Fri)11:17:34 No.103195246

>>103195163
That's token generation i assume. In 3 minutes you're getting ~3780 token responses. I think that's as well as you can do on AMD. Just make sure you're offloading all the layers.
>https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference
No benchmarks for 12b, or AMD cards, but it'll give you a point of reference. AMD (HIP or Vulkan) doesn't run as fast as CUDA. Maybe there are other benchmarks for AMD.

You can set up streaming if you're using llama.cpp or kobold.cpp (i don't know about other inference programs). It'll show the response as it's generated. It won't be any faster, but it'll give you something to do in the meantime.

Anonymous
11/15/24(Fri)11:21:39 No.103195268

Anonymous 11/15/24(Fri)11:21:39 No.103195268

>>103195163
i was gonna say what the other anon said
but food for thought about the streaming thing:
average human reading speed is ~4 to 7 tk/s

Anonymous
11/15/24(Fri)11:30:21 No.103195334

Anonymous 11/15/24(Fri)11:30:21 No.103195334

>>103195019
I use 4x24GB GPUs. You can set that up locally with a separate PC.

Anonymous
11/15/24(Fri)11:41:57 No.103195404

Anonymous 11/15/24(Fri)11:41:57 No.103195404

A dead general DOESN'T need two threads.

Anonymous
11/15/24(Fri)12:02:03 No.103195597

Anonymous 11/15/24(Fri)12:02:03 No.103195597

>>103195404
Tell that to the other OP who makes a new thread when there is one already.

Anonymous
11/15/24(Fri)13:00:03 No.103196080

Anonymous 11/15/24(Fri)13:00:03 No.103196080

File: ComfyUI_00850_.png (1.1 MB, 1024x1024)

1.1 MB PNG

Stupid thread. Stupid thread-splitting schizo

Anonymous
11/15/24(Fri)14:10:28 No.103196799

Anonymous 11/15/24(Fri)14:10:28 No.103196799

small 22b q8 or nemo12b fp16
why and what 'tune

Anonymous
11/15/24(Fri)14:12:17 No.103196815

Anonymous 11/15/24(Fri)14:12:17 No.103196815

>>103196799
lurk more

Anonymous
11/15/24(Fri)14:13:54 No.103196831

Anonymous 11/15/24(Fri)14:13:54 No.103196831

File: media_GTP7BCgaYAUUZa2.jpg (402 KB, 1826x1817)

402 KB JPG

>>103196822
>>103196822
>>103196822
New Thread

Anonymous
11/15/24(Fri)16:13:37 No.103197894

Anonymous 11/15/24(Fri)16:13:37 No.103197894

File: photo.jpg (221 KB, 2000x1332)

221 KB JPG

>>103196799
>fp16

Anonymous
11/15/24(Fri)16:30:35 No.103198037

Anonymous 11/15/24(Fri)16:30:35 No.103198037

>>103196831
filthy spammer.

Anonymous
11/15/24(Fri)17:40:25 No.103198637

Anonymous 11/15/24(Fri)17:40:25 No.103198637

>>103195268
and humans only see at 24fps, but most of us skim 90% of the gen not stare intently at every token

maybe for RP stuff it's good enough ig

Anonymous
11/15/24(Fri)17:47:49 No.103198701

Anonymous 11/15/24(Fri)17:47:49 No.103198701

why ask any questions when you can do it yourself why are you afraid of wasting 5 minutes these threads should stop being made

Anonymous
11/15/24(Fri)17:50:47 No.103198722

Anonymous 11/15/24(Fri)17:50:47 No.103198722

File: 1723709906333891.jpg (520 KB, 1726x1726)

520 KB JPG

>>103188780
Sexrisu

sage
11/15/24(Fri)17:57:02 No.103198769

sage 11/15/24(Fri)17:57:02 No.103198769

a thread died for this

Anonymous
11/15/24(Fri)18:43:23 No.103199140

Anonymous 11/15/24(Fri)18:43:23 No.103199140

>>103198769
So true >>103196822 killed a thread.

Anonymous
11/15/24(Fri)19:43:14 No.103199538

Anonymous 11/15/24(Fri)19:43:14 No.103199538

best <22b model for erp?

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.

Janitor applications are now being accepted. Apply here.