/g/ - /lmg/ - Local Models General - Technology


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

Anonymous
/lmg/ - Local Models General 11/25/24(Mon)01:14:48 No.103298520

File: 005579.png (1.29 MB, 896x1152)

/lmg/ - Local Models General Anonymous 11/25/24(Mon)01:14:48 No.103298520

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>103286673 & >>103278810

►News
>(11/22) LTX-Video: Real-time video generation on a single 4090: https://github.com/Lightricks/LTX-Video
>(11/21) Tülu3: Instruct finetunes on top of Llama 3.1 base: https://hf.co/collections/allenai/tulu-3-models-673b8e0dc3512e30e7dc54f5
>(11/20) LLaMA-Mesh weights released: https://hf.co/Zhengyi/LLaMA-Mesh
>(11/18) Mistral and Pixtral Large Instruct 2411 released: https://mistral.ai/news/pixtral-large
>(11/12) Qwen2.5-Coder series released: https://qwenlm.github.io/blog/qwen2.5-coder-family

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/tldrhowtoquant

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/hsiehjackson/RULER
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
11/25/24(Mon)01:15:19 No.103298523

Anonymous 11/25/24(Mon)01:15:19 No.103298523

File: ComfyUI_00093_.png (2.35 MB, 1536x1536)

2.35 MB PNG

►Recent Highlights from the Previous Thread: >>103286673

--Paper: Simplifying CLIP: Unleashing the Power of Large-Scale Models on Consumer-level Computers:
>103297845 >103297860 >103297890
--Papers:
>103296533 >103297733 >103297788 >103297918
--Testing and comparing LLMs, abliteration, and imatrix:
>103293087 >103293117 >103293133 >103293276 >103293314 >103293388 >103294589 >103294729 >103294759 >103294852
--Anon seeks to replace Claude 2 with a local model:
>103297270 >103297321 >103297441 >103297490 >103297520 >103297615
--Anon releases unofficial SMT implementation with PEFT version:
>103296930 >103297151 >103297268
--Merging safetensors files and quantization discussion:
>103288602 >103288654 >103291928 >103292005 >103292180
--LTX-Video model discussion and testing:
>103288336 >103288358 >103293709 >103293808 >103293832 >103293833 >103293979 >103294017 >103294054 >103294101
--Best practices for creating character definitions in koboldAI:
>103296813 >103297047 >103297081 >103297216
--Anons share non-coom uses for the model, including art and programming:
>103286774 >103286788 >103288316 >103286822 >103286831 >103286978 >103286998 >103287002 >103287586
--Card formatting debate and character writing discussion:
>103295022 >103295128 >103295138 >103295179 >103295271 >103295277 >103295250 >103295290 >103295338 >103295472
--Kernel update has no effect on CPU inference performance:
>103287570
--Athene-V2-Chat-72B open model performance and implications:
>103293224 >103293513 >103294469 >103294670
--Anon shares comic-translate app for automatic comic translations:
>103290835
--Anon shares AmoralQA-v2 dataset and discusses its usage in models:
>103287899 >103288215 >103288314
--Miku (free space):
>103286754 >103287503 >103289721 >103290110 >103292155 >103292194 >103292482 >103292570 >103294256 >103294336 >103295577 >103296695

►Recent Highlight Posts from the Previous Thread: >>103286678

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script

Anonymous
11/25/24(Mon)01:19:07 No.103298545

Anonymous 11/25/24(Mon)01:19:07 No.103298545

>>103298523
nigga what

Anonymous
11/25/24(Mon)01:21:03 No.103298557

Anonymous 11/25/24(Mon)01:21:03 No.103298557

>>103298447
Careful, you're gonna set off the MagnumV4 72B schizo. He won't countenance any praise for a larger model.

Anonymous
11/25/24(Mon)01:35:35 No.103298620

Anonymous 11/25/24(Mon)01:35:35 No.103298620

So with the new behemoth what the fuck am I supposed to do with the instruct format? I didn't quite understand the instructions with the prompt format drummer left.

Anonymous
11/25/24(Mon)01:56:30 No.103298712

Anonymous 11/25/24(Mon)01:56:30 No.103298712

Zzzz

Anonymous
11/25/24(Mon)01:56:49 No.103298713

Anonymous 11/25/24(Mon)01:56:49 No.103298713

File: 005585.png (1.34 MB, 896x1152)

1.34 MB PNG

:-)

Anonymous
11/25/24(Mon)01:57:21 No.103298717

Anonymous 11/25/24(Mon)01:57:21 No.103298717

ahhhh

Anonymous
11/25/24(Mon)01:59:21 No.103298723

Anonymous 11/25/24(Mon)01:59:21 No.103298723

File: 005586.png (1.16 MB, 896x1152)

1.16 MB PNG

:o

Anonymous
11/25/24(Mon)02:02:49 No.103298738

Anonymous 11/25/24(Mon)02:02:49 No.103298738

>>103298520
>>103298523
>>103298712
>>103298713
>>103298717
>>103298723
Large language models?

Anonymous
11/25/24(Mon)02:03:24 No.103298741

Anonymous 11/25/24(Mon)02:03:24 No.103298741

>>103298738
You're in the wrong hood

Anonymous
11/25/24(Mon)02:04:14 No.103298742

Anonymous 11/25/24(Mon)02:04:14 No.103298742

Which LLM is good if I want a buddy for programming? Which LLM is good if I needed someone to reference and improve my report writing? I tend to write a lot of things in the passive voice which is annoying.

Anonymous
11/25/24(Mon)02:10:54 No.103298770

Anonymous 11/25/24(Mon)02:10:54 No.103298770

>>103298742
these losers just masturbate.

Anonymous
11/25/24(Mon)02:20:40 No.103298807

Anonymous 11/25/24(Mon)02:20:40 No.103298807

>>103298770
Anon, none of these questions are genuine.

Anonymous
11/25/24(Mon)03:07:44 No.103299050

Anonymous 11/25/24(Mon)03:07:44 No.103299050

>>103298742
The last qwen-32B

Anonymous
11/25/24(Mon)03:12:02 No.103299079

Anonymous 11/25/24(Mon)03:12:02 No.103299079

Meow! :3

Anonymous
11/25/24(Mon)03:18:19 No.103299126

Anonymous 11/25/24(Mon)03:18:19 No.103299126

>>103298742
Qwen2.5 Coder 32b instruct
>24gb VRAM
>Q4_K_L
>16000 context length

Its surprisingly really good. Not as good as the SOTA models, but better than literally anything else for local. Starts to derp out a bit at longer contexts like all other models. I just use it in ST for now but plan on hooking it up to aider whenever I get ollama running.

Anonymous
11/25/24(Mon)03:24:13 No.103299167

Anonymous 11/25/24(Mon)03:24:13 No.103299167

>>103299050
>>103299126
For both programming and writing feedback? Ideally for me it'll be trained enough and I can isolate the information on my machine.

Anonymous
11/25/24(Mon)03:25:16 No.103299176

Anonymous 11/25/24(Mon)03:25:16 No.103299176

>>103299126
>aider whenever I get ollama running.
the cringe... it hurts...

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.