[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


File: 005579.png (1.29 MB, 896x1152)
1.29 MB
1.29 MB PNG
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>103286673 & >>103278810

►News
>(11/22) LTX-Video: Real-time video generation on a single 4090: https://github.com/Lightricks/LTX-Video
>(11/21) Tülu3: Instruct finetunes on top of Llama 3.1 base: https://hf.co/collections/allenai/tulu-3-models-673b8e0dc3512e30e7dc54f5
>(11/20) LLaMA-Mesh weights released: https://hf.co/Zhengyi/LLaMA-Mesh
>(11/18) Mistral and Pixtral Large Instruct 2411 released: https://mistral.ai/news/pixtral-large
>(11/12) Qwen2.5-Coder series released: https://qwenlm.github.io/blog/qwen2.5-coder-family

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/tldrhowtoquant

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/leaderboard.html
Code Editing: https://aider.chat/docs/leaderboards
Context Length: https://github.com/hsiehjackson/RULER
Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard
Censorbench: https://codeberg.org/jts2323/censorbench

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
>>
File: ComfyUI_00093_.png (2.35 MB, 1536x1536)
2.35 MB
2.35 MB PNG
►Recent Highlights from the Previous Thread: >>103286673

--Paper: Simplifying CLIP: Unleashing the Power of Large-Scale Models on Consumer-level Computers:
>103297845 >103297860 >103297890
--Papers:
>103296533 >103297733 >103297788 >103297918
--Testing and comparing LLMs, abliteration, and imatrix:
>103293087 >103293117 >103293133 >103293276 >103293314 >103293388 >103294589 >103294729 >103294759 >103294852
--Anon seeks to replace Claude 2 with a local model:
>103297270 >103297321 >103297441 >103297490 >103297520 >103297615
--Anon releases unofficial SMT implementation with PEFT version:
>103296930 >103297151 >103297268
--Merging safetensors files and quantization discussion:
>103288602 >103288654 >103291928 >103292005 >103292180
--LTX-Video model discussion and testing:
>103288336 >103288358 >103293709 >103293808 >103293832 >103293833 >103293979 >103294017 >103294054 >103294101
--Best practices for creating character definitions in koboldAI:
>103296813 >103297047 >103297081 >103297216
--Anons share non-coom uses for the model, including art and programming:
>103286774 >103286788 >103288316 >103286822 >103286831 >103286978 >103286998 >103287002 >103287586
--Card formatting debate and character writing discussion:
>103295022 >103295128 >103295138 >103295179 >103295271 >103295277 >103295250 >103295290 >103295338 >103295472
--Kernel update has no effect on CPU inference performance:
>103287570
--Athene-V2-Chat-72B open model performance and implications:
>103293224 >103293513 >103294469 >103294670
--Anon shares comic-translate app for automatic comic translations:
>103290835
--Anon shares AmoralQA-v2 dataset and discusses its usage in models:
>103287899 >103288215 >103288314
--Miku (free space):
>103286754 >103287503 >103289721 >103290110 >103292155 >103292194 >103292482 >103292570 >103294256 >103294336 >103295577 >103296695

►Recent Highlight Posts from the Previous Thread: >>103286678

Why?: 9 reply limit >>102478518
Fix: https://rentry.org/lmg-recap-script
>>
>>103298523
nigga what
>>
>>103298447
Careful, you're gonna set off the MagnumV4 72B schizo. He won't countenance any praise for a larger model.
>>
So with the new behemoth what the fuck am I supposed to do with the instruct format? I didn't quite understand the instructions with the prompt format drummer left.
>>
Zzzz
>>
File: 005585.png (1.34 MB, 896x1152)
1.34 MB
1.34 MB PNG
:-)
>>
ahhhh
>>
File: 005586.png (1.16 MB, 896x1152)
1.16 MB
1.16 MB PNG
:o
>>
>>103298520
>>103298523
>>103298712
>>103298713
>>103298717
>>103298723
Large language models?
>>
>>103298738
You're in the wrong hood
>>
Which LLM is good if I want a buddy for programming? Which LLM is good if I needed someone to reference and improve my report writing? I tend to write a lot of things in the passive voice which is annoying.
>>
>>103298742
these losers just masturbate.
>>
>>103298770
Anon, none of these questions are genuine.
>>
>>103298742
The last qwen-32B
>>
Meow! :3
>>
>>103298742
Qwen2.5 Coder 32b instruct
>24gb VRAM
>Q4_K_L
>16000 context length

Its surprisingly really good. Not as good as the SOTA models, but better than literally anything else for local. Starts to derp out a bit at longer contexts like all other models. I just use it in ST for now but plan on hooking it up to aider whenever I get ollama running.
>>
>>103299050
>>103299126
For both programming and writing feedback? Ideally for me it'll be trained enough and I can isolate the information on my machine.
>>
>>103299126
>aider whenever I get ollama running.
the cringe... it hurts...
>>
Is llama.cpp still being developed? Anyone know what to use to run models that aren't hopelessly out of date?
>>
>>103299238
which fucking repo are you on goddamn
>>
>>103299176
Brother I know. I primarily use ooba and have been resisting ollama. Unfortunately aider doesn't work well with ooba.
>>
>>103299167
Just coding.
>>
>>103299238
No, it's over.
>>
>>103298447
No, it's largely the same model. The meta is still Qwen2.5. Especially if you have to run Large at 2bits.
>>
>>103299393
Fuck :(
>>
>>103299441
After having a great talk 16k token interactive learning session with the new largestral, I couldn't disagree more. To say its just smarter doesn't really reflect how its improved...If it were a person I would say it's much better "put together" and on the ball than the previous one. It also doesn't suffer from nearly the repeating problem qwen2.5 has, which is the main thing that keeps me from using that model full time. I really wish that qwen was better, because I love the extra speed.
>>
>>103299601
Can you show me? I have 96GB of VRAM to run it at 4bits and it doesn't even justify occupying disk space in my computer.
>>
>>103299616
Lying faggot nigger. And no, posting someone else's nvidia-smi screenshot from the archives won't mean anything either.
>>
>>103298557
lol here he is >>103299441
>>
File: 1606384321251.jpg (103 KB, 800x800)
103 KB
103 KB JPG
Can I power a Tesla with a spare CPU cable or do I have to use picrel? It fits and the pinout is the same and it says on the internet it's the same connector and redditors say they use it, but I want to hear it from an anon...
>>
>>103299659
Maybe the next time try releasing a model that's at least trained on 18 millions tokens like Qwen, Arthur.
You didn't even show any evals or anything for the new model, you don't believe in your own creation, and you have to resort to spamming 4chan with shills...
Is this the best that your company can do?
>>
>>103299616
The logs aren't in English, and I don't really want to post them anyways.
I'm running it at q8, but I don't know if that makes a substantial enough difference in quality to make or break it vs qwen.
Its just my gut-feel, really, but thought I'd put my experience out there.
>>
>>103298520
is there a simple way to make tavern cards from a cai character



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.