[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


File: migu.png (1.57 MB, 1432x808)
1.57 MB
1.57 MB PNG
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>107668478 & >>107660171

►News
>(12/26) MiniMax-M2.1 released: https://www.minimax.io/news/minimax-m21
>(12/22) GLM-4.7: Advancing the Coding Capability: https://z.ai/blog/glm-4.7
>(12/17) Introducing Meta Segment Anything Model Audio: https://ai.meta.com/samaudio
>(12/16) MiMo-V2-Flash 309B-A15B released: https://mimo.xiaomi.com/blog/mimo-v2-flash
>(12/16) GLM4V vision encoder support merged: https://github.com/ggml-org/llama.cpp/pull/18042
>(12/15) llama.cpp automation for memory allocation: https://github.com/ggml-org/llama.cpp/discussions/18049

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
>>
File: svg2.png (56 KB, 400x500)
56 KB
56 KB PNG
►Recent Highlights from the Previous Thread: >>107668478

--MiniMax M2.1 open-source release and performance comparisons:
>107670819 >107670883 >107670920 >107670942 >107670949 >107672201 >107672262 >107672284 >107672312 >107672331 >107672351 >107672350 >107671849 >107671941 >107672070 >107672185
--Critique of AI-generated code in open-source projects:
>107677585 >107677617 >107677642 >107677665 >107677632 >107677647 >107678712 >107679099 >107677898 >107678528 >107678559 >107678656 >107677639
--LLM coding assistance and GPU performance debates in software development:
>107672336 >107672434 >107672518 >107675192 >107675246 >107675253 >107675587 >107676351 >107679607 >107674915 >107675005
--Miku SVG design challenges and feedback loop experiments:
>107672556 >107672566 >107672755 >107672790 >107672831 >107672847 >107672860 >107672914 >107673073 >107672996 >107674108 >107674478 >107675763
--Quantized model performance comparison debate: GLM4.7 vs 4.5 Air under standardized metrics:
>107677093 >107677281 >107677372 >107677412 >107677455 >107677473 >107677524 >107677649 >107677685
--Quantization and precision conversion challenges:
>107668841 >107677216 >107677285 >107678095 >107678235 >107678390
--MTP feature underperforms in speeding up token generation despite theoretical benefits:
>107670614 >107671516 >107671534 >107671544 >107671557 >107671707 >107671749
--Benchmarking llama.cpp vs ik_llama performance:
>107674698 >107675709 >107675728 >107676008
--TTS model recommendations and usability challenges:
>107669227 >107669313
--A/B test comparison showing improved response quality:
>107668821
--m2.1 model improvements and roleplay experience:
>107677496
--GreedyNalaTests update:
>107675251
--Miku (free space):
>107668993 >107669827 >107670524 >107671006 >107671016 >107672556 >107672566 >107672755 >107672831 >107672914 >107678060 >107678129

►Recent Highlight Posts from the Previous Thread: >>107668479

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script
>>
>>107679741
I do not like this miqu, its bald patina mocks me.
>>
when will there be resources for local models on mobile?
>le mobile shitter
why yes, i'd rather keep my digital sovereignty while on the go.
>le connect to your home setup with tailscale
i could be off grid or be somewhere with shitty/nonexistent internet access
>>
>>107679803
>i could be off grid or be somewhere with shitty/nonexistent internet access
imagine being such a limp wristed pencil necked cuck you cant lug a ~10 kilo server in your backpack desu you dont deserve ai then
>>
File: covered.png (39 KB, 400x500)
39 KB
39 KB PNG
>>107679761
>>
>>107679761
>>
>>107679803
If I was off grid then my main concern wouldn't be not having access to LLMs
>>
So now that ZAI has fallen, what is there even left to look forward to in this hobby? The latest assistantslopped qwen...?
>>
So guys after using 4.7 some more I am sad to say I think I understand what happened. 4.7 is an affront to the coomers. GLM-sex is probably worse because the model basically repeats the same phrases a lot. In comparison I could use 4.6 for basically 2 months without a hint of boredom. And when I used 4.7 for truly degenerate shit which is SFW roleplay.... It is much better. They really improved the model for roleplay... just not for sex.

Not cool zai.
>>
>>107679821
>10kilo
All you need is raspberry pi, RTX 6000 and a PSU
>>
Also with those crazy 300W power banks we can eventually go full mobile with out servers and stop relying on 3kW outlets
>>
>>107679870
>his top priority in the middle of the woods isn't jerking it to miku erp slop
smells like faggot in here...
>>
Forgive the retarded question, but been out of the game for awhile and want to make some decisions.
I need to replace some parts that died, and given the cost of ram all I could source was a 32 gig kit of cl30 ddr 5 ram.
Historically my "ai box" has had a 2x48gb kit of slightly worse timings, the main hardware in there is 2 3090s, and my question is if I took the 96 gig kit for my main build and gave it 32, how much would that hurt things? I know most stuff goes in the vram, but as far as recent models are there better quants or anything to where I won't notice that missing ram?
>>
>>107680048
r1/kimi2 or bust too thats what like 7 rtx atleast ? that takes too much electricity that would force you to carry 80+ kg of solar panels batteries etc which actually becomes a problem
>>
>>107680494
We don't know. The current meta is stacking hundreds of RAM and run MoE models on it, but with the prices it's not sustainable. Let's wait for a miracle like zit to happen in the llm space
>>
>>107680557
You only need mistral nemo though
>>



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.