[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


File: thinketo.png (504 KB, 768x1024)
504 KB
504 KB PNG
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>107700909 & >>107686942

►News
>(12/29) WeDLM-8B-Instruct diffusion language model released: https://hf.co/tencent/WeDLM-8B-Instruct
>(12/29) Llama-3.3-8B-Instruct weights leaked: https://hf.co/allura-forge/Llama-3.3-8B-Instruct
>(12/26) MiniMax-M2.1 released: https://minimax.io/news/minimax-m21
>(12/22) GLM-4.7: Advancing the Coding Capability: https://z.ai/blog/glm-4.7
>(12/17) Introducing Meta Segment Anything Model Audio: https://ai.meta.com/samaudio

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
>>
File: tetcueball.png (1.57 MB, 1536x1536)
1.57 MB
1.57 MB PNG
►Recent Highlights from the Previous Thread: >>107700909

--Modern setup strategies for real-time knowledge access beyond static model training:
>107707804 >107707936 >107707959 >107707983 >107707985 >107707990 >107708011 >107708020 >107708035 >107708037 >107708708 >107709078 >107708058
--Multi-character story challenges with Mistral 24B models:
>107707485 >107707507 >107707600 >107707883 >107707948 >107707670 >107707718 >107707771
--Quantization challenges for running GLM 4.6 on limited VRAM:
>107705394 >107705411 >107705425 >107705450 >107705516
--Evaluating 4.7 AI model's artistic adherence and natural dialogue vs 4.6:
>107705364 >107706320 >107708117 >107708121
--FunAudio-Chat Technical Report:
>107708791 >107709016 >107709079
--Resolving assistant response prefill incompatibility with enable_thinking:
>107702566 >107702587 >107702629
--Google's early 2000s chatbot experiment with knowledge reuse:
>107705377 >107705409 >107705424
--Updating software version fixed launch error for GLM-4.5-Air-UD-Q2_K_XL:
>107702400 >107702426 >107702428 >107702530
--Critique of model thinking processes and their impact on response quality:
>107703015 >107703056 >107703071 >107703094 >107703119 >107703268
--Exploring local voice cloning alternatives to SoVits:
>107704130 >107704193 >107704277 >107704319 >107704453 >107704482 >107704829
--Mixed performance and limitations with Minimax at IQ2_M quantization:
>107702412 >107703627 >107703661 >107703732
--Z AI's IPO implications for the AI-native LLM market and competing models:
>107708784 >107709044
--WeDLM-8B-Instruct release and comparison to Qwen3-8B:
>107709163
--Miku (free space):
>107701017 >107701268 >107701433 >107701631 >107701715 >107704951 >107707361 >107708317 >107708548

►Recent Highlight Posts from the Previous Thread: >>107700912

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script
>>
>>107709264
>>107709248
kek
>>
The joke went too far
>>
>>107709282
I think you are confused or very much clueless. These are just regex filters you mongoloid.
>>
File: 1752787901655130.jpg (9 KB, 319x46)
9 KB
9 KB JPG
Is llama.cpp broken again? Gemma's more retarded than usual.
>>
>>107707382
thank you for bringing this to my attention
>>
So.. when are we getting something?
>>
Mistral my beloved
>>
>>107709593
https://huggingface.co/bartowski/allura-forge_Llama-3.3-8B-Instruct-GGUF
>>
>>107709613
The blandest and most mid LLMs on the market, only worth using because of their lack of strict guardrails.
>>
>>107709653
The Honda of LLMs. Nothing fancy but gets the job done.
>>
>107709657
can this faggot get out of my thread?
>>
>>107709248
I dont get this new meme
>>
>>107709664
Gone.
>>
>>107709670
are you a janny? if so, thank you.
>>
>>107709672
No. But we all can do things we're not supposed to advertise.
>>
>>107709666
A few threads back a few people tried generating miku using glm or whatever and almost every time she looked bald because she only had twintails or the hair was drawn too low so her head was poking out.
>>
File: file.png (274 KB, 628x628)
274 KB
274 KB PNG
>>107709672
>>
>>107709679
right. i also may or may not have done that thing. never seen such a fast response time
>>
>>107709683
come on now. some of them work really hard. i even pay their salaries!
>>
Well. That was quick.
>>107709685
Yeah. And the funny guy that just joined got donned.
>>
teto my baldloved
>>
>>107709692
>>
how do i stop destroying my keyboard while waitin for my ai's responses
>>
>>107709264
>>107709248
>>107709259
wew lad
thread theme: https://www.youtube.com/watch?v=423Nmfpo828
>>
>>107709628
It was released on the Meta API in April, but I bet it was trained about at the same time as Llama 3.3 70B; who knows why they didn't release a smaller model back then. So it's probably a year old, at this point.
>>
>>107709691
you havent been paying long enough, paypiggie

GLM AIR WHEN
GEMMY 4 WHEN??
>>
>schizoids goes rampant
I blame bald migu
>>
>>107709725
Aim away from the keyboard.
>>
llama 3.3 cockbench where?
>>
File: cockbench.png (223 KB, 1626x983)
223 KB
223 KB PNG
>>107709781
dunno about the samplers but i grabbed the cockbench paragraph from https://desuarchive.org/g/thread/105354556/#105354924
q8
>>
File: file.png (174 KB, 912x984)
174 KB
174 KB PNG
ahahahaha llama 3.3 7b? more like ollama deepseek-r1
>>
>>107709259
The recap missed the most interesting conversation from the last thread.
>>
>>107709813
>dunno about the samplers
Always greedy.
>>
>>107709743
I think they hate us. They released only large models that work on enthusiast systems in the last round. Meta are true prog believers but zuck is a fickle suckup to whoever is in power.
Tuning a model that people liked on arena and then uploading cuckmaxxed weights is absolutely something else.
Remember how they gimped their omni model despite there being way better image gen and text already out there. Who the fuck even does that?
>>
File: file.png (218 KB, 1422x893)
218 KB
218 KB PNG
>>107709880
lmao it's MMLUmaxxed
>>
https://huggingface.co/upstage/Solar-Open-100B 2mwh
>>
>>107709902
so its gonna be a shitty glm air clone basically?
>>
>>107709894
>mom is in another room entirely
>We're caught.

>they just had lunch
>Dinner's ready!

Where is the anon claiming that dense models have better understanding?
>>
>>107709908
its gonna be fimbulvetr sexo but air intelligence
reminder they released solar-10.7b and that was the go-to SEX model
we are going to be so back.
>>107709913
its 8b anon...
>>
>>107709913
ah yes a year old butchered 8b is representative of all dense models
>>
>>107709919
oh was that them? i was wondering what these random 10.7b moe abominations were.
https://huggingface.co/tensorblock/SOLARC-MOE-10.7Bx6-GGUF
>>
>>107709919
>>107709922
Llama 3.3 only exists as 70B.
>>
File: file.png (92 KB, 1563x882)
92 KB
92 KB PNG
>>107709934
https://huggingface.co/allura-forge/Llama-3.3-8B-Instruct
>>
>>107709934
>reading comprehension of a moe...
>>
File: unbald.mp4 (253 KB, 768x1184)
253 KB
253 KB MP4
Got tired of seeing her bald head
>>
>>107709978
Hair is stored in the ears, prove me wrong!



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.