[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


File: 1758010686171721.jpg (1.53 MB, 1298x2116)
1.53 MB
1.53 MB JPG
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>107847320 & >>107838898

►News
>(01/08) Jamba2 3B and Mini (52B-A12B) released: https://ai21.com/blog/introducing-jamba2
>(01/05) OpenPangu-R-72B-2512 (74B-A15B) released: https://hf.co/FreedomIntelligence/openPangu-R-72B-2512
>(01/05) Nemotron Speech ASR released: https://hf.co/blog/nvidia/nemotron-speech-asr-scaling-voice-agents
>(01/04) merged sampling : add support for backend sampling (#17004): https://github.com/ggml-org/llama.cpp/pull/17004
>(12/31) HyperCLOVA X SEED 8B Omni released: https://hf.co/naver-hyperclovax/HyperCLOVAX-SEED-Omni-8B

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
>>
►Recent Highlights from the Previous Thread: >>107847320

--AI companion project updates and TTS technology debates:
>107850660 >107850672 >107850928 >107850947 >107853430 >107853443 >107853483 >107853554 >107853511 >107853575 >107853593 >107853616 >107853690 >107854122 >107854331 >107850983 >107851185 >107851216 >107851258 >107851258 >107853585
--Ouro-2.6B with claimed 12B performance and Scaling Latent Reasoning via Looped Language Models:
>107850350 >107850413 >107850458 >107850486 >107850501 >107850519
--Optimizing client-side beam search with logprobs and batching challenges:
>107848617 >107850562 >107853006 >107853060 >107853196 >107853678
--Tools and methods for converting EPUB books to audiobooks:
>107848869 >107848873 >107848915 >107849013 >107849252
--LLM roleplaying tool UI comparisons: RisuAI vs Talemate tradeoffs:
>107847698 >107847879
--Model-GPU compatibility and quantization tradeoffs for consumer hardware:
>107855431 >107855475 >107855507 >107855527 >107855540 >107855562 >107855603 >107855650 >107855676 >107855692
--Conflicting mmap behaviors in model loading across Windows/Linux platforms:
>107849640 >107849759 >107849779 >107849831 >107849870 >107849925 >107849966 >107849976 >107849992
--Multi-model debate system feasibility in chat environments:
>107850478 >107851241 >107851246 >107851397 >107851424 >107851506
--Llama.cpp integration via three-vrm example code:
>107851325
--GPU options for LLMs: Ampere+ requirements vs budget constraints:
>107851370 >107851398 >107851413 >107854581 >107851423 >107851429 >107851444 >107851450 >107851463 >107851469 >107851817
--P40 limitations for modern LLM frameworks despite llama.cpp support:
>107851776 >107851795 >107851803 >107852059
--GLM-Image release:
>107856290
--Miku and Teto (free space):
>107847978 >107849177 >107849273 >107850396 >107851664 >107852635 >107852843 >107853561

►Recent Highlight Posts from the Previous Thread: >>107847322

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script
>>
S-so it's really here?
autoregressive imagegen for local? I'm not dreaming?
>>
>>107856463
Worse than zit despite being newer and bigger, basically a garbage
https://huggingface.co/zai-org/GLM-Image/discussions/1
>>
File: Capture.png (6 KB, 968x64)
6 KB
6 KB PNG
>>107848869
>>107849013
>>107849252
ended up with kokoro-tts docker image and a chatgpt script to automatize the process. werks well, sounds good.
im running the model on runpod since i have some credit on there. rtx 4000 and 5080 both seem give around 3,5 sec/chunk. gonna try it locally on my 3060.
>>
>>107856493
lmao at the shills seething in the comments
novelai must be planning to adopt that model in the future with how they're panicking about getting exposed
>>
>>107856717
You're the *** shill, how would they have plans in the past to adopt an unknown model? Or are you suggesting Z.ai bought them out?
>>
Hey guys! Is this the place where I ask about NSFW Roleplay AI or do I need to move to another general? Looking for a place to like, learn the AI goober things really...
>>
>>107856809
Buy an ad.
>>
>>107856809
this place is as good as any other.
>>
>ouro-2.6B
verdict?
>>
>>107856424
Newfag here. How good are local models now? If I had a mountain of GPU how close to chat gpt can I get
>>
File: 1767551307991286.png (1.37 MB, 2175x1234)
1.37 MB
1.37 MB PNG
>>107856717
From a practical standpoint it should be smaller or better, there's no point in making it bigger and worse. That is embarrassing. But if you think about it for a second, you'll understand why it's a good thing to do something new and still release it even if it turns out to be shit. That's how progress is made
>>
>>107856886
>1.4MB
>>
File: file.png (365 KB, 2053x1322)
365 KB
365 KB PNG
>copy the instructions from the readme
>it doesn't work
>use the docker container
>it doesn't work
Thanks for nothing.
>>
>>107856809
this general is technical discussion and asshats
/aicg/ is casual roleplay discussion and mouthbreathers
take your pick! oh, and I forget that a general still exists on /vg/. That one is where a lot of this started, but last I checked it was kind of a weird place to hang out. in this general, actual groundbreaking SOTA methods like CoT have been explored by an anon who was later cited in a paper, I think kalomaze was his name but some anon can correct me. that was the principle behind reasoning, which became a big part of 2025 LLM releases. and we have a resident llama.cpp dev, and he's a pretty cool guy. but if you're less interested in running models than you are in roleplaying with them, /aicg/ might be more your speed. i hope you have a fun and warm 4chan experience newfriend :)
>>
>>107857025
Model?
>>
>>107857025
>CoT
That was /aids/ or /aidg/, back when it was GPT-3. Kalomaze made min p.
>>
>>107856983
>>
File: brrrrrrrrrrrrr.jpg (390 KB, 1200x898)
390 KB
390 KB JPG
Anyone tried these cheap Mi50 blower fan kits from Aliexpress? Tempted to replace my jank solution with something more aesthetic.
>>
>>107857227
>big corporations bought up entire hardware supplies to build AI god, causing shortages
>chinks sell 3D-printed customization kits for literal e-waste so you can run your local AI waifus
We're already living in a cyberpunk dystopia
>>
>>107857227
Are those printed? Is it really a good idea to be sticking thermoplastic directly onto your big ass gpu heatsink?
>>
I gaze upon this thread from the pit of despair. Ram and gpu prices won’t drop for decades. All hope is lost.
>>
>>107857483
All we need is a zit moment for llms. Z image turbo and LTX-2 gave me so much hope, I believe the future isn't grim anymore
>>
>>107857475
Yeah man just do it. Don't be a pussy, give your GPU a nice new coat to keep it warm this winter.
>>
>>107857494
Popping zits?
>>
>>107857494
Can the z image stuff do good nsfw? How about video stuff?
>>
File: 1749774314960475.gif (209 KB, 900x650)
209 KB
209 KB GIF
>>107857483
Honestly a modern, well-trained 12b model could probably BTFO just about everything short of SOTA 300b+ models in creative and RP if it was trained on a quality, curated dataset with minimal synthetic slop. This is why Nemo is still so highly regarded today, imagine if we got a model with the relative smarts of gemma 12b without safetycuckery and less gemini slop.
Throwing more hardware at the problem is a bandaid solution for a competency issue, coupled with companies being to scared or gay to give people what they want.
>>
>>107857528
>Throwing more hardware at the problem is a bandaid solution for a competency issue, coupled with companies being to scared or gay to give people what they want.
truth nuke
>>
>>107857494
>Z image turbo and LTX-2 gave me so much hope
Z-image turbo yes, it's a 6b model, but LTX2 is a huge boi (19b), one of the biggest local video models, not the best example of efficiency imo
>>
>>107857528
the problem is that it is difficult for the model during training to differentiate in the dataset between intelligence and style. if you feed the model every single high quality non synthetic token, it will still be dumb due to the sheer lack of non synthetic data in the world. ai is used everywhere at every level now and has been for at least a year and a half. you cannot train a model that is aware of the current state of the world without using synthetic data.



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.