[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


File: congration.jpg (228 KB, 1024x1024)
228 KB
228 KB JPG
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>107864105 & >>107856424

►News
>(01/15) TranslateGemma released: https://hf.co/collections/google/translategemma
>(01/14) LongCat-Flash-Thinking-2601 released: https://hf.co/meituan-longcat/LongCat-HeavyMode-Summary
>(01/08) Jamba2 3B and Mini (52B-A12B) released: https://ai21.com/blog/introducing-jamba2
>(01/05) OpenPangu-R-72B-2512 (74B-A15B) released: https://hf.co/FreedomIntelligence/openPangu-R-72B-2512
>(01/05) Nemotron Speech ASR released: https://hf.co/blog/nvidia/nemotron-speech-asr-scaling-voice-agents

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
>>
File: no particular reason.jpg (306 KB, 1536x1536)
306 KB
306 KB JPG
►Recent Highlights from the Previous Thread: >>107864105

--Paper: Ministral 3:
>107866902 >107866921 >107867400 >107867439 >107867521
--Papers:
>107867298
--TTS tool performance comparisons and resource challenges:
>107867729 >107867935 >107867969 >107867995 >107868067 >107868843 >107868864 >107868926 >107868937 >107868806 >107868159 >107868215 >107868260 >107868325 >107868357 >107868430 >107868353 >107868939 >107869085
--Google Gemma models update with TranslateGemma performance analysis:
>107869136 >107869860 >107870012 >107870672 >107869282 >107869353
--Multi-NVIDIA GPU compatibility and driver workaround challenges:
>107866329 >107866347 >107866370 >107866376 >107866468 >107866398 >107866426 >107866495 >107866827 >107866841 >107866903 >107867162
--Nvidia consumer GPU market shifts:
>107870399 >107870476 >107870488 >107870546 >107870591 >107870782
--ExaOne MoE cockbench results and content moderation challenges:
>107864456 >107864593 >107864594
--Comparing Google's TranslateGemma models with Gemma 3 for translation quality:
>107871784 >107871900 >107873103 >107873199 >107873290
--China's H200 GPU import ban and its implications for global VRAM availability:
>107868589 >107868600 >107868669 >107868718 >107868762
--adaptive-p sampler implementation in llama.cpp:
>107870871
--Struggles with uncensored AI model restrictions and creative bypass attempts:
>107871012 >107871097 >107871161 >107871720 >107871761 >107871781
--Falcon-H1-Tiny model collection on Hugging Face:
>107869739
--2003 hardware insufficient for modern AGI training due to computational limits:
>107871954 >107872024 >107872045 >107872086 >107872258 >107872078 >107872081 >107872433
--Language model fails to generate proper 3D NPC model:
>107867453 >107867473 >107867482 >107868022
--Rin-chan (free space):
>107865561 >107866509 >107867855 >107871514 >107872458

►Recent Highlight Posts from the Previous Thread: >>107864106

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script
>>
>>107873752
>>107873764
slop
>>
File: mikudayo.jpg (153 KB, 640x1536)
153 KB
153 KB JPG
>>107873752
>>107873764
top tier onahole
>>
File: ss.png (48 KB, 128x232)
48 KB
48 KB PNG
>>107873799
It's full of snatshis drawing chewbaccasticas.
>>
so why haven't companies tried making good models yet?
>>
>>107873764
post catbox for the last rin?
>>
What if I have a 9070XT
>>
>>107873842
What about it?
>>
>>107873828
Anon last thread was basically complaining that they were too good already.
>>107871761
>>107871782
>>
>>107873842
Condolences
>>
>>107873842
you throw it in the trash where it belongs. wait no. recycle it. protect the environment.
>>
>>107873849
but that's false. current models suck.
>>
>>107873846
>>107873854
>>107873857
Guess I can't LM then after all
>>
>>107873868
They're just salted trolls, you can do the LLM just fine.
>>
>>107873868
Of course you can.
>>
>>107873868
Build with vulkan so you don't have to fuck around with RoCm/HIP, download some nemo or mistral small and have fun anon. I get to have fun on much less. You're fine.
>>
Haven't been here in forever, so I have a simple question: Anything interesting happen model wise for RP retardation on 24GB VRAM? What about multimodal bullshit (image recognition)?
>>
>>107873908
nope, nemo is still sota at that range
>>
>>107873908
no
>>
>>107873908
>Anything interesting happen model wise for RP
Not really, Nemo/Mistral Small still the go-to.
>image recognition
Qwen3-VL is pretty accurate with minimal censorship. Much better than Gemma/Mistral's vision.
>>
>>107873835
I am neither the baker nor the guy who made the image but you're not getting the catbox because his stuff includes a bunch of inpainting and manual edits. The text is obviously added manually for example.
>>
>>107873914
>>107873916
>We're still stuck with Nemo/Mistral
Shocking stuff. Nothing ever really happens, I guess.
>>107873931
>Qwen3-VL is pretty accurate with minimal censorship.
Thank, will look into it!
>>
File: comparison.png (468 KB, 1888x3863)
468 KB
468 KB PNG
Opus 4.5: Human-like response, curiosity.
GLM 4.6: I have no feelings, I'm a machine.
Deepseek 3.1: Retarded, thinks he's me.
GPT 5.2: No feelings either, _obviously_.
Kimi K2: Middle point. "I feel useful". "No strong emotions".
Gemini 3 Pro: Similar response to Opus. Human-like.

This is consistent in my experience. Gemini is the only other modern model that feels it's trying to sound like a human rather than trying to sound like a robot. And I don't want to cope by adding a system prompt that tells it to pretend it does, that feels like cheating (and I'm not even sure it would actually work).
>>
>>107873937
don't think he cares to steal the prompts or whatever likely just wants pron
>>
>>107873937
Didn't even notice the text the first time
>>
>>107873540
Forgot to quote >>107873957
>>
>>107873957
Reading that image makes me sick to my stomach. Greasy-ass slop oozing through any flimsy attempt at adhering to sounding "human".
>>
The bigger the parameters the better the art? Can a Pi Hat 2 ever be good at drawing fapworthy waifus?
>>
>>107873987
I stopped caring about the slop, like when you fall in love with a fat or ugly woman you stop caring about her physical appearance.
>>
>>107874010
Okay, but this is more like when someone attractive you're with has a terrible, boring personality, specifically at the point where even hearing her speak is grating on you and makes you want to puke.
>>
File: 1738738663273246.jpg (643 KB, 850x2934)
643 KB
643 KB JPG
https://github.com/GetfroggyHoe/universal-immersion-engine
>>
>>107874055
>mobile mobile mobile
ew
>>
File: fbfan.png (32 KB, 1003x131)
32 KB
32 KB PNG
>>107874050
This all started when he tried to train a model on the "philosophy" if his favorite pedo. You can't expect much of him.
>>
>>107874055
Neat.
I'll steal some ideas from that to help refine my frontend.
>>
File: fishboy.png (101 KB, 1517x238)
101 KB
101 KB PNG
Good anons don't let anons forget.
>>
>>107874106
you're obesed!
>>
>>107874106
who cares
>>
File: trust.jpg (155 KB, 1024x1024)
155 KB
155 KB JPG
>>
>>107874106
It's crazy that you could be in a thread where so much knowledge about models is both available and necessary and still fall into the AI psychosis bullshit.
>>
>>107874055
I've been working on something similar. but standalone and more freeform.
>>
>>107874055
Can't wait for this to be abandoned almost instantly like waidrin. Also I'm assuming you made it, seeing as it was uploaded an hour ago? That's cool if so. If you got it from reddit or something though eat shit.
>>
File: icecube.png (354 KB, 1303x728)
354 KB
354 KB PNG
>>107874106
Damn, looks like I already have a couple haters, now I only need to get myself some groupies and I can say I made it.
>>
llama webui is going to get mcp support
https://github.com/ggml-org/llama.cpp/pull/18655
>>
*ahem* kimi sex
>>
>>107874141
Yeah. Reminds me of that schizo that kept claiming sillytavern was somehow injecting leftist propaganda into his model during inference. Hope that retard never comes back.
>>
>>107874441
welcome to 2024, llama.cpp
>>
>>107874544
Given how convoluted sillytavern settings are what how cards can override any of your settings I wouldn't be surprised if that was true.
>>
>>107874544
LMAO
It's crazy because if you use kobold or llamacpp you can see exactly the prompt silly sends.
>>
>>107874564
silly itself literally has it's prompt tool thingy that can tell you exactly what the backend received for the last genned message
>>
>>107874579
Yeah but if you're claiming ST injects propaganda I'm sure they wouldn't show it in the inspection window.
>>
SillyTavern is garbage. You can unironically replicate 90% of its features in ~600 lines of code. Seriously.

That includes a system prompt, character cards, conversation history/context utilization, saved chats, lorebooks, UI/UX, etc. Don't implement sampling. Just use llama.cpp's sampling flags on the backend. It's literally so easy.

Idk how it even became such a bloated piece of shit in the first place.
>>
>>107874587
yeah but llama.cpp has "--verbose" that shows you the exact prompt it's being given and other inference backends have similar options
so unless literally all of them are in on the conspiracy, it'd be easy to catch if ST did something like this
>>
>>107874564
I'm not quite sure about this and don't feel like going back to look so take this with a grain of salt but I'm pretty sure that schizo was always moving the goalposts to justify the belief that his models were somehow compromised by leftist propaganda and censorship despite not actually having proof. It's infuriating to see people spiral into AI psychosis and genuinely believe their own delusions.

It's one thing to acknowledge that models have inherent biases inherited by the pre and post training, but to claim sillytavern was injecting subversive material after moving the goalpost so many times is just legitimate schizophrenic behavior.

>>107874587
You can view exactly what tokens get sent to the backend on pretty much every inference engine out there. If there are extra injected tokens you would be able to see it.
>>
>>107874643
>reddit spacing
post ignored
>>
>>107874544
The fuck? You can literally inspect using your browser the raw request that ST sends out. I'm not a programmer and even I know how to do it. And then there's the Llama.cpp console window obviously, if you are using Llama.cpp.
>>
>>107874544
Considering how much of it is in the training data from including scrapes of reddit and twitter alone why would anyone ever need to inject more.
>>
>>107874643
ok then do it, if you are right then surely it should be easy to make a better solution and gain popularity.
>>
>>107874648
Pretty much. The "ST is messing with your prompts" schizo doesn't realize that every backend has an option to view the incoming prompts to see what's actually going on
ST mangles your text completions, but it isn't injecting whatever schizo babble shit that gets talked about here
This faggot also nonstop talks about pol, his conspiracy theories, targets virtually anything from flashattention to anyone who makes a single finetune and thinks you need to run nemo at fp64 for actual usage
>>
>>107874722
I literally have. That's why I'm saying this..
>>
>>107874564
>>107874648
Is there a flag so I can see progress and the prompt that goes in, like kobold? I tried --verbose and --verbose-prompt but it just vomits everything (not useful) into the terminal
>>
>>107874699
We tend to think about hallucinations in the current year as an LLM thing when humans are just as capable of having them.
>>107874733
I haven't seen him post in a long time so hopefully he never comes back.
>>
>>107874754
no you havent.
>>
>>107873380
Why not just make the images yourself? Seems like a waste of time.
>>
>>107874733
I you're talking about me (the "AI psychosis" guy from the screenshot), I'm not the /pol/ guy or the anti-flash-attention guy, I'm literally trying to re-implement flash attention in C.
The closest thing to the truth from what you've said is insulting some finetuners, for which I deeply apologize.
>>
>>107875092
I don't care about people being so mentally weak they get mindbroken by ai, so that was not what I was referring to
I specifically dislike the one ubiquitous shitposter that shits on anything that could involve any form of conversation, be it new model releases, some retard's finetune, or whatever new dumb shit he invents to be disingenuous about
>>
>ego death
>me
>me
>me
>>
>>107875174
I already told you I'm not the ego death guy!!! I fully admit to having a big ego and being a huge attention whore.
>>
>>107874000
>>
>>107875387
Bump for question
>>
>>107874000
no
>>
File: image.jpg (284 KB, 832x1248)
284 KB
284 KB JPG
>>107875408
Grok on my smartphone gives a eulogy in regards
>>
For art, probably better off asking in the other adjacent threads. But from my experience dabbling in slop art, its all the same even if you go for a bigger model
>>
>>107873799
no way this wasn't edited to hell and back, actually coherent AND dark scene
>>
>>107875479
And they try to claim genning takes no skills.
>>
>>107875479
no signs of inpainting.
>>
>>107873799
model?



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.