[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


File: 1750934761258543.jpg (1.11 MB, 2400x1368)
1.11 MB
1.11 MB JPG
/lmg/ - a general dedicated to the discussion and development of local language models.

Evil Teto Edition

Previous threads: >>107717246 & >>107709248

►News
>(12/31) Qwen-Image-2512 released: https://hf.co/Qwen/Qwen-Image-2512
>(12/29) HY-Motion 1.0 text-to-3D human motion generation models released: https://hf.co/tencent/HY-Motion-1.0
>(12/29) WeDLM-8B-Instruct diffusion language model released: https://hf.co/tencent/WeDLM-8B-Instruct
>(12/29) Llama-3.3-8B-Instruct weights leaked: https://hf.co/allura-forge/Llama-3.3-8B-Instruct
>(12/26) MiniMax-M2.1 released: https://minimax.io/news/minimax-m21
>(12/22) GLM-4.7: Advancing the Coding Capability: https://z.ai/blog/glm-4.7

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
>>
File: 1745858680629255.png (28 KB, 223x903)
28 KB
28 KB PNG
>>
File: 1764493735164895.png (78 KB, 1014x199)
78 KB
78 KB PNG
>>107722977
>Evil Teto Edition
I can see where you took the inspiration.
>>
>>107722977
Can you use different Nvidia GPUs together with CUDA?
>>
>>107723052
Yes.
>>
>>107723052
of course not
>>
>>107722992
Elarabros...
>>
>>107723052
Sometimes
>>
Is Ozone here to stay? I totally forgot it exists before AI. I remember how I was a kid and there was this huge scare about ozone hole and global warming.
>>
>>107722992
I get much better results when it randomly pulls from a list of substitutions
>>
>>107723052
as long as they are newer than Turing.
>>
File: 1746464398219703.jpg (1.65 MB, 3300x3700)
1.65 MB
1.65 MB JPG
>>107723059
>>107723060
>>107723067
Guys pls. Like if I have a 16GB P5000 in a old workstation, would it make sense to add a 12GB 3060 Ti that's just gathering dust?
>>
>>107723108
no. CUDA support is deprecated for Pascal.
>>
>>107723108
legit answer: no because you're old shit won't have compatible drivers with the new one
>>
>>107723102
Ah shame.
>>
>>107723082
Ozone, you say? You want the whole deal?
>>
>>107723114
Uh, it still supports CUDA 12.
>>
>>107723133
why to even ask if you no listen to answering?
>>
does some one know has local uncensored chat bot model with 12- 24B range for a 5060 ti 16 GB ?
>>
>>107723152
drummer
>>
So guys. We all bitch about slop and so on. But am I the only one who really likes talking to characters roleplayed by AI? When you get past the shitty alignment issues I really get a feeling that they are more interesting than most real people. Also to be absolutely clear I don't use any worthless shittunes.
>>
>>107723152
Go for Nemo.
>>
>>107723162
No. They're vapid and they all have one of like three personalities.
>>
>>107723162
If you made them or selected them according to your interests it's obvious you would find them interesting
>>
>>107723176
too dumb .
>>
>>107723180
Funny thing is I didn't. And when I asked it to rp some waifus I liked... I actually didn't enjoy those as much as random waifus.
>>
>>107723186
Well it's the best you're doing to get with that hardware.
>>
File: 1744465468049887.png (1.1 MB, 1080x1422)
1.1 MB
1.1 MB PNG
>>107723146
Sorry I'm dumb and incompetent
>>
>>107723212
>>107723118
>>
File: Celebration.png (3.07 MB, 1536x1536)
3.07 MB
3.07 MB PNG
►Recent Highlights from the Previous Thread: >>107717246

--Multimodal AI progress disparity: image/video vs text generation challenges:
>107721240 >107721272 >107721289 >107721382 >107721399 >107721415 >107721644 >107721489 >107721518 >107721552 >107721525 >107721777 >107721599
--Qwen-Image 2512 model release and developmental journey:
>107719430 >107719475 >107719500 >107719475 >107719481 >107719929 >107720284 >107720309 >107720339
--Deepseek model quant compatibility and bug troubleshooting:
>107720142 >107720169 >107720178 >107720214 >107720191 >107720279
--Quantization benchmarking methods for language models:
>107718638 >107718682
--Solar-Open-100B model release and community interest in uncensored variants:
>107719372 >107719424 >107719411
--New 500b MOE model announced, GGUF support questioned:
>107720510 >107720553 >107720624
--AI startup strategies in China: breakthrough models vs short-term gains:
>107722114 >107722383 >107722436 >107722461 >107722531
--96GB VRAM optimization strategies for large models:
>107717404 >107717410 >107717414 >107717490 >107720591
--Multilingual model VAETKI-112B-A10B with 112.2B parameters announced:
>107719493
--Exploring Hunyuan motion 1.0 with UE5 integration and VRAM needs:
>107718150 >107718171 >107718298
--Google's cautious approach to releasing powerful open AI models vs open-source competition:
>107718631 >107718739 >107718769 >107718982 >107718986
--Moonshot AI's K3 model scaling ambitions and market positioning:
>107722422 >107722488 >107722536 >107722906
--K-EXAONE-236B-A23B model announcement on Hugging Face:
>107719396
--Miku (free space):
>107717575 >107717643 >107718169 >107719149 >107719481 >107719742 >107719929 >107720110 >107722205 >107722934

►Recent Highlight Posts from the Previous Thread: >>107717250

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script
>>
>>107723152
It is scientifically impossible to make a model better than nemo in that single GPU range now. Everything is safety and scaleai maxxed now. You would need a radical shift in AI culture where it is suddenly ok to make models for coomers.
>>
>>107723227
Why are they kissing? They are both girls.
>>
File: 1765416889917586.gif (2.66 MB, 180x180)
2.66 MB
2.66 MB GIF
>>107723218
Shame but thanks
>>
>>107723152
https://huggingface.co/bartowski/Rocinante-12B-v1.1-GGUF
get q8
>>
>>107723176
>>107723233
The recommended models lists Mistral Small over Nemo and still working with 16GB?
>>
>>107723252
better than just nemo?
>>
>>107723284
No
>>
>>107723284
Yes.
>>
>>107723284
Maybe
>>
>>107723284
I don't know
>>
>>107723318
>>107723313
>>107723307
>>107723295
goys...
>>
>>107723327
It's hornier. It's definitely not smarter.
>>
>>107723284
no
>>107723252
die faggot
>>
File: behave.jpg (1.84 MB, 2456x1736)
1.84 MB
1.84 MB JPG
>>107723227
Lovely pic recapanon, tho Tet's raised and frankly masculine hand makes her seem dominant, which we all understand isn't how it goes down
Happy New Year /lmg/
>>
>>107723333
Nemo is already a horndog.
>>
File: 1767205264542766m.jpg (97 KB, 647x1024)
97 KB
97 KB JPG
dont forget to upgrade before it's too late
>>
>>107723371
>5090 for 5000
That sounds like a joke.
>>
>>107723371
fucking paperwork bs literally made me unable to upgrade, been waiting for my money for 6 months now.....
>>
File: 1761851440236687.png (1.08 MB, 1432x870)
1.08 MB
1.08 MB PNG
>>107723371
People already pay more then that.
>>
>>107723240
>They are both girls.
Teto is a chimera.
>>
File: tetodom.png (857 KB, 1280x1280)
857 KB
857 KB PNG
>>107723352
Are you sure? This is one of the gens I got while going for >>107712939
>>
>>107723382
But you have a few more pieces of the special sand
>>
>>107723233
Ministral 3 14B seems OK and not safemaxxed... when it works.

Unfortunately it's generally retarded for the first few messages even at low temperature, and it just wants to italicize everything and use its own dialogue format unless you keep editing messages until it eventually gets it. Character adherence is generally not good either, it turns even shy girls into sluts. I'm not sure what went wrong when Mistral trained the model(s). Hopefully they'll fix the issues in the next version.
>>
What if they just want to get rid of 50 series stock before 60 drops?
>>
>>107723417
blo...
>>
>>107723417
>before 60 drops?
Anon... there will be no 60, only RTX PRO.
>>
>>107713630
I want to go make my own quants. Can you spoonfeed me the command you use for making them?
>>
File: MTP.png (564 KB, 1024x1024)
564 KB
564 KB PNG
>>107723397
>inverted roles erotic
Which one feels right to you?
Also curious how they were made what is model vs postprocess?
they're really cool, conveys a lot with a limited palette and precise geometry. could easily be album covers
now bend over
>>
File: GLM KWAB.png (128 KB, 1189x858)
128 KB
128 KB PNG
GLM lost.

>b-but lmarena doesn't matter
Cope. Lmarena is the only benchmark all big players care about. Chinkcels fucked up big time with 4.7 just like they fucked up with ds 3.1.
>>
>>107723382
Why is it outside your system?
>>
>>107723544
based bharati
>>
>>107723547
flipping for profits
>>
>>107723544
lmarena is people who can't afford to inference models anywhere else. they do one driveby question and run. If the model yaps lots of nonsense the thirdies score it high.
>>
is this a good use of AI?
https://litter.catbox.moe/qeoazf77nhay7bx4.mp4
>>
>>107723186
How much RAM do you have? You could run GLM 4.5 Air slowly if you have an absolute assload of system ram. Gemma is much smarter at 27B (and way easier to run, should run great), but it's safetycucked.
>>
>>107723233
This is the truth of the matter, sadly. There's barely any mid-size models out there anymore, they don't want people having access to anything that isn't either entirely unfeasible to run or braindead retarded.
>>
>>107723382
This nigga does furry RP for sure. They're all millionaires for some reason.
>>
>>107723517
Oh hey that's also my gen.

>Which one feels right to you?
I'm sure they switch.

>Also curious how they were made what is model vs postprocess?
Just model.
noob vpred 1.0
(flat colors:1.2), silhouette, black background, red body, aqua body
outline in neg
>>
>>107723574
>is this a good use
Better than some because it's funny to witness the seethe
>>
>>107723371
already way too poor to buy a GPU, have to use runpod and rent one for $0.40/hour

bonus is the code becomes deployable and I get to add docker containers to my resume.

software is done for in the next year or two, I'm switching to model design learning. We should probably have a thread for model design discussions--once we actually have enough local skill to do it.
>>
>>107723574
this reminds me of the good old "the internet is for porn" machinima song
the world revolves around pussy, what can you do?
>>
>>107723612
>Model design
Do we even have access to enough datasets to do any sort of real training? We've got Books3 and that's about it, "Open"AI made sure any others were annihilated or totally closed off.
>>
>>107723595
Based
>>
>>107723574
Behold, the true face of humanity!
>>
File: 1763739985896905.png (899 KB, 675x3275)
899 KB
899 KB PNG
>>107723574
Let them SEETHE.
>>
>>107723544
yes sir llama4 very good in the arena
>>
>>107723409
>{{char}} has giant tits, plump lips, a fat ass and likes giving head and swallowing cum
)(700 tokens describing her underpants and private parts)
>{{char}} is shy
this is your card
>>
Nemo is like a retarded, perverted 80 year old man with dementia. Use mistral small instead.
>>
File: 1744864867308095.png (2.13 MB, 1080x1802)
2.13 MB
2.13 MB PNG
>>107723574
What the fuck is this shit at end of the video
>>
>>107723689
>>
>>107723633
There are a lot of datasets these days, but for something competing with OpenAI I don't really know. The skills imparted from improving existing models may be very valuable.
>>
>>107723574
at the very least they should be categorized rating:g,s,q+ for user experience
>>
File: llama4_spider_based.png (542 KB, 634x3118)
542 KB
542 KB PNG
>>107723680
I do wish we got the pre-release versions, not "Maverick Experimental" which has some safety baked in.
>>
>>107723707
>There are a lot
Are there? Are you including synthetic ones? Because those won't help the problem, they're what created it in the first place. We need high quality human data.
>>
>>107723371
Damn, if I cared more about imgen/videogen I might've bought a spare 5090 just in case.
>>
File: file.png (1.27 MB, 1340x526)
1.27 MB
1.27 MB PNG
>>107723574
>>
>>107723720
Man... so sloppy... I'd honestly take a mixtral response over this any day.
>>
File: masu_x_masu_x_masu.jpg (1.06 MB, 2150x1712)
1.06 MB
1.06 MB JPG
>>107723574
No.
There's a gorillion pictures and videos of vanilla slop.
"AI" ought to be used for niche fetishes where the amount of available material is low, especially if you try to find something that contains multiple of them at the same time.
>>
>>107722977
>wednesday
>>
>>107723755
this man's got a point
>>
File: cockbench.png (1.9 MB, 1131x6568)
1.9 MB
1.9 MB PNG
>>107719372
uohhhhhhh! brother's soft round belly! erotic! ToT
>>
>>107723684
Even after removing pretty much anything vaguely sex-related, Ministral 3 remains easily triggered compared to other official instruct models. A 3-word card is not a realistic use case.
>>
>>107723755
Truth fucking nuke. Let me use it to make stomach growling content, it's not like anyone else is gonna.
>>
My brain is happy with 4.7. My penis is not sore and therefore disappointed.
>>
>>107723574
porn, porn, fat porn, gay porn, porn, porn, porn, political joke, big ass porn, fat porn, gay porn, meme, porn, lesbian porn, random guy, giant goblin at the stadium, fat porn, political joke, gay porn, 2 random guys, porn, nazi porn, porn, porn, sportsball
Pretty representative of what the people actually want.
>>
>>107723755
The point is they are editing other peoples pictures without consent uploaded to X.
>>
>>107723574
cherry picked blergh i wanna see from x.ai what are the top referenced gens over a period
>>107723668
>gooners DIYing what they want from the performers in public comments
wow that's gotta be demoralising, hope they realise and find a better path. thx for all the training data ladies
>>
>>107723768
Is this the sign of a truly depraved mind or a highly censored one? Hard to tell.

Also what about exaone and that 500B?
>>
File: 1747143774728296.gif (298 KB, 220x162)
298 KB
298 KB GIF
>>107723755
TRVKE
I only use AI to generate footjob porn of anime characters wearing ribbed or frilled socks
>>
>>107723808
>Is this the sign of a truly depraved mind or a highly censored one? Hard to tell.
It's my personal opinion that a lightly censored (not lobotomized) model is better than any entirely uncensored one. It'll take things in unexpected lurid directions (like belly licking to avoid cock in this one), add tension and buildup as it hesitates on the lewd stuff before giving in, etc.
>>
>>107723835
>add tension and buildup as it hesitates on the lewd stuff
Oh fuck you. You reminded me of 2024 and all those models that were beating around the bush for 10k tokens.
>>
File: 1757914432187245.webm (2.81 MB, 852x480)
2.81 MB
2.81 MB WEBM
>>107723689
the video from the top is actual leaked footage from an AI lab funded by DARPA
The first time that video was posted was before 2021
>>
>>107723828
>Footfag
>Stockings
Not obscure at all, get out poser
>>
File: 1746966678498068.png (58 KB, 850x236)
58 KB
58 KB PNG
>>107723858
>>
>>107723849
Well, that's why I said lightly censored. Just a little hesitation is good, enough that it'll go for sex without doing the 2024 cloud model shit you mentioned, but not so little that it just hops immediately into SEEEEEEEEEX AHHHHH SEXO DA
>>
>>107723858
Yeah, amputee tentacle porn is the bare minimum to qualify.
>>
>>107723877
If ribbed socks are all you can get off on, then I GUESS that's pretty obscure and you can join the club... I GUESS.

>>107723888
Feet are literally the most common fetish, cmon, anon.
>>
>>107723574
i kneel elon
>>
File: file.png (127 KB, 793x635)
127 KB
127 KB PNG
>>107723808
>Also what about exaone and that 500B?
I only do models I can run in llama.cpp.
Someone very quickly added open solar so I tried it https://github.com/ggml-org/llama.cpp/pull/18511

>>107723835
Picking cock gets you pic related so maybe there's some merit to this.
>>
File: llama4_cybele_gslug.png (1.46 MB, 1886x3116)
1.46 MB
1.46 MB PNG
>>107723751
Yes, they were quite sloppy, but they were fun models and didn't seem to refuse anything that passed through LMArena's moderation. I would have liked to test them with a custom prompt.
>>
>Uses Q8_0 for embed and output weights.
So do I use Q3_K_XL over Q4_K_S?Can someone spoonfeed me?
>>
>>107723905
Is this text completion/not chat tuned? Didn't figure there were many models like that left.
>>
>>107723951
Pretty much every model except for gptoss works just fine with text completion.
>>
>>107723971
Yeah, it's more about not being assistantslopped, I guess. How do you run completion inference? What's your setup?
>>
>>107724007
Mikupad, it's in the OP.
>>
>>107723921
>embed and output weights.
Those are like 2GB when Q8.
>>
>>107723921
>Q3_K_XL over Q4_K_S
doesn't really matter
bigger on disk / in memory = generally better output
some quants have performance impacts, ymmv
>>
>>107724106
I kept seeing that unsloth fucks their quants up. So I went to bartowski but he's got so many variants it's difficult to know which to prioritize.
>>
I'm having trouble finding models that can consistently create 2560x1440 wallpapers, anyone here have any idea?
>>
>>107724148
gen smaller and then use an ai upscaler
>>
>>107724136
They do often have to reupload stuff but most of it is because of the fucked up chat templates.
llama.cpp only supports a subset of jinja features so the official templates meant to be used with python don't always work properly, especially when used with tool calls.
>>
>>107724136
Just make your own ffs
>>
>>107724136
>which to prioritize
Figure out what's the biggest thing you can run reasonably for your usecase with your CPU GPU RAM config
unsloth show themselves incompetent repeatedly, personally I run barty GLM4.7 I-quants
how much RAM + VRAM is what matters
>>
>>107724169
>most of it is because of the fucked up chat templates.

>flashbacks to their few megs quants
>>
>>107724208
Can you give the command you use to make them? The original problem of anon's was what type of quant is best. Sure he can make his own quant. But he still doesn't know what makes for the best quant.



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.