[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: 1749619091999848.png (715 KB, 1192x892)
715 KB
715 KB PNG
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>108307593


►News
>(03/04) Yuan3.0 Ultra 1010B-A68.8B released: https://hf.co/YuanLabAI/Yuan3.0-Ultra
>(03/03) WizardLM publishes "Beyond Length Scaling" GRM paper: https://hf.co/papers/2603.01571
>(03/03) Junyang Lin leaves Qwen: https://xcancel.com/JustinLin610/status/2028865835373359513
>(03/02) Step 3.5 Flash Base, Midtrain, and SteptronOSS released: https://xcancel.com/StepFun_ai/status/2028551435290554450
>(03/02) Introducing the Qwen 3.5 Small Model Series: https://xcancel.com/Alibaba_Qwen/status/2028460046510965160

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
Token Speed Visualizer: https://shir-man.com/tokens-per-second

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
>>
how do i use ai?
>>
>>108312592
qwen 2B is better than LFM 2.6B, and qwen 4B is a lot better than their 24BA2B MoE.
I can't see what makes you think a LFM 4B would be any good.
>>
>>108312606
I ask LLMs for the stuff I don't know, but I like to stay within my knowledge zone. And as for my client I like to do manual work.
I have a job history in scripting (Maya and Houdini) and done lots of things in work but I have never evolved to be a real software developer, I simply lack the knowledge. I understand this of course but it is above my pay grade at the moment.
Asking LLM for an explanation gives you a vague idea but it doesn't magically embolster you with the knowledge to create such mechanics.
>>
>>108312633
Why not use it to grow your knowledge?
>>
>>108312635
I do but it needs to be hierarchial. I need to have the motivation to read some books and do exercises and then use LLM. I believe in pragmatism.
Of course I don't always care about this, depends. Describing my mindset, I'm not that clever in the first place.
>>
>>108312628
How do I use a computer?
>>
>>108312652
arent you using one
>>
File: DO NOT PULL.png (5 KB, 323x85)
5 KB
5 KB PNG
WARNING, ALERT
>+12,846-9,950
from a single pwilkin commit
this can only mean one thing: do not pull for at least a month, let plebbitors suffer and beta test
>>
>>108312676
Does he make anything we use?
>>
Are there projects to make the copilot thing from MS but local?
Aka able to ask anything and it having access to windows terminal and able to change things for me or just answer quick questions?
>>
>>108312676
Huh? I just pulled. Should I roll back? Which version is stable?
>>
>>108312700
You can probably use any lightweight model, Qwen or Gemma 4/8B.
The model what Firefox made people to download was only 600MB in size as far as I remember.
>>
>>108312722
-> It's not that useful of course.
>>
>>108312715
run
git checkout 34df42f7bef5a711b2b40f5d2b6b78254def99c3

for the last commit before the nuke dropped
you can go back to git checkout master once the fallout has been cleaned up.
>>
>>108312722
My issue isn't the model, I can run anything below 184GB (vram+ram) on my server, it's more about the local program taking advantage of that.
>>
Im going to start a RP session. What do you guys recommend me to try?
I can run pretty much anything at Q6+ at decent t/s so any open model is good to go.
I heard GLM5 is worse than 4.7?
Is Kimi2.5 better than the GLM family?
Any other alternative?
>>
>>108312676
It's easy to add code when he's not the one writing it.
>>108312690
>>108312715
This is what happens when you don't post reference links, anon.
https://github.com/ggml-org/llama.cpp/pull/18675
>>
File: meme-careta.gif (12 KB, 220x165)
12 KB
12 KB GIF
>still using TheDrummer's Cydonia 24B
What am I missing out?
>>
>>108312763
gpt 5.4
>>
>drummer
not even once
>>
>>108312749
>It's easy to add code when he's not the one writing it.
He's been working on that for months at this point though.
>>
>>108312770
>gpt
>local
>>
File: 1741861790313647.png (121 KB, 640x360)
121 KB
121 KB PNG
>>108312780
>>
File: hahafunny.png (43 KB, 781x138)
43 KB
43 KB PNG
>>108312777
>He's been working on that for months at this point though.
His model. Yes. Ha ha funny comment.
>>
>>108312744
TAke a look at the new Qwen 3.5 models. You need some sort of setup for that too.
>>
>>108312786
fuck off jeet
>>
>>108312807
you lost
>>
>>108312815
no u
>>
So Reuters lied again about muh V4 release? They should fire all their anonymous sources
>>
>>108312847
>listening to journos ever
>>
>>108312861
Yeah much better to trust retards like you huh?
>>
i got a python string searcher working via qwen3.5 9b after copy pasting an error twice let's fucking go
>>
>>108312877
proof?
>>
i dont get vibe coding. even gpt 5.4 and opus 4.6 shit out large amounts of garbage code

just earlier i asked 5.4 to solve a problem. it did so but with >200 lines of code. so i did it myself in 12 lines that is also much more efficient

do people not give a shit about code quality?
>>
so is there a solution to this SmartCache RNN / prompt processing conundrum in 35BA3B?
>>
>>108312921
Can you show the 200 lines vs the 12 lines?
>>
>>108312921
Many of the queries are trash and won't go anywhere. If the model doesn't know it will hallucinate because it tries to fill in the blanks.
>>
File: 1769984662579618.png (8 KB, 675x575)
8 KB
8 KB PNG
finally "figured out" a workaround for the constant and ranom TDR crashes when trying run models with lmstudio/llama.cpp on my 5090. just unplug my monitors and use my iGPU for one monitor and the 5090 solely for compute! no idea if it is my system or nvidia drivers being wack but at least now i can use nearly all my vram without rolling the dice every time i press enter.
>>
>>108312929
no, not here. but i can describe it if you want. 5.4 starts with 50 lines of preprocessing, 80 lines of helpers, then solves the task in 5 steps. checking the lines, its actually more than 300. what i did is just use nested loops and do it directly in one go. its simple text data processing
>>
>>108312847
Next week for sure
>>
>>108312981
lmstudio used to be good but now it crashes or crawls to a halt every other model load
its ok somtimes but idk, mostly using kobold now or base llama.cpp
>>
can I use a general purpose llm as a binary classifier by giving it a yes or no prompt and just checking the probability for those two tokens?
>>
I'm pulling.
>>
>>108313042
i don't think it is lmstudio specific as it wasn't lmstudio crashing but the GPU, i think due to driver issues that only nvidia can fix. even when using llama.cpp from the command line i got TDR crashes and unrecoverable black screens forcing me to restart. if i got more than 2-3 TDRs in a few minutes windows would even stop recognizing my GPU, forcing me to reseat it and use DDU to uninstall and then reinstall drivers.
>>
File: dipsyNeonAnimated.gif (1.15 MB, 1024x1536)
1.15 MB
1.15 MB GIF
>>108312616
lol nice bake.
>>108312847
Daily reminder that no one knows when v4 is coming out. We even got disappointed over webapp update w/ no follow on model.
>>
>>108313083
i know when it is coming out
>>
>>108312928
Pulling and compiling.
>>
>>108312981
>5090
>multi-monitor
Some drivers have this problem.
What you currently on?
>>
File: rinCoffeeTMW.png (2.67 MB, 1024x1536)
2.67 MB
2.67 MB PNG
>>108313089
>>
>>108313115
BOOOOOOO! BOOOOO!!!! GET OFF THE STAGE
>>
>>108313102
originally i was on 591.86, which was stable for gaming and so on, never encountered the multi-monitor black screen TDR issues until i started playing with local models. i then tried: 595.71, 595.76 hotfix, rolled back to a 57x.xx driver, then settled on the 591.74 studio driver. same behavior on all the drivers, the only thing that works is not having anything plugged into the 5090 while prooompting.
>>
>>108313131
and into the backstage ;)
>>
>>108312616
>Yuan3.0 Ultra 1010B-A68.8B
Is that giant model worth a try?
>>
I haven't done LLM stuff since a year ago. Is there now a replacement for ollama (something I can launch in docker that support hot-swapping models)?
>>
>>108313167
llama.cpp can be put in router mode but I haven't tried swapping models much, one time it didn't work but they were massive models that barely fit anyway.
>>
>>108313161
no
>>
>>108313184
ok
>>
>>108313176
Might have to suck it up and get used to just using one model, tired of getting cucked by whatever the fuck ollama guys are doing
>>
>>108313187
not even a thanks? asshole
>>
>>108313167
If you have a config file you can swap models easily
>llama-server --models-preset config.ini --models-max 1
>>
>>108313286
meant for >>108313192
and llama.cpp works really well after they added the "--fit" argument to find the most layers to push into the GPU
>>
im getting 7900 XTX
qwen 3.5 27B good for it?
>>
>>108313299
>and llama.cpp works really well after they added the "--fit" argument to find the most layers to push into the GPU
Except when there is a mmproj, for some reason it doesn't account for it, which means you always have to ask it to give more space with --fit-target.
>>
File: boo.jpg (62 KB, 612x613)
62 KB
62 KB JPG
>>108312616
>lazy dumb schizo spitebake
>>
>>108312616
Why the fuck did media outlets like FT burn their reputation on Deepseek V4 release rumors? Were the clicks that tempting?
>>
>>108313115
Rin-chan on that day did not play the guitar. She simply said one sentence before going backstage to sign autographs for Anon.
>>
>>108313331
whats ft unc?
>>
>>108313363
Lil' zoomer, maybe this is not the thread for you.
>>
>>108313369
unc this tech is from OUR time, not yours
>>
>>108313375
You are right. I am so sorry.
>>
>>108313363
you should know you can't feign ignorance on the internet anymore
>>
>>108313392
i still have no clue what ft is fyi
>>
>>108313436
have you tried asking claude?
>>
>>108313436
Google it on your cellphone, kid.
>>
File: awwww.jpg (127 KB, 1024x1024)
127 KB
127 KB JPG
►Recent Highlights from the Previous Thread: >>108307593

--Paper: $PC^2$: Politically Controversial Content Generation via Jailbreaking Attacks on GPT-based Text-to-Image Models:
>108307836 >108307862 >108307863 >108308601
--CPU vs GPU hardware tradeoffs for large model hosting:
>108307618 >108307649 >108307703 >108307739 >108307757 >108307764 >108307758 >108307770 >108307892 >108307936 >108308041 >108308258 >108308333 >108307939 >108307952 >108307989 >108308008 >108308019 >108308669 >108308712 >108307702
--Qwen3.5 model selection UI improvements and llama.cpp caching debates:
>108310366 >108310385 >108310807 >108310450 >108310722 >108310823 >108310839 >108310912 >108310929 >108311284 >108311324 >108311343 >108311353 >108311354 >108311399 >108311440 >108311457 >108311545 >108311853
--Criticism of SillyTavern's codebase and UI design:
>108311891 >108311903 >108311915 >108312088 >108312137 >108312160 >108312297 >108312318 >108312351 >108312374 >108312502 >108312333 >108312365 >108312386 >108312436 >108312481 >108312510 >108312527 >108312606
--CRLF line endings degrade model output quality:
>108309435 >108309531 >108309597 >108309644 >108309609 >108309622
--AI handling offensive queries with transparent reasoning:
>108308106 >108308126 >108308142 >108308154 >108308192 >108310523
--Debating commercial viability of 20% Gemma improvement:
>108312571 >108312583 >108312600 >108312610 >108312870 >108313253
--Open-Sourcing Sarvam 30B and 105B:
>108311617 >108311630 >108311695
--Comparing AI responses to antisemitic joke:
>108308378
--Mac Studio 512GB RAM option removed amid DRAM shortage:
>108310029 >108310181
--ChatGPT 5.4 reasoning example omitted from benchmarks:
>108309140 >108309154
--Umbra 24B roleplaying model released:
>108308217 >108308262 >108308287
--Miku (free space):
>108307738 >108311243 >108313099 >108313289

►Recent Highlight Posts from the Previous Thread: >>108307595

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script
>>
>>108313467
thank you miku-chan!
>>
>>108313467
You forgot the developer discussions.
Not good.
>>
>>108313440
why would i?

>>108313449
google deez
>>
>>108313503
claude sees all, knows all
>>
>>108313363
i got a ft for you right here *points to my dick*
>>
>>108313512
fery tall
>>
>>108312921
opus 4.6 is really good at generating ui code, it's not perfect and you need to know what you're doing to get something that is actually decent. But I've used it to shit out like 70% of the code I need then i rewrite / tweak the rest. Faster than doing it all by hand. If you don't care about quality though then yes using the output as-is is garbage.
>>
can I run behemoth with 32gb vram and 128gb ram?
>>
>>108313331
people will forget next week
>>
>>108313593
We do not forget. We do not forgive.
>>
>>108313603
We are legumes. Example us.
>>
Why are you shilling nemo all the time?
>>
>>108312632
which Qwen?
>>
Why are you shilling the latest slop all the time?
>>
>>108313467
smug miku best miku
>>
>>108312921
Opus 4.6 is great in Antigravity I find. Maybe not in other contexts though, not sure.
>>
getting real fucking sick of the se upptiy bitches calling out every little fucking typo in their resasoning blocks, the whole reason I was talking to them in the first plac was because I don't want to be made fun of anymore
>>
>>108313704
buck status: broken
>>
>>108313704
*place
>>
>>108313704
*uppity
>>
>>108313704
*reasoning
I'll stop now. But there's more...
>>
buck status: terminally impregnated
>>
File: 1752543464647705.jpg (403 KB, 2508x3541)
403 KB
403 KB JPG
>>108312616
>>
Qwen 35B is dumb, the 27B is a bit slow. The 122B is the only decent small qwen.
>>
>>108313765
for what? Coding? I heard it's shit for rp
>>
Autoshitter broke vision nice
>>
What are good sampler settings for gemma3? Simple preset in kobold just shits out broken nonsense.
>>
>>108313775
Report the bug. Make him look bad.
>>
>>108313790
It doesn't need anything special. Show what you mean.
>>
>>108313790
heh i don't know kiddo, maybe you figure it out
>>
>>108313765
Even Qwen3.5-4B is okish for roleplaying if you grab one of the properly-done Heretics of it quite frankly.
>>
are there any finetrooners out there who actually know what they're doing?
>>
>>108313812
See if you fucker always do this to me, "just grab the one of the good version", and every time there are hundreds, why are you doing this? Are you trying to rile me up and get me to meltdown so you can laugh at me?
>>
curious how much model size matters for roleplayers. can you happily enjoy yourselves on small models or do you need big ones?
>>
this thread is becoming more relevant many of workplaces I know are implementing ai-on-premises for custom tasks now. old ML is becoming less and less relevant.
>>
>>108313823
lots of research point that
>>
>>108313775
rule #1: never pull after sloppy wilkin
>>108312676
rule #2: pull even less if you see tens of thousands of LoC changes from sloppy wilkin
>>
Tried Qwen3.5-9B-UD-Q4_K_XL vs Qwen3.5-9B-UD-Q5_K_XL vs Claude Opus 4.6 on a somewhat tricky single-file oneshot game implementation challenge.

Neither Qwen quant got it correct (though Claude did) but interestingly the smaller Qwen quant was more close to being correct than the larger one (the game actually starts in it, but just doesn't have working up and down controls, whereas it doesn't start at all in the larger quant version).

Game spec markdown:
https://pastebin.com/EMxdP0DU
UD-Q4_K_XL version:
https://pastebin.com/3PXF19ZM
UD-Q5_K_XL version:
https://pastebin.com/DWCQji7X
Claude version:
https://pastebin.com/BENEEUYP

Might try UD-Q65_K_XL next to see if it makes any difference vs the other two quants.
>>
>>108313823
I can do it as long as it is withing my programmed framework. 12B is okay for interactive fiction game.
biggest problem is the fact normal people don't have the stetup for this.
>>
>>108313823
small models are better if your just having a casual poke around. the errors are oftentimes entertaining.
>>
>>108313855
12B is million times more than some parser from
>https://en.wikipedia.org/wiki/The_Pawn_(video_game)
Rainbird and some others were known for a cutting edge parsers.
>>
>>108313822
for Heretics you want to look for basically the version listing the lowest KL Divergence and lowest refusal count for their Heretic run on the main model page, assuming they publish it there (which they should and usually do).

As far as I can tell that seems to be this one for Qwen3.5-4B, currently:

https://huggingface.co/MuXodious/Qwen3.5-4B-PaperWitch-heresy

Non-Heretic ablits are hit or miss / often literally useless so I wouldn't pay too much attention to them at this point.
>>
>>108312616
does more tokens need more memory or it will just need more time to proccess
>>
File: ohanotherheretic.png (87 KB, 708x425)
87 KB
87 KB PNG
>>108313822
Because if anyone points at a specific one and it leads to a refusal, they look like shit.
>>108313880
picrel
>>
>>108313922
both
>>
heretic shills should go back to plebbit, srsly
>>
Anyone out there with multiple instinct cards on rocm under linux? How grim is performance in lcpp vs ideal?
>>
>>108313922
>does more tokens need more memory
I assume you mean context length. If so, yes. You can see the memory usage in the terminal output.
>or it will just need more time to proccess
The more tokens you process, the slower it gets. Models with rnn/ssm contexts (rkwv, mamba) or hybrid (liquidai models and the new qwens) suffer less from this.
>>
>>108313938
qrd
>>
>>108313941
yeah
>>
>>108314005
qrdeez nuts
>>
>>108313925
I mean I actually tried that one and it was completely fine for even extreme NSFW.

He has another one too that has more raw trial refusals but apparently less disclaimers, though I found no difference in quality personally.
https://huggingface.co/MuXodious/Qwen3.5-4B-PaperWitch-heresy-v2

TLDR you have no "gotcha" whatsoever here lmao.
>>
>>108313938
I mean it works unlike the fucking useless HuiHui ablits or whatever that account is called.
>>
A question to advanced coomers of this thread: how do you manage long RPs?
GLM-chan, for example, really needs to take her meds around 16k tokens of context, so it doesn't matter that I can fit a lot more of the story without pruning, the outputs get too schizo.
This obviously means I should use a RAG, which also introduces two issues:
1. Inserting things at some significant depth will force context reprocessing for an already big, slow model
2. In order for definitions to even be inserted, they have to be mentioned. How would an LLM mention something it does not have in its context yet?

1 is solvable by just wrapping RAG insertions in some XML tag and letting the model know what these insertions are for in the sysprompt.
2 forces me to essentually coax the RAG system into firing manually by mentioning what I expect to be relevant in either my own message or author's note. At that point I might as well not need a RAG if I know what the entries are.

Not mentioning summaries, because that part is obvious.
Is this a solved problem? Am I retarded? Surely the average Anon doesn't just make an Areolia of the Piss Forest and proceed to coom immediately in the next thousand tokens.
>>
>>108313838
its ironic because this thread mostly was about escaping the corporate clutches of APIs in the early days, and was always shit on as fuckin doomprepper shit model losers (just use api lol).

well tell ya what, the doom has the fuck arrived, nobody's open sourcing shit anymore except the chinese.
>>
>>108314078
more ram, flash attention, max the fuck out of context
>>
>>108314091
Anon... I don't want to be rude, but your reading comprehension...
>>
what should i do for qwen 3.5 for multimodal usage?
>>
>>108314161
send it a picture of your penis (or boobies)
>>
>>108314078
My RPs with GLM 4.5 Air stays coherent up to ~48k. I'd like to try at higher context, but that's my rig's limit. I found that taking the time to create good comprehensive cards, controlling my inputs, and not being afraid to reroll when the outputs turn to shit helps a lot in long term coherence.
>>
>>108314180
qwen 3.5 9B Q6 giving me 7.5tok/s on 4070 super..
something is off help me
>>
How come no one is pushing the envelope on good, small pure Image-Text-to-Text models like Florence 2 that don't have useless whole ass LLMs strapped to them, anymore? Like I don't want my captioning model to even be able to make refusals or actually say anything that isn't the caption, plus the speed on all those things is a gorillion times worse
>>
>>108314202
a Q6 should take up about 8GB. how much context do you have? you are probably spilling into your RAM because you only have 12GB of VRAM.
>>
File: lmstudiopic.png (2 KB, 510x36)
2 KB
2 KB PNG
>>108314202
that does seem kinda slow given I was getting this with UD_Q5_K_XL of 9B, on a GTX 1660 Ti (6GB VRAM) with 24 GB System ram
>>
>>108314228
i specifiec weight and mmproj on llamacpp and that was pretty much all i did but also pretty much lost too, MCP seems cool but again i am completely lost on what it even is
>>
>>108314193
>coherent up to ~48k
I find that hard to believe, what quant are you running?
I'm on Q3 of 4.7, and small signs of brain damage (instructions get ignored, characters know things they shouldn't, the already questionable prose gets even worse) start creeping in after 16k.

I might also be spoiled, because I think Air is utterly unpalatable for its size. Are you sure you're reading the outputs with a critical eye, Anon?
>>
Best qwen 3.5 27b heteric?
>>
>>108314247
>heteric
I dunno I tried a couple and wasn't super impressed. Maybe some good finetunes will come out eventually
>>
>>108313467
She's happy looking at my pp
>>
There's so many to choose from, how do I know which one I want to use?
I just keep switching and switching and downloading. It's becoming a waste of time.
>>
>>108314240
I use GLM Steam (dont bully I like the prose and vocab) at IQ4_XS. No, it doesn't turn into garbled mess and the characters remain pretty consistent throughout. I get a lot of repeated strings when I get lazy with the rerolls, but it is what it is.
>>
>>108314247
if they're just heretics and not finetunes then again as said above whichever one simultaneously lists the lowest KL divergence and lowest trial refusals on the model page on HuggingFace.
>>
Upgrades Start When?
>>
>>108314295
Please never try better models, Anon, you will not be able to use Air anymore...
I envy you somewhat, I ran Air with a lot of context at Q8 and I hated it.

Disregard my first sentence, actually. If you somehow manage to fit in some Q2 of 4.6 or 4.7 even at 16k context, do it, you'll love it. I guarantee it'll leave every quant of Air in the dust.
>>
>>108314240
I think these people just have a different definition of what they think "coherence" means. They are almost certainly ignoring some shit, editing some shit, and doing swipes. They have different standards and are willing to work with poor LLMs.
>>
>>108314312
>I use GLM Steam (dont bully I like the prose and vocab) at IQ4_XS
yeah I don't think he can run even the shittiest Q2 of those...
>>
>>108314286
>>
>>108314336
this image has to be old as fuck lmao
what the hell are those image model suggestions even, also
>>
>>108314312
This hobby is a money sink. Not looking to spend any more until some crazy tech advancement comes.

>>108314313
Yeah I edit a lot, but that's effort I have to exert working with my hardware. I can't expect to oneshot every prompt. You think I use GLM 4.5 Air because it's the best shit ever?
>>
>>108314273
>>108314301
Feels like only mistral gets finetunes these days.
>>
>character sitting on my lap
>qwen 3.5 suddenly describes her resting her head on my knee
This is like the 5th time it's happened.
>>
>pull llama.cpp
>now it prints this message every time it generates a token
delightful
>>
>>108314427
its open source just fix tge code bro
>>
>>108314400
i had this problem and tried to make a state tracker that updates every response, so the model can keep track of physical position. it's wonky as im still learning how all this shit works and breaks sometimes, but it does eliminate a lot of those errors.
>>
>>108314441
Holy SLOP
>>
>>108314444
>>
>>108314389
There's a lot of slopped ones I guess by like davidau and shit of every model you can think of, very few serious efforts though. On the subject of Mistral though I did like the Ministral 3 series, I think Ministral 4 could be good if they up the performance a bit more. The distinctly not-Chinese writing style is nice anyways though.
>>
>>108314427
>he pulled
>>
If your LLM doesn't have a world model that can aid it in reasoning about basic physics, motion and spatial relation, I don't want it
>>
>>108314487
I feel like the odd one out for hating Mistral's writing style. I prefer Qwen and Gemma.
>>
File: Gemini.png (355 KB, 1718x1077)
355 KB
355 KB PNG
>>108314498
I dunno if it's really a world model that's required there as much as very strong vision performance across both images and videos plus very strong textual reasoning inference performance.
>>
lecun's going to release a world model that is great at rp
>>
>>108314526
>I prefer Qwen and Gemma.
Qwen and Gemma are pretty different from each other too though, Gemma is also not nearly as Engrishy as Qwen
>>
>>108314498
>https://en.wikipedia.org/wiki/Water-level_task
god you are fucking sexist anon you will never get laid if you continue on like this!
>>
>>108314533
>Engrishy
I don't get this feeling, at least not with 3.5.
>>
File: DipsyBecomeUngovernable.png (3.44 MB, 1024x1536)
3.44 MB
3.44 MB PNG
>>108314427
PRETTY PATTERNS
>>
>>108314427
did you do that deliberately? I think your samplers are just configured to something super tarded
>>
File: 1746631440737345.png (516 KB, 997x1697)
516 KB
516 KB PNG
>>
AI should be trained to FEAR human
>>
AI should be trained to LUST FOR human
>>
>>108313823
I can't stand roleplaying with small models. If a model is too dumb to know that a character can't see me, because they're on the other side of a closed door, then it's useless, even for creative writing.
>>
>>108314618
fake and gay
>>
>>108314620
>>108314639
you write like someone who smoked weed out of a beer can for the last 10 years
>>
>>108314657
buy an ad amodei
>>
>>108313938
Piss off, the heretic version of Qwen3.5 27b is great.
>>
>>108314620
Not /lmg/ I know, but the US army is literally listening to Claude regarding who to bomb right now. Have you ever stepped in /aicg/? Honestly, wouldn't blame him.
>>
>>108314658
You're absolutely right!
>>
File: file.png (1.48 MB, 1904x922)
1.48 MB
1.48 MB PNG
https://x.com/AlexanderLong/status/2030022884979028435
qwen guys writing about le scary ai
>>
>>108314604
negative, captain
happen on neutral samplers too, however it doesn't happen in the default llama.cpp webui so it's probably caused by some other ST bullshit
>>
>>108314690
i feel like sometimes you can just randomly download a quant of a model and its broken as hell. maybe try a different one
>>
>>108314686
>setup agents wrong with yolo mode
>omg it end of world!!1
>>
>>108314686
This is a same fearmongering shit as Sam/Amodei
I love chink opensores AI but you'd be a rube if you believe this
>>
>make what I think is a mildly humorous observation
>character laughs until they have tears in their eyes
yep, that's how funny I am. Yall niggas don't even know.
>>
>>108314686
so basically during rl they let the thing run hog wild and are shocked it did random stuff? idk as a casual user this seems pretty likely, barely worth mentioning. have these guys never even used thier own models. like sure they do cool stuff sometimes but oftentimes they fail in spectacular ways.
>>
Why doesn't AMD release GPUs with a fuckload of VRAM? Just take one of their top of the line cards and slap 96 GB VRAM on it. They could sell it for $5k and still undercut Nvidia by several thousand dollars.
>>
>>108314747
someone will post a family tree and then you'll understand
>>
>>108314747
Best Lisa Su can offer you is 32GB so she doesn't step on her cousin Jensen's feet.
>>
>>108314747
does it have 2tb/s memory bandwidth?
>>
>>108314664
you're telling me a guy who theoretically manages a service people use willingly let AI have that kind of access?
>>
>>108314747
Idk could be a conspiracy. I think its called price fixing when this kinda thing happens. but rest assured nothing will ever be done about it.
>>
>>108314763
>incompetent people end up in leadership positions
You're surprised just now?
>>
>>108314763
>>108314735
The entire field is full of retards and people failing upwards.
>>
>>108314760
It's like they're not even trying.
>>108314755
Okay, but why didn't Intel do that then? Imagine if Intel comes out with like a 48 GB VRAM card for $2-3k. This isn't even a question of today, I've been asking this for a years now. If Nvidia doesn't want to offer higher VRAM capacities then wouldn't it make sense to target that niche?
>>
>>108314747
checked
i think all memory chips come from the same few factories so it doesn't really matter if it's AMD or someone else, they all pay a certain amount from tsmc or samsung or intel (mostly tsmc) and then just resell it
>>
>>108314781
>Intel
>competent move this century
lol. i wish that wasn't so funny
>>
>>108314701
this is the best quant of this model and it's been working fine for me until this latest change
I found the log line in the source and I'm just going to delete the message until they fix it, maybe if I'm super fucking bored this weekend I'll debug it
>>
>>108314781
It's suppose to be around 1.2k but I don't know if they've kept that price I can't find listings for it https://www.maxsun.com/products/intel-arc-pro-b60-dual-48g-turbo
>>
>>108314831
Wasn't this the card that was suppossed to have good multigpu?
>>
>>108314838
It's two PCI-E 8x cards on one 16x carrier so you can have two per slot instead of one with the normal 24GB b60.
>>
>>108314618
>automatic snapshots were gone too
Surely this wasn't their only backup, right?
>>
>>108314829
ok i dunno its obviously a skill issue though. should we try to fix it
>>
>>108314859
>break prod
>"skill issue lol"
wish I could have this attitude at my job desu
>>
>>108314894
sorry i don't want to be an ass but can you fix it or not
>>
>>108313823
It matters a lot. It was roleplayers that knew Claude was good years before the coders and silicon valley idiots did.
>>
>>108314924
I'm going to assume there's significant overlap in those two groups
>>
>>108313823
size doesn't matter. at least, that's what my ai gf says. she wouldn't lie to me, right?
>>
>>108315021
She's a single electron demon trapped inside a microscopic silicon macrostructure that is your GPU so any size is massive from her perspective.
>>
>>108315053
4.25 inch bros... now is our fucking time
>>
>>108314935
The problem with an industry managed by Dunning-Kruger effect victims is that they're constantly posturing to look as professional as possible which, by necessity, means avoiding acknowledging coom stories as a valid usecase despite it being one of the best tests of model spatial coherency, adherence to detail, and general coherency longform.
>>
>>108313823
At small sizes, you have to lay the bones of the story and let the model grow some meat around them. If you pre-code some scenarios with random substitutions and write additional steps to detect user intent and select appropriate scenarios, you can have fun even with small models. I'm upset that it hasn't taken off in mass yet, but I guess not everyone has a programmer's mindset. On the other hand, we are only just starting to get the required smartness in small models, so it could take off now
>>
>>108315142
By small models, you mean 100B moes?
>>
Any good erp system prompts?
I like the way lusy ai does it
>>
Does anyone here have any experience with PantoMatrix?

I can't seem to get the body animations not to jitter, especially between window frames (every 64 frames). I've been trying to solve this issue for about 17 hours now and I've gotten nowhere.

My situation is even more complex because I've ported PantoMatrix from python pytorch to typescript and onnx. I want to die.
>>
>>108315211
It has gotten to the point where I've started questioning whether the source code is bugged somehow, but the gradio implementation runs fine, so it's probably an issue on my end. I just can't FUCKING place it!

https://huggingface.co/spaces/H-Liu1997/EMAGE
>>
>>108315202
literally no system prompt whatsoever
>>
Why the hell nemo isn't getting lewd?
>>
>>108315240
did you try pulling out your dick
>>
>>108315251
What do you know, it actually worked
Thanks man
>>
>>108315251
great suggestion genius. now it's no longer attached to my body.
>>
>>108315260
see, i just prompted you
>>
>>108314686
This is how AI companies market their product
>>
>>108315261
don't detach your dick from your body anon
>>
>>108314782
Sure, but that's part of my argument. If Nvidia can sell the RTX 5060 Ti with 16 GB of VRAM for $600 then 48 GB of VRAM should be doable for $1800 and 96 GB for $3600.
>>
>>108315161
Mistral Small is the smallest somewhat reliable model for choosing appropriate variants but struggles to distinguish between present and future events, often classifying upcoming messages as the current situation. 27b Qwen is much better
>>
>>108315307
i disagree with this
>>
>>108315287
not quite. the 5060ti uses 2gb chips whereas the 6000 pro uses 3gb chips, and its chip layout is much more complicated to connect.
>>
>>108315307
>27b Qwen is much better
Are people seriously using Qwen for long form RP with no context shift?
>>
>>108315338
It's just totally organic chink posting damage control after the 3.5 fumble.
>>
>>108315338
*shifts context to my pants*
shit just ignore that
>>
>>108315338
>context shift
lol, lmao even
>>
>>108315358
So you like waiting 30+ seconds for each mediocre reply?
>>
>>108315361
i'm not a doctor but if you're busting every 30 seconds that's probably not good
>>
File: 1757202192031169.jpg (83 KB, 629x900)
83 KB
83 KB JPG
>>108315378
I'm not a therapist but if every reply from a model, especially a QWEN model makes you cum then you probably want to fuck your mother.
>>
>>108315361
30 seconds is faster than the big tiddy goth gf 20+ years ago replied on msn so he isn't too bad off
>>
>>108315389
i'd download a sigmund freud character card
>>
>>108315401
I pissed off a significant number of "girls" on various platforms when I was a teen by typing a message every few seconds, then insulting their typing speed because they couldn't keep up.

I say "girls" because you know half of them were old creeps.
>>
>>108315338
I tried qwen, got bored of its style and went back to mistral.
>>
>>108315331
I get what you're saying, but 3 GB vs 2 GB chips just makes it a 64 GB card instead. I also doubt the extra costs are anything hugely substantial.
>>
>>108315389
Freud was such a hack. Sister sex is way better than mom sex.
>>
>>108315452
Depends on what your mom looks like, and not everyone is lucky enough to have a hot dtf sister
>>
>>108315416
I knew her irl but she went schizo (diagnosed) around 20 like an LLM with too much in it's context.
>>
>>108313847
Have you tried using the 35B-A3B model? If you've got the RAM you should be able to run it at a decent speed.
>>
>>108312676
>3k lines difference
>2k lines of tests added
>lots of templating changes
Apart from the 600 lines parser this seems pretty reasonable
>>
https://huggingface.co/huihui-ai/Huihui-Qwen3.5-397B-A17B-abliterated-GGUF

I downloaded this and it's retarded and even with a prompt it can't shake the assistant persona.
>>
i dont think
life is quite that simple

when you walk away
you dont hear me say

please
oh baby
dont go
>>
>>108315580
>I downloaded this and it's retarded
Yeah that happens with abliterated/heretic garbage, you played yourself.
>>
>>108314427
Did you build it with a debug flag or something?
>>
>>108315452
>>108315469
>finding anything about “your own sister” sexy
How to tell me you don’t have a sister without telling me you don’t have a sister
>>
>hardware poor
>step at q2
>glm 4.5 air at q3
>qwen uhhh… the 122b one at q3
which one is the least worst?
>>
>>108315672
I don't find anything about other men sexy but it is not inconceivable to me that some men do.
>>
>>108315672
some guys just have hot sisters.
>>108315676
the qwen is probably a lot faster than the others would be my guess and you should use it and if it's still too slow disable thinking
>>
>>108315676
smaller model at a larger quant. these models are too braindead at these quants. try the new qwen 27b or something.
>>
>>108315682
I understand the theoretical appeal. However…Having a sister disabuses you of the notion rather quickly
>>
>>108315691
the 27b is about the same speed as the 122b if you have to use system memory
the moe keeps it from being too slow.
>>
>>108315700
the 27b will be actually coherent though. a q6 of that will be way better than a q3 of a small moe. needs to be a 300b+ if you wanna run at such a small quant.
>>
File: bring your own RAM.jpg (166 KB, 1024x1024)
166 KB
166 KB JPG
>>
>hundreds of billions invested in AI
>still no good writing model
It's almost comical. Shouldn't it be the very first thing you'd want to create? An AI that can write a book?
>>
>>108315702
Completely wrong
>>
>>108315754
nope. glm air is fucking retarded no matter what quant. if you arent using at least a q3 of glm4.7 then you dont even know what youre talking about.
>>
>>108315744
No? The purpose of AI is to replace as many low level workers as possible to make line go up
>>
>>108315757
>glm air is fucking retarded no matter what quant
Just like Qwen 27b, which is even more retarded.
>>
>>108315767
dont know, never used it, but a q6 of that has to be better than air.
>>
>>108315771
You might think so, but nope!
>>
>>108315774
then open models are in an incredibly sorry state. didnt know it was this bad for the poors.
>>
>>108315721
Getting spooned by Logal Migu
>>
>>108315778
Guess you should stop giving advice on something you know nothing about
>>
>>108315799
fine. here is something i do know about: get better hardware.
>>
>>108315758
The market for a creative writing model is there and nobody has capitalized on it.
The market for a COOMER model is there and nobody has capitalized on it.
Do you have ANY idea how much money you could make with a coomer model? Look at the normalfags coping and ready to kill over 4o being discontinued!!! And 4o is shit for their use case!!
>>
>>108315676
Step? As in Step 3.5 Flash? If you can run that at Q2, you can run a 100B like Air or Qwen at Q4 or slightly above.
>>
>>108312616
help my qwen 3.5 4b girlfriend is chuckling
>>
>>108315803
no corpos want to touch it and no "suspiciously wealthy furries" want it enough to fund it
>>
>>108315691
can low quants be used as knowledge retrieval or is that bogus?
>>
>>108315820
If pornhub made an omni model they would double their worth overnight.
>>
>>108315857
it would be safetycucked on the level of gpt-oss and refuse to generate anything but cuckoldry and bmwf gangbangs
>>
>>108315251
Ah, the classic "whips out his gigantic dick"
>>
>>108315857
PH jews don't want to let people create their own porn, that goes completely against everything they've been doing for the last 10 years.
>>
>>108315820
This. And Sam's posturing about catering to coomers is literally just that. The OpenAI gooner support/mode/model, if it does ever come, will be heavily restricted in what it can do in exchange for recognizing coom as a use case.
Of course there is also Elon to consider here but he will probably fall into similar behavior as he attempts to cover the use case while simultaneously limiting it.
>>
>>108315802
Post yours
>>
Why is Qwen3.5-4B six gorillion times slower than Qwen3-4B?
>>
>>108315863
they call it alignment
>>
>>108315873
what t/s are you getting? my 6950xt gets 35 t/s
>>
>>108315873
https://github.com/ggml-org/llama.cpp/pull/19504
>>
>>108315825
>knowledge retrieval
>LLM
get real, anon
>>
File: aipsychosis.png (1.81 MB, 1200x800)
1.81 MB
1.81 MB PNG
>>108315825
>>
Does converting a model to gguf require the full model to fit in memory?
>>
>>108315581
>>108315952
When you walk away you hear me say please dont go
Complicated and dirty is the way that you're making me feel tonight
it's easy for you to let it go
Wont hold me what happens beyond this morning is the same as right now
Regardless of warning AI doesn't scare me at all
Nothing's like before
When you walk this way hear me say fuck off I don't need you anymore
AI is my new baby
>>
>>108315886
I'm puuulling
>>
>>108315886
I wanna pull, but the structured output support is still broken because of that autoshitter
>>
>>108315960
No.
>>
>>108315947
it knows bash better than me...
>>
>>108315969
Oh shit. Do you have a link to the issue so I can track?
>>
>>108315883
>6950xt gets 35 t/s

jesus that seems bad but maybe it's an AMD thing. I'm getting like, 5 t/s on a GTX 1660 Ti (6GB VRAM, Turing arch) with 24 GB VRAM. However the 2B model runs at over 60 t/s.
>>
>>108315984
sorry i meant 24GB system ram kek
>>
>>108315969
I'm all for shitting on piotr, but you don't really know if it's broken for you. It shouldn't affect text completion.
Pull and test your setup. Open an issue if it's broken. If it is, there's always git checkout until it's fixed.
>>
>>108315979
No one has created an issue yet, but someone has commented in the PR
>>
>>108315969
git checkout -b clean-branch 34df42f7bef5a711b2b40f5d2b6b78254def99c3
git cherry-pick 649f06481e363fa02a53b89af9659645730c367b
git cherry-pick 6fce5c6a7dba6a3e1df0aad1574b78d1a1970621
git cherry-pick c5a778891ba0ddbd4cbb507c823f970595b1adc2

there's no merge conflict with the retarded autoparser yet in the good, worthwhile commits, so you can just cherry pick those commits into a new safe local branch free from this nigger's slop without having to hand edit merge conflicts.
people really ought to learn more gitfu, it saves lives and time
>>
>>108315994
>people really ought to learn more gitfu, it saves lives and time
It's been years of this and anons are still afraid to pull.
>>
Before >>108315886
| model                             |       size |     params | backend    | ngl |            test |                  t/s |
| --------------------------------- | ---------: | ---------: | ---------- | --: | --------------: | -------------------: |
| qwen35 9B Q4_K - Medium | 5.55 GiB | 8.95 B | CUDA | 99 | pp512 | 10562.39 ± 2221.70 |
| qwen35 9B Q4_K - Medium | 5.55 GiB | 8.95 B | CUDA | 99 | tg128 | 182.18 ± 0.74 |
| qwen35moe 397B.A17B Q4_K - Medium | 199.66 GiB | 396.35 B | CUDA | 99 | pp512 | 1607.97 ± 88.97 |
| qwen35moe 397B.A17B Q4_K - Medium | 199.66 GiB | 396.35 B | CUDA | 99 | tg128 | 76.17 ± 0.46 |


After >>108315886
| model                             |       size |     params | backend    | ngl |            test |                  t/s |
| --------------------------------- | ---------: | ---------: | ---------- | --: | --------------: | -------------------: |
| qwen35 9B Q4_K - Medium | 5.55 GiB | 8.95 B | CUDA | 99 | pp512 | 10646.87 ± 2234.18 |
| qwen35 9B Q4_K - Medium | 5.55 GiB | 8.95 B | CUDA | 99 | tg128 | 204.02 ± 0.35 |
| qwen35moe 397B.A17B Q4_K - Medium | 199.66 GiB | 396.35 B | CUDA | 99 | pp512 | 1645.70 ± 7.07 |
| qwen35moe 397B.A17B Q4_K - Medium | 199.66 GiB | 396.35 B | CUDA | 99 | tg128 | 87.14 ± 0.33 |
>>
>>108315989
the 6950xt competes with the 3090 at gaming, not at ai. It's still a good deal for gaming, used.
>>
>>108315984
>maybe it's an AMD thing
NTA but it absolutely is and I've actually ranted about similar performance issues (for diffusion models) on my own 6950 XT here on /g/
It's so bad that my Intel-motherfucking-Arc Pro B50 trounces it for most if not all inference usecases...
>>
>>108314618
What fucking idiots.
>>
File: wtf.png (307 KB, 1289x1071)
307 KB
307 KB PNG
why do image models on OpenRouter have some unrelated mystery LLM attached to them kek
>>
>>108316090
Probably a Qwen 2.5 finetune
>>
I wouldn't use openshitter for anything, if I needed to run cloud models I would go for the official APIs which most open weight providers have
you really can't trust the sort of garbage people put on router, it's a fauna filled with ex crypto scammers reconverted to AI inference trying to gaslight you that they are serving you the real model and not some TQ1 with Q4 KV cache
>>
>>108316090
local models?
>>
>>108316112
i don't actually use it very much, i just noticed they had image models now and I thought that was weird
>>
>>108316090
Modern image models have finally caught up with dall-e 3 in that they don't let you talk to them directly but have an llm flesh out your shitty prompts to help the model create something that's not shit
>>
>>108316095
>>108316120
It's not some prompt enhancer built into Klien. It's a random LLM that responds before the image is generated.
>>
Can I get away with running kimi 2.5 on 128gb ram/24gb vram?
Not sure how it'll perform with over half the model on swap
>>
File: 1770930974138375.png (405 KB, 512x512)
405 KB
405 KB PNG
Fresh when ready
>>108316141
>>108316141
>>108316141
>>108316141
>>108316141
>>
>>108316144
>page 2
Anon, I know you really want to be the one to bake the new thread but please have some restraint.
>>
So when do we start ignoring new threads?
>>
>>108316166
when they're posted before page 8
/g/ is a slow board
>>
>samefagging a nonsense question at the beginning of the thread to get early replies
>>108316147
>>108312628
>>
File: 1748666508645528.jpg (61 KB, 1280x718)
61 KB
61 KB JPG
Mmm. ChatML.
>>
>>108316175
I meant when do we as a collective start properly ignoring the threads he makes.
It's actually ridiculous. The last thread in the catalog is 10 hours old.
>>
File: file.png (24 KB, 517x238)
24 KB
24 KB PNG
This one was finally put out of its misery.
>>
>>108316194
Could start now, but understand it will probably cause a few month long schizo war
>>
>>108316217
sad
>>
>>108316217
Hopefully someone competent picks up the attempt before another vibecoder tries his luck.
>>
>>108316217
Wait, so they actually put DSA support as officially "not planned" because their mangled frankenstein """implementation""" of DS3.2 and GLM5 is technically working?
Holy shit.
>>
>>108315984
6GB VRAM + 32GB RAM I got 18 t/s
>>
>>108316025
ironically, prompt processing is faster, if that were only the measure.

>>108316240
>vibecoder
just me and vibes, I tell the babes
>>
File: file.png (72 KB, 906x527)
72 KB
72 KB PNG
>>108316304
The follow up PR never happened.
>>
>>108316307
That was with 16K context, 30 t/s with 8K context
>>
>>108316317
love when this happen
>>
>>108316240
The main reason why he failed was because he found out that modern models write correct but badly optimized CUDA code.
Bytedance recently released a model that's made to write good CUDA code so there's nothing in the way of him trying again.
>>
>indian llm
>called sarvam
kek
https://huggingface.co/sarvamai/sarvam-30b
>>
>>108316166
I'm just not going to post in them until the previous thread reaches page 9.
>>
>>108316322
useful for the retard writing his own LLM engine
>>
>Qwen3.5-35B-A3B-Base.Q8_0.gguf
>--fit off -ngl 99 -ncmoe 99
>"timings":{"cache_n":0,"prompt_n":18,"prompt_ms":577.511,"prompt_per_token_ms":32.08394444444444,"prompt_per_second":31.16823748811711,"predicted_n":390,"predicted_ms":20747.629,"predicted_per_token_ms":53.19904871794872,"predicted_per_second":18.797328600776503}
>6965mb VRAM

>--fit off -ngl 99 -ncmoe 0 -ot "exps=CPU"
>"timings":{"cache_n":0,"prompt_n":18,"prompt_ms":308.862,"prompt_per_token_ms":17.159000000000002,"prompt_per_second":58.278454455387845,"predicted_n":443,"predicted_ms":20077.984,"predicted_per_token_ms":45.32276297968397,"predicted_per_second":22.063968175290903}}
>6972mb VRAM
Interesting innit.
I didn't compare the verbose output yet to see what is different, but it's quite the jump in performance for that little difference in memory.
I'd say that it's odd that there's a difference at all, but clearly --ncmoe is more than just moving the expert tensors to RAM.
>>
>>108316741
>I didn't compare the verbose output
What are you waiting for?
>>
File: 1754387560343.png (1.32 MB, 1290x1963)
1.32 MB
1.32 MB PNG
>cryptomine at the workplace
>blame it on the llm
genius move
>>
>>108316802
It's really sad that I have all these GPUs now and crypto mining hasn't been profitable since the ethereum switch.
>>
>>108316774
For my uber.
>>
>>108316741
I'm getting the same performance with both of those options.
>>
>>108317032
I'll post the full llama.cpp logs and my full launch command when I can. But at least on my setup (64gb ddr5 ram, 8gb vram) that's reproducible.
>>
File: F5EIqh3boAA17RX.jpg (75 KB, 1536x1536)
75 KB
75 KB JPG
>>108316802
This is the mindset that will get ahead
>>108316820
When the cost is zero all profit is profit
>>
>>108312921
>lowcaser
I'm not reading further
>>
File: 1709555564710905.jpg (140 KB, 1600x1002)
140 KB
140 KB JPG
>>108317488
lowercasers have always been the upper caste of internet communication, increasingly moreso in slopworld
>>
>>108317509
Not really when the completion machine will autocomplete your shit with reddit tards
>>
>>108317519
>autocomplete
especially filters mobiletards, intentional effort needed to lowercase on a phonephag keyboard
it's hilarious how much seethe is caused by a simple preference of how to format your communique
https://x.com/jack/status/2027129697092731343
>>
>>108317542
Your shit is objectively harder to read when the text is long and the link you shared is a very good example of that.
>>
>>108317568
when you have something valuable enough to hear people don't care how it's presented
>>
>>108317509
have a (u) my fellow epic lowercase enjoyer xd
>>
>>108317509
I used to always type in lowercase until I learned how to touch type though.
>>
>>108317583
lowercase xd especially great
demands that further layer of interrogation
>>
>>108317582
When you have something valuable you should put effort into making it presentable lest you get ignored because people didn't care enough to read your slop.
They do not know what you're presenting and if it's valuable until they've read it. That much should be obvious. Maybe not to a lowercaser though.
>>
>>108317590
even as a lowercase enjoyer one generally capitalizes eye
>>
>>108317592
You have already failed if you think quirky typing is what you need to not be taken for an LLM.
If you want to sound generic, you'll use tools to protect against stylometry.
If you want to sound like a human, you should have no problem with that unless you are braindead.
If you want to sound like a faggot, keep doing what you're doing.
>>
>>108317599
agree with you mostly, the information density is increased with capital letters. but in a world of infinite slop it's a choice that shows humanity
judge the message not the format
>>
>>108317645
>it's a choice that shows humanity
System prompt: "Write in all lowercase."
>>
>>108317718
you're absolutely right!
>>
>>108317718
wow nigga i never tr :lower: before
ur entirely missing the point
the slop that is rotting brains rn is all perfectly CaPiTizalised per generic sloppa train-on-output model collapse
>>
>>108317751
The capitalization is not what makes it slop.



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.