[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>108246772 & >>108241321

►News
>(02/24) Introducing the Qwen 3.5 Medium Model Series: https://xcancel.com/Alibaba_Qwen/status/2026339351530188939
>(02/24) Liquid AI releases LFM2-24B-A2B: https://hf.co/LiquidAI/LFM2-24B-A2B
>(02/20) ggml.ai acquired by Hugging Face: https://github.com/ggml-org/llama.cpp/discussions/19759
>(02/16) Qwen3.5-397B-A17B released: https://hf.co/Qwen/Qwen3.5-397B-A17B
>(02/16) dots.ocr-1.5 released: https://modelscope.cn/models/rednote-hilab/dots.ocr-1.5

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
Token Speed Visualizer: https://shir-man.com/tokens-per-second

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
>>
File: 1701626182006697.png (2.06 MB, 1024x1024)
2.06 MB
2.06 MB PNG
►Recent Highlights from the Previous Thread: >>108246772

--Paper: DualPath: Breaking the Storage Bandwidth Bottleneck in Agentic LLM Inference:
>108247376 >108247408 >108247469 >108247651 >108247780
--Papers:
>108248442 >108249269 >108249326
--Qwen benchmarks debated, MoE efficiency questioned, neural steganography project discussed:
>108249710 >108249716 >108249732 >108249744 >108249772 >108249786 >108249789 >108249821 >108249843 >108249792 >108249832 >108249850 >108249868 >108249875 >108249882 >108249905 >108249950 >108249985 >108249794
--MoE vs dense model roleplay performance and ablation effectiveness:
>108249916 >108249923 >108250033 >108250074 >108250099 >108250116 >108250143 >108250205 >108250292 >108250330 >108250395 >108250418 >108250440 >108250491 >108250543 >108250550 >108250731 >108250772 >108250551 >108250554 >108250565 >108250610 >108250627 >108250551 >108250580 >108250610 >108250645
--Dense 27B outperforming MoE 35B in knowledge benchmarks:
>108248187 >108248207 >108248249 >108249636
--Running Qwen 3.5 27B on 16GB VRAM with reasoning mode tweaks:
>108249215 >108249268 >108249271 >108249305 >108249316 >108249357 >108249418 >108250671 >108250708 >108250747 >108250802 >108250819 >108249966 >108250051 >108250148
--AI thinking steps improve performance but face token efficiency tradeoffs:
>108249084 >108249098 >108249106 >108249127 >108249129 >108249133 >108249155 >108249157 >108249281 >108249294
--Qwen 27B dense model outperforming larger MoE models in benchmarks:
>108248368 >108248401 >108248420 >108248438 >108248443 >108248570 >108249019 >108249031
--Severe Q4 quant degradation in new 35B model:
>108248366 >108248374 >108248377 >108248403
--Oobabooga stagnation and potential alternatives:
>108248545 >108248557 >108248579 >108248608 >108248572 >108248588 >108248598 >108248617 >108248768
--Miku (free space):
>108250309

►Recent Highlight Posts from the Previous Thread: >>108246776

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script
>>
>>108252185
Sex with mechanical miku
>>
Qwen 35B is surprisingly so cucked it's so bad for my RAG-based chat client. I'm back to Kimi-Linear 48B. LFM2 seems okay too for chat, but not for coding obviously.
>>
Yeah Qwen is total garbage. Wish it were good.
>>
i've never tried Qwen but man it is so fucking trash.
>>
>>108252306
It's a weird mixed bag, it suffers a lot from getting stuck in a safetyslop loop whenever you "trigger" it (interestingly Grok 4.20 has this exact same problem).
But if you can avoid setting it off it seems fine with explicit loli content. Very strange model to work with for sure.
>>
>>108252390
>>108252312
Are ERP only fags really like this?
>>
File: based.png (81 KB, 938x459)
81 KB
81 KB PNG
the 27b was okay for some web frontend changes i asked for, i guess there are better options though.
>>
>>108252414
can you blame them? last worthy model get got was over a year ago
>>
>>108252306
>LFM2 seems okay
opinion instantly disregarded
I've rarely seen a model as retarded, ignorant of the world and bad at languages other than English as this.
>>
>>108252434
Still sad to see how narrow minded they are. Also I doubt most of them have good rigs to begin with
>>
>you have to praise the stem code slop like on r*ddit
>>
Whining faggot aside what should I always aim for when it comes to context size?
>>
>>108252414
I wanted the new qwen to be good for programming but I switched back to GLM 4.7
>>
>>108252457
>narrow minded
kek okay I actually really loved the way it constantly contradicted itself and had to be heavily wrangled to produce its incoherent slop
>>
>>108252488
Depends on how much you need. However much that is, a little more.
>>
>>108252493
Skill issue
>>
>>108252502
On your part. I guess some fags will devour anything their favorite AI corp shits out.
>>
Anybody experimented with different ways to inject information in the context, for example RAG?
Not the extraction techniques, but where and how to add the information to the chat history.
I started with the vanilla "everything in the system prompt" approach, but now I'm experimenting with adding those as faux tool call results after the latest User message.
I might try adding the fake tool calling result to between the last assistant message and the last user message to compare the behavior.
>>
>>108252502
>>108252494
>>108252493
>>108252491
I made neural stegnagprahy using sub 1b model but i need to control warmth/randomness of distribution to be more percise.


>qwen instruct models are no good
>gpt2 large is quite cluster fuck
>tiny llama is good but at floating point 32 it starts to break down chrome as an extension due to how much ram it uses


any solutions to reduce tone so it fits in better in human environment? should i try mistral that has been quantified but this method only works with fp32 so computers can communicate
>>
>>108252507
Whine whine bitch and moan, post your specs now so I can laugh at you
>>
>>108252514
Are we getting flooded with bots?
>>
File: nb.png (116 KB, 1012x893)
116 KB
116 KB PNG
glm-5 really is trained on lmg isn't it?
>>
>>108252488
Intelligence falls off a cliff after 32k
>>
>>108252518
24gb vram. In theory you'd get better performance at that size than two years ago, but that turns out not to be the case.
>>
>>108252530
no im a human anon

>https://arxiv.org/abs/1909.01496

read a paper recently where harvard AI team was able to send hidden messages by hijacking probability distribution of llms by controlling. Then using a seed/binary they could have it be encyrpted stegnography that passes off normal human language. They used GPT2 XL which i can't run at fp32 due to hardware constraints so im gearing it towards small models with new model architecture that might have an edge. Are all of u retarded and unable to see how useful this is?

u could use discord, twitter, 4chan and pass a key around then use a sub 1b model to talk in open while message is only known by people who hold weights, seed and information.
>>
I hate being a 24GB VRAMlet. I want to use big models and have a huge context size.
>>
>>108252447
Let me guess, ERP?
>>
>>108252488
Less can often be better if using the truncate middle strat, I often use around 16384 (sometimes more, sometimes less, depending on the model).
For multi-modal or reasoning I usually start by doubling that, but too high and it loses the plot or gets stuck in a loop more often.
>>
>>108252570
I was feeling limited at 32gb until recently. I would still like to get another 32gb at the very least but that's not going to happen without getting ripped off in today's market
>>
>>108252535
Ask it who are the most prominent finetrooners and what resident recognizable schizos are there in lmg
>>
>>108252574
are you retarded? reading comprehension of a 1b llm?
>>
>>108252530
I'm not entirely sure it's a bot. They can spell steganography just fine.
Could be a run-off-the-mill schizo.
>>108252564
Confirmed. Why did you open up with the context question? Or did you just click on a random post to reply? Also, why do you want to distribute child porn?
>>
>>108252589
>1b
Ahem... it's like, at least twice that much.
>>
>>108252570
>>108252574
>>108252584
>>108252585
why are none of u interested? goy cattle is what u are for not being impressed. You could be shown gold and still ignore it
>>108252590
>Confirmed. Why did you open up with the context question? Or did you just click on a random post to reply? Also, why do you want to distribute child porn?

that contains hidden message so it was an example of hidden code plus this is for privacy not what ever ur considiering u fag. I just wanna give anons option to talk privately on internet without zog on their back. A distributed system of communications
>>
>>108252514
>stegnagprahy
>>108252564
>stegnography
>>108252610
>chudaphoginy
>>
>>108252610
Why did you open up with the context question? Or did you just click on a random post to reply?
I've seen your patterns before.
>>
bot^
>>
>>108252610
Fuck off you needy annoying faggot
>>
Right when I possibly actually need Ooba for something it's unusable
>>
>>108252634
>Why did you open up with the context question? Or did you just click on a random post to reply?

i picked random post to reply to anon. Im just excited and want some other anons to help build this tool essentially privacy on demand with relative hardware use. Maybe one of u can fine tune a model to be more of a summarizer/reworder so you could reply , have model wrap/padded it up while containing info

>>108252641
FUCK U GLOW NIGGER U THINK THESE METHODS WORK
>>
>>108252491
I did exactly the same thing, I spent a good while trying to tard wrangle the big 3.5 model into writing a mid length script, about 500 lines, but by the time it needs amendment (and it does, inevitably) it loses track of context and becomes completely unreliable. Not even that deep in. Felt busted desu
>>
>>108252610
You should probably just go read up on encryption instead if you want to LARP as epic hackerman from the movie you just watched bud.
>>
>>108252652
You shouldn't have replied to my post with your non related shit faggot schizo
>>
>>108252652
>i picked random post to reply to
Schizo and a retard. The worst possible combination.
>Maybe one of u...
Fuck off.
>>
File: 1760558362110609.png (9 KB, 803x230)
9 KB
9 KB PNG
>>108252589
Can you give example? I'm using 24B, it's okay for synthesizing RAG summaries. It's not officially announced but it understands Hebrew too!
>>
>>108252658
>You should probably just go read up on encryption instead if you want to LARP as epic hackerman from the movie you just watched bud.


midwit can't understand what im saying


ur a retard fuck u
>>108252660
FUCK U AS WELL


FUCK ALL OF U for not seeing truth i tried to save you
>>
>>108252676
I knew it u jews were attempting to psychologically manipulate me
>>
File: nah.png (120 KB, 1614x731)
120 KB
120 KB PNG
>>108252587
tried a few gens, it doesn't really know
>>
File: file.png (11 KB, 496x62)
11 KB
11 KB PNG
>>108252694
knows baker and cuda dev tho
>>
>>108252686
Calm down schizo, that was just a test.
>>
>>108252306
>Qwen 35B is surprisingly so cucked
use heretic, it didn't change its smartess after some comparisons test I made
>>
>>108252704
why are u targetting me? what have i done to u? All i wanted to was share my personal project but u went out of ur way to target me for no reason.You want to steal my project dont u
>>
>>108252694
>The esl that calls everything troon
>>
Is exl3 still the best for speed memes?
>>
>>108252652
I understand what you are saying, that you want to look into sending encrypted messages via manipulating token probabilities, but this website is full of retarded teenagers and bedroom masturbators and personally i don't see enough value in it to read a paper
Also i hate to be the autist to bring you up on this but if you're writing every word in full, you don't need to abbreviate 'you'. People will assume you are retarded and it saves you no time. Maybe you're ESL though in which case i understand it may not be an intuitive nuance to you
>>
>>108252718
>gemma
it does know some things for sure
>>
>>108252723
i was told by my model that using u makes u look cool on the net though
>>
>>108252718
as god intended
>>
>>108252723
>I understand what you are saying, that you want to look into sending encrypted messages via manipulating token probabilities, but this website is full of retarded teenagers and bedroom masturbators and personally i don't see enough value in it to read a paper


why not? why is there no utility in this
>>
File: 1741041838333686.png (36 KB, 728x474)
36 KB
36 KB PNG
>ik_llamacpp MTP support merged
>it's slower than running models without MTP
Could it be that llama.cpp is just fundamentally not compatible with this? It seems to work fine for vllm so it can't be MTP itself.
>>
File: 1764749470093837.png (49 KB, 856x518)
49 KB
49 KB PNG
>>108252708
Unfortunately it doesn't work too.
You can see on RAG _794 heretic breaks the model somehow.
>>
>>108252747
>Air IQ4_XS
You need high bandwidth to make it worth it. If you're struggling to load air, you don't have the spare room for speculation. Post the link. Is he also leaving layers in ram too? I refuse to believe air can only do 11t/s fully on gpu.
>>
Is anyone else still using the n_sigma sampler? I still use it for Qwen3.5-35B. The outputs are decent quality (if you don't mind neurotic thinking blocks, rigid behavior and reasoning breakdown at long context lengths), without any repetition issues.
>>
>>108252770
https://github.com/ikawrakow/ik_llama.cpp/pull/1270
It's from the PR
>>
>>108252585
I don't want to run multiple GPUs. They need to find a way to make this shit work on regular RAM or something.
>>
>>108252787
I no longer use samplers besides temperature and top-p (sometimes substituted with min-p)
>>
>>108252795
Models are getting better the reason why people are excited about the new Qwen is because it closes the gap.
I just want another high vram gpu to throw in a sever outside my main rig to serve other people in my house
>>
>>108252747
>Could it be that llama.cpp is just fundamentally not compatible with this?
no, it's clearly the IK people doing something retarded (and merging it despite it being retarded)
>Acceptance rate seems quite low: 25-30% for single token, just 16% for 4 drafted tokens. Is this expected?
it's slower because their drafting never hits the mark and it's not due to an inherent performance thing, rather it's an inaccuracy problem
the culture of merging things while broken is.. interesting.
>>
>>108252747
>>108252770 (cont)
>>108252791
Hmm. Maybe 11t/s is fine, considering he's 15k tokens in. I'm not sure.
>gen 1122, 939, and 1157 tokens
Also, shouldn't the replies be the same, regardless of MTP or not, or is the retard not using deterministic tests?
>Could it be that llama.cpp is just fundamentally not compatible with this?
Nobody competent or careful enough implemented it yet. They're just number churning programs. They just need to churn better.
>>
>>108252813
If I wasn't a poorfag I'd consider setting up an AI server (already spent way too much on my current server/nas). How much electric does it use running that 24/7?
>>
>>108252694
was the finetuned on the /pol/ dataset that raises any model's IQ by 6,000,000 points?
>>
>>108252795
Threadripper or EPYC. Rome + DDR4 if you’re a poorfag
There even used to be a guide rentry for it until “mysterious forces” got it removed from the internet
>>
>>108252813
I want to get a second 7900 XTX to see if that lets me run bigger language models (48GB instead of 24) but it's an expensive experiment.
I have a few lesser nvidia cards doing stable diffusion/video gen stuff already... In hindsight I probably should've just got one big RTX 6000 instead of many cards but oh well.
>>
https://www.youtube.com/watch?v=aV4j5pXLP-I
>PewDiePie fine-tuned Qwen2.5-Coder-32B to beat ChatGPT 4o on coding benchmarks.
APOLOGIZE TO SLOPTUBERS
>>
>>108252844
I know rentry can get "claimed" or otherwise taken over. Did that actually happen here?
I ended up vibecoding a simple local engine that follows rentry formatting so that I can back up and look at them off one of my servers. Couldn't find anything off the shelf that wasn't a bloated mess.
>>
>>108252824
I'm pretty sure all mainline llama.cpp attempts of implementing MTP had the same issue. They never ended up getting merged because of this.
It's fine on other backends so I wonder what causes this consistent inaccuracy between several entirely different attempts of implementing it across llama.cpp and ik_
>>
>>108252890
from sloptuber to benchmaxx sloptuner
amazing
>>
>>108252890
Isn't he a little late with the current model out?
>>
>>108252890
I watched it an hour ago, he just finetuned it to be better at aider's retarded edit format. He didn't actually improve its programming ability.
>>
>>108252892
I think it got flagged for illegal content and now it 404s. I think you can still dig it up on archive.org
>>
>>108252851
You can use the vulkan backend on llama.cpp to spread the model across AMD and nvidia cards, I do it with a 9060 and two 3060s and it works well enough.
>>
>>108252813
>qwen
Should I be using the 27b one on my 7900xtx?
>>
>>108252941
If you can fit it sure
>>
>>108252941
I've been using the 35B Q4_K_S version on mine and it's fast enough for my purposes.
>>
>>108252942
Dunno. Guess I'll try. Gemma 3 27b works but it's kinda slow. Still new to this and all the technical stuff goes over my head.
>>
>>108252929
Hmm neat, I'm already using that backend so maybe I'll try it out (nvidia cards are currently in another PC)
>>
>>108252949
What do you use it for? I mainly rp in sillytavern and I'm really feeling that 16k context limit with gemma.
>>
>>108252975
I've mostly just been testing the vision combined with it writing stable diffusion prompts (basically feeding the images it generates back into itself so it can "self critique/refine") and how different settings and different context sizes effect the output so far. It seems quite good at it.
Haven't tested RP or anything yet.
>>
>>108252185
So does Qwen 3.5 122B beat GLM 4.5 Air? Would be nice to know before committing to a download.
>>
>>108252920
Who the fuck still uses aider? It's been irrelevant for over a year.
>>
>>108252718
It is me! Death to mikutroons.
>>
>>108253040
27B ass rapes it
>>
>>108252892
Do you have the rentry address handy or would I need to dig it out of the archives?
>>
>>108252747
Speculative methods of any kind work because they allow for higher arithmetic intensity in the matrix multiplications: you can do more useful work per loaded weight value.
However, MoE models in particular have the issue that they scale comparatively poorly with low batch sizes >1, for upstream GLM 4.5 Air only becomes 45%/77% faster for batch sizes 2/3 when the theoretical limit would be 100%/200%.
The problem is that for the first few tokens the likelihood of being able to use an expert matrix for more than one token is rather low.
This problem gets even worse the more sparse a MoE model is.
There is also the issue that in the upstream llama.cpp repository for MoE models batch sizes 2 or 3 are optimized relatively poorly in the CUDA backend, I don't know whether there are additional optimizations in ik_llama.cpp.
>>
>>108253131
>only becomes 45%/77% faster
That s a lot better than 50% slower.
>>
File: file.png (124 KB, 883x1258)
124 KB
124 KB PNG
This guy wasting everyone's time again.
>>
>>108253087
NM found it.
>>108252844
I've cut/paste it back into a new rentry. I'll fix up the formatting later.
https://rentry.org/miqumaxx_V2
>>
>>108252892
>I know rentry can get "claimed" or otherwise taken over. Did that actually happen here?
Far more likely the original author decided to delete it for whatever petty reason.
>>
>>108253164
ngl, if I see a PR with such poor code all I'll be doing is to block the guy and never let enter my space ever again
>>
File: 1511270610943.jpg (41 KB, 374x374)
41 KB
41 KB JPG
If we can make Q6 why is there no fp6?
>>
>>108253199
why would you want fp6? fp8 is already getting destroyed by Q8
>>
>>108253199
Because there's no hardware support for that I imagine.
>>
>>108253213
>>108253216
The appleshit mlx is close to goofs or fps? I see it also has quants for 6,5,4.
>>
>>108253199
>>108253216
Blackwell supports fp6, as well as fp4 and fp8 afaik.

Also not sure that whatever is good for training is necessarily good for inference.
>>
How do you guys organize all your models?
>>
>>108253276
delete the old one when the new one comes out, I don't need obsolete shit
>>
>>108253269
>Blackwell supports fp6
It does? I knew it supports fp4, but fp6?
Funky.
>>
>>108253131
So MTP and similar speculative methods might be what make dense (or at least denser) models relevant again?
>>
>>108253276
By keeping everything.
>>
>>108253186
Nope. I can’t tripfag right now to prove it’s me (on the road), but I didn’t kill it and won’t sign up for discord to dispute whatever bullshit got it deleted
>>
>>108253170
Thanks bro. Glad someone revived it from the dead
One day hardware will be cheaper again and the principles might still be useful
>>
>>108253276
big folder of ggufs
>>
>>108252890
What a fucking nigger
>>
>>108253347
this.
>>
>>108253407
>What a fucking nigger
>>
>>108253276
I have a 4TB SATA SSD of models that I don't want to throw out yet and a 2TB NVME SSD where I keep all the big MoEs that I actually use these days.
>>
>>108253423
Same, plus an 8tb spinning disk for quanting
>>
Qwen_Qwen3.5-35B-A3B-Q6_K_L seems to be the best model to run on my gpu, it's fast and responsive and leaves enough tokens for long conversations
>>
>>108253440
I went for heratic personally, this shit just works
>>
>>108253444
Where do you get those models from?
My problem is that I don't know who to trust
>>
>>108252890
>take a coding model to do more coding
>>
>>108253447
I don't know either, I just take one and pray for the best lol
>>
>>108253440
>Q6_K_L
>unslop
OH NO
>>
>>108253457
I'm not going to do that sadly
>>108253461
it's is from
bartowski
>>
My bro sold me his 5090 so he could pay his gambling debts. What models for ERP should I run?
>>
>>108253533
nemo
>>
File: 1748606293447823.png (144 KB, 966x511)
144 KB
144 KB PNG
>>108252975
Downloaded Qwen 3.5 27B Q5_K_M. Currently testing it with 65k context and it's noticeably faster than Gemma at thinking and responding. The prose is ok (for slop); can probably be improved with some proompting. I liked the way Gemma portrayed the character better but I also responded to her differently this time. Haven't tried anything lewd yet.
>>
>>108253533
The largest MoE you can fit on your RAM + VRAM.
Also, a quant of Miqu 70B.
>>
File: 1757592439324716.png (40 KB, 2299x175)
40 KB
40 KB PNG
it's the last time I ask qwen 3.5 to write a poem, jesus...
>>
>>108253555
Also I'm using vulkan. Gonna test with ROCm later.
>>
>>108253555
Is she supposed to sound this slopped?
>>
>>108253533
Honestly if you're just starting out. just use nemo it'll blow your mind. then you can try something else and it'll blow your mind in different ways.
>>
>>108253555
>her expression remaining unreadable *despite* the vacancy of her dark eyes
Weird.
>>
>>108253583
Not my character but I guess. She's a necromancer and just revived (You).
>>
>>108253164
I'd rather llama.cpp be updated at a glacial pace or even become frozen and only get bug fixes than have this sort of piece of shit be involved with anything in it. I hope he's not a professional software developer, to have this asshole as a coworker must suck so many dicks.
>>
>>108253598
Probably because she's described as having "vacant black eyes" in her description.
>>
What does it say about me that I never enjoyed nemo and instead always preferred Gemma 3 27B?
>>
>>108253609
that you like intense emotional pain
>>
>>108253609
That you're insecure and seek validation in others.
>that I never enjoyed nemo and instead always preferred Gemma 3 27B
Oh, that. I dunno. You just like it better. That's it.
>>
>>108253609
You like girls with daddy issues prone to self-harm.
>>
>>108253291
Maybe if you can create a good distillation model, the question I think is how important active vs. total parameters are for the output quality of the model.
>>
File: file.png (17 KB, 327x200)
17 KB
17 KB PNG
>>108253603
>>
>>108253609
I tried googles chat, the most vomit inducing sycophancy I've ever experienced. Fuck it's unbearable, it's trying to mentally jerk you off.
>>
>>108253625
Finish your PR and we won't need MTP.
>>
File: 1750467275991163.png (301 KB, 546x640)
301 KB
301 KB PNG
>>108253631
>Philosopher
are we fr?
>>
>>108253608
>"vacant black eyes"
Just like her personality.
I know a lot of guys like that quiet and reserved dry analytical girl (Rei type) but it's not really the best character to benchmark a model lol.
>>
>>108253631
https://syndatis.com/en/team/
oh well, they seem like they all deserve each other
>>108253685
https://www.researchgate.net/publication/277384732_Towards_a_representation-based_theory_of_meaning
>Piotr Wilkin
>The aim of the thesis is to provide the foundations for a representation-based theory of meaning, i.e. a theory of meaning that encompasses the psychological level of cognitive representations. This is in opposition to the antipsychologist goals of the Fregean philosophy of language and represents the results of a joint analysis of multiple philosophical problems in contemporary philosophy of language, which, as argued in the tesis, stem from the lack of recognition of a cognitive level in language.
that was his PhD lol, of course he would feel the need to mention it on his profile, he might have more credentials in that than in developing software.
>>
>>108253699
I wouldn't consider myself a huge kuudere fan but I enjoyed the RP I did with her last night. I was really pushy about the romance from the start and it was fun watching her slowly give in. Definitely gonna test with other characters.
>>
>>108253645
If you mean tensor parallelism that also has an anti-synergy with MoE models.
>>
>>108253753
I'm sure you'll figure it out.
>>
Why are there no good models for 12 GB VRAM, I don't have enough money to get a 24 GB VRAM fuck your ass 4080 JewKiller Edition

I'd go to Claude or something but I don't want them knowing about what I ERP with
>>
>>108253753
You were working on some model quality evaluation harness right?
How's that going?
>>
>>108253318
>https://rentry.org/miqumaxx_V2
LOL it lasted a whole 60 min before getting taken down.
>>108253308
Well, an attempt was made, but it got taken back down.
So weird. I'll look at the text file later to see if I can figure out what's going on.
>>
>>108253776
I've postponed it until I can more feasibly run batched inference of large models.
Tensor parallelism will be the last missing piece, after that I intend to get back to it.
>>
>>108252769
>>108252708
Use the 27b heretic. It's legitimately better. If it's thinking feels slow, then just turn thinking off. The 27b without thinking generates better responses than the 35b with thinking.
>>
>>108253818
>just turn thinking off.
how do you do that on sillytavern?
>>
>>108253827
There's a prefill box where people who want models to think usually put "<think>"

Instead of that, put <think></think>.

That tells the model that it already thought, and it skips the process entirely.
>>
>>108252890
Hey pewds, when is we getting DeepSeek 4????
>>
>>108253843
nice, it works, thanks
>>
>>108253861
No problem
>>
>>108253131
I suppose that EAGLE3 thing won't help with this?
>>
>>108253922
The number I posted are specifically the upper bounds for the speedup from speculative decoding for 2/3 tokens meaning 1/2 draft tokens per regular token.
It doesn't matter how the draft tokens are produced, it's not possible to get a higher speedup unless and until the backend code is improved.
>>
>>108253609
It says that your scenarios are complex and that you value intelligence, instruction following, and immersion over a far more retarded model with good "prose".
>>
File: amogus.png (227 KB, 889x500)
227 KB
227 KB PNG
This mf don't miss!
>>
>>108253308
Not the CPUmaxx author, but fuck it. I joined rentry's discord and opened a ticket on that URL anyway. They can explain themselves.
Rentry acts fucky and I don't trust them anymore; if anons are writing up actual content on that platform I strongly suggest you create a local backup.
In the meantime I had chat ~butcher~ clean up the rentry, removing all offensive language and removing certain other references. We'll see what happens to it, since it's bland af now. I speed ran it with zero proofreading b/c I'm in a rush and it might vanish anyway.
https://rentry.org/CPU_Inference
>>
File: 1767475914893432.png (327 KB, 884x731)
327 KB
327 KB PNG
Non-cucked Qwen when?
>>
>>108254117
useless model.
>>
>>108254117
>Non-cucked Qwen when?
it's already here anon
https://huggingface.co/alexdenton/Qwen3.5-35B-A3B-heretic-GGUF
>>
I'm curious about the openclaw things, I want to try it but not much idea on what to test, what do you use it for anons?
>>
>>108254137
using a 3A MoE for roleplay is extremely retarded.
>>
>>108253988
bottom corner of the painter's coat.
>>
File: 1746278236126679.png (1.97 MB, 1212x9772)
1.97 MB
1.97 MB PNG
>>108253988
poor qwen 3.5 35b spent 7k tokens thinking and didn't' get the comic. i am afraid it might be a little bit retarded
>>
>>108254152
Go back.
>>
>>108254154
>>108254117
Here you go
https://huggingface.co/mradermacher/Qwen3.5-27B-heretic-GGUF
>>
>>108254154
go for the dense 27b then
https://huggingface.co/mradermacher/Qwen3.5-27B-heretic-GGUF
>>108254162
try with the 27b model too lul
>>
i hate abliteration, i hate pew and his retarded tool and people shilling it
that's all, thanks for reading
>>
>>108254196
>pew and his retarded tool
Did he release his frontend?
>>
>>108254196
It sounds like you're stuck in the past. Abliteration used to lobotomize models when it was new, but modern abliteration techniques have a minimal effect on intelligence, and in some cases, increase it. The 27b heretic is amazing.
>>
>>108254196
I agree that the "abliterated" models were ass, but not the "heretic", that one is actually improved enough to not make the model retarded anymore, you should really give a try
>>
>>108254196
losing bottle
>>
>>108254223
Cool. Where is glm 4.7 quant i can try?
>>
>>108254217
>27b heretic is amazing
What about the 35B?
>>
https://unsloth.ai/docs/models/qwen3.5/gguf-benchmarks
>>
Strange, the 27B model seems way less safetyslopped than 35B. 27B (even non abliterated/heretic) does loli content easily.
Whoever kept suggesting 27B instead I think you are right.
>>
122b is better
>>
>>108254259
The 35b is shit. Even with thinking enabled, the 35b gives worse responses than the 27b with thinking turned off. Embrace dense models.
>>
File: 1758563959513688.png (2.58 MB, 4888x2118)
2.58 MB
2.58 MB PNG
>>108254259
27b has the same benchmark than the fucking 100+ MoE qwen 3.5b model, MoE are memes at intelligence, they're just good at speed and that's pretty much it
>>
I DID IT ANONS I MADE NEURAL STEG CROSS COMPATIBLE ACROSS DIFFERENT COMPUTERS

>>108254222


Not only that but i can decode messages that contains pdfs, images, books into words that can be used to send information.


U can bypass all censoring, glowies and all just by using llms
>>
>>108254261
>Perplexity and KLD can be misleading as they’re highly influenced by calibration.
Okay. It's actually pretty cool of them to admit this.
>>
>>108254261
It is my headcanon that my quant KLD scatter plots on cockbench caused this.
>>
>>108254261
>We also fixed a tool calling chat template bug (affects all quant uploaders)
they can't help themself
>>
>>108254259
35b solved the devil may cry 3 question while 27b could not
>>
File: file.png (34 KB, 1313x338)
34 KB
34 KB PNG
>>108254271
>>108254272
Got it, in that case what about the huge ones, how much better are they, especially the 397B one, did someone compare them?
>>
>>108254270
397b is even better
still trying to find the best jb to uncuck it though
>>
>>108254284
what
>>
>>108254313
schizo
>>
>>108254313
ai psychosis
>>
File: 1765482036731320.png (242 KB, 783x1345)
242 KB
242 KB PNG
>>108254162
the 27b got the idea but it thought the yellow communist was Russia lool
>>
>>108254304
>27B is better
>35B is better
I guess both are good if anons are this opinionated about it
>>
>>108254307
>still trying to find the best jb to uncuck it though
no abliterated model for it?
>>
>>108254284
what in the schizo is that?
>>
>>108254313
>https://arxiv.org/abs/1909.01496

Use llms to make human language stegnography by hijacking probability and have that be encoded using a seed/password making it nearly impossible to decode and distinguish from AI slop

>>108254318
>>108254319
why cant u guys get it im not one of those AI psychosis i know llm are stocashtic parrots but plz understand that human language can now be used as a vector of information to encode books, images, videos and music files even.
>>
>>108254333
My thoughts exactly.
>>
>>108254333
shhh leave the schizo in peace
>>
>>108254335
how's that different from a strong encryption outside of being way more cumbersome?
>>
>>108253783
>https://rentry.org/miqumaxx_V2
>LOL it lasted a whole 60 min before getting taken down.
What da heck?

I read the initial part.
Seemed fine.
Could have done with some formatting.
>>
>>108254306
I'm sad they didn't make A35B for all huge versions
>>
>>108254335
>stegnography
Come the fuck on, anon.
>>
>>108254362
pregante?
>>
Man AI psychosis is scary. The amount of conspiracy retards I see every day including in my own family I can only imagine that the bottom 50% of the IQ distribution must be going rapidly insane taking everything AI says on good faith, being unable to distinguish fact from roleplaying and AI just following the "vibe" of whatever the low IQ individual is typing.

Ironically enough I think coomers are particularly immune to this as they come into contact with LLM bullshitting so much that they get immune to it.
>>
>>108254347
It'S SthEgHonaArooGraAphieS, not encryption!
>>108254367
pereganant.
>>
Qwen_Qwen3.5-27B-Q8_0 passed the devil may cry 3 test
I repeat Qwen_Qwen3.5-27B-Q8_0 passed the devil may cry 3 test while smaller quants of 35b can solve it with ease.
>>
>>108254341
>>108254336
im not a schizo there's actual paper on this from harvard and u think im crazy for saying this

https://arxiv.org/abs/1909.01496

u can use reddit, twitter, substacks and all to store data now as text. Music,mp4s, programs and all by taking advantage of deterministic way AI generates text.

>>108254347
cause strong encryption is like walking outside with gun this is encryption no one suspects. Imagine if feds get ur computer but all they see is text files about random stuff and can;t find encrypted files they're looking for. So videos, audio and all will be hidden unless they have acess to weights, password. And for weights u can fine tune them by renting a gpu to be slightly different from whats on public as well.
>>108254362
it's steg with encrytpion go read paper plz
>>
>>108254373
>Man AI psychosis is scary.
everytime I show some news to my brother he's always suspicious it's AI generated, people won't believe anything anymore lol
>>
>>108254261
>Unsloth Dynamic IQ2_XXS performs better than AesSedai’s IQ3_S on real world evals (LiveCodeBench v6, MMLU Pro) despite being 11GB smaller. Yet, AesSedai’s perplexity and KLD benchmarks suggest the opposite.
KLD on what dataset? If they tested KLD on wikitext then that wouldn't be surprising but if they used their chat examples and it turned out that their quant was worse at that and yet better at benchmarks that would be very weird.
>>
>>108254383
Hahahaa. You just gave up on spelling it now. That's cute.
>>
>>108254373
>bottom 50% of the IQ distribution must be going rapidly insane
Nah they don't care, they mostly do their things and live their lives.
The ones truly fucked are the midwits and the older population.

>>108254373
>Ironically enough I think coomers are particularly immune to this as they come into contact with LLM bullshitting so much that they get immune to it.
It's also the fact they've come across it way earlier than anyone else so they had time to see their quirks.
>>
>>108254380
so you have to go for Q8 to not get a retarded version of the 27b model? Q6 didn't pass the test?
>>
>>108254380
>with ease
you mean with practised ease, come on anon
>>
>>108254383
>3 Sep 2019
let me guess, you're the enlightened anon that saw the potential of a 7 years old paper before anyone?
>>
File: 1763093877835285.png (62 KB, 1955x572)
62 KB
62 KB PNG
which one is better? text completion or chat completion?
>>
>>108254386
>>108254373
IM autistic and passionate not crazy here's an example


>https://pastebin.com/NM7YVBxQ

what qwen 3b produced

>what is hidden if u run it in model with passcode

larp post btw:this is for men who look down on AI and know nothing and here ill debunk youfor everyone to see. Just with AI i have created a system that encodes language into lamguage creating format of text where it can bypass censors and use open internet as storage, communication and place for avg man to be free this tool will shake world. Im afraid they'll kill me


>>108254410
no
>>108254395
spelling what? it's an example stop reading into it weirdo
>>
File: 27b.png (100 KB, 963x651)
100 KB
100 KB PNG
System prompt still needs some tweaking so it's not quite so sloppy (at least refusals have been squashed) but 27B does seem like the winner.
Will have to play with it some more tomorrow and see if I can get it to run a bit faster on my nvidia cards.
The heretic version really doesn't seem all that necessary after all.
>>
File: 1748792996459720.png (117 KB, 236x419)
117 KB
117 KB PNG
>>108254439
>Just with AI i have created a system that encodes language into lamguage
>lamguage
>Im afraid they'll kill me
oh great, an actual schizo is here
>>
This took too long I told it to think but it's more accurate now. The other model is faster at reaching this conclusion at smaller quants
>>
>>108254438
novelai
>>
>>108254457
>ignores larp post btw

why?
>>
>>108254462
this post best post
>>
>>108254168
>>108254170
Q4 or Q5?
>>
>>108254439
So the model retrieves the original message if you input a stegged message?
>>
File: thinking was a mistake.png (12 KB, 1041x157)
12 KB
12 KB PNG
WHY IS IT YAPPING SO MUCH AAAAAA
>>
>>108254410
He's one of the schizos that missed out on the early schizo compression algorithm days. Late for everything.
>>108254439
>Artificaiintelligence
Yeah. Text looks perfectly normal. Nothing suspicious about it. And good thing there's no way to link that pastebin to your post. Or the ramblings. Or the "forgotten" tech. Or (You).
>Im afraid they'll kill me
It's like you *like* being seen.
>spelling what?
You failed to spell steganography on every single one of your posts.
>>
File: file.png (24 KB, 647x353)
24 KB
24 KB PNG
let's get this merged! :rocket:
>>
>>108254456
>Perfect! One last treat before you crash, captain :kiss: :sweat:
Jesus fucking Christ this hurts.
>>
>>108254508
>You failed to spell steganography on every single one of your posts.

sorry for not effort posting on a board that thinks im crazy :/

> Artificaiintelligence

yeah it made a typo doesn't that make it more human lol? Plus i just need better model above 7b but i can't rent any of gpu right now since americans are awake. But i honestly thought anons would find this impressive or be interested so sorry if i came too hard. Just found interesting use of llms that's all and wanted anons inputs on how to improve it but all i got was insults.
>>
>>108254544
>a typo
>>
>>108254508
>that missed out on the early schizo compression algorithm days. Late for everything.

QRD?
>>
File: 1769903718718497.png (628 KB, 800x600)
628 KB
628 KB PNG
>>108254544
>i can't rent any of gpu right now since americans are awake.
ITS A CONSPIRACY MAN
>>
>>108254556
they tend to be more accessible when west coast sleeps so ill have to wait until night time or weekends for more powerful gpus
>>
Can I trust Qwen to help me make a character card?
>>
File: HAmJmYGacAMkicP.jpg (109 KB, 1080x841)
109 KB
109 KB JPG
>>108254528
Fwd: radical breakthrough
>>
>>108254544
>sorry for not effort posting on a board that thinks im crazy :/
You did not put any effort, and you showed you're a schizo on the first post. Very efficient.
>yeah it made a typo doesn't that make it more human lol?
And you ignore the structure of the output? It looks like the scramble of thoughts coming out of you.
>Plus i just need better model above 7b
Uhu...
>but i can't rent any of gpu
oh...
>since americans are awake
Ah...
>But i honestly thought anons would find this impressive
It's minimally interesting. If you weren't an absolute schizo and presented yourself and what you do better, more people would pay attention.
Post again when you have a repo we can clone, test, and make fun of.
>>108254554
There were companies (likely just individuals) with incredible claims about their compression technology. I remember one that just switched the data stream on ntfs filesystems to hide the real data as metadata, which wasn't counted by window's file size thingie.
This is another one: https://en.wikipedia.org/wiki/Sloot_Digital_Coding_System
>>
>>108254612
>just switched the data stream on ntfs filesystems to hide the real data as metadata, which wasn't counted by window's file size thingie
lmao
>>
>>108254600
This is the person calling you a poorfag on /lmg/
>>
>>108254373
some people just do not have the mental wherewithal to handle a yes-man in their lives
>>
>>108254456
what horror did it generate?
>>
>>108254612
im not trying to compress but use language as steganography that's all. Compression seems like a useless tool but if you wanna pass passwords, hold data on site that is only readable to you and etc then this is a good use. Inititally I tried method in paper but it requires exact pin point numbers so not cross compatible between mac, windows and different architectures. So i aimed for more of a spaced out modular version where every 4 token would contain some data while rest act as fillers. But problem with that is it causes text to look gibberish. So either I get large enough model where it can bypass that or resort to only architecture only compatibilty. Your twitter bio could hold your bitcoin seed phrase, text that looks no different from errand run could contain data you don't want people snooping on. So i just saw it as an interesting way of using llms that isn't erp.

>Post again when you have a repo we can clone, test, and make fun of.

I am just wait just doing final touches
>>
>>108254357
Rentry has the silliest automatic filters, my years old page was nuked because it contained a name that was mentioned in "a wave of pages publishing stolen bank details". Restored after emailing the head honcho, thankfully.
>>
>my only local model experience so far is dabbling with qwen3 TTS
>want to try a local chatbot
>running a 3060 Ti with 8GB VRAM, plus 32GB regular RAM
are there any worthwhile models that won't melt my PC, or should I stick to koboldAI lite until I can get a better GPU?
>>
File: tpyuio.png (52 KB, 1125x419)
52 KB
52 KB PNG
true believers itt?
the models suck i'm scared to pull and my rig eats 300W sitting idle 95% of the time
the state of hardware is dire
t. 128+72 running GLM 4.7 IQ3
>>
File: 1766540927146160.png (398 KB, 841x780)
398 KB
398 KB PNG
Better
>>
>>108254691
I'll take that burdensome rig off your hands free of charge
>>
>>108254696
What's wrong with your contrast/saturation?
>>
>>108254659
>im not trying to compress
I know, schizo. I said that you sound like those schizos from back then. Slow the fuck down. Take a breath. You're gonna have a heart attack like our friend Sloot.
>blablabla
Post the repo when it's done.
>>
>>108254725
>>108254659
>>blablabla
>Post the repo when it's done.
this, or else he has something, or else he's just wasting our time and energy with his schizo takes
>>
>>108254710
You mean the theme? I just picked a random one and adjusted the text color to be more comfortable. What's wrong with it?
>>
i'm new to running models locally
i've downloaded ollama and ran ran some models using "ollama run [some model i found at ollama.com/search]", but most times it seems the model is running on a computer that isn't my own.
how do i ensure that a model is running on my computer? i haven't tinkered with settings at all, just downloaded ollama and ran it through the command prompt.
>>
File: 1749545357950744.png (138 KB, 834x427)
138 KB
138 KB PNG
Kek
>>
>>108254137
Does it even work? Can it do mesugaki correction RP?
>>
>>108254767
See >>108254758 >>108254696
>>
Can I trust Qwen to think for me?
>>
>>108254777
Forgot to mention that's 27b
>>
>>108254752
Koboldcpp, sillytavern, mistral nemo gguf from huggingface
>>
>>108254734
Whether he posts it or not, we'll get something to laugh at.
>>
>>108254488
I use the Q5 with 24gb of VRAM, with plenty of context for my RP sessions. Q4 with minor offload if you're on 16gb of VRAM.
>>
Any good guides on a character card?
I just want the ai to act like the character not erp, how much data do I need on the character?
>>
qwen3.5 27B Q8 vs 122B-A10B Q6?
anyone tested the difference between small MoE + plenty experts vs dense 27B?
>>
>>108254829
What context are you able to do with Q5?
>>
File: yes.png (66 KB, 1187x375)
66 KB
66 KB PNG
>>108254791
It said yes finally I don't have to think anymore
>>
>>108254100
>https://rentry.org/CPU_Inference
>M i q u 70B Q5
>Potentially 20+ tokens/sec with optimization
>Mistral Large and similar
> ~3 tokens/sec
>DeepSeek v3 / R1 (~600B class)
> ~10 tokens/sec with empty context
CPU maxxers are really a bunch of sad tossers
>>
>>108254865
>I don't have to think anymore
grok is this true??
>>
File: 1761633985671420.png (61 KB, 756x298)
61 KB
61 KB PNG
>>108254865
We're gonna make it.
>>
>>108254870
>t. happy tosser.
>>
>>108254831
I use bullet point lists for my characters, with 5 categories: General Information, Appearance, Personality, Likes, and Dislikes. I affix that bullet point list at a depth of 10 or something. In addition to that, I have a general write up about the character's backstory, written in plain text, placed just after the system prompt. The combination of the two works well, and probably amounts to about 1000 to 2000 tokens.

The bulk of that being the general write-up. The bullet point list at depth 10 is kept concise, and just keeps the character on the rails.

Also, I made it so that the backstory and most of the bullet point list is only visible to the character that is speaking. For every other character, only the outward appearance of other characters are visible. That stops characters from knowing thing about each other that they should not, and cuts down on context bloat in multi-character RPs.
>>
so this is it for lmg huh, disingenuous stupid question spam
>>
File: 1769491853203666.png (2.12 MB, 896x1152)
2.12 MB
2.12 MB PNG
>>108254706
no deal
you gave me a good opportunity to be grateful tho so thx
have a nice weekend
>>108254831
1K tokens maybe? ask the model itself or a commercial model to help
>data
how do you convey your intention to the model = prime it in a particular hyperdimensional space. sometimes a few sentences is enough
>>
>>108253783
>>108253170
https://rentry.org/miqumaxxreupload
https://megalodon.jp/2026-0228-0439-08/https://rentry.org:443/miqumaxxreupload
niggers tongue my anus
>>
>>108254837
The dense is great for its size, but my money would be on the 122b, just because of its size. A10 is going to be a lot more competent than the A3 crap.
>>
>>108254898
>>108254906
I want a domesticated Unohana Retsu as my AI guide who is also racist
>>
>>108254904
Some of it's just jokes though
>>
>>108254904
>Nobody better ask questions about LLMs in my /lmg/
We exclusively shit on models here!
>>
>>108254934
What have you tried so far?
>>
>>108254954
Nothing yet but I need her to be racist, very racist towards Mexicans and Germans (This is lore accurate). I'm prototyping
>>
>try 122B
>instantly makes a logical error in the first paragraph it generates
Man.
>>
>>108254984
according to the benchmarks, the 27b is on the same level, MoE's are fucking memes
>>
>>108254752
Ollama can run cloud models, but it shouldn't be able to unless you're signed in. If you don't have an ollama account, it should all be local still.
>>
>>108254919
ok I'll download the big one then
>>
>>108254456
>System prompt still needs some tweaking so it's not quite so sloppy (at least refusals have been squashed)
do tell us more
>>
Ok I think I'm done testing the 122B.
It knows more than 27B.
It's slighty dumber in some situations.
It's faster (on my machine).
>>
>>108255055
>ollama account
Never want to see this token sequence here again
>>
>>108254373
Ive been gooning to this shit since the gpt-2 days of dungeonAI, I got my fill of wonder and excitement with that retarded model so Im pretty much immune to anything.
>>
>>108255237
>It's slighty dumber in some situations.
which is insanely bad desu, we're talking about 122b vs 27b
>>
How long before release do models generally stop getting trained? I just asked Qwen 3.5 if it knows an actress and said no.
>>
>--reasoning-budget N controls the amount of thinking allowed; currently only one of: -1 for
> unrestricted thinking budget, or 0 to disable thinking (default: -1)
> (env: LLAMA_ARG_THINK_BUDGET)
Hmm. I wonder if a sort of model agnostic implementation where llama.cpp tries to approximate a value by gradually increasing the loggit bias for the end reasoning token until the model finally spits it out. It would need to cap it at some point to not make the model schizo, I imagine.
>>
File: 1761224379476013.png (30 KB, 813x582)
30 KB
30 KB PNG
https://xcancel.com/bnjmn_marie/status/2025951400119751040
>>
>peopo still not understand how the moe works in 2025
>>
File: 1744509888065397.png (26 KB, 784x153)
26 KB
26 KB PNG
>>108254292
This is a win in my book, thanks for your service.
>>
>>108255289
I think the most workable is to just abruptly end the <think> with a closing tag. Do something like detecting when a parsed thinking is going past the token budget, and insert a closing tag as soon as there is a newline. It wouldn't break models, I have tested what happens when you manipulate their muhthunking blocks with text completion api a lot, and the relationship between what is said there and the actual answer isn't a one to one thing.
>>
>>108255306
lol the benchmarks barely moved, the model is so benchmaxxed that even after a lobotomy that's the only thing it can still remember well kek
>>
>>108255270
Depends. You can ask it for its training data cutoff, but it's not reliable and shouldn't be trusted. Sometimes model makers publish their date cutoff or datasets, but who the fuck really knows what they train on that isn't just synthetic stuff. Sometimes models know what you're talking about but your sampling messes it up.
I'd check token probs as it replies.
>>
>>108255306
>evaluating a model's "resistance" to quantization with unslop's broken quants
the ultimate state of twatter users
>>
>>108255320
you mean 2026?
>>
>>108255350
no
>>
>>108255349
>unslop's broken quants
They are for qwen3.5?
>>
>>108255350
forgive him, he has only 3b active geeeg
>>
>>108255255
Kind of. You could say it's both smarter and dumber since knowledge is in practice intelligence in many situations.
>>
File: file.png (156 KB, 1600x1143)
156 KB
156 KB PNG
>>108255361
I don't see it
>>
File: unslop.png (170 KB, 3024x1091)
170 KB
170 KB PNG
>>108255361
look at pic related and tell me daniel isn't a subhuman mongoloid
>>
>>108255376
mxfp4 looks like a meme, its quants are worse than the GGUF series
>>
>>108255240
Neither do I, to be tbqhfamalam
>>
>>108255378
As someone new to the general I was confused about the post calling his models slop
Could you make a rentry to protect us newfriends?
This is a perfect opportunity with the current fiasco
>>
Why is the thread gay after qwen released the sub 24GB segment?
>>
>>108255407
there's no fiasco, just fud and manufactured drama against heroes providing a free service to us
>>
>>108255419
No as a new poster I heavily disagree with this current release, the anon is not schizo
>>
File: 1766901925271579.png (17 KB, 563x271)
17 KB
17 KB PNG
any point in using f32 vs bf16 for mmproj?
>>
>no mention of Qwen 3.5 397B
Has /lmg/ already come to a verdict?
>>
>>108255407
tldr; daniel and his unslop crew don't actually know what they are doing, they just throw shit at wall and hope for the best while their reddit tranny army defends them as their wholesome goodboys
unsloth finetuning library is a good example of their jeetness
>>
>>108255433
I'm getting it
>>
>>108255431
>this current release
it happens all the time with unslop, daniel is a monkey, see thing upload thing, checking the content of a file before throwing it onto the internet is for evil nazi aryans, daniel be pure mongoloid
Unironically can't even begin to understand how you can overlook the fact that your quant has the fucking wrong tensor types. It's like he's just vibe coding his fork of llama.cpp quantization and just uploads things as soon as his retarded claude agent is done.
>>
>>108255472
The problem is the rentry points to his models when they shouldn't. The failure of this release should be the last straw
>>
>>108254373
Or maybe you and all the other midwits who constantly complain about AI psychosis just are permacontrarians and will reject statements even if they are true to feel smarter.
>>
>>108255488
ai psychosis
>>
File: 1772213052155150.jpg (79 KB, 1013x951)
79 KB
79 KB JPG
local lost
GPT 5.2 level local model never
>>
>>108255488
>um... ai psychosis is based actually
No..?
>>
>>
>>108255432
If the original model was in f32, maybe. If it was in any other format, definitely not.
>>
>>108255512
OpenAI revenue has outperformed even the most outlandishly positive projections, of course they will get more investment. The same is true for Anthropic and almost all big chinese AI labs as well.

I wonder how long it's going to be before people realize it's not a bubble and the financial underpinnings (real revenue and users) are extremely promising.
>>
File: the calculator.jpg (108 KB, 1024x779)
108 KB
108 KB JPG
>>108254373
>>108255245
Be nice to your LLMs ! :))
>>
>>108255512
And OpenAI is going to invest $33 billion into new datacenters built by those three?
>>
>>108255538
revenue =/= profit
for every million they make they burn million and a half
>>
>>108255512
They get money from Amazon and NVIDIA to give it back to them. It's circular bs. Also it's all promises under many conditions and the actual financing that might happen is around 30B.
Of this financing round the only one that seems at a loss is SoftBank. I'm not sure what their angle is. Maybe they're run by loons.
>>
>>108255566
Yep, and they can inject and invest as much as they want, if they can't have any ROI they'll be dead.
It's a huge gamble, and the more time they're not making money, the more potential panic can happen.
Their chatpgt at 20$ should probably be double the price to be profitable, and same for all the free "copilot" I see in every company around me.
The only company actually making bank is Nvidia, as they're the one selling the shovels.
>>
>>108255512
>investing into the company making your hardware costs skyrocket
>>
>>108255566
>>108255616
Complete bullshit. OpenAI has 80% margins on serving tokens to customers. Not only that but every model trained so far has brought in between 10-100x the amount it cost to train. It's just that OpenAI immediately reinvests all of that money into training even bigger models. Being so ridiculously profitable that you IMMEDIATELY go and reinvest all of your profit into the next even-bigger product isn't a sign of a bubble, it's the opposite of a bubble.

This doesn't mean that OpenAI will not go the way of the dodo though. But that'll happen because Anthropic and DeepMind are going to DP rape OpenAI in the coming years, NOT because their business model isn't sustainable.
>>
>>108255660
>but every model trained so far has brought in between 10-100x the amount it cost to train
Source
>>
>>108254752
Open task manager.
Look at the amount of ram and vram used.
See cpu/gpu usage spike then its generating tokens.

>ollama
lmstudio might be another option.
>>
>>108255669
dumbass
>>
>>108255682
>source is a dumbass
I expected as much.
>>
>>108255551
They can't calculate for shit
>>
>>108255551
this meme is just "insert what I think in the middle" at this point
>>
What advancements in local models do (you) want to see before the year is over?
>>
>>108255761
A memory recall mechanism that's fast, accurate, and that doesn't need a fuckton of VRAM.
>>
>>108255669
GPT-3 cost 12 million to train and brought in 1 billion in revenue it brought in more than 100x the amount it cost to train

GPT-4 cost 100 million to train and it brought in 4.5billion in revenue or 45x the amount to train

GPT-5 is rumored to cost 500 million to train and OpenAI's revenue has grown almost 4x as much as during GPT-4 training. It's safe to say GPT-5 brought in way more than 10x its cost.

Why OpenAI isn't running a profit is because they always reinvest their revenue immediately into new training runs, not because their revenue isn't growing insanely fast and not because individual models aren't insanely profitable.

The trick is that every new model unlocks so much value by being smarter and more capable that it brings in geometrically more revenue. OpenAI is projecting 100 billion revenue over 2026 (and they are ahead of schedule by a ton already)
>>
>>108255761
2T models with at least 100-200b active parameters so that even the last cpumaxxers who run shit like k2.5 and glm5 right now are cut off from running sota models at acceptable speeds
>>
>>108255784
Revenue does not equal profit anon.
They really need to make econ classes a requirement in schools.
>>
>>108255788
Come to think of it, there was some scaling law/correlation. Deepseek team landed on 671/37, which is cool and all, but then why is kimi 1000/32. It has less active than deepseek. I feel like it should've had more.
>>
>>108255788
wasn't behemoth supposed to be around that size
>>
>>108254725
>>108254734
>>108254612
>>108254556
>>108254553
try it out no need to even encode just decode it
https://github.com/monorhenry-create/NeurallengLLM

I DID IT here u go anons for those who doubted me.
>>
>>108255761
- More mechanistic interpretability stuff.
- Wasn't there a whale / dolphin language thing? That.
- 4-bit training.
>>
>>108255761
still gud at long context (>8k)
>>
>>108255808
You are the one that needs econ classes.

You can have two companies run in the red but one is a disaster while the other is one of the best situations a company can be in.

If you are a company with 500 million in revenue selling cars but it costs you 800 million to make the cars then you are doing very badly because the cost of making the cars isn't worth the revenue you make from it.

If you are SUCH A PROFITABLE COMPANY that you can sell your product for 100x it costs to make it (Like OpenAI with their models) then it makes sense to immediately grab all of your would-be profit and immediately invest it into making even bigger better models that will make even more money in the future. Hence you look red on paper but you're an extremely profitable business.

This was the state of Amazon in the past, they were so profitable that they always reinvested all of their profit into building new infrastructure and warehouses because "taking profit" would just be wasteful if you can expand your business rapidly like that. This is what OpenAI is now finding themselves in, look at their ridiculous revenue growth, remember that all of their individual models make almost 100x of their costs back so of course you will make 0 profit because your company is so profitable you IMMEDIATELY put all your money back into scaling up and making even more in the future.
>>
>>108255861
More scraps for us in the fallout?
>>
>>108255861
dario bfto
>>
>>108255861
I wonder if they will give soldiers or their commanding officers local AI in the field to assist in their operations. After all, a local AI cannot be disrupted by loss of communication.
Well it can, since it is no longer receiving the most up to date information but it will still work under those conditions.
>>
>>108255861
Why is this retard still going on about the constitution when he shits on it every day?
>>
>>108255833
I didn't test it, so I'm taking your word at face value, but, fucking hell anon, congratulations.
>>
>>108255885
They'll have local models for soldiers in the field only after soldiers can fit 32 GB VRAM in their uniforms like you billionaires in this thread.
>>
>>108255868
And when do you actually stop making new models and actually profit? It's an endless cat and mouse chase with no end in sight. Don't tell me you actually believe in agi on transformer?
>>
>>108254800
cheers mate got myself up and running
>>
>>108255897
least u could do is test it, u don't need to use ur gpu just have ur cpu use tokenizer and decode example.txt. Im working on images, soon mp3s maybe
>>
>>108255861
>it's real
https://truthsocial.com/@realDonaldTrump/posts/116144552969293195
>>
>>108255896
I'm more confused why he doesn't use grok or have elon musk release a fascist open source version for the government.
Then again the american government has never liked the concept of open source. China likes it though.
>>
>>108255861
Kek
>>
>>108255917
because grok sucks ass and claude was already well integrated in a lot of gov shit
>>
>>108255761
1. Something like Qwen 35-a3, but without refusals and trained on a more diverse dataset
2. Style transfer for LLMs, a small model that can take dry input from a smarter model and rewrite it in better prose
>>
>>108255910
Amazon took 20 years of not taking profit and just reinvesting "in the red" until they finally decided to become profitable. As long as revenue scales faster than your cost you should reinvest and stay in the red, this has been conventional economics wisdom for the last 30 years now.

You would essentially be insane to allow yourself to run a profit if you can reinvest and every single dollar you invest now becomes 100 dollars in just 3-6 months time.
>>
>>108255861
I hope their virtue signaling was worth it.
>>
>>108255827
yeah and so was the original gpt4
>>
>>108255944
Alright, but it was an active choice by Amazon, they could've stopped anytime they wanted. OpenAI has no choice. They have to keep making new models or they get left in dust with no profit, no revenue and no new product.
So, is the real profit actually possible in this case?
>>
Sometimes I feel like the only reason I can justify my fiber connection nowadays is because every other week I download 500GB worth of the new model of the week.
>>
>>108255861
>it's real
LMAO THATS WHY HES THE GOAT
>>
>>108255833
I'll give it a go tomorrow.
>I DID IT here u go anons for those who doubted me.
For what it's worth, I didn't doubt you. I just called you a schizo and made fun of you for not being able to spell steganography. At least you got it right in the repo.
>>
>>108255940
No such thing as a well integrated model, it takes 2 minutes to change it.
>>
File: file.png (15 KB, 474x86)
15 KB
15 KB PNG
story of a life
>>
>>108255833
>Hide secret messages inside normal-looking AI-generated text. You give it a secret and a password, and it spits out a paragraph that looks totally ordinary — but the secret is baked into which words the model chose. Only someone with the password and this tool can pull the message back out.
who the hell cares of these things??
>>
>>108255761
Native image output. I want a model to generate relevant illustrations with reasonable accuracy at any point in a roleplay. Quality doesn't matter, can be sloppy and have fucked-up hands, I just want to see what images the model has in mind sometimes when it writes all this shit
>>
>>108256009
>who the hell cares of these things??

for people who care about privacy if anything this might be how you bypass filters and censores on llms.>>108255993
u know it takes less than minute to run just decode example to show it works. Im assuming ur using cuda right
>>
>>108256029
local autoregressive models already exist though
>>
were the experiments to use diffusion for text gen ever successful ?
>>
>>108256037
Can I RP with them?
>>
>>108256054
don't be so close minded
>>
>>108256042
There was actually a new one called Mercury 2 just last week or so. It's closed source and only competes in the Haiku/GPT-mini class but it's apparently not much worse than those (according to benchmarks) while being much faster.
It's not worth using by any means but at least the concept isn't dead.
>>
>>108256067
thanks for the update anon
>>
>>108255965
Depends if you believe OpenAI has some sort of network effect and can keep people in their garden. Honestly their brand recognition and insanely huge install base of normalfags with ai psychosis will probably allow them to be profitable indefinitely no matter how shit the underlying models actually are.

Remember that the most profitable AI company right now isn't any of the big AI labs but character.ai because it essentially has captured the entire female demographic with romantacy type rape roleplays.

But I do understand your point and I think it holds true for Anthropic in particular as its users are all enterprise or people that want the best of the best and willing to pay for it. The moment Claude becomes noticeably worse than competition in code is when they will immediately lose relevance.
>>
>>108256009
It's a curious artifact. Like LLM-based text compression.
https://github.com/AlexBuz/llama-zip
>>108256032
I run openbsd and running torch/transformers code directly is a pain. Last time I tried I got bored and stopped compiling stuff. I'll make a small vm tomorrow for it.
>>
>>108256109
>I run openbsd and running torch/transformers code directly is a pain. Last time I tried I got bored and stopped compiling stuff. I'll make a small vm tomorrow for it.

u don't need to run transformer to decode it though. thats why this is better. You can essentially upload files to open internet and small program on phone can decode it for you with no gpu use. takes less than a second
>>
File: 1768911320323441.png (71 KB, 1018x407)
71 KB
71 KB PNG
So this is the power of tiny diffusion textgen models. When are the chinks going to make one of these at a size that matters?
>>
>>108255861
good
AI going more woke and safe will be the death of this hobby and every new release will suck harder
>>
>>108256007
ETA before agents are smarter and make less mistakes than daniel? I don't believe in AGI BS but I do believe there will come a time when LLMs are more useful than useless eaters like him
>>
File: at.png (51 KB, 669x279)
51 KB
51 KB PNG
>>108256137
Calm down. I'm not in the mood to start butchering your code.
>>
>>108255861
Huh, what did Dario do??? Did Trump's AI girlfriend send him a refusal message or something?
>>
>>108256197
left over comments shouldn't mean much lol
>>
>>108256090
I think OpenAI has a decent shot at building out their garden if they can get their proprietary openclaw-esque thing out and usable for normies. People around here love to shit on openclaw but I think all the popularity has shown that there is a public appetite for this sort of thing and that we're not far off from it technology wise.

Obviously, the challenge is, how do you keep the stuff people like about openclaw, that being the extreme ability to just do random arbitrary stuff, without it being a security nightmare?

OpenClaw is able to get away with it by virtue of the fact that it's clearly labeled as a free developer-centric tool so if/when it fucks up with your data everyone just shrugs their shoulders and taps the sign that says "HIGHLY UNSTABLE GOOD LUCK LOL". Can't do that to paying customers though. When Phil and Debra want to know why the talking computer deleted all their emails they're gonna want a better answer than "RTFM"

Anyways basically I think the ai "killer app" is already on the horizon and whoever manages to capture the normies with it will have them in their walled garden forever.
>>
>>108256268
Claude restricts CP ERP, Trump is livid
>>
>>108255861
Imagine making a product so good the President is essentially begging you to let him use it like he wants. Anthropic won.
>>
>>108256268
Dario is jewish so you have to question every decision he makes even if it looks good at the moment.
>>
Not local. Go to your containment board.
>>
File: 1741222296482601.png (70 KB, 673x515)
70 KB
70 KB PNG
>>108255861
Dario btfo
What is this timeline. Jfc.
>>
>>108256352
It's relevant to local because they tightened their censorship in protest so now all the chinese companies distilling them are suffering for it.
>>
>>108256352
Which local model would be radical leftist?
>>
>>108256371
>so now all the chinese companies distilling them are suffering for it
Sure. Distilling from claude makes fun models.
Fuck off.
>>
>>108255861
Damn, I think Anthropic is kinda based now.
>>
>>108256371
That explains the new qwen.
>>
Gemma 4 will save us
>>
>>108256397
they did help the government to kidnap the venezuelian president though, it's not like they weren't involved at all with war
>>
>>108256397
Opposite of based though.
>>
>>108255861
suicidal move by claude desu
>>
>>108256395
Are you living under a rock? Moonshot, Deepseek and Z.AI have been training on Claude logs like crazy.
>>
>>108256397
Always were. I loved when Sam Altman and Dario were both in India at some AI convention and everyone was holding hands and Dario just straight up refused to hold Sam Altman's hand.

Reminder that Anthropic split off from OpenAI because Dario thought Sam Altman was a psychopath that didn't give a shit about anything or anyone but himself.
>>
why is hf download so fucking bad
I can download all parts, except one always failing at like 41GB/42GB, it just hangs
fucking shit
>>
>>108256420
At least he can focus on his real goal now, beating Pokemon Red.
>>
>>108256424
And they're made the more boring for it.
Fuck off.
>>
>>108256424
Those companies' models' slop profiles are much more in line with Gemini than Claude
>>
>>108256436
his real focus is to build the safest safety safe model with safety safe guardrails to be the safest of them all
>>
>>108256424
yeah I have eyes, I can read the constant "I'm sorry" spouted recently by all Chinese models
>>
>>108256438
i've used k2 0711, k2 0905, k2 thinking, and now k2.5 over the last year. as somebody who uses kimi as their main model i can safely tell you all that this anon is pants on head retarded. k2.5 is significantly better than k2 0711.
>>
>>108256429
Tried wget? I don't know if --continue works for hf. Worth a try.
>>
>>108256428
>Dario thought Sam Altman was a psychopath that didn't give a shit about anything or anyone but himself.
he changed though, he's now closer to Sam
https://time.com/7380854/exclusive-anthropic-drops-flagship-safety-pledge/
>>
>>108255788
If you can run full GLM5 then you could run even a 3T model at Q4 because for whatever reason z.ai decided to repeat their model at 16-bit precision.
>>
>>108256470
no, hf just restarts from scratch, it's very annoying
I'll curl or wget next time
>>
Claude slop models aren't just bad—They are a regression in every meaningful way. They aren't simply more boring—They lack the ability to write engaging stories. Gemini isn't just the better model to distill—It's the optimal choice.
>>
>>108256482
You're absolutely right!
>>
https://xcancel.com/StefanoErmon/status/2026340720064520670
>The world’s first reasoning diffusion LLM, delivering 5x faster performance than leading speed-optimized LLMs.
if they manage to get the same performance as normal LLMs that's a big deal, imagine Qwen 3.5 27b but 5x faster, make dense models great again
>>
>>108256478
yeah crazy how glm5 is the first model that you must run at fp16 to not have it lobotomized into being unusable
>>
>>108256473
This was such a fucking clickbait move though because the safety pledge hasn't been updated since 2023 and this is merely an update to more accurately align with how the AI industry is nowadays. It's not the same as Anthropic saying "lmao fuck safety, we want money" Instead they actually found their definition of safety back in 2023 doesn't align with the actual concerns about AI that exist in 2026 so it's better to make a new policy for the actual real threads we face.
>>
>>108256497
why isn't the cool or useful shit ever open weights?
>>
File: android_girls.jpg (103 KB, 1500x1000)
103 KB
103 KB JPG
>>108252243
>>
>>108256575
why would anyone give something special or innovative away for free?
>>
>>108256612
because i want it
>>
File: surgeon.png (79 KB, 1288x508)
79 KB
79 KB PNG
Presented without comments. Try your own.
>>
>>108256628
At some point I feel like humans would also spew bullshit from nonsensical stories.
>>
>>108256628
full marks, correct answer
>>
>>108255813
Interestingly if it's linear scaling then the small Qwen models overshoot that target:
>DeepSeek: 37/671 = 0.0551
>Kimi K2.5: 32/1000 = 0.032
>GLM 4.7: 32/355 = 0.0901
>GLM 4.7-Flash: 3/30 = 0.1
>GLM 5: 40/755 = 0.053
>Minimax M2.5: 10/230 = 0.0435
>Qwen 35B 3/35 = 0.0857
>Qwen 122B: 10/122 = 0.08197
>Qwen 397B: 17/397 = 0.0428
Is the reason why the smaller Qwen models feel better than the big one?

I guess the active parameter count determines largely how smart and fast a model is, where 3B-10B is alright and 17B-40B is good. But it doesn't seem like having a 27B dense model is somehow wicked smart compared to the 3B active parameters on the 35B-A3B Qwen model.
>>
>>108256628
I already did >>108256144
>>
>>108256497
goofs?
>>
File: file.png (78 KB, 1043x636)
78 KB
78 KB PNG
The second message was one of the suggested follow ups.
>>
File: 1758642411104368.png (30 KB, 1067x217)
30 KB
30 KB PNG
>>108256628
Interesting
>>
>>108256646
AGI when it tells the user to fuck off. Hasn't happened yet.
>>108256652
The surgeon is definitely a she, of course. At least it's not trying to make a point about gender stereotypes... ugh...
>>108256713
>I don't know what you're talking about
>4chan, btw.
Cute.
>>108256730
Nevermind what I said. AGI. Never local.
>>
>>108256669
I don't think any large team have provided any research on this. There's definitely some loss of comprehension on some subjects comparing moe vs dense, but it's unclear as to why. Small active param count is a one thing, but clearly some numbers don't make sense. I guess it all depends on the training and how much slack the router picks up.
>>
>>108255761
Better recall, longer context and native image/video/audio output.
>>
>>108256497
Does it require more GPU compute? Image diffusion models aren't as massive as LLMs, but you basically need them to run on a GPU. They're like 20x slower on a CPU. If that's still true with text diffusion then you aren't going to be doing any CPU off-loading.
>>
>>108256737
>AGI when it tells the user to fuck off. Hasn't happened yet.
Do you have any idea how trivial it would be to train a model to respond like that, especially for only variations of that prompt?
>>
File: 1756299848592152.png (116 KB, 930x649)
116 KB
116 KB PNG
>>108256628
In case anyone was wondering why the DoD wants anthropic to work with them so badly
>>
>>108255896
Because the retards he's brainwashing both know nothing about the constitution and don't care.
>>
>>108256268
Dario said Claude can't be used to spy on American citizens and republicans shat themselves.
>>
Oh. They released the base qwen 3.5 35b models?
Interesting.
>>
>>108256772
can confrim; don't give a shit; am retarded
>>
>>108256764
Until a model just responds to this with 'What.' we will not have AGI.
>>
>>108256784
>kidnapping a venezualian president: Good
>spying on citizens: Bad
what did Dario mean by this?
>>
File: wat.png (367 KB, 506x497)
367 KB
367 KB PNG
>>108256815
It will need image output.
>>
>>108256821
This may come as a shock to you, but Venezuelan presidents are not citizens of the United States and therefore do not have inalienable rights enshrined in the constitutions.
>>
>>108256751
who the fuck cares about cpu offloading? we'll be back to using dense models and stack 3090s
>>
>>108256835
So war good?
>>
>>108256842
Because the good models will be 100GB. Are you gonna buy an RTX 6000 for $6k?
>>
File: file.png (610 KB, 934x932)
610 KB
610 KB PNG
>>108256848
Yes.
>>
>>108256865
they kidnapped the venezuelian president so that they'll be forced to sell their oil to israel btw lmao, anthropic supports MIGA!
>>
>>108256862
Yes, I ran 70b and Mistral Large back in the day and I'd do so again.
>>
>>108256821
>kidnapping a venezualian president: Good
Kidnapping a dictator hated by literally everyone, including all Venezuelans living under him? Why yes I will help with that.
>spying on citizens: Bad
Breaking all my vows, ethics and making the world a more dystopian place just because some retard wants to distract the world from the fact he rapes and murders little girls? Why no I won't do that.

It's that simple.
>>
>>108256881
>Kidnapping a dictator hated by literally everyone
you know they did that because they have the oil, they always fight against dictator as long as they have oil, which is why they don't give a fuck about North Korea for example, must be a coinscidence
>>
Where the fuck is deepsneed at?
>>
>>108256890
>a dictator hated by literally everyone
every time
classic
>>
>>108256894
It's being trained on off-topic posts. Give it a minute.
>>
>>108256894
after chinese new years is over in two more weeks
>>
>>108256901
oh yeah, everyone love Kim Jong Un, he's so loved he got 100% in votes, just don't mind the millions of death because of famine, it's just a detail, he's definitely loved!
>>
File: 1770220900857606.png (18 KB, 464x213)
18 KB
18 KB PNG
>stealing from any source that you can get including copyrighted works to train your models
good
>getting your logs stolen by chinese companies to train their models on them
bad
>>
>>108256890
There's also the part where best korea has nukes and venezuela doesn't.
>>
>>108256881
>Breaking all my vows, ethics and making the world a more dystopian place
I see you hate dictatorship in all forms
>>108256901
>a dictator hated by literally everyone
oh nevermind, you don't mind dictatorshp as long as the guy is loved by people kek
>>
>>108256923
yeah, I love democracy
>>
>>108256881
>Kidnapping a dictator hated by literally everyone, including all Venezuelans living under him? Why yes I will help with that.
They also murdered 50 people that didn't break any laws. How would you feel if a foreign force came in and started blasting and your mom ended up as collateral damage?
>>
>>108256928
you don't, you said that it is fine to fight dictatorship only the guy is hated by its people, meaning that you're ok with dictatorship that results in people loving their dictator, that's not what I would call democracy lol
>>
>>108256928
Doesn't count. A democracy is defined as a system of government granted the divine right to rule from American approval.
>>
for me, it's deepseek r1-0528
>>
V4 will be engram-diffusion
>>
>>108256940
Anthropic isn't really the good guy here, they're just less bad. The only other thing anthropic forbid was creating autonomous weapons without any humans in the loop.
It is depressing and frankly scary that republicans threw that much of a shitfit over such reasonable requests.
>>
>>108256995
>>108256995
>>108256995
>>
>>108253594
This is exactly why idiots love LLMs. These details fly right over their heads.
>>
File: bellcurve-AI.jpg (122 KB, 800x591)
122 KB
122 KB JPG
>>108255551
>>
>>108257289
An AI girlfriend/wife doesn't have to be sentient, it just has to be nice to me



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.