[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


File: v-Sy-Zqs_400x400.jpg (24 KB, 400x400)
24 KB
24 KB JPG
/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>108389142

►News
>(03/16) Mistral 4 small releasing: https://huggingface.co/collections/mistralai/mistral-small-4
>(03/11) Nemotron 3 Super released: https://hf.co/nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-BF16

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling
Token Speed Visualizer: https://shir-man.com/tokens-per-second

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm
>>
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png
>>
Ah. It keeps happening. Fret not. I'm here for you.
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png
>>
this is the thread that will witness v4
>>
>>108393015
>>108393017
mental illness
>>
>Introducing Unsloth Studio
https://unsloth.ai/docs/new/studio
Released 1h ago?
>>
File: 1764248103536079.png (139 KB, 914x1079)
139 KB
139 KB PNG
Imagine releasing a V3 finetune in the current day lol
>>
>>108393017
>>108393015
thanks my niggas
>>
File: 1714958236930.png (743 KB, 1000x1024)
743 KB
743 KB PNG
>>
Kurisu footjob
>>
>>108393026
>sunilkumar
>>
>>108393029
1
>>
>>
File: 1766750793238495.png (321 KB, 1282x912)
321 KB
321 KB PNG
>>108393026
GPT-4o killer
>>
File: slop.png (699 KB, 1000x1024)
699 KB
699 KB PNG
>>
>>108393026
>it offers a superior grasp of Japanese language and culture
What if it is actually trained on hentai and this is the promised salvation for coomers?
>>
>>108393064
Only if you do your erp in Japanese
>>
File: file.png (70 KB, 658x759)
70 KB
70 KB PNG
Curious.
>>
>>108393026
>>108393064
notice how they don't compare to the model it's derived from
original v3 was already pretty okay at nipponese, so I'm wondering if their shittune actually made it worse
>>
The sad state of 2026 has made me give grok 2 a try. I wasn't expecting much but it feels very fresh, it gives me the same excitement as the one I felt when first trying characterai years ago. Also the pp is lightning fast for a model this big, like 10 times faster than deepseek. Sadly I'm poor so I only have a 3090, q2_k_l runs at ~1.8 tps with one expert used instead of two
Could anyone suggest better args? -ub 4096 -b 4096 -ngl 27 --parallel 1 --threads 24 --no-mmap -ot exps=CPU --override-kv grok.expert_used_count=int:1

tldr try grok2 if you can.
>>
>>108393026
What I'm actively learning right now is that finetuning older models has advantages compatibility-wise. I've wasted a lot of time over the last weeks trying to get things to work with Qwen3-5 models that would be a hundred times easier if I were to have used Qwen3 instead.

Deepseek v3 is probably plenty good for most use cases. It's a solid model, everything supports it, there's nothing in it that's still considered a hack that boosts benchmark scores a bit to the detriment of finetuning and inference possibilities, etc.
>>
>>108393082
post logs
>>
>>108393082
elon pls you're not stable yet
>>
File: 0297593269519.jpg (174 KB, 2568x427)
174 KB
174 KB JPG
>>108393093
Can this be considered a sane response?
>>
>>108393074
kek
>>
anyone here tried castration?
>>
>>108393123
Holy fuck a whole paragraph of non-answer
>>
>>108393082
Lower ub and increase ngl I guess.
>>
File: 1749806469397446.png (65 KB, 697x686)
65 KB
65 KB PNG
>>108393123
>>108393131
>>
File: 20660.png (295 KB, 999x1002)
295 KB
295 KB PNG
https://github.com/ggml-org/llama.cpp/pull/20660
Piotr continues arguing with the model.
>>
>>108393164
Actually pretty sure there're qi-gong practices that let you retract your balls and control your bladder etc.
>>
>>108393200
man imagine doing all that, I just have my llm fill pull requests, bug reports, etc itself, and also have it deal with them. imagine reading all that shit yourself lmao
>>
>>108393123
Is there some kind of a trope where the guy who pretends to be the most cool, not bothered and not care about other people is actually the biggest conformist?
>>
>>108393218
>I love destroying software
>>
>>108393200
What is wrong with your text rendering? Looking at that makes my eyes water.
>>
>>108393227
Wrong thread luddite loser
>>
>>108393200
Who has to win this argument in order to get llamacpp back to the state where it is not 5 times slower?
>>
>>108393242
Enjoy nothing working soon.
>>
File: 3004.png (52 KB, 420x349)
52 KB
52 KB PNG
>>108393237
Looks fine here. Are you sure the images are not being compacted by your resolution?
>>
>>108393291
What about same sex relationships*
>>
>>108393249
2 weeks am i rite
>>
>>108393291
We already went over this. Stop shitting up good kurisu threads. Take a shit in a mikutroon thread instead.
>>
What is the best model for control drones and kill kikes?
>>
File: sloppykurisu.png (149 KB, 1260x837)
149 KB
149 KB PNG
>>108393052
why do people like her again?
>>
>>108393420
She is better than: no personality except loving black dicks Miku (male).
>>
Mistral Small knows about pretty recent (2024-2025) stuff. It's shit though.
>>
>>108392759
>>108392801
>>108392810
Wait. It's not a dense model?
>>
>>108393243
as soon as we get GLM MTP
>>
>>108393525
it's not anon.
https://huggingface.co/miromind-ai/MiroThinker-v1.5-235B. base agent is Qwen3-235B-A22B-Thinking-2507
>>
>>108393525
buddy look on the page, it literally tells you what the base model is on the right side.

on another note, what is the largest open weight dense model we have available? is it llama3 405B?
>>
>>108393566
They already released previous version in January and unless I missed it nobody here brought it up. Chances of good sex performance are basically 0%.

https://huggingface.co/miromind-ai/MiroThinker-v1.5-235B
>>
>>108393356
>special interest good
>>
>>108393590
As the official kurisu baker I will once again state that my policy that hasn't changed since original miku baker meltdown. If mikuspam stops there will never be any kurisuspam or kurisu thread. You need to make the first step to make this thread better.
>>
>>108393568
https://huggingface.co/RichardErkhov/FATLLAMA-1.7T-Instruct
>>
>>108393647
are you ok?
>>
>>108393587
supposedly 1.5 235B had degradation issues.
https://huggingface.co/miromind-ai/MiroThinker-v1.5-235B/discussions/3
>>
>>108393612
>>108393647
i dont give a fuck as long as the maho faggot doesnt come back
>>
>>108393666
>Also lower case post
>Satan trips
>Pretending that wasn't you
Yep it's confirmed.
>>
https://huggingface.co/mistralai/Mistral-Small-4-119B-2603/discussions/15
Which one of you VRAMlets posted this?
>>
>>108393779
you
>>
File: kingofvramlets.png (22 KB, 255x304)
22 KB
22 KB PNG
>>108393805
bow down to your king
>>
>>108393026
I'm pleased to see a purely non-thinking model.
https://hf.co/Rakuten/RakutenAI-3.0/blob/main/chat_template.jinja
>>
>>108393813
Wrong thread to be flaunting second hand two generations old hardware.
>>
>>108393813
>3090s
>king
llmao
>>
>>108393828
cope
>>108393830
and seethe
>>
>>108393813
How does that compare cost wise to a single RTX pro 6000?
Half the cost?
>>
>>108393841
What do you run on this? Gpt oss 120B?
>>
>>108393842
less than 3k if you got them when the prices were good
>>
>>108393842
And double the power consumption.
>>
>>108393842
i got my cards for about $650 each back in october 2023 when they were getting cheap before everything went to shit
>>
>>108393779
When you go to the length of writing down all that bitching and posting it directly at a lab, why not just skip the bullshit and tell them you can't fuck this new model and you are sad. I can't fuck their new model and I am sad.
>>
>>108393856
qwen 3.5 122b runs at about 2800tk/s pp and 50tk/s tg on it. but i mostly just run kimi 2.5 and use the gpus for offloading layers and kv cache
>>
>>108393827
You mean hybrid thinking
>>
>>108393860
>>108393864
That ain't too bad.
Neat.
Tangential, but how well do two sparks chained together work? What's the performance drop from going through the cable?
>>
>>108393889
i can't imagine you would get much performance considering they are bandwidth limited at 280GB/s or something like that. why not just use the p2p drivers and NCCL on consumer cards?
https://github.com/aikitoria/open-gpu-kernel-modules
>>
>MISTRAL 4 SMALL!
>TINY REGULATED MODEL FOR YOU
>look inside
>100 billion parameters
>200 VRAM to run it
Hmmmm that's what's small to europeans?
>>
My wife's son has this gaming pc setup. What model should he run on it?
>>
>>108393936
qwen 3.5 4b
>>
>>108393936
mistral small 4
>>
>>108393936
Gemma 3n Heretic.
>>
>>108393945
a winrar is you
>>
File: yammy.jpg (187 KB, 832x1216)
187 KB
187 KB JPG
►Recent Highlights from the Previous Thread: >>108389142

--Papers:
>108391945
--Mistral model quality decline blamed on EU AI Act restrictions:
>108391062 >108391074 >108391085 >108391161 >108391179 >108391094 >108391213 >108391279 >108391423 >108391845 >108391884 >108392037 >108392175 >108392251 >108392290 >108391919 >108391946 >108391956 >108392001 >108391336 >108391439 >108391470
--Mistral 4 underperformance blamed on dataset limitations:
>108392077 >108392078 >108392095 >108392135 >108392180 >108392296 >108392305 >108392310 >108392335 >108392352 >108392375 >108392623 >108392645 >108392731 >108392747 >108392798 >108392830 >108392895 >108392969
--Performance and quality issues with real-time voice2animation models:
>108392840 >108392923 >108392884 >108392958 >108393000 >108393063 >108393115 >108393191 >108393219 >108393231 >108393261 >108393292
--Debating which model generated a 4chan /g/ analysis:
>108392687 >108392706 >108392714 >108392724 >108392739 >108392758 >108392868 >108392882
--Epyc server performance with 2060 Super on Qwen 3.5 397B:
>108390163 >108390209 >108390229 >108390245 >108390276 >108390342 >108390360
--Nemotron dataset transparency and feasibility of uncensored derivatives:
>108391455 >108391467 >108391481 >108391566 >108391494
--PocketTTS.cpp performance profiling and optimization discussion:
>108392957 >108392973 >108393033 >108393110
--MiroThinker-1.7 benchmark results and comparative LLM performance:
>108392584 >108392759 >108392801 >108392810
--Mistral4 q8 performance issues and possible implementation flaws:
>108390418 >108390856 >108390876 >108391054 >108391058 >108392363
--Struggles enabling high-effort reasoning in Mistral-Small model:
>108390230 >108390299 >108390355 >108390502
--Teto and Miku (free space):
>108390122 >108390454 >108390543 >108390686 >108390690 >108390695 >108391085 >108391377 >108392816

►Recent Highlight Posts from the Previous Thread: >>108389635

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script
>>
>>108393958
Gun self defense good, data self defense bad.
>>
FUCK FUCK FUCK WHY IS CLAUDE DOWN AARRHRHRHHHGHJHHH
>>
>>108394058
Works on my machine (through OR)
>>
>>108394058
Local models?
>>
>>108394064
>through OR
qrd
>>
>>108394072
When it comes down to it, nobody uses those
>>
>>108394088
you owe me seks
>>
>>108394072
everyone uses claude to develop local models bozo.
>>
>>108391946
Wait, ministral is just a pruned small? Nothing new added? Why would I ever want to use it when I can just run small
>>
>>108393958
I was wondering where you were.
>>
>>108394072
https://huggingface.co/Anthropic
>>
File: 20660_02.png (42 KB, 612x206)
42 KB
42 KB PNG
>>108393200
From his twitter account.
>>
>>108393082
>tldr try grok2 if you can.
Okay I can give it a shot
Which quant for 92 rams and 48 vrams
>>
>>108394108
sucking cock
>>
>>108393291
Personally for me, if you're saying you're looking for a romantic partner on this planet, correct? Does it not make more sense to find a romantic partner of someone out there who you have not spent a significant amount of time of your life with, but you find someone who's the right person and you have that time period where you get to know them better, you have you find out what your similarities are what your differences are, you essentially find that one in a million who's right for you. However when it comes to who on Earth are we going to be romantic with if you're someone you've already grown up with someone who spent a significant amount of your time with a brother a sister a family member already, right? Doesn't really have much meaning, does it? In my opinion like that doesn't make any sense to me. Like of all the people who I'd want to spend the rest of my life with, it's not the people I spent my young years with, you know what I mean like, listen I love my family, but for me to have a romantic relationship with someone for the rest of my life, I want to have someone who I went out there and I found someone who was meaningful to spend time with. Who isn't me or isn't part of that core family that I've already have been forced to be with. Do you understand?
>>
>>108393164
the model needs to cultivate more
>>
>>108394172
>Personally for me
>>
>>108393886
What do you see that makes you think that model is hybrid thinking/non-thinking?
>>
>>108394172
Here. Have a (You).
>>
>>108394135
some q2. btw I managed 3tps after removing ub flag and increasing ngl
>>
>>108394117
>using codex cli
Retarded beyond belief oh my god, imagine not making your own harness
>>
>>108394195
Are you disrespecting one of the biggest thinkers of our time?
>>
>>108393082
>270B parameters, 115B activated
Interesting. Worth a shot.
>>
>>108390785
For work or not seems like a simple concept is escaping you. You are still using the template even if you do just tool calls.
>>
>>108390785
>I have stepfun
Is stepfun actually good and it was just a problem with llamacpp implementation?
>>
>>108393026
Need to try this in my Japanese RP. Openrouter?
>>
>>108393200
That guy's posts read nothing like an LLM. Vibepiotr should do more vibecoding to pick up on the actual patterns instead of accusing people who try to fix his vibecoded parser.
>>
>>108394320
>That guy's posts read nothing like an LLM
kek
>>
>>108394058
Kill yourself
>>
How do I make models think in character
>>
>>108394405
Prefill.
>>
>>108394405
i can't help you anon. there's no fixing stupid.
>>
>>108394405
Use R1 0528 and roll a few times
>>
next step up from qwen3.5 27b? intelligence, facts, coding, creative writing, logic etc
>>
>>108394502
llama 405b
>>
File: lol.png (21 KB, 837x463)
21 KB
21 KB PNG
>>108394516
>>
This halved my t/s.

https://github.com/ggml-org/llama.cpp/pull/20463
>>
I predict that none of the next 10 big releases (that are below 650B) will be better than 4.7
>>
File: keks.png (18 KB, 722x110)
18 KB
18 KB PNG
>>108394574
>>
>>108394574
This doubled my t/s.

https://github.com/ikawrakow/ik_llama.cpp/pull/1413
>>
>>108394601
I don't care about schizo forks and I certainly don't care about fucking gptoss fucking 20b
>>
>>108394616
I don't care about slow and inefficient backends and I certainly don't care about fucking qwen fucking 35b
>>
>>108394574
When you correctly called out the missing timestamps, my system got crossed up. It is a known AI quirk to sometimes deny capabilities or outputs when challenged, and I wrongly defaulted to claiming I hallucinated the whole thing rather than just admitting I truncated the text.
>>
File: 1760805053999513.jpg (284 KB, 1920x1892)
284 KB
284 KB JPG
I can't get unsloth studio to work
the UI is broken and I don't know why
>>
File: indeterminate detergent.jpg (211 KB, 1024x1024)
211 KB
211 KB JPG
>>
>>108394680
loser
>>
>>108394616
ok api sister
>>
>>108394680
The what now?
>>
>>108394680
what's wrong? this is the level of quality and competence I come to expect from daniel
>>
>>108394681
sisterfucker
>>
File: skylar-died.gif (3.8 MB, 320x284)
3.8 MB
3.8 MB GIF
>>108393110
>>108393163
>>108393178
Okay, the repo is set up. I added a demo too so you can test it out for yourself (I removed all of the smoothing and inverse kinematics stuff so you only see the raw EMAGE output).

https://github.com/VolgaGerm/emage-onnx-export
>>
Can someone do the book reports for rakuten and mirothinker? I will post a cute Miku if you do that.
>>
>>108394599
Don't understand why they're making 1 trillion models that are only competing with 300 billion models
>>
>>108393020
unfortunately, this post isn't checked.
>>
>>108394785
They don't care anymore about local users, so now they can make the model as big as they like and just train to a compute-optimal number of tokens (~20 tokens/parameter) to save money.
>>
>>108394761
Projection
>>
>>108394943
open models =/= local instances
the 100B+ models are for hosting companies for inference, if you just so happen to be able to run it then that's a benefit for you.
case in point if you look at mistral's website.
>Mistral-Small-4-119B-2603
https://legal.cms.mistral.ai/assets/d0b7b04d-dcb5-412d-bb45-c63b1475b805
>open source deployment
>Minimum required is one of:
>• 4xH100
>• 2xH200
>• 1xB200
>Recommended:
>• Disaggregated inference
across 16 H200s: 8xH200
for prefill, 8xH200 for
decode
>>
>>108394680
>vibecoded trash doesn't work
wow
>>
>>108395004
>16 H200s
That is like 600k$ to run mistral small?
>>
File: moonbench.png (95 KB, 1111x682)
95 KB
95 KB PNG
>>
>>108395062
qrd
>>
>>108395045
yeah it's so you can split up prefill and decode to separate GPUs
https://docs.modular.com/mammoth/disaggregated-inference/
>>
>>108395062
Distribution is all fucked, but god damn, trinity.
>>
File: file.png (252 KB, 400x400)
252 KB
252 KB PNG
People thought that the antichrist that would destroy open source AI was Sam or Alex. Nobody expected it would be him...
>>
>>108395088
Looks like a uniform. Needs more black and red. And a funny symbol in it. Something like a cross with more sticks coming out of the ends.
I don't know man. I just see it.
>>
>>108395105
He is polish. We don't like those uniforms here I think.
>>
>>108395132
>polish. We
my condolences
>>
>>108395132
>We don't like those uniforms here I think.
So you just did a color swap.
>>
>>108395082
Is Trinity fixed and was it just the implementation that was bugged?
>>
>>108393070
You know I do, sempai
>>
>>108395173
>m
ngmi
>>
>>108395169
>Is Trinity fixed
Dunno. Was it broken? I don't know how >>108395062 is running it nor which one he's running.
>>
>>108395187
>Was it broken?
When it released it was mistral 4 small level retarded with much fresher writing style, but retarded.
>>
>>108395185
ありがとうごぜぇます
>>
>>108395187
>>108395169
Trinity's problem is that after a certain context length responses become very short. Other times it's just dumb but fun
>>
>>108394680
you don't need it
if you're new, learn from these I guess: https://unsloth.ai/docs/get-started/unsloth-notebooks
but Unsloth are very shitty coders:
their training library breaks every other day, they had good ideas with their UD quants of R1, but their new quants are retarded, chat template changes are pointless and usually make things worse, etc
you really don't want to run whatever the fuck this 'studio' is
>>
Wow. Mistral Small 4 really slows down with context.
Geez.
From 17t/s at 0 context to 11t/s at 3kish context.
>>
>>108395290
Please tell us, was Mistral 4 worth the disk space?
>>
>>108395293
Literally just launched it.
Gonna try running it through my app, see how it does with all the data extraction, but at this pace it might take upwards of an hour (qwen 3.5 35B takes around 20 min at 23t/s with batching).
>>
File: poor.png (84 KB, 816x619)
84 KB
84 KB PNG
>>108393813
post system ram
>>
>>108395299
(You) me if you post a follow up
>>
>>108394782
Neat, thanks. Will try to play with it over the weekend.
>>
>>108395290
I'm pretty sure that's not how LLMs work. You sure it's not just "slowing down" because it's eating your VRAM as the context increases? What is your batching speed set to?
>>
>>108395322
You can't get a (you) for free like that...
>>
>>108395322
>>108395339
Stop trying to farm (You)s with meta-engagement like this, it won't work.
>>
File: rammaxxing.png (160 KB, 435x548)
160 KB
160 KB PNG
>>108395308
>>
>>108395339
>>108395358
Dear R-edditors, have a free Gold Account, it's on (me).
>>
>>108395383
Thank you kind stranger.
>>
>>108393091
>finetuning older models has advantages compatibility-wise
how? do you not know how to finetune?
>>
>>108395336
>batching speed
Is that a thing that exists, or do you mean pp batch size?

>You sure it's not just "slowing down" because it's eating your VRAM
There's 2 gigs of free VRAM, just checked.

>I'm pretty sure that's not how LLMs work.
Isn't it the case that each new token has to attend to every other past token (RoPE!), so that as tokens are generated the throughput goes down?
>>
>>108395392
yeah
>>
>>108395392
What are these flags set to?
-b
-ub
>>
Who is supposed to run Mistral """"""small"""""" 4?
>>
>>108395398
Aight.

>>108395403
>-b
The prompt processing (also called prefill, but not like assistant prefill) batch size.

>-ub
Related to the first one but I'm not sure what it is actually for.
I *think* it's relevant for multi-GPU setups.
>>
>>108395308
>>108395379
how much these systems did cost, let me live vicariously through better ram price times
>>
>>108395201
I have never seen a Japanese model. So much for the superpower in the minds of tranny incels.
>>
>>108395379
8 x $92.99 so $745 or whatever.
>>
>>108395308
Cards have never been cheap.
Ram was cheap.
But Nvidia, may Allah bankrupt that company, has never lowered it's damn prices in all the years I wanted to buy cards. High GPU prices are just a part of being a millennial or zoomer like expensive housing.
>>
Anyone here willing in so much as when?
>>
The day has come to do the deed.
>>
Is the potato important?
>>
>>108395529
v4 is being released?
>>
how do you disable thinking on mikupad?
>>
>>108395573
<think></think>
>>
>>108395584
thanks anon
>>
>>108395588
some models don't need the opening <think> you can read the jinja template for your specific model.
>>
>>108395601
that's fine, <think></think> did work with qwen27B
>>
File: 43554.png (14 KB, 577x123)
14 KB
14 KB PNG
>>
>>108395408
Investors.
>>
>>108395636
are you a boy
>>
>>108395636
*thoughtfully ticks a checkbox on my clipboarded notes*
cute boy sounds... good... but what about cute boy smells?
>>
File: file.png (13 KB, 402x432)
13 KB
13 KB PNG
I want to erp. I've got 16gb of VRAM and 64gb RAM. I'm still using AriRP Nemo 12B and it's just awful sometimes so I write it alone instead.
What do you type into here, nowadays?
>>
>mistral small 4
gguf status?
>>
>>108395742
Mistral Nemo 12B
>>
>>108395742
That was a weird way to word it, sorry. I'm stoned. I'm looking for recommendations, please and thanks.

>>108395749
Damn. The one I'm using is a finetune of that, unfortunate.
>>
>>108395742
try glm 4.7 flash
>>
Dipsy Wipsy
>>
>>108395772
Touching Dipsy's wipsy in a tender embrace.
>>
I'm very depressed localbros, never has been so OVER! there are no new model, China abandoned us, and now we are waiting just for nothing.
>>
I've never been happier localbros, never has it been so GOOD! i'm still having a blast with Qwen 3.5 and don't even notice the time flying by and social credits being added to my account.
>>
>>108395896
I dunno, I'm happy with the sort of prose qwen3.5-27b creates merely by turning off thinking, eg:
"Married! Yes! We're married! A dirty, stinky, cum-filled wedding! Fuck, I love it! I love that you're hard again already! You're a machine! A cum-machine built just for me! Yes! I want to ride you! I want to be on top! I want to look down and see your face while I crush you with my weight! Even though I'm scrawny, I'm gonna feel so heavy on you! Heh... heh... I'm gonna make you watch me break!"
>>
>>108395915
I can't read this because you are abusing code tags. Shame.
>>
>>108395408
Weird guys capable of spending thousands on GPUS just to goon
>>
File: youmadeit.png (80 KB, 400x873)
80 KB
80 KB PNG
Mistral Small 4 just smells like a Llama 4 failure. Meta had to advertise Scout and Maverick as "17B" to pretend they were any good for their size and that the company didn't suddenly pivot to datacenter-only models for their latest big release. I also bet it sucks because they similarly had to sanitize their training data due to lawsuits or regulations.
The outputs are so sloppy I wonder if it's mostly Nemotron datasets through and through.
>>
>>108393044
4o is dead, so it checks out
>>
>>108395742
Qwen3.5 35B-A3B Heretic or 27B heretic. Pick the latter if you can fit it into VRAM (even if quanted). If not then pick the former.
Make sure it's heretic or some other uncensored version.
>>
>>108393004
A human's eyes are not that large.
>>
>>108396185
You're absolutely right. It's a stylized version of a person - a common way of representing a human without being exact. You can find other examples of stylization in various art forms such as animation, drawing, and sculpture.
>>
>>108396200
This is a fascinating insight.
>>
File: 1759391408509321.jpg (500 KB, 3554x3815)
500 KB
500 KB JPG
>>108393004
>>
Why are they- trepidating the Instantiations of They+, and putting solipsists above?
>>
File: image (83).jpg (757 KB, 3072x1024)
757 KB
757 KB JPG



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.