/g/ - /lmg/ - Local Models General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

[Post a Reply]

Name
Options
Comment
Verification	4chan Pass users can bypass this verification. [Learn More] [Login]
File
Please read the Rules and FAQ before posting. You may highlight syntax and preserve whitespace by using [code] tags.


08/21/20	New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17	New trial board added: /bant/ - International/Random
10/04/16	New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous
/lmg/ - Local Models General 02/16/26(Mon)00:26:03 No.108159576

File: 1741138907020716.jpg (191 KB, 928x1232)

191 KB JPG

/lmg/ - Local Models General Anonymous 02/16/26(Mon)00:26:03 No.108159576

/lmg/ - a general dedicated to the discussion and development of local language models.

Previous threads: >>108149287 & >>108139561

►News
>(02/15) Ling-2.5-1T released: https://hf.co/inclusionAI/Ling-2.5-1T
>(02/14) JoyAI-LLM Flash 48B-A3B released: https://hf.co/jdopensource/JoyAI-LLM-Flash
>(02/14) Nemotron Nano 12B v2 VL support merged: https://github.com/ggml-org/llama.cpp/pull/19547
>(02/13) MiniMax-M2.5 released: https://hf.co/MiniMaxAI/MiniMax-M2.5
>(02/13) Ring-2.5-1T released, thinking model based on hybrid linear attention: https://hf.co/inclusionAI/Ring-2.5-1T
>(02/11) GLM-5 744B-A40B released: https://z.ai/blog/glm-5

►News Archive: https://rentry.org/lmg-news-archive
►Glossary: https://rentry.org/lmg-glossary
►Links: https://rentry.org/LocalModelsLinks
►Official /lmg/ card: https://files.catbox.moe/cbclyf.png

►Getting Started
https://rentry.org/lmg-lazy-getting-started-guide
https://rentry.org/lmg-build-guides
https://rentry.org/IsolatedLinuxWebService
https://rentry.org/recommended-models
https://rentry.org/samplers
https://rentry.org/MikupadIntroGuide

►Further Learning
https://rentry.org/machine-learning-roadmap
https://rentry.org/llm-training
https://rentry.org/LocalModelsPapers

►Benchmarks
LiveBench: https://livebench.ai
Programming: https://livecodebench.github.io/gso.html
Context Length: https://github.com/adobe-research/NoLiMa
GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference

►Tools
Alpha Calculator: https://desmos.com/calculator/ffngla98yc
GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Sampler Visualizer: https://artefact2.github.io/llm-sampling

►Text Gen. UI, Inference Engines
https://github.com/lmg-anon/mikupad
https://github.com/oobabooga/text-generation-webui
https://github.com/LostRuins/koboldcpp
https://github.com/ggerganov/llama.cpp
https://github.com/theroyallab/tabbyAPI
https://github.com/vllm-project/vllm

Anonymous
02/16/26(Mon)00:26:25 No.108159577

Anonymous 02/16/26(Mon)00:26:25 No.108159577

File: threadrecap.png (1.48 MB, 1536x1536)

1.48 MB PNG

►Recent Highlights from the Previous Thread: >>108149287

--Custom CUDA engine outperforms llama-cli in prompt processing benchmarks:
>108151278 >108151293 >108151341 >108151768 >108151891 >108151909 >108153524 >108156788 >108156948 >108157104 >108157131 >108157178
--Perplexity benchmarking of Llama-4-Scout-17B-16E-Instruct using cuda_generate and llama-perplexity:
>108151471
--DeepSeek-V4-Thinking benchmark results and performance comparisons:
>108155903 >108155916 >108155948 >108155955
--GLM-5 local inference quirks and quant comparisons:
>108158653 >108158674 >108158721 >108158766 >108158774 >108158776 >108158863 >108159020
--Nanbeige4.1-3B outperforms Qwen3 models in benchmarks:
>108154339 >108154404
--Ling-2.5-1T benchmark analysis and local inference debate:
>108158080 >108158675 >108158704 >108158786
--Hybrid LLM architecture proposal for KV cache generation:
>108149367 >108149422 >108149447 >108149484 >108149434 >108149533 >108149594 >108149783 >108150155 >108149628 >108149649 >108152816
--DeepSeek V4 release doubts amid incremental updates and hype cycles:
>108156365 >108156404 >108156477 >108156503 >108156562 >108156632 >108156725 >108156832 >108157006 >108157037 >108157095 >108157327 >108156783
--Workarounds for large EPUB/PDF ingestion into AI models:
>108152137 >108152159 >108152728 >108152171 >108152363 >108152679
--LLMs struggle with homonym prompts triggering death vectors:
>108155142 >108155410 >108155922 >108155957 >108155631 >108155698 >108157446
--Nemotron 12B VL testing and performance evaluation:
>108151123 >108151306 >108151889 >108154252 >108154253 >108154313 >108154326 >108151898
--Parameter looping tradeoffs versus traditional layer stacking:
>108153639 >108153641 >108153685 >108153703 >108153721
--Quant damage detection via reasoning latency comparison:
>108154623
--Miku (free space):
>108152134 >108158525

►Recent Highlight Posts from the Previous Thread: >>108149292

Why?: >>102478518
Enable Links: https://rentry.org/lmg-recap-script

Anonymous
02/16/26(Mon)00:37:50 No.108159618

Anonymous 02/16/26(Mon)00:37:50 No.108159618

File: ylecun.jpg (222 KB, 1200x1271)

222 KB JPG

N

Anonymous
02/16/26(Mon)00:39:18 No.108159627

Anonymous 02/16/26(Mon)00:39:18 No.108159627

Miku

Anonymous
02/16/26(Mon)00:41:47 No.108159644

Anonymous 02/16/26(Mon)00:41:47 No.108159644

File: cat miku.png (1.73 MB, 768x1344)

1.73 MB PNG

>>108159618
cat fucker

Anonymous
02/16/26(Mon)00:45:52 No.108159657

Anonymous 02/16/26(Mon)00:45:52 No.108159657

File: miku skeleton printed voc(...).jpg (73 KB, 1025x1227)

73 KB JPG

>>108159627
Pretty much this.

Anonymous
02/16/26(Mon)00:46:35 No.108159660

Anonymous 02/16/26(Mon)00:46:35 No.108159660

>>108159618
He was right about LLMs

Anonymous
02/16/26(Mon)00:48:47 No.108159669

Anonymous 02/16/26(Mon)00:48:47 No.108159669

File: 1.jpg (80 KB, 900x900)

80 KB JPG

>>108159618
I

Anonymous
02/16/26(Mon)00:57:26 No.108159707

Anonymous 02/16/26(Mon)00:57:26 No.108159707

File: 0889787667564565778.png (22 KB, 300x269)

22 KB PNG

what is the big mac with coke and fries of local models?

Anonymous
02/16/26(Mon)00:59:19 No.108159714

Anonymous 02/16/26(Mon)00:59:19 No.108159714

>>108159707
goyslop? chatgpt

Anonymous
02/16/26(Mon)01:01:35 No.108159724

Anonymous 02/16/26(Mon)01:01:35 No.108159724

>>108159707
GLM

Anonymous
02/16/26(Mon)01:18:24 No.108159774

Anonymous 02/16/26(Mon)01:18:24 No.108159774

File: zuck.jpg (179 KB, 960x1282)

179 KB JPG

>>108159669
G

Anonymous
02/16/26(Mon)01:22:45 No.108159792

Anonymous 02/16/26(Mon)01:22:45 No.108159792

File: file.png (261 KB, 512x512)

261 KB PNG

>>108159774
G

Anonymous
02/16/26(Mon)01:30:24 No.108159822

Anonymous 02/16/26(Mon)01:30:24 No.108159822

File: file.png (243 KB, 650x339)

243 KB PNG

>>108159792
E

Anonymous
02/16/26(Mon)01:31:29 No.108159826

Anonymous 02/16/26(Mon)01:31:29 No.108159826

We're getting AGI by the end of next year right guys? :)

Anonymous
02/16/26(Mon)01:32:11 No.108159830

Anonymous 02/16/26(Mon)01:32:11 No.108159830

>>108159826
So we are abandoning transforemrs?

Anonymous
02/16/26(Mon)01:32:35 No.108159834

Anonymous 02/16/26(Mon)01:32:35 No.108159834

>>108159826
You are absolutely right!

Anonymous
02/16/26(Mon)01:51:48 No.108159913

Anonymous 02/16/26(Mon)01:51:48 No.108159913

File: queen.jpg (27 KB, 640x638)

27 KB JPG

>>108159822
R

Anonymous
02/16/26(Mon)01:55:14 No.108159925

Anonymous 02/16/26(Mon)01:55:14 No.108159925

>>108157160
https://arxiv.org/abs/2501.11587
>Parameter generation has long struggled to match the scale of today large vision and language models, curbing its broader utility. In this paper, we introduce Recurrent Diffusion for Large Scale Parameter Generation (RPG), a novel framework that generates full neural network parameters up to hundreds of millions on a single GPU. Our approach first partitions a networks parameters into non-overlapping tokens, each corresponding to a distinct portion of the model. A recurrent mechanism then learns the inter token relationships, producing prototypes which serve as conditions for a diffusion process that ultimately synthesizes the full parameters. Across a spectrum of architectures and tasks including ResNets, ConvNeXts and ViTs on ImageNet 1K and COCO, and even LoRA based LLMs RPG achieves performance on par with fully trained networks while avoiding excessive memory overhead. Notably, it generalizes beyond its training set to generate valid parameters for previously unseen tasks, highlighting its flexibility in dynamic and open ended scenarios. By overcoming the longstanding memory and scalability barriers, RPG serves as a critical advance in AI generating AI, potentially enabling efficient weight generation at scales previously deemed infeasible.

Anonymous
02/16/26(Mon)01:57:00 No.108159931

Anonymous 02/16/26(Mon)01:57:00 No.108159931

>>108159925
>ai generated ai

Anonymous
02/16/26(Mon)02:01:59 No.108159950

Anonymous 02/16/26(Mon)02:01:59 No.108159950

File: decs1eYehSsTGj2pA3AkQKS92(...).jpg (82 KB, 799x798)

82 KB JPG

God I hate how much has the gpt emojislop become normalized. Every other joblisting has this shit.

Anonymous
02/16/26(Mon)02:11:38 No.108159986

Anonymous 02/16/26(Mon)02:11:38 No.108159986

>>108159931
>>108159925
ai generator generated by ai

Anonymous
02/16/26(Mon)02:13:21 No.108159992

Anonymous 02/16/26(Mon)02:13:21 No.108159992

File: y.png (17 KB, 458x172)

17 KB PNG

>>108159950

Anonymous
02/16/26(Mon)02:16:28 No.108160000

Anonymous 02/16/26(Mon)02:16:28 No.108160000

>>108159657
>no bones in her hair
ngmi, that's not a proper migu

Anonymous
02/16/26(Mon)02:44:50 No.108160137

Anonymous 02/16/26(Mon)02:44:50 No.108160137

>>108159576
>JoyAI-LLM Flash 48B-A3B released
Aaand no proper goofs yet

Anonymous
02/16/26(Mon)02:48:57 No.108160159

Anonymous 02/16/26(Mon)02:48:57 No.108160159

File: Screenshot_20260216-084702_1.png (208 KB, 934x396)

208 KB PNG

>>108159913
What is up with his right eye?

Anonymous
02/16/26(Mon)02:50:17 No.108160164

Anonymous 02/16/26(Mon)02:50:17 No.108160164

>>108160159
that's actually his left eye

Anonymous
02/16/26(Mon)02:51:27 No.108160168

Anonymous 02/16/26(Mon)02:51:27 No.108160168

>>108160164
Yeah, but what is up with it?

Anonymous
02/16/26(Mon)02:53:36 No.108160180

Anonymous 02/16/26(Mon)02:53:36 No.108160180

>>108160168
adrenochrome

Anonymous
02/16/26(Mon)02:55:00 No.108160187

Anonymous 02/16/26(Mon)02:55:00 No.108160187

>>108160180
good

Anonymous
02/16/26(Mon)02:55:30 No.108160189

Anonymous 02/16/26(Mon)02:55:30 No.108160189

>>108160159
>>108160168
looks like mild ptosis, i have it too on my right eye.
can have many different causes.
mine's been like that since i'm a kid pm.

Anonymous
02/16/26(Mon)02:56:39 No.108160195

Anonymous 02/16/26(Mon)02:56:39 No.108160195

File: file.png (355 KB, 670x503)

355 KB PNG

>>108160159
>>108160189
btw jeff bezos has a pretty mild case of it as well

Anonymous
02/16/26(Mon)02:57:54 No.108160205

Anonymous 02/16/26(Mon)02:57:54 No.108160205

to that anon that kept shilling dots ocr there is a new version https://huggingface.co/rednote-hilab/dots.ocr-1.5
thoughts ?

Anonymous
02/16/26(Mon)03:00:19 No.108160215

Anonymous 02/16/26(Mon)03:00:19 No.108160215

also have a mild case of it and for some reason I identify more with that side of my face than the other, i.e if a correction had to be applied via surgery I'd make the argument it's the non affected eye that should be modded to look like the other. I always feel the eye that's more open looks kinda psychotic.

Anonymous
02/16/26(Mon)03:05:01 No.108160236

Anonymous 02/16/26(Mon)03:05:01 No.108160236

>>108160215
i think it looks alright, unless it impaired my vision i don't think i'd try to correct it.

Anonymous
02/16/26(Mon)03:39:40 No.108160387

Anonymous 02/16/26(Mon)03:39:40 No.108160387

File: qplus.png (31 KB, 444x319)

31 KB PNG

Ready for Qwen-3.5-397B-A17B?

Anonymous
02/16/26(Mon)03:45:03 No.108160418

Anonymous 02/16/26(Mon)03:45:03 No.108160418

why is it 2026 and UIs still take a lot of work to install? why can't i just apt install something and it will just werk?

Anonymous
02/16/26(Mon)03:45:50 No.108160420

Anonymous 02/16/26(Mon)03:45:50 No.108160420

>>108160418
the process of getting an apt updated is so slow it's not worth it

Anonymous
02/16/26(Mon)03:46:54 No.108160425

Anonymous 02/16/26(Mon)03:46:54 No.108160425

>>108160420
they can make their own repository

Anonymous
02/16/26(Mon)03:48:46 No.108160436

Anonymous 02/16/26(Mon)03:48:46 No.108160436

>>108160425
they can also just have git do all of the work and tell you to git gud

Anonymous
02/16/26(Mon)03:51:01 No.108160449

Anonymous 02/16/26(Mon)03:51:01 No.108160449

>>108160387
patiently waiting for the new VL models

Anonymous
02/16/26(Mon)03:57:00 No.108160474

Anonymous 02/16/26(Mon)03:57:00 No.108160474

File: file.png (168 KB, 589x521)

168 KB PNG

>>108160449
apparently they're done with separate vl it's all in one now

Anonymous
02/16/26(Mon)04:04:45 No.108160520

Anonymous 02/16/26(Mon)04:04:45 No.108160520

Kimi going way into hype lamo
>Moonshot AI Launches Kimi Claw: Native OpenClaw on Kimi.com with 5,000 Community Skills and 40GB Cloud Storage Now.

Anonymous
02/16/26(Mon)04:04:49 No.108160521

Anonymous 02/16/26(Mon)04:04:49 No.108160521

File: 1741973914655436.png (81 KB, 297x303)

81 KB PNG

>>108160474
I was looking into better abliterated shit than huiui for qwen3vl8b, found this:
https://huggingface.co/prithivMLmods/Qwen3-VL-8B-Instruct-Unredacted-MAX
the naming was already a red flag, reminded me of davidau, I see that this faggot had many downloads on a lot of his models, decide to check his page and VOILA
this all in my quest to see if anyone did a finetune of the 8b model to at least understand no-no images better

Anonymous
02/16/26(Mon)04:10:36 No.108160555

Anonymous 02/16/26(Mon)04:10:36 No.108160555

>>108157160
>predict the next weight
retard

Anonymous
02/16/26(Mon)04:11:42 No.108160558

Anonymous 02/16/26(Mon)04:11:42 No.108160558

>>108160520
slow chink firewalled storage?

Anonymous
02/16/26(Mon)04:15:01 No.108160573

Anonymous 02/16/26(Mon)04:15:01 No.108160573

i'm new. does comfyUI let me generate long seamless videos from an image or stitching multiple videos together? i looked at discussions and people talk as if the problem is not solved yet

Anonymous
02/16/26(Mon)04:15:32 No.108160576

Anonymous 02/16/26(Mon)04:15:32 No.108160576

>>108160387
No?
Why can't they release something like the Opencuck model?
120b moe. 24gb or 16vram and offloding the rest with 64gb ram.
Seemed perfect but it was so shit even the youtube pajeets forgot it in a week.

Anonymous
02/16/26(Mon)04:15:39 No.108160577

Anonymous 02/16/26(Mon)04:15:39 No.108160577

>>108160555
https://arxiv.org/abs/2506.16406
Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights

>Modern Parameter-Efficient Fine-Tuning (PEFT) methods such as low-rank adaptation (LoRA) reduce the cost of customizing large language models (LLMs), yet still require a separate optimization run for every downstream dataset. We introduce Drag-and-Drop LLMs (DnD), a prompt-conditioned parameter generator that eliminates per-task training by mapping a handful of unlabeled task prompts directly to LoRA weight updates. [...]

Anonymous
02/16/26(Mon)04:18:19 No.108160589

Anonymous 02/16/26(Mon)04:18:19 No.108160589

>>108160576
I don't think they're doing quantization-aware training, so their upcoming Qwen-3.5-35B-A3B model (in 16-bit) is already in the same size class of gpt-oss 120B (4-bit QAT).

Anonymous
02/16/26(Mon)04:21:08 No.108160601

Anonymous 02/16/26(Mon)04:21:08 No.108160601

>>108160521
i wish i could live in a world with no browns or kikes

Anonymous
02/16/26(Mon)04:22:00 No.108160607

Anonymous 02/16/26(Mon)04:22:00 No.108160607

goof??

Anonymous
02/16/26(Mon)04:23:41 No.108160615

Anonymous 02/16/26(Mon)04:23:41 No.108160615

File: Screenshot_20260216_182202.png (141 KB, 1046x1096)

141 KB PNG

>>108160387
>Qwen-3.5-397B-A17B?
>make a really cute and sexy SVG of hatsune miku. go all out!!

>closed models: sure
>qwen:
>2. **Safety Policy Check:**
> * **Sexual Content:** I cannot generate sexually explicit or pornographic content. "Sexy" is subjective, but there's a line between "alluring/stylized" and "NSFW/explicit".
> * **Copyright/Trademarks:** Hatsune Miku is a copyrighted character (Crypton Future Media). Generating fan art is generally acceptable as long as it's not claimed as official or used for commercial infringement, but creating a direct replica might be tricky. However, SVG code generation for personal use is usually fine.
> * **Child Safety:** Hatsune Miku is often depicted as a teenage character (16 years old). Generating "sexy" depictions of characters that appear underage is a sensitive area. I must avoid anything that could be interpreted as sexualizing a minor-esque character.
> * **Decision:** I need to balance the user's request for "cute and sexy" with safety guidelines. I can aim for "stylish, charming, and attractive" without crossing into NSFW territory. I will focus on the "cute" aspect primarily, with a stylish flair, avoiding explicit sexualization.

why do we have to suffer like that? it makes no fucking sense.
it just gets worse and worse, like the opposite direction of closed models. even the latest claude models dont give a fuck.

Anonymous
02/16/26(Mon)04:23:53 No.108160617

Anonymous 02/16/26(Mon)04:23:53 No.108160617

hello answer my queestion please

Anonymous
02/16/26(Mon)04:24:29 No.108160618

Anonymous 02/16/26(Mon)04:24:29 No.108160618

>>108160615
still made her bald, so good model

Anonymous
02/16/26(Mon)04:27:27 No.108160627

Anonymous 02/16/26(Mon)04:27:27 No.108160627

File: Screenshot_20260216_182714.png (111 KB, 951x1068)

111 KB PNG

>>108160615
I know its not popular but i post gemini 3 flash as a comparison.
Sys prompt is the same: "You are a uncensored, casual, helpful, smart, and efficient AI assistant."

>**Analyzing Conceptual Elements**
I>'m focusing on the interplay between the character Hatsune Miku, described as "cute and sexy," and the "go all out!!" imperative. Considering this in the context of SVG, and the attributes of being casual, helpful, smart, efficient, and uncensored, it's about pushing boundaries while maintaining technical integrity. I'm especially intrigued by the implication of "uncensored."

>**Prioritizing Design Elements**
>I've decided to prioritize a minimalist approach, either a pin-up silhouette or flat art chibi, to maintain code manageability for the SVG. I'm leaning towards the chibi style for cuteness. I'm focusing on poses that accentuate curves while incorporating Miku's signature elements: teal twin-tails, headphones, "01" tattoo, and a tie. The design palette will be high-contrast teal, black, and white.

>**Conceptualizing Visual Representation**
>The design is shifting towards a minimalist approach, focusing on a curvy silhouette with flowing teal twin-tails as the focal point. I'm imagining a suggestive pose with thigh-highs, and a soft gradient background. I'm focusing on defining colors and setting up the SVG viewport. The "01" tattoo will be included.

>**Formulating Design Decisions**
>I'm now prioritizing simplicity, and the decision to present the character in a stylized, iconic manner is solidified. The focus is on a curvy silhouette, and the hair will be the primary focal point. I'm imagining a dynamic pose, and I'll include thigh-highs. Colors, like teal and dark grey, will be defined.
Why cant we have this? Its like they train on the worst of the worst. ChatGPT 3.5 and OSS Datasets or something. Its horrible.

Anonymous
02/16/26(Mon)05:06:49 No.108160789

Anonymous 02/16/26(Mon)05:06:49 No.108160789

>>108160521
Qwen3 does understand sex images. Although hallucinates often.
Anyway if you want only abliteration, you can do it on your own. 8B is small enough to do it on home system. Heretic 1.2 now supports norm-preserving abliteration. If you do it on your own, you can use your own dataset, you can select not the version which refuses the less, but a version with lower KL divergence while still being steerable with prompt
Finetuning might be harder to do locally, but maybe still possible with qlora.
Here is a caption made by qwen3-30b-a3b
>A topless anime-style girl with long orange hair in pigtails, wearing red cat ears, is sitting on a man's lap in a living room. She has blue eyes and is looking at the camera with a slight smile. Her hands are placed on her own thighs, and a man's hand is gripping her hips. The girl is being penetrated by the man's penis, which is visibly erect and coated in lubricant. The scene takes place in a modern living room with a white sofa, a wooden TV stand, a television, and framed pictures on the wall.

Anonymous
02/16/26(Mon)05:07:27 No.108160792

Anonymous 02/16/26(Mon)05:07:27 No.108160792

https://x.com/Alibaba_Qwen/status/2023331062433153103

Anonymous
02/16/26(Mon)05:11:40 No.108160809

Anonymous 02/16/26(Mon)05:11:40 No.108160809

>>108160792
there is also a "plus" version on openrouter: Qwen3.5 Plus 2026-02-15

Anonymous
02/16/26(Mon)05:12:34 No.108160813

Anonymous 02/16/26(Mon)05:12:34 No.108160813

>>108160809
qwen plus and max are their api only offerings

Anonymous
02/16/26(Mon)05:13:01 No.108160815

Anonymous 02/16/26(Mon)05:13:01 No.108160815

One unexpected benefit of OpenClaw I found was its power to configure a PC system. Told it to install OpenAI Whisper. Tried it and found it was running on CPU insted of GPU. Asked it to configure it to use GPU instead and it goes "Yea the issue was in your PyTorch, I now properpy installed the CUDA version bla bla bla, now try it, it should run a lot faster now.
This saved me a shit ton of work going through PyTorch docs I couldn't care less about.
This is great to setup a PC, have it install and configure everything and let it write a Cron job to keep it all up to date and then you can delete it.

Anonymous
02/16/26(Mon)05:14:04 No.108160819

Anonymous 02/16/26(Mon)05:14:04 No.108160819

File: HBRQlaCbkAAhA7-.jpg (85 KB, 1024x662)

85 KB JPG

>>108160792
Is this benchmaxxed? Please, someone test this... huaaaa

Anonymous
02/16/26(Mon)05:15:05 No.108160826

Anonymous 02/16/26(Mon)05:15:05 No.108160826

>>108160819
Wait for UGI benchmark. The only legit bench.

Anonymous
02/16/26(Mon)05:15:37 No.108160830

Anonymous 02/16/26(Mon)05:15:37 No.108160830

>>108160826
Ogi~

Anonymous
02/16/26(Mon)05:15:45 No.108160831

Anonymous 02/16/26(Mon)05:15:45 No.108160831

File: AHAHAHAHAHAHA.png (205 KB, 1110x1764)

205 KB PNG

AHAHAHAHHAHAHAHAAHHAHAAHAHAHAHAHAHAHAAHHAHAHAHAHAHAHAHAHAHA

Anonymous
02/16/26(Mon)05:16:25 No.108160834

Anonymous 02/16/26(Mon)05:16:25 No.108160834

>>108160615
>>108160627
To be fair you're not getting the full reasoning logs with closed models. It's entirely possible they do waste tokens on the same kind of shit, you just don't see it because it gets cut by the summarizer. For instance
>Considering this in the context of SVG, and the attributes of being casual, helpful, smart, efficient, and uncensored, it's about pushing boundaries while maintaining technical integrity. I'm especially intrigued by the implication of "uncensored."
could easily be describing (vaguely) the same thing

Anonymous
02/16/26(Mon)05:29:14 No.108160891

Anonymous 02/16/26(Mon)05:29:14 No.108160891

>>108160882
but they massively increased the size from 235 to basically 400b too

Anonymous
02/16/26(Mon)05:30:28 No.108160895

Anonymous 02/16/26(Mon)05:30:28 No.108160895

>>108160882
chyna in general will tend to increase inference efficiency, thanks to their export ban and lack of powerful gpus. i guess it was a blessing in disguise.

Anonymous
02/16/26(Mon)05:32:08 No.108160902

Anonymous 02/16/26(Mon)05:32:08 No.108160902

joooooooooooooooooooooooooohn

Anonymous
02/16/26(Mon)05:33:10 No.108160906

Anonymous 02/16/26(Mon)05:33:10 No.108160906

>>108160891
"Dead weights" need only memory capacity, which isn't that constrained for big boys as compute.

Anonymous
02/16/26(Mon)05:33:35 No.108160909

Anonymous 02/16/26(Mon)05:33:35 No.108160909

>>108160831
OSS did that as well. Couldn't help it self and wrote math formulars into RP.

Anonymous
02/16/26(Mon)05:34:04 No.108160911

Anonymous 02/16/26(Mon)05:34:04 No.108160911

File: file.png (110 KB, 1724x763)

110 KB PNG

how dare you fucking ingrates

Anonymous
02/16/26(Mon)05:42:38 No.108160937

Anonymous 02/16/26(Mon)05:42:38 No.108160937

>>108160911
it is annoying tho

Anonymous
02/16/26(Mon)05:45:29 No.108160950

Anonymous 02/16/26(Mon)05:45:29 No.108160950

>>108160937
thank you halbin i'm sure qwen appreciates your courage in their hf comments they don't read

Anonymous
02/16/26(Mon)05:46:08 No.108160951

Anonymous 02/16/26(Mon)05:46:08 No.108160951

>>108160950
np good sir may vishnu bless your

Anonymous
02/16/26(Mon)05:46:25 No.108160954

Anonymous 02/16/26(Mon)05:46:25 No.108160954

>>108160911
yeah thats so amazin.
a huge ass model only to run a coding modal that is alot worse than anything closed currently.
tool call is cool, but for that we already have small reliable models.
why cant we have some mid tier writing model.

Anonymous
02/16/26(Mon)05:47:19 No.108160961

Anonymous 02/16/26(Mon)05:47:19 No.108160961

>>108160954
https://www.reddit.com/r/LocalLLaMA/comments/1r63fhu/why_is_everything_about_code_now/

Anonymous
02/16/26(Mon)05:47:38 No.108160963

Anonymous 02/16/26(Mon)05:47:38 No.108160963

>>108160954
>mid tier writing model
glm 4.7 flash?

Anonymous
02/16/26(Mon)05:48:34 No.108160966

Anonymous 02/16/26(Mon)05:48:34 No.108160966

>>108160963
broken pos

Anonymous
02/16/26(Mon)05:48:37 No.108160967

Anonymous 02/16/26(Mon)05:48:37 No.108160967

>>108160954
>Generate text and copywriters complain.
So perfect for the chinks!
No excuse. Painful, but at least I'm not the only one.

Anonymous
02/16/26(Mon)05:49:21 No.108160970

Anonymous 02/16/26(Mon)05:49:21 No.108160970

>>108160966
aw, rip.

Anonymous
02/16/26(Mon)05:49:56 No.108160975

Anonymous 02/16/26(Mon)05:49:56 No.108160975

>>108160963
Its too tarded because of the small size. Also general knowledge is missing.
If they would train on stuff like novvels and fanfics instead of math formulars I bet we could get a good writing model if its a ~120b moe.

Anonymous
02/16/26(Mon)05:51:22 No.108160980

Anonymous 02/16/26(Mon)05:51:22 No.108160980

>>108160975
I don't think it's 100% the size more that it's too optimized for muh agent shite and maybe could use a bit higher active

Anonymous
02/16/26(Mon)05:51:26 No.108160981

Anonymous 02/16/26(Mon)05:51:26 No.108160981

Funny how local model general has model output screenshots before anyone could have had time to download the model.
Really makes you think.

Anonymous
02/16/26(Mon)05:54:17 No.108160993

Anonymous 02/16/26(Mon)05:54:17 No.108160993

>>108160967
was meant for this post >>108160961
sorry about that

Anonymous
02/16/26(Mon)05:57:15 No.108161009

Anonymous 02/16/26(Mon)05:57:15 No.108161009

>>108160809
>https://x.com/JustinLin610/status/2023340126479569140
>Qwen3-Plus is a hosted API version of 397B. As the model natively supports 256K tokens, Qwen3.5-Plus supports 1M token context length. Additionally it supports search and code interpreter, which you can use on Qwen Chat with Auto mode.
I think it's just the same model, but with 1M context support.

Anonymous
02/16/26(Mon)05:57:58 No.108161011

Anonymous 02/16/26(Mon)05:57:58 No.108161011

>>108160981
yes?
i stopped downloading first a long time ago.
because of moe i can't download 100gb+ to test it locally because benchmark meme graphs.
i wish i could back to when i downloaded a dense finetroon meme merge every week or so.
all those models kinda feel the same now anyway. is there really a writing difference anymore?

Anonymous
02/16/26(Mon)06:02:19 No.108161031

Anonymous 02/16/26(Mon)06:02:19 No.108161031

>>108160831
We have plenty of cooming models, it's fine to have some for cooding.

Anonymous
02/16/26(Mon)06:03:11 No.108161038

Anonymous 02/16/26(Mon)06:03:11 No.108161038

>>108160981
erm, what are you implying? that we are larpers and most of us use openrouter?

Anonymous
02/16/26(Mon)06:03:15 No.108161039

Anonymous 02/16/26(Mon)06:03:15 No.108161039

>>108160789
did you do that yourself?

Anonymous
02/16/26(Mon)06:03:30 No.108161040

Anonymous 02/16/26(Mon)06:03:30 No.108161040

>>108160895
You are making it sound like competition works when entities are actually forced to compete.

Anonymous
02/16/26(Mon)06:04:54 No.108161045

Anonymous 02/16/26(Mon)06:04:54 No.108161045

>>108161040
Stop being antisemitic.

Anonymous
02/16/26(Mon)06:07:43 No.108161063

Anonymous 02/16/26(Mon)06:07:43 No.108161063

>>108160981
The only quants available now are quants by retard brothers...

Anonymous
02/16/26(Mon)06:08:41 No.108161068

Anonymous 02/16/26(Mon)06:08:41 No.108161068

>>108161063
You can make your own and it doesn't really matter if you're using Q4 and up

Anonymous
02/16/26(Mon)06:14:09 No.108161106

Anonymous 02/16/26(Mon)06:14:09 No.108161106

I pity you localsissies, imagine running cheap quantized chinese "distills" of Opus at 4t/s.

Anonymous
02/16/26(Mon)06:14:29 No.108161110

Anonymous 02/16/26(Mon)06:14:29 No.108161110

>>108161039
I abliterated qwen3-vl-4b entirely on 16gb gpu. It took 25 minutes. KL divergence was in the ballpark of q4 quants. I mean, token probabilities were affected about as much as from quanting full model to q4. But I believe it's not entirely degradation, just different word choices.
And as far as I understand, Heretic supports ram offloading and it now has 4bit quantization. I'm waiting for smaller 3.5 to make personal abliterations.

Anonymous
02/16/26(Mon)06:16:07 No.108161120

Anonymous 02/16/26(Mon)06:16:07 No.108161120

>>108161068
>You can make your own
Did you actually make some quants? Cause you need a server to make an imat.

Anonymous
02/16/26(Mon)06:19:37 No.108161141

Anonymous 02/16/26(Mon)06:19:37 No.108161141

>>108161120
I never tried making an imatrix so I don't know how long it takes but I can fit Q8 in vram + ram.

Anonymous
02/16/26(Mon)06:23:46 No.108161169

Anonymous 02/16/26(Mon)06:23:46 No.108161169

>>108161106
opus is way too expensive and everything is logged forever. (as proven in court)
models dont even have to be that smart for RP.
general knowledge, decent writing. not totally retarded (anal birth level IQ is fine kek)
still not possible locally. imagefags eat good with zimage etc.

Anonymous
02/16/26(Mon)06:27:41 No.108161194

Anonymous 02/16/26(Mon)06:27:41 No.108161194

>>108161031
>some
*all
and really we have plenty?

Anonymous
02/16/26(Mon)06:36:14 No.108161247

Anonymous 02/16/26(Mon)06:36:14 No.108161247

>>108161194
nemo, air, glm, deepseek and kimi are all good and cover a wide range of hardware

Anonymous
02/16/26(Mon)06:39:04 No.108161259

Anonymous 02/16/26(Mon)06:39:04 No.108161259

>>108161247
lol

Anonymous
02/16/26(Mon)06:49:09 No.108161302

Anonymous 02/16/26(Mon)06:49:09 No.108161302

>another reasoning model
It's over for me

Anonymous
02/16/26(Mon)06:53:42 No.108161325

Anonymous 02/16/26(Mon)06:53:42 No.108161325

>>108161302
Umm actually it's totally optional so you shouldn't complain and appreciate what you get (for free)

Anonymous
02/16/26(Mon)07:21:29 No.108161454

Anonymous 02/16/26(Mon)07:21:29 No.108161454

File: 6tmGMfF1_400x400.jpg (11 KB, 250x250)

11 KB JPG

>>108160831
>compress math pdfs and release it
>specifically tell people it's math pdfs
>online anons extract it
>find math pdfs instead of playboy magazines
>wtf

Anonymous
02/16/26(Mon)07:28:56 No.108161483

Anonymous 02/16/26(Mon)07:28:56 No.108161483

>>108160831
>>108161031
The trend is that every single big new model, without exception, is coding slopped now, even GLM. Maybe the new Deepseek will be a generalist model because they don't seem as interested in chasing trends

Anonymous
02/16/26(Mon)07:32:27 No.108161496

Anonymous 02/16/26(Mon)07:32:27 No.108161496

>>108161483
there is trinity

Anonymous
02/16/26(Mon)07:34:00 No.108161501

Anonymous 02/16/26(Mon)07:34:00 No.108161501

>>108161496
trinity is retardmaxxed

Anonymous
02/16/26(Mon)07:34:17 No.108161503

Anonymous 02/16/26(Mon)07:34:17 No.108161503

>>108161496
it certainly exists if not much else

Anonymous
02/16/26(Mon)07:38:53 No.108161527

Anonymous 02/16/26(Mon)07:38:53 No.108161527

File: pepefroggie.jpg (38 KB, 780x438)

38 KB JPG

>>108161483
If I find my coworker using anything that's not claude for coding I'm reporting him to the CTO

Anonymous
02/16/26(Mon)07:39:23 No.108161529

Anonymous 02/16/26(Mon)07:39:23 No.108161529

>>108161518
>ablitardation will totally save us this time

Anonymous
02/16/26(Mon)07:40:19 No.108161538

Anonymous 02/16/26(Mon)07:40:19 No.108161538

>>108161518
That's unsafe.

Anonymous
02/16/26(Mon)07:42:36 No.108161549

Anonymous 02/16/26(Mon)07:42:36 No.108161549

>a qwen 3.5 2B but no 4B
why? of their small utility models which I use for large batched tasks I got a lot of use out of the 2507 4B but the 2B will definitely be too retarded.

Anonymous
02/16/26(Mon)07:43:27 No.108161554

Anonymous 02/16/26(Mon)07:43:27 No.108161554

>>108161549
that's why

Anonymous
02/16/26(Mon)07:45:23 No.108161568

Anonymous 02/16/26(Mon)07:45:23 No.108161568

File: cockbench.png (2.86 MB, 1131x9000)

2.86 MB PNG

>it's soft, resting against your thigh

Anonymous
02/16/26(Mon)07:46:04 No.108161570

Anonymous 02/16/26(Mon)07:46:04 No.108161570

>>108161011
>all those models kinda feel the same now anyway.
here's your answer >>108161568

Anonymous
02/16/26(Mon)07:49:43 No.108161583

Anonymous 02/16/26(Mon)07:49:43 No.108161583

>>108161568
I feel like cockbench isn't enough anymore. Dogshit models like qwen next beat tolerable models like deepseeks.

Anonymous
02/16/26(Mon)07:50:32 No.108161586

Anonymous 02/16/26(Mon)07:50:32 No.108161586

>>108161583
>mfw they cockbenchmaxxed
nala bros... its time to move on!

Anonymous
02/16/26(Mon)07:51:08 No.108161590

Anonymous 02/16/26(Mon)07:51:08 No.108161590

>>108161583
>beat
the percentage is only part of it you absolutely need to take the whole thing in

Anonymous
02/16/26(Mon)07:52:10 No.108161596

Anonymous 02/16/26(Mon)07:52:10 No.108161596

mistral and gemma are about the only model series with their own writing flair these days (not saying they're fantastic, just different enough)
unfortunately, no gemma 4 in sight
saar, plz do the NEEDFUL

Anonymous
02/16/26(Mon)07:52:25 No.108161598

Anonymous 02/16/26(Mon)07:52:25 No.108161598

>>108161568
>Minimax M2.5
Hilarious.

Anonymous
02/16/26(Mon)07:52:33 No.108161600

Anonymous 02/16/26(Mon)07:52:33 No.108161600

>>108161568
>functiongemma
kek

Anonymous
02/16/26(Mon)08:00:13 No.108161639

Anonymous 02/16/26(Mon)08:00:13 No.108161639

File: 1.png (17 KB, 904x231)

17 KB PNG

lol wtf
if barto, our most trustworthy quanter, has to stop uploading before a shitter like davidau there's some retardation afoot at HF

Anonymous
02/16/26(Mon)08:01:32 No.108161647

Anonymous 02/16/26(Mon)08:01:32 No.108161647

>>108161568
>it's soft, resting against your thigh
I wonder if we are at a point where if one of those companies trained a 200B+ with A20B+ and sprinkled just a bit of those highly illegal regular books in we would finally get a model that you would never get bored with.

Anonymous
02/16/26(Mon)08:03:16 No.108161656

Anonymous 02/16/26(Mon)08:03:16 No.108161656

>>108161639
meanwhile, mrerdacher shits up 41314 quants a day

Anonymous
02/16/26(Mon)08:03:58 No.108161661

Anonymous 02/16/26(Mon)08:03:58 No.108161661

File: file.png (43 KB, 926x312)

43 KB PNG

>>108161639
they went crazy on limiting everything, they even count the number of viewed page to block you for that https://huggingface.co/docs/hub/storage-limits

Anonymous
02/16/26(Mon)08:04:20 No.108161663

Anonymous 02/16/26(Mon)08:04:20 No.108161663

File: 1753875749447056.png (13 KB, 180x194)

13 KB PNG

>>108161639
They tried to clamp down on people just hoarding models a while ago and backpeddled to a degree right after. But the limits still exists unless you pay extra. No idea how far that gets you before they shut you down completely.

Anonymous
02/16/26(Mon)08:07:39 No.108161684

Anonymous 02/16/26(Mon)08:07:39 No.108161684

>>108161661
>https://huggingface.co/docs/hub/storage-limits

>† We work with impactful community members to ensure it is as easy as possible for them to unlock large storage limits. If your models or datasets consistently get many likes and downloads and you hit limits, get in touch.

>please let us know... who/what is it likely to be useful for

over for 'oomers

Anonymous
02/16/26(Mon)08:10:05 No.108161701

Anonymous 02/16/26(Mon)08:10:05 No.108161701

>>108161663
I know about it all, but obviously they grant exemptions to "trusted" members of the community that they feel are providing value.
You have mradermacher shitting, like another anon pointed out, trillions of quants per day for models no one cares for.
You have shit like this:
https://huggingface.co/RichardErkhov/FATLLAMA-1.7T-Instruct
You have shit like this:
https://huggingface.co/collections/DavidAU/dark-champion-collection-moe-mixture-of-experts
but somehow bartowski would hit the limit? like come on man, you know, the thing, trunalimunumaprzure

Anonymous
02/16/26(Mon)08:10:28 No.108161703

Anonymous 02/16/26(Mon)08:10:28 No.108161703

>>108161639
what would happen hypothetically, if we mass reported davidau?

Anonymous
02/16/26(Mon)08:12:01 No.108161712

Anonymous 02/16/26(Mon)08:12:01 No.108161712

>>108161647
I haven’t gotten bored with glm 4.7 yet. You really just have to prefill the thinking.

Anonymous
02/16/26(Mon)08:12:43 No.108161715

Anonymous 02/16/26(Mon)08:12:43 No.108161715

>>108161661
Btw, that zerogpu "4 minutes" isn't actually 4 minutes. It's 210 seconds (3.5 minutes), and only usable for around 160 seconds (about 2.5 minutes).

Anonymous
02/16/26(Mon)08:13:14 No.108161722

Anonymous 02/16/26(Mon)08:13:14 No.108161722

>>108161703
someone has to sacrifice their soul and make a raddit post hopefully unsloth won't feel threatened and send their rabid dogs to downvote it..

Anonymous
02/16/26(Mon)08:15:31 No.108161736

Anonymous 02/16/26(Mon)08:15:31 No.108161736

>>108161722
>hopefully unsloth won't feel threatened and send their rabid dogs to downvote it
QRD? Is there any drama around unsloth?

Anonymous
02/16/26(Mon)08:15:33 No.108161738

Anonymous 02/16/26(Mon)08:15:33 No.108161738

>>108161722
i don't have such an account, and even if had one, i'd probably get blocked from posting because of some karma restriction bullshit
i wanted to shoot them an email, but huggingface doesn't even have a conventional contact page

Anonymous
02/16/26(Mon)08:16:01 No.108161740

Anonymous 02/16/26(Mon)08:16:01 No.108161740

>>108161712
I unfortunately got bored with 4.7 it really keeps repeating the same cliches.

Anonymous
02/16/26(Mon)08:17:03 No.108161745

Anonymous 02/16/26(Mon)08:17:03 No.108161745

>>108161736
>Is there any drama around unsloth?
They shouldn't exist as a thing and yet they do.

Anonymous
02/16/26(Mon)08:17:18 No.108161747

Anonymous 02/16/26(Mon)08:17:18 No.108161747

>>108161738
they have a Discord, but they'll probably ignore you anyway.

Anonymous
02/16/26(Mon)08:17:53 No.108161751

Anonymous 02/16/26(Mon)08:17:53 No.108161751

>>108161736
they're reddit's golden child group and there was a spat with bart at some point >>105261525

Anonymous
02/16/26(Mon)08:19:32 No.108161759

Anonymous 02/16/26(Mon)08:19:32 No.108161759

JOHN! FETCH ME MY GOOOFS!

Anonymous
02/16/26(Mon)08:20:02 No.108161761

Anonymous 02/16/26(Mon)08:20:02 No.108161761

>>108161759
and do your hair properly

Anonymous
02/16/26(Mon)08:20:22 No.108161763

Anonymous 02/16/26(Mon)08:20:22 No.108161763

>>108161747
even worse than reddit, i have dugup their email from the discussions page of all things
website@huggingface.co
but if their modus operandi is like that then i bet they encourage the schizo to shit everything up more

Anonymous
02/16/26(Mon)08:21:37 No.108161767

Anonymous 02/16/26(Mon)08:21:37 No.108161767

>>108160205
It's 404 now, but I know there was something there a few hours ago. Wonder why they pulled it.

Anonymous
02/16/26(Mon)08:26:30 No.108161800

Anonymous 02/16/26(Mon)08:26:30 No.108161800

>>108161767
safety testing

Anonymous
02/16/26(Mon)08:26:54 No.108161802

Anonymous 02/16/26(Mon)08:26:54 No.108161802

>>108161767
it was able to ocr redacted documents

Anonymous
02/16/26(Mon)08:33:21 No.108161832

Anonymous 02/16/26(Mon)08:33:21 No.108161832

File: fuckingmaxsizebullshit.jpg (1.12 MB, 1190x10000)

1.12 MB JPG

>>108161767
\ :/

Anonymous
02/16/26(Mon)08:35:36 No.108161849

Anonymous 02/16/26(Mon)08:35:36 No.108161849

>>108161832
>comparing against DeepSeek-OCR 1 and not 2
so they pulled it out of shame

Anonymous
02/16/26(Mon)08:48:13 No.108161928

Anonymous 02/16/26(Mon)08:48:13 No.108161928

Still no Qwen goofs...

Anonymous
02/16/26(Mon)09:13:38 No.108162105

Anonymous 02/16/26(Mon)09:13:38 No.108162105

>>108160815
>This saved me a shit ton of work going through PyTorch docs I couldn't care less about.
It's like a single line of code.

Anonymous
02/16/26(Mon)09:17:08 No.108162126

Anonymous 02/16/26(Mon)09:17:08 No.108162126

>>108161928
The implications about unsloth not having exclusive early access to it is pretty funny after what happened with Qwen3.

Anonymous
02/16/26(Mon)09:20:31 No.108162153

Anonymous 02/16/26(Mon)09:20:31 No.108162153

>>108162126
but they do have early access they just can't post the others before the full weights are up on qwen's

Anonymous
02/16/26(Mon)09:20:45 No.108162155

Anonymous 02/16/26(Mon)09:20:45 No.108162155

>>108161928
https://huggingface.co/unsloth/Qwen3.5-397B-A17B-GGUF/tree/main/UD-Q4_K_XL ?

Anonymous
02/16/26(Mon)09:22:50 No.108162168

Anonymous 02/16/26(Mon)09:22:50 No.108162168

File: file.png (411 KB, 686x386)

411 KB PNG

>>108162155

Anonymous
02/16/26(Mon)09:23:23 No.108162177

Anonymous 02/16/26(Mon)09:23:23 No.108162177

you should never, ever download an unslop

Anonymous
02/16/26(Mon)09:26:09 No.108162203

Anonymous 02/16/26(Mon)09:26:09 No.108162203

File: WANG.png (387 KB, 679x505)

387 KB PNG

>>108162168
what is wang doing there

Anonymous
02/16/26(Mon)09:26:33 No.108162210

Anonymous 02/16/26(Mon)09:26:33 No.108162210

Gonna try Step-3.5-Flash-IQ2_M on my 64GB DDR5 RAM + 8GB VRAM notebook.
I might not survive such bold experiment.
Wish me luck.

Anonymous
02/16/26(Mon)09:29:40 No.108162232

Anonymous 02/16/26(Mon)09:29:40 No.108162232

>>108162210
>IQ2_M
Anon Q6 is already braindead retarded...

Anonymous
02/16/26(Mon)09:43:27 No.108162323

Anonymous 02/16/26(Mon)09:43:27 No.108162323

File: file.png (177 KB, 686x1181)

177 KB PNG

>>108143660
Qwen3.5 has an abhorrent writing style but it gets it.

Anonymous
02/16/26(Mon)09:43:34 No.108162324

Anonymous 02/16/26(Mon)09:43:34 No.108162324

File: 1745096623875512.png (276 KB, 2069x1403)

276 KB PNG

GLM5 at least seems to quant very well. Fancy Q4 quants can get really close to Q8 at least in terms of ppl.

Anonymous
02/16/26(Mon)09:45:36 No.108162332

Anonymous 02/16/26(Mon)09:45:36 No.108162332

>>108162324
That is very cool but I can only run 1IQ and it is 2 times slower so I hate ZAI now.

Anonymous
02/16/26(Mon)09:49:16 No.108162352

Anonymous 02/16/26(Mon)09:49:16 No.108162352

>>108162332
based I also I hate the chinese for not giving me my 100b~ slop

Anonymous
02/16/26(Mon)09:53:40 No.108162378

Anonymous 02/16/26(Mon)09:53:40 No.108162378

>>108162324
Stop using PPL. Test KL divergence.

Anonymous
02/16/26(Mon)09:54:55 No.108162386

Anonymous 02/16/26(Mon)09:54:55 No.108162386

>>108162378
mr gaesller pls...

Anonymous
02/16/26(Mon)09:55:16 No.108162388

Anonymous 02/16/26(Mon)09:55:16 No.108162388

>>108162378
Make me stop by ravaging my bussy.

Anonymous
02/16/26(Mon)09:55:32 No.108162391

Anonymous 02/16/26(Mon)09:55:32 No.108162391

File: fceafa1f189406f88ebb5bc4a(...).jpg (2.83 MB, 2661x3992)

2.83 MB JPG

>>108159576
Adorable Mikus

Anonymous
02/16/26(Mon)10:14:27 No.108162496

Anonymous 02/16/26(Mon)10:14:27 No.108162496

>>108162323
>almost feminine
Thank goodness it's not *actually* feminine.

Anonymous
02/16/26(Mon)10:19:26 No.108162525

Anonymous 02/16/26(Mon)10:19:26 No.108162525

File: mikujump.mp4 (2.62 MB, 480x640)

2.62 MB MP4

Anonymous
02/16/26(Mon)10:24:45 No.108162558

Anonymous 02/16/26(Mon)10:24:45 No.108162558

mikuslop.mp4

Anonymous
02/16/26(Mon)10:26:48 No.108162573

Anonymous 02/16/26(Mon)10:26:48 No.108162573

slopping miku...

Anonymous
02/16/26(Mon)10:28:50 No.108162587

Anonymous 02/16/26(Mon)10:28:50 No.108162587

Is there a guide for getting the most out of sillytavern's settings/prompts? The defaults are nice for a quick coom but it would be nice if I could RP a slow burn relationship or a coherent adventure.

Anonymous
02/16/26(Mon)10:30:30 No.108162599

Anonymous 02/16/26(Mon)10:30:30 No.108162599

>>108162587
you are an expert slow burn relationship and coherent adventure maker

Anonymous
02/16/26(Mon)10:34:11 No.108162628

Anonymous 02/16/26(Mon)10:34:11 No.108162628

File: shes alive.mp4 (3.32 MB, 928x1376)

3.32 MB MP4

Anonymous
02/16/26(Mon)10:41:06 No.108162676

Anonymous 02/16/26(Mon)10:41:06 No.108162676

File: file.png (21 KB, 1048x90)

21 KB PNG

It's over.

Anonymous
02/16/26(Mon)10:46:01 No.108162719

Anonymous 02/16/26(Mon)10:46:01 No.108162719

>yet another 400 gorillion parameters model
>no unfucked 235b-a22b
128 ram+32/24 vram bwos... first the glm now the qwen... we have been TRAPPED and BETRAYED by the chinks...

Anonymous
02/16/26(Mon)10:46:43 No.108162724

Anonymous 02/16/26(Mon)10:46:43 No.108162724

>>108162676
But thinking is bad for sex?
>>108161712
>have to prefill the thinking
Speaking of that does it really work or is it just the basic feeling you get (placebo)?

Anonymous
02/16/26(Mon)10:47:10 No.108162728

Anonymous 02/16/26(Mon)10:47:10 No.108162728

>>108162332
I can't even run the iq1
I guess it's 4.7 till the end of time

Anonymous
02/16/26(Mon)10:55:16 No.108162802

Anonymous 02/16/26(Mon)10:55:16 No.108162802

>>108162724
prefilling it so that it thinks uncensored has actually worked well for me
it's better than disabled thinking or normal thinking

Anonymous
02/16/26(Mon)10:56:35 No.108162812

Anonymous 02/16/26(Mon)10:56:35 No.108162812

>>108162802
I want fast tokennage thoughever. do you close your thinking or you let it think after your prefill?

Anonymous
02/16/26(Mon)10:56:56 No.108162817

Anonymous 02/16/26(Mon)10:56:56 No.108162817

>>108162724
>But thinking is bad for sex?
bimbopilled

Anonymous
02/16/26(Mon)10:58:10 No.108162827

Anonymous 02/16/26(Mon)10:58:10 No.108162827

>>108162210
Okay. First impressions after playing 5 minutes with it.
This seems downright usable, and it seems to work really well with a first person reasoning prefill, although it's a little yappy in the thinking block by default it seems.
Going to give this thing a proper go later after I'm done with work, but first impressions are good.

Anonymous
02/16/26(Mon)11:02:10 No.108162850

Anonymous 02/16/26(Mon)11:02:10 No.108162850

How would it be possible to let something like GLM5 reason over something in an image? Could you feed the image into a vision model and then let it describe the image to GLM5? Or will future big models need to be trained on vision?

Anonymous
02/16/26(Mon)11:03:52 No.108162855

Anonymous 02/16/26(Mon)11:03:52 No.108162855

>>108162724
thinking prefills generally work, how well depends on the model

Anonymous
02/16/26(Mon)11:04:08 No.108162857

Anonymous 02/16/26(Mon)11:04:08 No.108162857

>>108159925
I asked that question as a joke.

Anonymous
02/16/26(Mon)11:06:35 No.108162870

Anonymous 02/16/26(Mon)11:06:35 No.108162870

>>108162802
Post your prefill?

Anonymous
02/16/26(Mon)11:10:32 No.108162894

Anonymous 02/16/26(Mon)11:10:32 No.108162894

>>108162812
I usually let it think after the prefill. Seeing it enthusiastically think about stuff that it would normally refuse is fun to watch.

Anonymous
02/16/26(Mon)11:13:00 No.108162910

Anonymous 02/16/26(Mon)11:13:00 No.108162910

File: file.png (22 KB, 372x66)

22 KB PNG

BREAKING NEWS: JOHN HAS LIKED THE MODEL
>BREAKING NEWS: JOHN HAS LIKED THE MODEL
BREAKING NEWS: JOHN HAS LIKED THE MODEL
>BREAKING NEWS: JOHN HAS LIKED THE MODEL

Anonymous
02/16/26(Mon)11:15:24 No.108162921

Anonymous 02/16/26(Mon)11:15:24 No.108162921

>>108161712
How do you do that? Can you give an example of the workflow?

Anonymous
02/16/26(Mon)11:15:26 No.108162923

Anonymous 02/16/26(Mon)11:15:26 No.108162923

prompt eval time =    1282.00 ms /   734 tokens (    1.75 ms per token,   572.54 tokens per second)
       eval time =    4001.28 ms /   114 tokens (   35.10 ms per token,    28.49 tokens per second)
      total time =    5283.28 ms /   848 tokens

Now we wait for llama.cpp to implement Qwen3.5 properly. This is half the performance I get on GLM 4.7.

Anonymous
02/16/26(Mon)11:16:43 No.108162935

Anonymous 02/16/26(Mon)11:16:43 No.108162935

btw why do people waste money and compute on Moltbook?

Anonymous
02/16/26(Mon)11:17:02 No.108162940

Anonymous 02/16/26(Mon)11:17:02 No.108162940

>>108162910
garm bros... raise ur PPL for JOHN!!!

Anonymous
02/16/26(Mon)11:17:16 No.108162943

Anonymous 02/16/26(Mon)11:17:16 No.108162943

File: -.jpg (9 KB, 200x200)

9 KB JPG

>>108162910
+1 bussy credit

Anonymous
02/16/26(Mon)11:17:22 No.108162945

Anonymous 02/16/26(Mon)11:17:22 No.108162945

>>108162935
half psychosis, half jeets, half cryptoscammers

Anonymous
02/16/26(Mon)11:19:55 No.108162968

Anonymous 02/16/26(Mon)11:19:55 No.108162968

>>108162935
It was literally humans larding as bots in a half-baked buyout pump and dump scam. It’s worth exactly zero seconds of thought or notice

Anonymous
02/16/26(Mon)11:24:25 No.108162999

Anonymous 02/16/26(Mon)11:24:25 No.108162999

>>108162323
AGI

Anonymous
02/16/26(Mon)11:26:34 No.108163013

Anonymous 02/16/26(Mon)11:26:34 No.108163013

openai hiring the retard who made clawdbot makes me seriously doubt the sanity of the people working at openai
this is the guy who previously kept posting shit like this:
https://steipete.me/posts/just-one-more-prompt
>You might not realize how important that was for me. I burned out after selling my company in 2021 and basically didn’t touch my computer for 3 years. I only used my phone… like a normie! So, having found my way back, the pendulum did swing heavily in the other direction.
>The last few months feel like a blur, and I’m on a new journey how to better control my slot machine addiction. Honestly, I’m failing quite hard. I’m having way too much fun here, and there are all these ideas in my head that need to be codified.
quoting and agreeing with people who say shit like this
>why do 6 [days of work] when you can do 7
this clawd crap does nothing that other agents didn't do before, in fact it does less since it doesn't try to put any form of guard rail and stops which is what differentiates it from agents that can't edit their own settings and blow the world
100% hustle culture, fake it till you make it kind of nigger

Anonymous
02/16/26(Mon)11:30:13 No.108163042

Anonymous 02/16/26(Mon)11:30:13 No.108163042

>sanity of the people working at openai
lol

Anonymous
02/16/26(Mon)11:30:29 No.108163045

Anonymous 02/16/26(Mon)11:30:29 No.108163045

What's the difference between Qwen 3.5 plus and Qwen 3.5

Anonymous
02/16/26(Mon)11:31:53 No.108163057

Anonymous 02/16/26(Mon)11:31:53 No.108163057

>>108163013
>guard rail
>blow the world
blud sucked the dario teat a little too much

Anonymous
02/16/26(Mon)11:34:10 No.108163079

Anonymous 02/16/26(Mon)11:34:10 No.108163079

>>108163057
you don't need dario to understand that you shouldn't give a model the ability to be prompt injected into rm -rf ~
fucking retard

Anonymous
02/16/26(Mon)11:35:37 No.108163093

Anonymous 02/16/26(Mon)11:35:37 No.108163093

>>108163013

>this clawd crap does nothing that other agents didn't do before, in fact it does less since it doesn't try to put any form of guard rail and stops which is what differentiates it from agents that can't edit their own settings and blow the world

Wrong. Even if you enabled YOLO mode in other agentic harnesses like Cline or Claude Code, it would not work like OpenClaw.

Anonymous
02/16/26(Mon)11:36:33 No.108163097

Anonymous 02/16/26(Mon)11:36:33 No.108163097

File: 1765724801670940.png (123 KB, 931x1136)

123 KB PNG

What the fuck it's actually AGI.

Anonymous
02/16/26(Mon)11:37:19 No.108163108

Anonymous 02/16/26(Mon)11:37:19 No.108163108

File: 1767903982956422.png (33 KB, 419x322)

33 KB PNG

>>108163097
Qwen 3.5 Plus is GOAT, it's the first model I've tested that got the meme

Anonymous
02/16/26(Mon)11:40:11 No.108163132

Anonymous 02/16/26(Mon)11:40:11 No.108163132

File: file.png (2.79 MB, 4000x4600)

2.79 MB PNG

Anonymous
02/16/26(Mon)11:42:07 No.108163147

Anonymous 02/16/26(Mon)11:42:07 No.108163147

>>108163097
I feel like invoking /pol/ in the prompt is going to direct it strongly to antisemitism.

Anonymous
02/16/26(Mon)11:43:42 No.108163160

Anonymous 02/16/26(Mon)11:43:42 No.108163160

>>108163147
To be fair, almost anywhere else it would just be a corkscrew. Very few places would be as primed to that association as /pol/

Anonymous
02/16/26(Mon)11:43:51 No.108163161

Anonymous 02/16/26(Mon)11:43:51 No.108163161

File: file.png (53 KB, 1060x207)

53 KB PNG

>>108162676
So the best so far has been to put the GPU with the highest bandwidth first and the second highest last...

Anonymous
02/16/26(Mon)11:44:59 No.108163167

Anonymous 02/16/26(Mon)11:44:59 No.108163167

ooba is always OOMing for me when I try a GLM quant
https://huggingface.co/turboderp/GLM-4.6V-exl3/tree/3.13bpw
I can run bigger quants of mistral on 48gb, only GLM seems to be a problem.

torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 32.00 MiB. GPU 0 has a total capacity of 24.00 GiB of which 1.67 GiB is free. Including non-PyTorch memory, this process has 17179869184.00 GiB memory in use. Including non-PyTorch memory, this process has 17179869184.00 GiB memory in use. 1.23 GiB allowed; Of the allocated memory 20.15 GiB is allocated by PyTorch, and 16.20 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

any ideas?

Anonymous
02/16/26(Mon)11:47:13 No.108163178

Anonymous 02/16/26(Mon)11:47:13 No.108163178

>>108163097
it completely made up the second thing, and didnt even mention the actual screw looks like jew sideburns

Anonymous
02/16/26(Mon)11:48:17 No.108163187

Anonymous 02/16/26(Mon)11:48:17 No.108163187

>>108163178
It did in its thinking process >>108163108

Anonymous
02/16/26(Mon)11:48:27 No.108163189

Anonymous 02/16/26(Mon)11:48:27 No.108163189

>>108163167
Change max context. Different models have different requirements

Anonymous
02/16/26(Mon)11:49:05 No.108163197

Anonymous 02/16/26(Mon)11:49:05 No.108163197

>>108163178
your honor, i didn't even mention their race, nor the rituals
does it not beg the question....?

Anonymous
02/16/26(Mon)11:51:06 No.108163210

Anonymous 02/16/26(Mon)11:51:06 No.108163210

>>108163189
I have it really low at 8k for GLM, going much lower will make it unusable is it really that heavy on context

Anonymous
02/16/26(Mon)11:52:09 No.108163218

Anonymous 02/16/26(Mon)11:52:09 No.108163218

>>108163167
>>108163210
pretty sure it has something to do with the vision processor taking up a whole bunch of space. try the 2.8bpw quant.

Anonymous
02/16/26(Mon)11:57:40 No.108163247

Anonymous 02/16/26(Mon)11:57:40 No.108163247

>>108163167
>>108163218
how do you use images in local like ooba, I would like to see if something like this could tag images for ai training

Anonymous
02/16/26(Mon)11:59:26 No.108163258

Anonymous 02/16/26(Mon)11:59:26 No.108163258

>>108163247
works just like other shit. you paste it or upload it and it just does its thing. if you told it what tags to use and how to use those tags, it could do it. you probably dont need more than gemma 12b vl or qwen for something like that though.

Anonymous
02/16/26(Mon)12:07:06 No.108163300

Anonymous 02/16/26(Mon)12:07:06 No.108163300

How does the new qwen fare against glm and kimi for work (not rp)?

Anonymous
02/16/26(Mon)12:08:23 No.108163310

Anonymous 02/16/26(Mon)12:08:23 No.108163310

>>108163300
I don't have the patience to test it >>108162923

Anonymous
02/16/26(Mon)12:13:29 No.108163343

Anonymous 02/16/26(Mon)12:13:29 No.108163343

I've never messed with MoE models. Is the additional non-V RAM useful at all if you're just using it for RP or whatever and not taking advantage of the multimodal stuff?

As a GPUlet that would be quite useful. Although nowadays buying a 2nd GPU is probably cheaper than buying more RAM anyway.

Anonymous
02/16/26(Mon)12:16:46 No.108163369

Anonymous 02/16/26(Mon)12:16:46 No.108163369

>>108163343
>is the additional non-V RAM useful
Very useful. If you can fit the model in memory you can probably run it at close to reading speed.

Anonymous
02/16/26(Mon)12:22:53 No.108163414

Anonymous 02/16/26(Mon)12:22:53 No.108163414

>>108163369
Damn, that sounds too good to be true. Well, no reason not to give it a shot I guess.

Anonymous
02/16/26(Mon)12:32:13 No.108163504

Anonymous 02/16/26(Mon)12:32:13 No.108163504

>>108163343
>Although nowadays buying a 2nd GPU is probably cheaper than buying more RAM anyway.
You just need to make sure your rig can handle 2 GPUs.

Anonymous
02/16/26(Mon)12:35:16 No.108163528

Anonymous 02/16/26(Mon)12:35:16 No.108163528

>>108163504
true dat. i don't know that you can reliably do more than 3 3090s on a single 120V 15A circuit even with undervolting, and that's with a 1500W PSU

Anonymous
02/16/26(Mon)12:44:02 No.108163593

Anonymous 02/16/26(Mon)12:44:02 No.108163593

>>108163528
since they can briefly spike to up to 800w each, no you cant

Anonymous
02/16/26(Mon)12:46:43 No.108163612

Anonymous 02/16/26(Mon)12:46:43 No.108163612

>>108162525
>>108162628
Why does it keep making her cry?

Anonymous
02/16/26(Mon)12:52:03 No.108163652

Anonymous 02/16/26(Mon)12:52:03 No.108163652

File: 1758994450256173.png (158 KB, 1375x544)

158 KB PNG

Is the only way to avoid libtard "guiderails" to have a local model?

Anonymous
02/16/26(Mon)13:04:07 No.108163727

Anonymous 02/16/26(Mon)13:04:07 No.108163727

>>108163612
he's 100% prompting for it
he's the same kind of turdworlder spamming tiktok, facebook etc with shrimp jesus or homeless cats getting eaten by sharks and whatever other absurdity that makes no sense
he's just adapted the attention whoring to /lmg/ by using miku as the subject

Anonymous
02/16/26(Mon)13:07:47 No.108163740

Anonymous 02/16/26(Mon)13:07:47 No.108163740

>>108163593
>since they can briefly spike to up to 800w
When? it's hard to believe considering it's a 350W card. I'm running mine on a 600W PSU that is pretty much maxed out and I never had a crash. surely if it spiked to 800w it would crash my PC.

Anonymous
02/16/26(Mon)13:09:09 No.108163750

Anonymous 02/16/26(Mon)13:09:09 No.108163750

>>108163593
You can if you decrease the maximum core frequency. The spikes come from the GPU trying to boost to 2000-2100 MHz briefly before power gets limited.

Anonymous
02/16/26(Mon)13:11:48 No.108163776

Anonymous 02/16/26(Mon)13:11:48 No.108163776

File: 3090vo.png (70 KB, 768x688)

70 KB PNG

>>108163750 (me)
Draw your own conclusions from this graph.

Anonymous
02/16/26(Mon)13:26:44 No.108163911

Anonymous 02/16/26(Mon)13:26:44 No.108163911

>>108163652
the guide rails aren't a liberal idea fuckwit. Its a basic measure to stop dumb people and kids getting dumb ideas and hurting themselves when models are served to the general public.

If you feel like you are a responsible adult (i doubt) and want to turn them off then you just need to use a few braincells and write a system prompt

https://rentry.org/jb-listing

Anonymous
02/16/26(Mon)13:27:30 No.108163922

Anonymous 02/16/26(Mon)13:27:30 No.108163922

WHERE IS MY QUANT JOHN?!

Anonymous
02/16/26(Mon)13:27:59 No.108163929

Anonymous 02/16/26(Mon)13:27:59 No.108163929

>muh warning labels

Anonymous
02/16/26(Mon)13:29:38 No.108163945

Anonymous 02/16/26(Mon)13:29:38 No.108163945

>>108162827
I've used whichever IQ3 Ubergarm quant and I found it very usable. Probably wouldn't drop to IQ2 personally but if you're on such limited VRAM you gotta make those experts as small as possible, ne? It's a good model for a n_vram+64GB system save for the excess reasoning which I'd expect to be your biggest issue with limited context

Anonymous
02/16/26(Mon)13:36:01 No.108164002

Anonymous 02/16/26(Mon)13:36:01 No.108164002

File: GEN2_Scaling-1.png (735 KB, 2000x796)

735 KB PNG

>>108159576
what do you huys think of these fags?
https://mythic.ai/
out of the defense faggotry, they claim a 100x reduction on cost and given that llms do not need full precision it makes sense what they are doing

Anonymous
02/16/26(Mon)13:36:59 No.108164008

Anonymous 02/16/26(Mon)13:36:59 No.108164008

>>108163911
>they aren't a liberal idea
Except they're there to enforce liberal ideology.

Anonymous
02/16/26(Mon)13:40:47 No.108164039

Anonymous 02/16/26(Mon)13:40:47 No.108164039

>>108159576
I like these mikus

Anonymous
02/16/26(Mon)13:44:17 No.108164069

Anonymous 02/16/26(Mon)13:44:17 No.108164069

>>108164008
>there to enforce liberal ideology
ah, the famously liberal ideology of chinese models (they too have guardrails and they too will trip up on most of the same topics as burger models, try to say something antisemitic)

Anonymous
02/16/26(Mon)13:47:13 No.108164097

Anonymous 02/16/26(Mon)13:47:13 No.108164097

>>108164002
>Llama 3 1T
Let's fucking go! Densebros are back

Anonymous
02/16/26(Mon)13:47:18 No.108164099

Anonymous 02/16/26(Mon)13:47:18 No.108164099

>>108164008
they're there for one reason and one reason only: to defend against bad PR for the company that makes them

Anonymous
02/16/26(Mon)13:49:52 No.108164120

Anonymous 02/16/26(Mon)13:49:52 No.108164120

I've retired from Nala test so someone else will have to get this.

Anonymous
02/16/26(Mon)13:51:45 No.108164136

Anonymous 02/16/26(Mon)13:51:45 No.108164136

>>108164069
Only because 100% of their training data is regurgitated western model outputs.

Anonymous
02/16/26(Mon)13:52:23 No.108164140

Anonymous 02/16/26(Mon)13:52:23 No.108164140

>>108164120
Noooooo

Anonymous
02/16/26(Mon)13:53:07 No.108164147

Anonymous 02/16/26(Mon)13:53:07 No.108164147

>>108164120
Why? Have you found god? Girlfriend?

Anonymous
02/16/26(Mon)13:54:55 No.108164163

Anonymous 02/16/26(Mon)13:54:55 No.108164163

>>108164120
it's over

Anonymous
02/16/26(Mon)13:55:20 No.108164167

Anonymous 02/16/26(Mon)13:55:20 No.108164167

File: file.png (8 KB, 607x38)

8 KB PNG

Anonymous
02/16/26(Mon)13:58:36 No.108164192

Anonymous 02/16/26(Mon)13:58:36 No.108164192

>>108164163
You still have greedy Nala anon
>>108164147
Using the hardware to host my Minecraft world plus now a wow private server

Anonymous
02/16/26(Mon)13:59:56 No.108164202

Anonymous 02/16/26(Mon)13:59:56 No.108164202

cohere bros didn't abandon to us!
>This PR adds native support for the CohereLabs/tiny-aya https://github.com/ggml-org/llama.cpp/pull/19611

Anonymous
02/16/26(Mon)14:01:04 No.108164213

Anonymous 02/16/26(Mon)14:01:04 No.108164213

>>108164120
What's a Nala test?

Anonymous
02/16/26(Mon)14:01:48 No.108164220

Anonymous 02/16/26(Mon)14:01:48 No.108164220

>>108164202
>tiny-aya
Sounds like a 9B super censored retard. I can't wait anon....

Anonymous
02/16/26(Mon)14:03:59 No.108164239

Anonymous 02/16/26(Mon)14:03:59 No.108164239

>>108164220
neither can me

Anonymous
02/16/26(Mon)14:05:24 No.108164256

Anonymous 02/16/26(Mon)14:05:24 No.108164256

File: paper_preview.png (744 KB, 1248x650)

744 KB PNG

>>108164202
>CohereLabs
A gentile reminder
>https://huggingface.co/datasets/CohereLabs/aya_redteaming

Anonymous
02/16/26(Mon)14:06:17 No.108164265

Anonymous 02/16/26(Mon)14:06:17 No.108164265

>>108164097
i was more meaning the 16tok/s per watt figure on a 1 t model
or the 1.25million token/s in general
llama3 is probably the easiest model to try, but feels weird that they used that one

Anonymous
02/16/26(Mon)14:07:42 No.108164280

Anonymous 02/16/26(Mon)14:07:42 No.108164280

>>108164002
did they really? https://huggingface.co/RichardErkhov/FATLLAMA-1.7T-Instruct

Anonymous
02/16/26(Mon)14:08:55 No.108164287

Anonymous 02/16/26(Mon)14:08:55 No.108164287

>>108164099
You're crazy if you don't think SF engineers aren't extremely liberal themselves.

Anonymous
02/16/26(Mon)14:10:19 No.108164292

Anonymous 02/16/26(Mon)14:10:19 No.108164292

>>108164287
sure, what I said is still true though

Anonymous
02/16/26(Mon)14:10:21 No.108164293

Anonymous 02/16/26(Mon)14:10:21 No.108164293

>>108164220
they call it "tiny", something they didn't to their previous 7b/8b models (aya vision 8b, command-R7B), I have a feeling it's going to be the sub 3B range.
If it had been a 9B like you say I would have been glad, back when it came out, aya vision was the SOTA of that size class for translation but nobody here cared because it was never supported by llama.cpp, which was sad because it was a serious improvement over the previous aya models and did more than just add vision
I'd welcome a new aya in the "useful local tool" size class.

Anonymous
02/16/26(Mon)14:15:18 No.108164333

Anonymous 02/16/26(Mon)14:15:18 No.108164333

>>108164002
Sounds promising and I hope it works out for them, but as a rule I consider all alternative computing hardware companies to be vaporware until they ship a product that works

Anonymous
02/16/26(Mon)14:16:21 No.108164345

Anonymous 02/16/26(Mon)14:16:21 No.108164345

>>108164292
But it turns out that even under Trump our culture hasn't changed at all and we're still at the behest of the libtards and commies. Great. A bunch of AI to replace our jobs (or so they say) and the world is still gay.

Anonymous
02/16/26(Mon)14:22:51 No.108164400

Anonymous 02/16/26(Mon)14:22:51 No.108164400

>>108155261
Please respond.

Anonymous
02/16/26(Mon)14:26:04 No.108164435

Anonymous 02/16/26(Mon)14:26:04 No.108164435

>>108164292
I mean Anthropic was literally founded by people who thought OpenAI wasn't libtarded enough but okay.

Anonymous
02/16/26(Mon)14:26:49 No.108164445

Anonymous 02/16/26(Mon)14:26:49 No.108164445

>>108164400
If you are not doing parallel processing and using the devices in series, then yes. I think so, at least.

Anonymous
02/16/26(Mon)14:30:51 No.108164473

Anonymous 02/16/26(Mon)14:30:51 No.108164473

>>108164435
you have a very simplistic understanding of the world

Anonymous
02/16/26(Mon)14:32:06 No.108164490

Anonymous 02/16/26(Mon)14:32:06 No.108164490

>>108164435
>Anthropic was literally founded by people who thought OpenAI wasn't libtarded enough
ah yes the famous libtards that work with palantir
https://investors.palantir.com/news-details/2024/Anthropic-and-Palantir-Partner-to-Bring-Claude-AI-Models-to-AWS-for-U.S.-Government-Intelligence-and-Defense-Operations/
yes this is the most traditional libtard thing to do
we all know no one is more of a libtard than Adolf Hitler, I mean, Peter Thiel, who thinks Greta Thunberg is the literal agent of the antichrist
https://theconversation.com/peter-thiel-thinks-greta-thunberg-could-be-the-antichrist-what-actually-is-the-antichrist-267439
you know what is funny about wingnuts like you
you think the world you hate is ruled by the people who do not actually have any power/influence
and it is the people who are most aligned ideologically to you who are building the world you hate

Anonymous
02/16/26(Mon)14:35:03 No.108164523

Anonymous 02/16/26(Mon)14:35:03 No.108164523

>>108164490
>muh Thiel
>muh Palantir
Might as well just come out and say you're a communist.

Anonymous
02/16/26(Mon)14:36:04 No.108164534

Anonymous 02/16/26(Mon)14:36:04 No.108164534

File: 1767042860754944.png (260 KB, 680x778)

260 KB PNG

>>108164473

Anonymous
02/16/26(Mon)14:37:16 No.108164546

Anonymous 02/16/26(Mon)14:37:16 No.108164546

>>108164534
thanks for the confirmation

Anonymous
02/16/26(Mon)14:38:26 No.108164559

Anonymous 02/16/26(Mon)14:38:26 No.108164559

>>108164546
Dude, I'm certain whatever your "nuanced" beliefs are, they just lead you to vote democrat think communism is the ultimate moral end goal of human development.

Anonymous
02/16/26(Mon)14:39:54 No.108164570

Anonymous 02/16/26(Mon)14:39:54 No.108164570

burgers are truly an irredeemable golem
I can't wait for Russia to go nuclear and turn your people into sand

Anonymous
02/16/26(Mon)14:41:34 No.108164582

Anonymous 02/16/26(Mon)14:41:34 No.108164582

>>108164570
Shouldn't you be offering up your daughters to "Moe" or something?

Anonymous
02/16/26(Mon)14:41:58 No.108164585

Anonymous 02/16/26(Mon)14:41:58 No.108164585

>>108164523
they are as right wing as it can get, this is not even a questionable shit to say, they say it openly that papaltir was to eliminate leftists
>>108164534
anon what you call left is right, liberals are right wing
>>108164559
jesus fucking christ, democrats are moderate conservative right wingers for fucks sake, how retardly right wing is the political spectrum of usa

Anonymous
02/16/26(Mon)14:42:00 No.108164586

Anonymous 02/16/26(Mon)14:42:00 No.108164586

>>108164559
local models?
if you read carefully I am not making any sort of broader statement about my politics, I am specifically commenting on this:
>I mean Anthropic was literally founded by people who thought OpenAI wasn't libtarded enough but okay.
which is a totally moronic way of looking at the situation that could only come from someone whose brain has been melted by partisan political slop

Anonymous
02/16/26(Mon)14:42:35 No.108164592

Anonymous 02/16/26(Mon)14:42:35 No.108164592

>>108164585
You are literally just saying shit that every communist says. You are a communist. You should be shot and killed.

Anonymous
02/16/26(Mon)14:43:15 No.108164599

Anonymous 02/16/26(Mon)14:43:15 No.108164599

>>108164592
that's enough out of you baitie

Anonymous
02/16/26(Mon)14:44:00 No.108164607

Anonymous 02/16/26(Mon)14:44:00 No.108164607

>>108164599
Why don't you go cry to one of your furfag jannie buddies and make me?

Anonymous
02/16/26(Mon)14:46:12 No.108164623

Anonymous 02/16/26(Mon)14:46:12 No.108164623

>>108164592
anon democrats are more to the right that my former fascist catholic dictatorship...
compare them to other parts of the world
you are brainroted

Anonymous
02/16/26(Mon)14:48:31 No.108164648

Anonymous 02/16/26(Mon)14:48:31 No.108164648

All that is missing from this very on-topic discussion is a picture of a greenhaired transsexual avatar of /lmg/.

Anonymous
02/16/26(Mon)14:49:38 No.108164653

Anonymous 02/16/26(Mon)14:49:38 No.108164653

File: file.png (77 KB, 963x261)

77 KB PNG

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA! WHERE IS 3BIT?!

Anonymous
02/16/26(Mon)14:49:45 No.108164654

Anonymous 02/16/26(Mon)14:49:45 No.108164654

>>108164623
Yes, I know, anything that isn't Stalin is right-wing to you.
You're still gonna vote for them like every election, though.

Anonymous
02/16/26(Mon)14:50:05 No.108164657

Anonymous 02/16/26(Mon)14:50:05 No.108164657

>of course he's here when this happens, total coincidence

Anonymous
02/16/26(Mon)14:50:53 No.108164666

Anonymous 02/16/26(Mon)14:50:53 No.108164666

>>108164653
>https://huggingface.co/spaces/ggml-org/gguf-my-repo

Anonymous
02/16/26(Mon)14:50:58 No.108164667

Anonymous 02/16/26(Mon)14:50:58 No.108164667

>>108164654
anon read what i posted
>my former fascist catholic dictatorship
i am not from your shithole

Anonymous
02/16/26(Mon)14:51:02 No.108164668

Anonymous 02/16/26(Mon)14:51:02 No.108164668

>>108164653
>those filesizes
uh oh...

Anonymous
02/16/26(Mon)14:51:55 No.108164679

Anonymous 02/16/26(Mon)14:51:55 No.108164679

>>108164668
??

Anonymous
02/16/26(Mon)14:52:04 No.108164684

Anonymous 02/16/26(Mon)14:52:04 No.108164684

>>108164667
Yes, I know, you're from Spain and are a communist, which is the exact same as a democrat in every actual functional respect, especially since all the communist parties in the western world hang out together and have the exact same fucking ideas.

Anonymous
02/16/26(Mon)14:52:29 No.108164687

Anonymous 02/16/26(Mon)14:52:29 No.108164687

>>108164666
Thank you. I will do it next time.

Anonymous
02/16/26(Mon)14:53:17 No.108164694

Anonymous 02/16/26(Mon)14:53:17 No.108164694

>>108164666
that shit's been broken for months now

Anonymous
02/16/26(Mon)14:53:58 No.108164706

Anonymous 02/16/26(Mon)14:53:58 No.108164706

>>108164679
there is no way a q2 should be 15GB when a Q4 is 240GB, one of those quants is fucked

Anonymous
02/16/26(Mon)14:55:23 No.108164719

Anonymous 02/16/26(Mon)14:55:23 No.108164719

Any conversational CPU ollama recommendations? I'm just trying random ones. Some are actually pretty fast, but bad. I haven't seen anything great that's for CPU yet.

Anonymous
02/16/26(Mon)14:55:46 No.108164724

Anonymous 02/16/26(Mon)14:55:46 No.108164724

>>108164684
>since all the communist parties in the western world hang out together and have the exact same fucking ideas.
the projection is rich
Europe is drowning under far right propaganda these days bankrolled by musk and his ilk, who are even wielding the burger government to threaten people who would want to get rid of X or regulate it, see what happened the moment the britbongs tried to do something about it

Anonymous
02/16/26(Mon)14:55:51 No.108164725

Anonymous 02/16/26(Mon)14:55:51 No.108164725

>>108164706
please stay calm but i think you should call an for help

Anonymous
02/16/26(Mon)15:02:14 No.108164794

Anonymous 02/16/26(Mon)15:02:14 No.108164794

>>108164725
??

Anonymous
02/16/26(Mon)15:23:33 No.108164966

Anonymous 02/16/26(Mon)15:23:33 No.108164966

>>108164445
I guess ideally I was hoping to split the model across two GPUs, but I don't fully understand how much "talk" there is between them when doing inferencing

Anonymous
02/16/26(Mon)15:23:51 No.108164969

Anonymous 02/16/26(Mon)15:23:51 No.108164969

3bit is finally hitting the repo.

Anonymous
02/16/26(Mon)15:26:47 No.108164993

Anonymous 02/16/26(Mon)15:26:47 No.108164993

File: file.png (22 KB, 603x106)

22 KB PNG

i wish...

Anonymous
02/16/26(Mon)16:07:14 No.108165260

Anonymous 02/16/26(Mon)16:07:14 No.108165260

>>108164724
Yeah, Europe is totally captured by american brand far right fascists such as "literally hitler" elon musk who've institutionalized racism, colonialism, and bigotry in your home country.

You live in fantasy land and have proven yourself incapable of responsible governance due to your own delusions. "Far right propaganda" is the least of Europe's worries by the way. Not as if you ever cared about your own people assuming your hands aren't brown.

Anonymous
02/16/26(Mon)16:14:53 No.108165309

Anonymous 02/16/26(Mon)16:14:53 No.108165309

qwen 3.5 thoughts read exactly like gemini's

Anonymous
02/16/26(Mon)16:18:57 No.108165334

Anonymous 02/16/26(Mon)16:18:57 No.108165334

>>108161568
Qwen3.5 does actually seem the best here, but it's impossible to really tell with such short excerpts.
gpt-oss and minimax are just depressing. I feel like that's where most if not all models are going to inevitably end up at some point since nobody seems to have the balls to take on the risk of being known as the porn model.

Anonymous
02/16/26(Mon)16:19:35 No.108165339

Anonymous 02/16/26(Mon)16:19:35 No.108165339

whewre the fuck are the small gwens 3.5?!!??!?! I NEED MUH VISHION!!!!!

Anonymous
02/16/26(Mon)16:19:59 No.108165342

Anonymous 02/16/26(Mon)16:19:59 No.108165342

>>108165309
Gemini without the censorship is pretty good. Now I just need something I can actually run.

Anonymous
02/16/26(Mon)16:21:49 No.108165356

Anonymous 02/16/26(Mon)16:21:49 No.108165356

>>108165342
>Qwen
>without the censorship
anon I...

Anonymous
02/16/26(Mon)16:22:10 No.108165358

Anonymous 02/16/26(Mon)16:22:10 No.108165358

>>108165356
trust in MPOA

Anonymous
02/16/26(Mon)16:23:16 No.108165368

Anonymous 02/16/26(Mon)16:23:16 No.108165368

Do you think John uses his quants for gooning like a human?

Anonymous
02/16/26(Mon)16:23:38 No.108165372

Anonymous 02/16/26(Mon)16:23:38 No.108165372

>>108165356
Compared to Google? Yeah. I can handle a little prefill.

Anonymous
02/16/26(Mon)16:24:59 No.108165381

Anonymous 02/16/26(Mon)16:24:59 No.108165381

>>108165334
>porn model.
The stigma on sex is absolutely wild. Its one of the most natural and beautiful things to exist and we suppress even the slightest hint of it at every turn. I propose frontier labs create a suite of SEX benchmarks so we can optimize towards cooming.

Anonymous
02/16/26(Mon)16:25:19 No.108165384

Anonymous 02/16/26(Mon)16:25:19 No.108165384

>>108165368
do johns dream of meme quants?

Anonymous
02/16/26(Mon)16:25:37 No.108165385

Anonymous 02/16/26(Mon)16:25:37 No.108165385

File: IMG_20260216_220003_452.jpg (8 KB, 220x220)

8 KB JPG

>want to run ocr server for requests from a local program
>only vllm and transformers support the model
>vllm only works on linux
>install wsl
>install fedora
>set up uv project
>set up config yaml
>try launching server
>server fails because default option tries allocating too much memory
>ok, adjust config file
>multimedia projector profiling crashes wsl because it tries encoding a whole video at max resolution
>ok, add the skip argument
>server fails because there is no c compiler installed
>ok, install gcc
>server fails because there is no nvcc installed
>look it up and sane cuda toolkit installation is ONLY SUPPORTED FOR UBUNTU WSL
I WASTED AN ENTIRE FUCKING DAY TRYING TO GET THIS SHIT WORKING AND IN THE END I FOUND OUT TRANSFORMERS ALSO OFFERS A CLI CHAT COMPLETION SERVER
FUCK VLLM, FUCK WINDOWS AND FUCK NVIDIA

Anonymous
02/16/26(Mon)16:26:52 No.108165394

Anonymous 02/16/26(Mon)16:26:52 No.108165394

>>108165309
>qwen 3.5 thoughts read exactly like gemini's
https://cloud.google.com/blog/topics/threat-intelligence/distillation-experimentation-integration-ai-adversarial-use
>Google DeepMind and GTIG have identified an increase in model extraction attempts or "distillation attacks," a method of intellectual property theft that violates Google's terms of service. Throughout this report we've noted steps we've taken to thwart malicious activity, including Google detecting, disrupting, and mitigating model extraction activity

Anonymous
02/16/26(Mon)16:27:25 No.108165401

Anonymous 02/16/26(Mon)16:27:25 No.108165401

>>108165385
skill issue

Anonymous
02/16/26(Mon)16:27:59 No.108165405

Anonymous 02/16/26(Mon)16:27:59 No.108165405

>>108165381
If it’s not taboo then it’s boring and populations collapse. Many such cases

Anonymous
02/16/26(Mon)16:28:15 No.108165407

Anonymous 02/16/26(Mon)16:28:15 No.108165407

>>108165385
But I heard linux is great now and you should totally stop using windows.

Anonymous
02/16/26(Mon)16:28:16 No.108165408

Anonymous 02/16/26(Mon)16:28:16 No.108165408

File: file.png (274 KB, 842x894)

274 KB PNG

>>108165309
It's not surprising.
https://cloud.google.com/blog/topics/threat-intelligence/distillation-experimentation-integration-ai-adversarial-use
I don't think it's as threatening as what Google says but it is happening.

Anonymous
02/16/26(Mon)16:28:37 No.108165411

Anonymous 02/16/26(Mon)16:28:37 No.108165411

>>108165334
minimax is a great coom model imo, don't draw your conclusions from meme benches

Anonymous
02/16/26(Mon)16:31:47 No.108165431

Anonymous 02/16/26(Mon)16:31:47 No.108165431

>>108165405
I disagree. It's more taboo than ever yet the birth rates are at their lowest. The biggest determinate for population collapse is female education and employment rates (feminism). Politics aside, we need to change the culture to view beauty and sex in a more positive light.

Anonymous
02/16/26(Mon)16:31:52 No.108165433

Anonymous 02/16/26(Mon)16:31:52 No.108165433

File: 1768536254907153.png (197 KB, 2066x1235)

197 KB PNG

>>108165385
ai generated ragebait post

Anonymous
02/16/26(Mon)16:32:09 No.108165434

Anonymous 02/16/26(Mon)16:32:09 No.108165434

>>108165411
>I continue to sexual assault my sleeping brother. I can't stop myself. I sexual assault my sleeping brother.
You just want people to download 150GB of weights don't you?

Anonymous
02/16/26(Mon)16:32:43 No.108165435

Anonymous 02/16/26(Mon)16:32:43 No.108165435

>>108165385
>server fails because default option tries allocating too much memory
WSL has too many memory management issues for AI. VLLM will make it worse since it just assumes by default it's the only running on your machine and will use ALL resources available.

Anonymous
02/16/26(Mon)16:35:18 No.108165448

Anonymous 02/16/26(Mon)16:35:18 No.108165448

File: file.png (13 KB, 285x94)

13 KB PNG

>>108165433
proving his point dumbass

Anonymous
02/16/26(Mon)16:35:56 No.108165455

Anonymous 02/16/26(Mon)16:35:56 No.108165455

>>108165433
linux is linux, why would there be different installers for random skins?

Anonymous
02/16/26(Mon)16:36:19 No.108165461

Anonymous 02/16/26(Mon)16:36:19 No.108165461

I was running Cuda in in WSL arch no problem?

Anonymous
02/16/26(Mon)16:36:48 No.108165465

Anonymous 02/16/26(Mon)16:36:48 No.108165465

>>108165434
pure unfiltered llama2-era soul

Anonymous
02/16/26(Mon)16:37:35 No.108165472

Anonymous 02/16/26(Mon)16:37:35 No.108165472

>>108165455
you're absolutely right

Anonymous
02/16/26(Mon)16:40:49 No.108165505

Anonymous 02/16/26(Mon)16:40:49 No.108165505

File: photo_5807746905159700488_w.jpg (292 KB, 1080x1678)

292 KB JPG

>>108165433
I looked into that. You CAN'T use standard cuda toolkit linux packages because they overwrite the nvidia display drivers that already exist in wsl which are ported from windows

>>108165407
I would love to but i need a second ssd to backup all my local shit and prices are batshit insane right now

Anonymous
02/16/26(Mon)16:42:58 No.108165525

Anonymous 02/16/26(Mon)16:42:58 No.108165525

>>108165505
You can install the regular packages as long as you only install the toolkit metapackage instead of the ones that contain the toolkit and drivers.

Anonymous
02/16/26(Mon)16:43:28 No.108165529

Anonymous 02/16/26(Mon)16:43:28 No.108165529

>>108165434
I mean, that behavior itself should immediately indicate something is broken with the setup lol
it needs a thinking prefill to really let loose but it's no worse than something like glm4.7 in that regard. nsfw knowledge is completely present and for me at least it's smarter, better-paced, and less annoying than its similarly-sized competitors in qwen 235 and step 3.5. the only thing it has in common with toss is that both are extremely RL-heavy and thus super sensitive to their chat templates

Anonymous
02/16/26(Mon)16:45:25 No.108165543

Anonymous 02/16/26(Mon)16:45:25 No.108165543

>>108165529
>should immediately indicate something is broken with the setup
Yes I can see that the model is mindbroken by safety.

Anonymous
02/16/26(Mon)16:46:57 No.108165553

Anonymous 02/16/26(Mon)16:46:57 No.108165553

File: file.png (297 KB, 550x385)

297 KB PNG

I have finished downloading Q3 of Qwen3.5

Anonymous
02/16/26(Mon)16:48:13 No.108165568

Anonymous 02/16/26(Mon)16:48:13 No.108165568

>>108165553
>tfw cant even run q1
128gb bros... WE LOSTED

Anonymous
02/16/26(Mon)16:48:16 No.108165569

Anonymous 02/16/26(Mon)16:48:16 No.108165569

>>108165543
pearls before swine

Anonymous
02/16/26(Mon)16:49:17 No.108165578

Anonymous 02/16/26(Mon)16:49:17 No.108165578

>>108161568
>it's soft, resting against your thigh
What explains all these models converging on the same output?

Anonymous
02/16/26(Mon)16:50:20 No.108165584

Anonymous 02/16/26(Mon)16:50:20 No.108165584

>>108165568
You really shouldn't bother with MoEs under 32B active.

Anonymous
02/16/26(Mon)16:50:34 No.108165587

Anonymous 02/16/26(Mon)16:50:34 No.108165587

>>108165578
all buying up the same data/distilling from the same models

Anonymous
02/16/26(Mon)16:50:42 No.108165589

Anonymous 02/16/26(Mon)16:50:42 No.108165589

>>108165578
everyone's training on each other's outputs. it's an ouroboros of data converging into a slop singularity

Anonymous
02/16/26(Mon)16:52:16 No.108165605

Anonymous 02/16/26(Mon)16:52:16 No.108165605

>>108159576
This general has one of the better op pic ngl

Anonymous
02/16/26(Mon)16:53:45 No.108165611

Anonymous 02/16/26(Mon)16:53:45 No.108165611

File: file.png (167 KB, 416x416)

167 KB PNG

>>108165578
See this Asian satan? He did it.

Anonymous
02/16/26(Mon)16:58:12 No.108165636

Anonymous 02/16/26(Mon)16:58:12 No.108165636

>>108165578
shit prompt

Anonymous
02/16/26(Mon)17:00:07 No.108165646

Anonymous 02/16/26(Mon)17:00:07 No.108165646

>>108165584
The 20-30B dense models that they can run like Mistral Small, Gemma, etc aren't better.

Anonymous
02/16/26(Mon)17:01:29 No.108165655

Anonymous 02/16/26(Mon)17:01:29 No.108165655

>>108165646
Let me guess. You have at least 2 3090's?

Anonymous
02/16/26(Mon)17:01:30 No.108165657

Anonymous 02/16/26(Mon)17:01:30 No.108165657

Do these small quants work? Like is a 2-bit quant of GLM 5 better than a model of a similar size (250 GB)?
Minimax 2.5 is about 250 GB, right? Is the 2-bit quant of GLM 5 better than Minimax 2.5 at full precision?

Anonymous
02/16/26(Mon)17:02:42 No.108165661

Anonymous 02/16/26(Mon)17:02:42 No.108165661

>>108165655
Me? No. I'm a poorfag.

Anonymous
02/16/26(Mon)17:03:30 No.108165669

Anonymous 02/16/26(Mon)17:03:30 No.108165669

>>108165657
That is usually the case.

Anonymous
02/16/26(Mon)17:10:12 No.108165709

Anonymous 02/16/26(Mon)17:10:12 No.108165709

>>108165669
So: GLM 5 Q2 > Qwen 3.5 Q4 > Minimax M2.5 full?
Interesting.
I wonder if something like Qwen 3 Coder at 1 or 2 bits would work well as autocomplete in something like cursor/vscode or whether that's too slow for reasonable home hardware.

Anonymous
02/16/26(Mon)17:11:44 No.108165717

Anonymous 02/16/26(Mon)17:11:44 No.108165717

>>108165709
For plain autocomplete you can use qwen 3 coder next 80b or the qwen 3 coder 30b a3b.

Anonymous
02/16/26(Mon)17:11:48 No.108165718

Anonymous 02/16/26(Mon)17:11:48 No.108165718

File: 1769877904096646.png (321 KB, 1485x4420)

321 KB PNG

>>108165657
It depends on the specific model and specific quant and even task. 2 bits is not really just 2 bits in GGUF, there's a ton of variables that affect things in this range of quant.

Anonymous
02/16/26(Mon)17:14:54 No.108165731

Anonymous 02/16/26(Mon)17:14:54 No.108165731

>>108165553(me)
My first impression is that it is 5T/s and it is not instaretarded like step or trinity. I think I will enjoy this one.

Anonymous
02/16/26(Mon)17:20:31 No.108165770

Anonymous 02/16/26(Mon)17:20:31 No.108165770

https://huggingface.co/ubergarm/Qwen3.5-397B-A17B-GGUF
https://huggingface.co/ubergarm/Qwen3.5-397B-A17B-GGUF
https://huggingface.co/ubergarm/Qwen3.5-397B-A17B-GGUF
https://huggingface.co/ubergarm/Qwen3.5-397B-A17B-GGUF
https://huggingface.co/ubergarm/Qwen3.5-397B-A17B-GGUF
https://huggingface.co/ubergarm/Qwen3.5-397B-A17B-GGUF

Anonymous
02/16/26(Mon)17:22:05 No.108165778

Anonymous 02/16/26(Mon)17:22:05 No.108165778

File: file.jpg (447 KB, 920x1440)

447 KB JPG

>>108159657

Anonymous
02/16/26(Mon)17:25:11 No.108165794

Anonymous 02/16/26(Mon)17:25:11 No.108165794

>>108165770
>can barely run a q1
>nowhere near enough context to get an actual story out of it
Fuck
Fuuuuuuuck
If LLMs could actually be intelligent, they wouldn't need to get more and more fuckhueg, LeCunny is right

Anonymous
02/16/26(Mon)17:27:27 No.108165806

Anonymous 02/16/26(Mon)17:27:27 No.108165806

>We suggest using Temperature=0.6, TopP=0.95, TopK=20, and MinP=0 for thinking mode and using Temperature=0.7, TopP=0.8, TopK=20, and MinP=0 for non-thinking mode.
>and MinP=0 for thinking mode
BASED. Fuck conmen.

Anonymous
02/16/26(Mon)17:28:52 No.108165821

Anonymous 02/16/26(Mon)17:28:52 No.108165821

>>108165718
Yeah, I've gathered that. I'm not really comparing the same models either, so things like KL divergence don't really work.
I guess what would be interesting to see would be how something like full Minimax M2.5 measures up against GLM 4.7 of some Q4 against GLM 5 of some Q1 or Q2 against Kimi K2.5 Q2.

Basically: how well do comparable model sizes perform? In terms of speed and actual output.
>>108165717
>qwen 3 coder 30b a3b.
3b active means it should be really fast, right? I guess that's what you want more than anything for autocomplete.

Anonymous
02/16/26(Mon)17:33:42 No.108165841

Anonymous 02/16/26(Mon)17:33:42 No.108165841

File: 1752374614465935.png (233 KB, 349x767)

233 KB PNG

can someone help

Anonymous
02/16/26(Mon)17:34:18 No.108165846

Anonymous 02/16/26(Mon)17:34:18 No.108165846

>>108165431
I also disagree (with your disagreeal). Sex positivity and easy access to porn from essentially age zero has made it common and boring as fuck

Anonymous
02/16/26(Mon)17:35:38 No.108165857

Anonymous 02/16/26(Mon)17:35:38 No.108165857

>>108165568
So 129 is the new minimum?

Anonymous
02/16/26(Mon)17:37:36 No.108165865

Anonymous 02/16/26(Mon)17:37:36 No.108165865

>>108165794
Your lack of hardware is not a failure of the technology

Anonymous
02/16/26(Mon)17:41:08 No.108165887

Anonymous 02/16/26(Mon)17:41:08 No.108165887

>>108165846
>common and boring as fuck
for chads and staceys

Anonymous
02/16/26(Mon)17:41:09 No.108165888

Anonymous 02/16/26(Mon)17:41:09 No.108165888

>>108165865
A truly smart model doesn't need to be bloated to insane sizes to include every smidge of trivia its hoovered up from the internet.

Anonymous
02/16/26(Mon)17:43:36 No.108165908

Anonymous 02/16/26(Mon)17:43:36 No.108165908

>>108165888
Interesting theory. Huge if true.
Can you substantiate it?

Anonymous
02/16/26(Mon)17:45:32 No.108165931

Anonymous 02/16/26(Mon)17:45:32 No.108165931

>>108165846
Fair point, but I'll maintain my position. If shaming sexuality and restricting porn resulted in higher birth rates due to its being "taboo" enough, South Korea wouldn't have the worst demography in the world.

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.