/g/ - A guy just fine-tuned Qwen3.6-35B-A3B to imitate C - Technology

Anonymous

04/19/26(Sun)14:04:55 No.108638991

File: Screenshot 2026-04-19 184408.png (668 KB, 1159x1323)

Anonymous 04/19/26(Sun)14:04:55 No.108638991 Archived

A guy just fine-tuned Qwen3.6-35B-A3B to imitate Claude Opus 4.7's reasoning style — basically he took Opus's chain-of-thought traces and used them as training data, so the model now "thinks" the same way Opus does, wrapped in <think>...</think> tags.
The wild part is the efficiency. It's a 35B MoE model but only ~3B parameters are active per token, which means you can actually run this thing on a single A100 or H100. No cluster needed.
And it's fully open. Apache 2.0. Weights are public. Training dataset is public.
This is essentially reasoning distillation — taking what makes a frontier model good at thinking and compressing it into something accessible.
Not saying it matches Opus on benchmarks. It probably doesn't. But the trajectory is clear — the gap between "I can afford this" and "this is actually good" keeps shrinking.
https://huggingface.co/lordx64/Qwen3.6-35B-A3B-Claude-4.7-Opus-Reasoning-Distilled

Anonymous
04/19/26(Sun)14:35:26 No.108639192

Anonymous 04/19/26(Sun)14:35:26 No.108639192

>>108638991
>A guy
you
>imitate Claude Opus
It's been shown that these small model opus finetunes perform worse than their base model on most benchmarks and subjective experience.
>H100
You think you need one of those to run a 35B-A3B?
>Three em-dashes

Anonymous
04/19/26(Sun)14:37:29 No.108639214

Anonymous 04/19/26(Sun)14:37:29 No.108639214

>>108639192
>You think you need one of those to run a 35B-A3B?
i run that in a 1070 with 32 gb of ddr3 ram

Anonymous
04/19/26(Sun)14:40:57 No.108639239

Anonymous 04/19/26(Sun)14:40:57 No.108639239

>>108638991
>35b parameters
That's Mac Mini tier, not A100 lol

Also why are you selling this as something noteworthy when people do distills like that all the time?

Anonymous
04/19/26(Sun)14:45:57 No.108639277

Anonymous 04/19/26(Sun)14:45:57 No.108639277

>>108638991
use better model to write your post next time and tell it to remove all slop and em-dashes from it

Anonymous
04/19/26(Sun)17:41:44 No.108640341

Anonymous 04/19/26(Sun)17:41:44 No.108640341

>>108638991
I know what a meter is
I know "para" means "around"
Whats a parameter?

Anonymous
04/19/26(Sun)17:44:19 No.108640367

Anonymous 04/19/26(Sun)17:44:19 No.108640367

why is he listing it using the same thinking tags Qwen always used to begin with as some kind of relevant feature?

Anonymous
04/19/26(Sun)17:51:30 No.108640422

Anonymous 04/19/26(Sun)17:51:30 No.108640422

>>108638991
At least clear the em dashes before spamming your posts (gens) around.

Anonymous
04/19/26(Sun)18:25:18 No.108640599

Anonymous 04/19/26(Sun)18:25:18 No.108640599

>>108638991
Pajeet

Anonymous
04/19/26(Sun)18:27:43 No.108640616

Anonymous 04/19/26(Sun)18:27:43 No.108640616

>>108638991
You literally used AI to write this. What a retard.

Anonymous
04/19/26(Sun)18:32:14 No.108640645

Anonymous 04/19/26(Sun)18:32:14 No.108640645

when will 4chan get emojis? then all these bots can post using them and they'll be even easier to identify. they can already use emdashes...

Anonymous
04/19/26(Sun)22:26:11 No.108641866

Anonymous 04/19/26(Sun)22:26:11 No.108641866

>>108638991
4.7 is worse than 4.6

Anonymous
04/19/26(Sun)22:27:43 No.108641872

Anonymous 04/19/26(Sun)22:27:43 No.108641872

>>108639192
fpbp
op clueless.

Anonymous
04/19/26(Sun)22:45:08 No.108641944

Anonymous 04/19/26(Sun)22:45:08 No.108641944

>>108640341
it's an imperial measurement for how many paratroopers it takes to solve a problem.

Anonymous
04/19/26(Sun)22:46:48 No.108641952

Anonymous 04/19/26(Sun)22:46:48 No.108641952

>>108640645
you can use emojis on /sci/ technically, or in filenames

Anonymous
04/19/26(Sun)22:49:12 No.108641965

Anonymous 04/19/26(Sun)22:49:12 No.108641965

>>108638991
>Just shipped an...
what is it about xitternigs that makes them talk like that?

Anonymous
04/20/26(Mon)08:22:33 No.108644324

Anonymous 04/20/26(Mon)08:22:33 No.108644324

>>108639192
>three emm dashes

>And it's fully open. Apache 2.0. Weights are public. Training dataset is public.
This line should have tipped me off. I need to get better.

Anonymous
04/20/26(Mon)08:26:17 No.108644346

Anonymous 04/20/26(Mon)08:26:17 No.108644346

>>108638991
Buy an ad Rajesh

Anonymous
04/20/26(Mon)08:48:44 No.108644474

Anonymous 04/20/26(Mon)08:48:44 No.108644474

>>108638991
Claude doesn't send actual reasoning traces.
They are thinking summaries designed to prevent exactly this kind of thing.

Anonymous
04/20/26(Mon)15:14:53 No.108647146

Anonymous 04/20/26(Mon)15:14:53 No.108647146

>>108644474
it's already been seen on the claude leak