/g/ - /ldg/ - Local Diffusion General - Technology

[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]

Board

▼ Settings Mobile Home

/g/ - Technology

Return Catalog Bottom Refresh

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

[Return] [Catalog] [Bottom]

Anonymous

/ldg/ - Local Diffusion Genera(...) 11/26/25(Wed)07:25:50 No.107332452

File: highlights_g_107328508_17(...).jpg (1.41 MB, 2609x1926)

1.41 MB JPG

/ldg/ - Local Diffusion General Anonymous 11/26/25(Wed)07:25:50 No.107332452 Archived

Discussion of Free and Open Source Text-to-Image/Video Models

Prev: >>107328508

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/kohya-ss/sd-scripts
https://github.com/tdrussell/diffusion-pipe
https://github.com/ostris/ai-toolkit

>WanX
https://rentry.org/wan22ldgguide
https://comfyanonymous.github.io/ComfyUI_examples/wan22/

>NetaYume
https://civitai.com/models/1790792?modelVersionId=2298660
https://nieta-art.feishu.cn/wiki/RY3GwpT59icIQlkWXEfcCqIMnQd
https://gumgum10.github.io/gumgum.github.io/
https://huggingface.co/neta-art/Neta-Lumina

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
Training: https://rentry.org/mvu52t46

>Illustrious
1girl and Beyond: https://rentry.org/comfyui_guide_1girl
Tag Explorer: https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/b/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo

Anonymous
11/26/25(Wed)07:27:01 No.107332461

Anonymous 11/26/25(Wed)07:27:01 No.107332461

What is the standard practice for leveraging the vram installed on a separate PC on the same network?

I have an RTX 5070 Ti with 64GB dram and it's been mostly fine for wan2.2, but I recently added NAG to the workflow and gen time is now taking twice as long for 97 frames for 992x720. Can't I just offload the text encoding off to some 8GB vram pc?.

Anonymous
11/26/25(Wed)07:27:06 No.107332462

Anonymous 11/26/25(Wed)07:27:06 No.107332462

File: collage.jpg (3.22 MB, 2975x3710)

3.22 MB JPG

Anonymous
11/26/25(Wed)07:27:28 No.107332463

Anonymous 11/26/25(Wed)07:27:28 No.107332463

File: flux_krea_00039_.png (2.72 MB, 1088x1920)

2.72 MB PNG

i reckon i'll have me one of them
fresh baked local model generals.

Anonymous
11/26/25(Wed)07:27:33 No.107332464

Anonymous 11/26/25(Wed)07:27:33 No.107332464

Blessed thread of frenship

Anonymous
11/26/25(Wed)07:27:37 No.107332465

Anonymous 11/26/25(Wed)07:27:37 No.107332465

ty baker!
>>>/b/realistic+parody became >>>/r/realistic+parody

Anonymous
11/26/25(Wed)07:27:56 No.107332467

Anonymous 11/26/25(Wed)07:27:56 No.107332467

File: 1746837732977701.png (732 KB, 1024x1024)

732 KB PNG

>tfw a 6b model destroys a 32b model in realism lmao
https://www.modelscope.cn/models/Tongyi-MAI/Z-Image-Turbo/picture

Anonymous
11/26/25(Wed)07:28:42 No.107332474

Anonymous 11/26/25(Wed)07:28:42 No.107332474

>>107332467
is this on huggingface yet? even runnable in comfyui? i don't want to give the chinks my phone number just to download it.

Anonymous
11/26/25(Wed)07:29:52 No.107332488

Anonymous 11/26/25(Wed)07:29:52 No.107332488

>>107332474
it's supposed to be released today, we have to wait a bit before being able to test that out, comfyui has already implemented it so that's something

Anonymous
11/26/25(Wed)07:30:56 No.107332498

Anonymous 11/26/25(Wed)07:30:56 No.107332498

>>107332452
>https://rentry.org/wan22ldgguide
Why there's no updated t2v workflow? Which lightx2v I should use with t2v models?

Anonymous
11/26/25(Wed)07:31:35 No.107332503

Anonymous 11/26/25(Wed)07:31:35 No.107332503

File: 1755756357473374.png (1.37 MB, 1024x1024)

1.37 MB PNG

>>107332467
alibaba is definitely cooking holy shit

Anonymous
11/26/25(Wed)07:31:42 No.107332505

Anonymous 11/26/25(Wed)07:31:42 No.107332505

File: 1739993131966240.png (745 KB, 498x818)

745 KB PNG

>huge breasts according to flux 2

Anonymous
11/26/25(Wed)07:32:27 No.107332508

Anonymous 11/26/25(Wed)07:32:27 No.107332508

Why not try to optimize model architecture instead increasing parameters?

Anonymous
11/26/25(Wed)07:33:14 No.107332517

Anonymous 11/26/25(Wed)07:33:14 No.107332517

>>107332467
>no model at hf
grim

Anonymous
11/26/25(Wed)07:33:16 No.107332518

Anonymous 11/26/25(Wed)07:33:16 No.107332518

>>107332508
everything should be optimized for 24gb vram

Anonymous
11/26/25(Wed)07:33:59 No.107332523

Anonymous 11/26/25(Wed)07:33:59 No.107332523

File: 1760471328186989.png (1.4 MB, 768x1024)

1.4 MB PNG

>>107332508
>Why not try to optimize model architecture instead increasing parameters?
that's what alibaba is doing, and they do a great job at it
>>107332467
>>107332503

Anonymous
11/26/25(Wed)07:34:39 No.107332528

Anonymous 11/26/25(Wed)07:34:39 No.107332528

File: Flux2_00030_.png (1.58 MB, 1024x1024)

1.58 MB PNG

>>107332467
Same prompt converted Chinese -> English
>In this nighttime outdoor scene, the lighting is dim, with weak light sources in a low-light environment (such as distant lights). In the close-up composition, a young Chinese woman wears a light gray top with black backpack straps visible on her shoulders. She has long, dark hair, which is dynamically flowing in the wind, with some strands falling around her shoulders and face. Her facial skin has a natural texture in the dim light, and her eyes are looking towards the camera with a natural expression. The background consists of blurred trees, grass, and a fence: the outlines of the trees are faintly visible in the dim light, the grass is dark in color, and the fence has a mesh-like texture. Overall, in this low-light environment, details such as the movement of her hair, the texture of the backpack straps, the outlines of the trees, and the texture of the grass are rendered realistically.

Anonymous
11/26/25(Wed)07:35:08 No.107332532

Anonymous 11/26/25(Wed)07:35:08 No.107332532

File: 08aa8282-87e1-47e0-9469-2(...).jpg (176 KB, 1280x1280)

176 KB JPG

>>107332488
i will be stalking the thread today with great interest then. It looks promising!
can't believe this is a 1280pix image, look at that. damn.

Anonymous
11/26/25(Wed)07:36:03 No.107332539

Anonymous 11/26/25(Wed)07:36:03 No.107332539

File: BASED.gif (2.25 MB, 636x640)

2.25 MB GIF

>>107332523
>small model
>seems to know a lot of IPs
>the realism is on point
>Apache 2.0
I kneel again China, you guys are the future.

Anonymous
11/26/25(Wed)07:38:18 No.107332560

Anonymous 11/26/25(Wed)07:38:18 No.107332560

File: borzoi googly eyes.png (295 KB, 648x680)

295 KB PNG

>that image of the chick in her lingerie
holy shit it might actually be able to do nsfw out of the box

Anonymous
11/26/25(Wed)07:38:42 No.107332562

Anonymous 11/26/25(Wed)07:38:42 No.107332562

File: 1755769459686956.jpg (740 KB, 2048x1390)

740 KB JPG

>>107332528
this is humiliating for bfl holy shit

Anonymous
11/26/25(Wed)07:40:35 No.107332577

Anonymous 11/26/25(Wed)07:40:35 No.107332577

File: god bless the chinks.png (139 KB, 1900x790)

139 KB PNG

>>107332560
>>107332562
and it's only the turbo model, just imagine the improvement once we get the "best version"

Anonymous
11/26/25(Wed)07:41:08 No.107332584

Anonymous 11/26/25(Wed)07:41:08 No.107332584

File: 1740003630255120.mp4 (1.36 MB, 832x480)

1.36 MB MP4

4 steps/4 steps is the way (kijai 2.2 MoE distil high, 2.2 lightning low).

the golden retreiver dog on the left fires a huge blue lightning bolt from their paws at the man wearing glasses. The man with glasses flies off his chair to the right through a window into the sky, flipping over and over, over the side of a mountain cliff during the day. he lands far below and a giant lightning bolt hits his body where he lands, creating lots of smoke and fire.

Anonymous
11/26/25(Wed)07:41:20 No.107332586

Anonymous 11/26/25(Wed)07:41:20 No.107332586

>>107332577
>just imagine the improvement once we get the "best version"
improvement like none of us being able to run it kek

Anonymous
11/26/25(Wed)07:42:36 No.107332599

Anonymous 11/26/25(Wed)07:42:36 No.107332599

>>107332586
the difference between base and turbo is that turbo is a distilled 8 steps model, they're both 6b parameters

Anonymous
11/26/25(Wed)07:43:10 No.107332603

Anonymous 11/26/25(Wed)07:43:10 No.107332603

>>107332599
ah ok, forgive me for not knowing/ being retarded. don't dock my social credit score.

Anonymous
11/26/25(Wed)07:43:57 No.107332608

Anonymous 11/26/25(Wed)07:43:57 No.107332608

File: file.png (388 KB, 625x374)

388 KB PNG

>>107332503
>no background blur

Anonymous
11/26/25(Wed)07:45:00 No.107332614

Anonymous 11/26/25(Wed)07:45:00 No.107332614

>>107332603
I'm in a good mood today so I'll only dock your izzat score instead kek

Anonymous
11/26/25(Wed)07:46:28 No.107332632

Anonymous 11/26/25(Wed)07:46:28 No.107332632

Reminder for ChromaGODS

Try your prompts with "aesthetic 9," in positive and "aesthetic 1," in negative if you havent until now

Anonymous
11/26/25(Wed)07:48:09 No.107332646

Anonymous 11/26/25(Wed)07:48:09 No.107332646

>>107332467
I am serious when I say this, lodestone has to finetune it and add NSFW shit to it.

Anonymous
11/26/25(Wed)07:49:19 No.107332652

Anonymous 11/26/25(Wed)07:49:19 No.107332652

>>107332467
modelscope download --model 'Tongyi-MAI/Z-Image-Turbo' --local_dir 'Z-Image-Turbo'
>Requests.exceptions.HTTPError: The request model: Tongyi-MAI/Z-Image-Turbo does not exist!
Huh? Just trying to follow the instructions. How do I download this?

Anonymous
11/26/25(Wed)07:49:33 No.107332655

Anonymous 11/26/25(Wed)07:49:33 No.107332655

File: you suck bfl.png (2.12 MB, 1670x1415)

2.12 MB PNG

Flux 2 fucking sucks, and that yellow tint makes it obvious they trained their model on synthetic 4o imagegen shit

Anonymous
11/26/25(Wed)07:50:10 No.107332661

Anonymous 11/26/25(Wed)07:50:10 No.107332661

When we getting Z image? Flux 2 download is 30+ gb total even with gguf, kek

Anonymous
11/26/25(Wed)07:50:35 No.107332665

Anonymous 11/26/25(Wed)07:50:35 No.107332665

>>107332652
>How do I download this?
you can't, it's not officially there yet

Anonymous
11/26/25(Wed)07:51:44 No.107332675

Anonymous 11/26/25(Wed)07:51:44 No.107332675

>>107332532
>can't believe this is a 1280pix image, look at that. damn.
since it's a small model you can go for bigger resolutions and it won't be too slow, it's pretty smart if you ask me, I think the (big res + normal size model) is a better combo than (1k res + giant model)

Anonymous
11/26/25(Wed)07:52:24 No.107332679

Anonymous 11/26/25(Wed)07:52:24 No.107332679

>>107332614
DO NOT DOCK IZZAT BLOODY BITCHOD

>>107332655
its like the chinks waited for BFL to fail again so they could BTFO them. absolutely SAD!

Anonymous
11/26/25(Wed)07:52:30 No.107332683

Anonymous 11/26/25(Wed)07:52:30 No.107332683

File: 1757371441591070.mp4 (1.57 MB, 832x480)

1.57 MB MP4

one more of the dog shock guy:

Anonymous
11/26/25(Wed)07:53:32 No.107332687

Anonymous 11/26/25(Wed)07:53:32 No.107332687

File: 1735396514537025.png (637 KB, 1136x912)

637 KB PNG

>>107332655
qwen edit + 4/8 step lightx2v lora is the way. fast and good, also makes great pepes.

Anonymous
11/26/25(Wed)07:54:21 No.107332695

Anonymous 11/26/25(Wed)07:54:21 No.107332695

File: img_00047_.jpg (1.02 MB, 1264x1592)

1.02 MB JPG

>>107332562
damn

Anonymous
11/26/25(Wed)07:54:27 No.107332697

Anonymous 11/26/25(Wed)07:54:27 No.107332697

File: kek.png (26 KB, 220x165)

26 KB PNG

>>107332679
>its like the chinks waited for BFL to fail again so they could BTFO them.
they're so petty, I love that

Anonymous
11/26/25(Wed)07:55:02 No.107332702

Anonymous 11/26/25(Wed)07:55:02 No.107332702

>>107332665
Oh I see.
Hopefully Soon TM then.

Anonymous
11/26/25(Wed)07:55:19 No.107332705

Anonymous 11/26/25(Wed)07:55:19 No.107332705

>alibaba releasing 2 open weight image models by two different teams
so what's the deal with wan??? why is it api only now?

Anonymous
11/26/25(Wed)07:56:31 No.107332713

Anonymous 11/26/25(Wed)07:56:31 No.107332713

>>107332705
>why is it api only now?
what? the next iteration of QIE will be local and that Z-image model will be local as well

Anonymous
11/26/25(Wed)08:00:51 No.107332748

Anonymous 11/26/25(Wed)08:00:51 No.107332748

>>107332713
>will be, will be, will be...
>but nothing happens
stop lying

Anonymous
11/26/25(Wed)08:01:19 No.107332751

Anonymous 11/26/25(Wed)08:01:19 No.107332751

File: cc2bd76fa6494d3fe874596bf(...).png (363 KB, 2417x1175)

363 KB PNG

Why am I filling up my vram at this stage? Models are fully offloaded, 196gb, 5090.

Anonymous
11/26/25(Wed)08:01:29 No.107332754

Anonymous 11/26/25(Wed)08:01:29 No.107332754

>>107332748
>they added huggingface links for no reason
why are you doubting them? they always delivered

Anonymous
11/26/25(Wed)08:01:30 No.107332755

Anonymous 11/26/25(Wed)08:01:30 No.107332755

>>107332705
>>107332748
you came in here looking for a fight, but all you're getting is me calling you a faggot nigger.

Anonymous
11/26/25(Wed)08:04:17 No.107332785

Anonymous 11/26/25(Wed)08:04:17 No.107332785

>>107332562
>>107332695
so if i understand correctly it's another team than the Qwen team that did this inside of Alibaba, if they managed to get this kino at 6b just imagine if the scaled it up for 12+b, if I were an employee from the Qwen team I would be worried as fuck, the other team seems to be more talented than them

Anonymous
11/26/25(Wed)08:04:29 No.107332786

Anonymous 11/26/25(Wed)08:04:29 No.107332786

For me, its Z-Image.

Anonymous
11/26/25(Wed)08:04:35 No.107332790

Anonymous 11/26/25(Wed)08:04:35 No.107332790

>>107332751
No you didn't.
Virtual VRAM and donor device means CPU is used for the exceeding amount. Whatever fits into your GPU is still used.
Set device parameter, the third setting to CPU if you want the agonizingly slow CPU inference.

Anonymous
11/26/25(Wed)08:05:21 No.107332797

Anonymous 11/26/25(Wed)08:05:21 No.107332797

File: kek.png (1.91 MB, 1408x768)

1.91 MB PNG

>>107332679
>its like the chinks waited for BFL to fail again so they could BTFO them. absolutely SAD!
kek

Anonymous
11/26/25(Wed)08:06:13 No.107332804

Anonymous 11/26/25(Wed)08:06:13 No.107332804

>>107332508
that would encourage people to use local AI instead of API

Anonymous
11/26/25(Wed)08:06:35 No.107332807

Anonymous 11/26/25(Wed)08:06:35 No.107332807

Bros if those images aren't cherry picked I am genuinely excited for Z-Image.
I am gonna cope and hope that it can do light NSFW (booba) even.

Anonymous
11/26/25(Wed)08:09:43 No.107332832

Anonymous 11/26/25(Wed)08:09:43 No.107332832

>>107332713
reading comprehension level: gpt-0.5

Anonymous
11/26/25(Wed)08:10:40 No.107332842

Anonymous 11/26/25(Wed)08:10:40 No.107332842

Please don't be too excited for this model, Alibaba will see they have struck gold and they'll keep it for themselves :(

Anonymous
11/26/25(Wed)08:12:18 No.107332851

Anonymous 11/26/25(Wed)08:12:18 No.107332851

File: Flux2_00034_.png (1.49 MB, 1408x768)

1.49 MB PNG

>>107332797

Anonymous
11/26/25(Wed)08:12:36 No.107332854

Anonymous 11/26/25(Wed)08:12:36 No.107332854

File: ComfyUI_temp_pqugj_00036_.png (3.33 MB, 1824x1248)

3.33 MB PNG

https://files.catbox.moe/2hcnb6.png

Anonymous
11/26/25(Wed)08:12:46 No.107332856

Anonymous 11/26/25(Wed)08:12:46 No.107332856

>>107332842
this, thats why you should always shit on every mistake to make the company want to qualify themselves for you, you gotta counterract the normgroid praise that they will be showered with anyway

Anonymous
11/26/25(Wed)08:13:14 No.107332862

Anonymous 11/26/25(Wed)08:13:14 No.107332862

>>107332562
To be fair, the Flux. 2 image is the only one that's actually dim.

Anonymous
11/26/25(Wed)08:13:48 No.107332868

Anonymous 11/26/25(Wed)08:13:48 No.107332868

File: ComfyUI_temp_iubdp_00131_.png (2.56 MB, 2304x960)

2.56 MB PNG

https://files.catbox.moe/vpr1ig.png

Anonymous
11/26/25(Wed)08:14:11 No.107332871

Anonymous 11/26/25(Wed)08:14:11 No.107332871

>>107332851
how did it manage to even copy the logo? lmao

Anonymous
11/26/25(Wed)08:14:51 No.107332875

Anonymous 11/26/25(Wed)08:14:51 No.107332875

File: ComfyUI_temp_iubdp_00113_.png (3.5 MB, 2304x960)

3.5 MB PNG

https://files.catbox.moe/t1cwmr.png

Anonymous
11/26/25(Wed)08:16:22 No.107332884

Anonymous 11/26/25(Wed)08:16:22 No.107332884

>>107332871
>Panel 3 (Bottom Left): This panel is split into two sections.
>Left Side: A yellow caption box reads "ONE DAY LATER..." Below it is a logo featuring a stylized 'e' or 'a' curve with the text "ALIBABA Z-IMAGE (6B)".
>Right Side: A group of four diverse office workers gathers around a computer, looking intrigued and happy. One character asks, "A new 6B local model? Z-Image?"
Beats me

Anonymous
11/26/25(Wed)08:16:45 No.107332889

Anonymous 11/26/25(Wed)08:16:45 No.107332889

>>107332868
Why would you catbox censored images?

Anonymous
11/26/25(Wed)08:17:48 No.107332897

Anonymous 11/26/25(Wed)08:17:48 No.107332897

File: ComfyUI_temp_iubdp_00106_.png (2.28 MB, 1824x1248)

2.28 MB PNG

https://files.catbox.moe/5y9hti.png

Anonymous
11/26/25(Wed)08:18:02 No.107332899

Anonymous 11/26/25(Wed)08:18:02 No.107332899

>>107332807
As long as they don't intentionally poison the model like BFL does with Flux, it should be easy to add NSFW using lora/finetune.

Anonymous
11/26/25(Wed)08:18:17 No.107332902

Anonymous 11/26/25(Wed)08:18:17 No.107332902

File: 1763710556613020.png (432 KB, 661x645)

432 KB PNG

>>107332797
>>107332851
I'm not gonna lie this logo looks pretty good, if an Alibaba employee is lurking here, you can use it as the official Z-image logo lol

Anonymous
11/26/25(Wed)08:18:31 No.107332907

Anonymous 11/26/25(Wed)08:18:31 No.107332907

File: 3ce23748e4696793b6d9a582c(...).jpg (937 KB, 2903x983)

937 KB JPG

I know flux is trained on itself and require 9 pages of essays from an ai to prompt for an image, but at least the default settings should do better than this..?

>>107332790
I'm retarded, forgot to set the amount for the actual model that matters.. It's genning now.

Anonymous
11/26/25(Wed)08:19:28 No.107332910

Anonymous 11/26/25(Wed)08:19:28 No.107332910

>>107332899
they never do that, Alibaba is based and give us normal models with the goated apache 2.0 licence
>>107332907
unfortunately it looks less slopped after 40 steps (it's too long bruh)

Anonymous
11/26/25(Wed)08:21:16 No.107332926

Anonymous 11/26/25(Wed)08:21:16 No.107332926

File: 1747481106292273.png (1.96 MB, 1120x1440)

1.96 MB PNG

LOCAL WON

Anonymous
11/26/25(Wed)08:21:39 No.107332932

Anonymous 11/26/25(Wed)08:21:39 No.107332932

>>107332899
We don't know how well it will respond to lora training, plus doing NSFW as lora is usually iffier than base model knowing, and it can be difficult to combine with other loras.
It's better that the base model knows it already.

Anonymous
11/26/25(Wed)08:22:01 No.107332935

Anonymous 11/26/25(Wed)08:22:01 No.107332935

File: ComfyUI_temp_iubdp_00052_.png (3.17 MB, 1824x1248)

3.17 MB PNG

>>107332889
It's not censored, it was meant to be wholesome image :/
https://files.catbox.moe/4r4jf3.png

Anonymous
11/26/25(Wed)08:23:06 No.107332945

Anonymous 11/26/25(Wed)08:23:06 No.107332945

>>107332467
Oh it'll also be an edit model, I wonder if it'll be better than QIE (probably not but it'll be worth a try)

Anonymous
11/26/25(Wed)08:25:28 No.107332964

Anonymous 11/26/25(Wed)08:25:28 No.107332964

File: BrunoLipBite.png (968 KB, 660x640)

968 KB PNG

>more images being added to that page
>skimpier and skimpier girls
>anatomy understanding is pretty good even for a bunch of 1mp base images
please don't fuck us over
please don't fuck us over
please don't fuck us over

Anonymous
11/26/25(Wed)08:27:43 No.107332981

Anonymous 11/26/25(Wed)08:27:43 No.107332981

>>107332932
>It's better that the base model knows it already.
Of course, but beggars can't be choosers, and even if the model doesn't know about NSFW it's still miles better than BFL training/poisoning their Flux models with tons of synthetic images depicting breasts without nipples and crotches without genitals.

Anonymous
11/26/25(Wed)08:28:28 No.107332987

Anonymous 11/26/25(Wed)08:28:28 No.107332987

File: Flux2_00036_.jpg (2.85 MB, 1824x1248)

2.85 MB JPG

>>107332854
Nice work as usual. Same prompt, default workflow, sa_solver_pece

>>107332889
nta, bc sharing is caring. I can just copy the prompt and see how flux2 does without asking

Anonymous
11/26/25(Wed)08:30:33 No.107332999

Anonymous 11/26/25(Wed)08:30:33 No.107332999

>>107332964
>please don't fuck us over
TRUST DA CHINKS
>>107332981
>it's still miles better than BFL training/poisoning their Flux models with tons of synthetic images depicting breasts without nipples and crotches without genitals.
this, fuck bfl, the chinks are showing you can improve your models without being a lazy fuck and just stack more layers to solve the problem

Anonymous
11/26/25(Wed)08:30:36 No.107333001

Anonymous 11/26/25(Wed)08:30:36 No.107333001

>>107330558
>"kijai's nodes are better"
>does 4 steps with native, 6 steps with kijai
gee I wonder why it's "better". do you even look at the settings, retard?

Anonymous
11/26/25(Wed)08:32:40 No.107333010

Anonymous 11/26/25(Wed)08:32:40 No.107333010

File: Capture.jpg (92 KB, 720x1280)

92 KB JPG

>>107332467
the chroma footfag is gonna have a blast with that one lol

Anonymous
11/26/25(Wed)08:32:50 No.107333014

Anonymous 11/26/25(Wed)08:32:50 No.107333014

File: 20 steps Q6_K.png (2.9 MB, 1536x1024)

2.9 MB PNG

My benchmarking prompt tested on Q6_K Flux 2:
>Detailed photograph RAW of seven smiling friends of different races that are at a nightclub concert with dim lighting that is shining on their faces, behind them is a crowd of people dancing while fighting with large swords, everyone is holding a sword in their left hand and an intricate beer glass with differently colored beer in the right hand. Far behind them above the DJ there is a sign which has "Minimum drinKing age 021!" written on it in stylized cursive letters.

Anonymous
11/26/25(Wed)08:33:50 No.107333023

Anonymous 11/26/25(Wed)08:33:50 No.107333023

>>107333014
can someone try that prompt on modelscope, I wanna see if Z-image can handle such complex prompt >>107332467

Anonymous
11/26/25(Wed)08:33:59 No.107333024

Anonymous 11/26/25(Wed)08:33:59 No.107333024

File: ComfyUI-euler-1.0-20-2025(...).png (2.14 MB, 1088x1920)

2.14 MB PNG

>when it understands the mission, just in a roundabout way
well okay

Anonymous
11/26/25(Wed)08:34:58 No.107333031

Anonymous 11/26/25(Wed)08:34:58 No.107333031

>>107333014
>>107333024
>yellow tint
lmao are you serious bfl?

Anonymous
11/26/25(Wed)08:35:36 No.107333035

Anonymous 11/26/25(Wed)08:35:36 No.107333035

File: 40 steps Q6_K.png (2.84 MB, 1536x1024)

2.84 MB PNG

>>107333014
Same seed with 40 steps

Anonymous
11/26/25(Wed)08:36:05 No.107333039

Anonymous 11/26/25(Wed)08:36:05 No.107333039

>>107332532
I mean, the big part is its coherency. It got the Chinese characters right if this isn't an edit prompt and is actually a from nothing gen.

Anonymous
11/26/25(Wed)08:36:31 No.107333048

Anonymous 11/26/25(Wed)08:36:31 No.107333048

https://huggingface.co/fal/FLUX.2-Tiny-AutoEncoder
>Tiny AutoEncoder trained on the latent space of black-forest-labs/FLUX.2-dev's autoencoder. Works to convert between latent and image space up to 20x faster and in 28x fewer parameters at the expense of a small amount of quality.
hold up, maybe this can save flux 2

Anonymous
11/26/25(Wed)08:37:36 No.107333060

Anonymous 11/26/25(Wed)08:37:36 No.107333060

File: ComfyUI_00042_.jpg (833 KB, 2048x2048)

833 KB JPG

>>107332897
Took this prompt for flux 2.
I think that dype works for flux 2. This is 2048p. The first attempt without dype at 1024 looked like shit.
627s gen rofl.

Anonymous
11/26/25(Wed)08:37:36 No.107333061

Anonymous 11/26/25(Wed)08:37:36 No.107333061

>>107333001
You already got embarrassed in the previous thread, my illiterate idiot. There is no need to continue doubling down on your stupidity.

Anonymous
11/26/25(Wed)08:37:57 No.107333063

Anonymous 11/26/25(Wed)08:37:57 No.107333063

>>107333010
Chromajeet is going to seethe because it’s not chroma, that’s all he does

Anonymous
11/26/25(Wed)08:38:27 No.107333066

Anonymous 11/26/25(Wed)08:38:27 No.107333066

>>107333048
man the biggest quality loss and the fastest thing in these workflows IS the vae, fucking that over for a couple of seconds speedup is worthless, especially for edit workflows

Anonymous
11/26/25(Wed)08:38:38 No.107333069

Anonymous 11/26/25(Wed)08:38:38 No.107333069

File: 1758642774080723.png (3.37 MB, 3489x1518)

3.37 MB PNG

>>107333039
>It got the Chinese characters right if this isn't an edit prompt and is actually a from nothing gen.
it's a normal image from turbo, not the edit model
https://modelscope.cn/models/Tongyi-MAI/Z-Image-Turbo/picture

Anonymous
11/26/25(Wed)08:40:22 No.107333088

Anonymous 11/26/25(Wed)08:40:22 No.107333088

>>107333061
I'm not that anon. also, you're not even using the same light loras between the workflows. you're a giga fucking retard nigger

Anonymous
11/26/25(Wed)08:41:10 No.107333094

Anonymous 11/26/25(Wed)08:41:10 No.107333094

>>107333061
>doesn't use the same loras
>doesn't use the same number of steps
>still doubles down and thinks he's right
why are you like this?

Anonymous
11/26/25(Wed)08:41:28 No.107333096

Anonymous 11/26/25(Wed)08:41:28 No.107333096

>>107333088
anyone that disagrees with wanschizo is a samefag according to him

Anonymous
11/26/25(Wed)08:43:41 No.107333116

Anonymous 11/26/25(Wed)08:43:41 No.107333116

>>107332463
catbox? all my krea gens look insanely washed out

Anonymous
11/26/25(Wed)08:44:22 No.107333123

Anonymous 11/26/25(Wed)08:44:22 No.107333123

>>107332467
>>107332503
I never expected a 6b model to look this good, what's their secret sauce?

Anonymous
11/26/25(Wed)08:44:43 No.107333127

Anonymous 11/26/25(Wed)08:44:43 No.107333127

>>107332508
bulk and cut, bulk and cut

Anonymous
11/26/25(Wed)08:45:32 No.107333135

Anonymous 11/26/25(Wed)08:45:32 No.107333135

>>107333048
I dunno. Given the details at play and the Chinese models likely not using it, this isn't going to be a case of what happened with the first FLUX VAE where everyone just reuses it.
>>107333069
Holy shit, that is some good stuff right there. Now to see if anyone has money left over for fine tuning on these models or if that is dead...

Anonymous
11/26/25(Wed)08:45:58 No.107333139

Anonymous 11/26/25(Wed)08:45:58 No.107333139

gonna make a better comfyui for people to use. it's reached eol and they seem to be wasting that billion pretty quickly. hope you're well anon.

Anonymous
11/26/25(Wed)08:46:15 No.107333142

Anonymous 11/26/25(Wed)08:46:15 No.107333142

>>107333088
>>107333094
Anon, loras have nothing to do with the teacache effect on hands.
I know you're still butthurt over being repeatedly exposed as a complete and utter idiot, but do at least try to get over your loss.

Anonymous
11/26/25(Wed)08:46:16 No.107333144

Anonymous 11/26/25(Wed)08:46:16 No.107333144

File: ohh mommy.png (1.3 MB, 1024x1024)

1.3 MB PNG

>>107332467
https://www.reddit.com/r/StableDiffusion/comments/1p77dli/some_images_i_generated_using_the_zimage_model/
pretty good

Anonymous
11/26/25(Wed)08:47:59 No.107333154

Anonymous 11/26/25(Wed)08:47:59 No.107333154

>>107333142
>I'm too retarded to make an actual 1:1 comparison
there's nothing we can do for you anon, you're too braindead for this

Anonymous
11/26/25(Wed)08:48:20 No.107333158

Anonymous 11/26/25(Wed)08:48:20 No.107333158

>>107333142
wrong, faggot. it is known that the 1022 lora has that effect. alright, I'm done with your down syndrome ass

Anonymous
11/26/25(Wed)08:48:27 No.107333159

Anonymous 11/26/25(Wed)08:48:27 No.107333159

>>107333096
nah the problem is you're a socially retarded autist who is too stupid to vary your prose, and end up falling into all the same patterns.

Anonymous
11/26/25(Wed)08:48:49 No.107333166

Anonymous 11/26/25(Wed)08:48:49 No.107333166

>>107333144
where the is second hand

Anonymous
11/26/25(Wed)08:48:52 No.107333168

Anonymous 11/26/25(Wed)08:48:52 No.107333168

>>107333123
Novel text encoder.
Probably a lot of pre-training data set pruning to prevent low quality images from degrading the model.
And aggressively specializing on realism probably.
It will likely be bad for non-realism, artistic, editing, etc. use cases.
Honestly smaller models should do that instead of trying to become useless jack of all trades.

Anonymous
11/26/25(Wed)08:49:06 No.107333174

Anonymous 11/26/25(Wed)08:49:06 No.107333174

>107333159
>107333142
exhibit a

Anonymous
11/26/25(Wed)08:49:07 No.107333175

Anonymous 11/26/25(Wed)08:49:07 No.107333175

File: 1739469907958759.png (3.71 MB, 8208x1445)

3.71 MB PNG

>>107333144
here's some more examples with prompts

Anonymous
11/26/25(Wed)08:49:28 No.107333181

Anonymous 11/26/25(Wed)08:49:28 No.107333181

>>107333158
>it is known that the 1022 lora has that effect
Which wasn't being used in any posted you laughably stupid idiot xD

Anonymous
11/26/25(Wed)08:49:37 No.107333186

Anonymous 11/26/25(Wed)08:49:37 No.107333186

File: 20 steps Q6_K.png (1.74 MB, 1024x1024)

1.74 MB PNG

>>107333014
>>107333031
Another seed and resolution

Anonymous
11/26/25(Wed)08:49:49 No.107333191

Anonymous 11/26/25(Wed)08:49:49 No.107333191

>>107333123
we are comparing a realism image model to a general model that does gen, edit, multi-ref and 4mp. i think it's great to provide smaller specialized models but they are going to be different sizes based on that alone

Anonymous
11/26/25(Wed)08:50:16 No.107333196

Anonymous 11/26/25(Wed)08:50:16 No.107333196

>>107333175
> 1girl, standing

Anonymous
11/26/25(Wed)08:50:34 No.107333201

Anonymous 11/26/25(Wed)08:50:34 No.107333201

>>107333031
in my case that greta gen is actually just flux.

>>107333116
sorry, was using a node that had krea in the name field for some reason. that's just cyberrealistic flux.

been meaning to check progress on krea checkpoints.. last i tried it, kinda schizo.

Anonymous
11/26/25(Wed)08:50:59 No.107333205

Anonymous 11/26/25(Wed)08:50:59 No.107333205

>>107333186
thanks for this but benchmarks usually show like what device and speed as well

Anonymous
11/26/25(Wed)08:51:05 No.107333206

Anonymous 11/26/25(Wed)08:51:05 No.107333206

File: the famous 1girl.png (1.44 MB, 1151x1236)

1.44 MB PNG

>>107333196
> 1girl, standing

Anonymous
11/26/25(Wed)08:51:13 No.107333209

Anonymous 11/26/25(Wed)08:51:13 No.107333209

>>107333196
> 1fag, crying

Anonymous
11/26/25(Wed)08:51:18 No.107333211

Anonymous 11/26/25(Wed)08:51:18 No.107333211

>>107333154
>autist still can't let go of 1:1 comparisons
>even when the anon who DID the 1:1 comparison clarified to you that it shouldn't matter, and that the base template shouldn't have this issue
Bizarre how dumb you are my dude. Ppl with severe autism like you can never move on lol

Anonymous
11/26/25(Wed)08:51:22 No.107333212

Anonymous 11/26/25(Wed)08:51:22 No.107333212

never change /g/

Anonymous
11/26/25(Wed)08:51:39 No.107333214

Anonymous 11/26/25(Wed)08:51:39 No.107333214

File: ComfyUI_00043_.jpg (911 KB, 2048x2048)

911 KB JPG

Yeah dype seems to work just fine for flux 2, very nice.

Anonymous
11/26/25(Wed)08:52:24 No.107333221

Anonymous 11/26/25(Wed)08:52:24 No.107333221

1boy, crying, soaked in urine, pajamas, gamer chair, cat ear headphones

Anonymous
11/26/25(Wed)08:53:46 No.107333234

Anonymous 11/26/25(Wed)08:53:46 No.107333234

Alright zimage is nice yeah yeah okay
aesthetically it's good, size is good, but what is necessary are the technological advancements since

reference based image gen
training, ideally with said references and/or multiple res/aspect ratio
prompt enhancement
controlnet and similar (depth, canny, normal, etc.)
inpainting, outpainting, whatever variant
text editing

you get the idea, that stuff.

Anonymous
11/26/25(Wed)08:53:51 No.107333237

Anonymous 11/26/25(Wed)08:53:51 No.107333237

>>107333191
>a general model that does gen, edit, multi-ref
let's be serious there, Qwen Image Edit is 20b, do you think it's that much inferior to Flux 2 (32b)? and it'll even be closer once we'll get the next iteration of QIE this week

Anonymous
11/26/25(Wed)08:53:51 No.107333238

Anonymous 11/26/25(Wed)08:53:51 No.107333238

>>107333209
anon, you lost the argument, time to swallow your pride and let it go man

Anonymous
11/26/25(Wed)08:55:29 No.107333262

Anonymous 11/26/25(Wed)08:55:29 No.107333262

File: 1726075449183071.jpg (7 KB, 249x250)

7 KB JPG

>>107333168
>Honestly smaller models should do that instead of trying to become useless jack of all trades.
idk why barely anyone does that, one fuckhueg model after another drops that dies after the initial hype because they are all limited and get dropped because no one has the hardware to modify them. Focus for local should be on small models that are easy to train so you can readily modify them for whatever you need instead of trying to pack a trillion parameters into it and failing anyway

Anonymous
11/26/25(Wed)08:57:19 No.107333277

Anonymous 11/26/25(Wed)08:57:19 No.107333277

File: 1742042169772414.png (2.88 MB, 2189x1065)

2.88 MB PNG

>>107333234
>but what is necessary are the technological advancements since
>reference based image gen
there will be a Z-image edit though, we won't just get an image model

Anonymous
11/26/25(Wed)08:59:32 No.107333301

Anonymous 11/26/25(Wed)08:59:32 No.107333301

>>107332751
you're using the full 60+gb model but only offloading 4gb. You need to set virtual_vram_gb to like 32 or more. works for me on a 3090 with it set to 40

Anonymous
11/26/25(Wed)09:00:21 No.107333309

Anonymous 11/26/25(Wed)09:00:21 No.107333309

Is there an ai model in comfy that I can ask to describe an input image to then feed the prompt into flux?

Anonymous
11/26/25(Wed)09:00:57 No.107333314

Anonymous 11/26/25(Wed)09:00:57 No.107333314

File: Qwen_00093_.png (1.93 MB, 1328x1328)

1.93 MB PNG

>>107333186
Composition actually reminds me of qwen a lot. Quality is better, of course, but qwen one was pretty WIP and not shooting for any sort of quality at all.
It's impressive how it resolves faces into different ones as steps go by, though.

Anonymous
11/26/25(Wed)09:01:24 No.107333320

Anonymous 11/26/25(Wed)09:01:24 No.107333320

>>107333237
no idea what your point is, the anon was asking why a 6b model is good at realism vs a 32b one. i have no idea about qie and flux2 edit comparisons.

i suppose QIE is probably better for image edit since it's an image edit model and not a general model? that was literally what i just finished saying.

Anonymous
11/26/25(Wed)09:02:37 No.107333332

Anonymous 11/26/25(Wed)09:02:37 No.107333332

File: 1751580226090127.mp4 (1.23 MB, 832x480)

1.23 MB MP4

wan 2.2 kino:

the golden retreiver dog on the left fires a huge blue lightning bolt from their paws at the man wearing glasses. The man with glasses flies off his chair to the right through a window into the sky in a skydiving pose with an electric aura around his body, flying over the side of a mountain cliff during the day. he lands far below in a large electric field and a giant lightning bolt hits his body where he lands, creating lots of smoke and fire.

Anonymous
11/26/25(Wed)09:02:45 No.107333333

Anonymous 11/26/25(Wed)09:02:45 No.107333333

File: 20 steps Q6_K, 2.jpg (307 KB, 2048x2048)

307 KB JPG

>>107333014
Image at 2048x2048 resolution instead, coherent but much more slopped.

It also seems like it has trouble with capitalizing K in drinKing across seeds, so it's seems it's overfit too much in some aspects.

>>107333205
I'm obviously just benchmarking the quality of the model. Although I'm using Q6 instead of Q8 which I'll switch to later

Anonymous
11/26/25(Wed)09:03:13 No.107333337

Anonymous 11/26/25(Wed)09:03:13 No.107333337

>>107333320
so basically flux 2 is mid at everything because it tries to do everything at once? but the point was that it was supposed to be good at everything because it's a giant 32b model, that bloat was supposed to make that model a good jack of all trades

Anonymous
11/26/25(Wed)09:03:22 No.107333338

Anonymous 11/26/25(Wed)09:03:22 No.107333338

>>107333309
Yes, a VLM. There's joycaption and others.
https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
or you can use google ai or something if you want to go that route

Anonymous
11/26/25(Wed)09:04:08 No.107333345

Anonymous 11/26/25(Wed)09:04:08 No.107333345

File: ComfyUI-euler-1.0-20-2025(...).png (2.89 MB, 1920x1088)

2.89 MB PNG

"You've come this far have you? Watch me obliterate The World with my secret weapon - DJ Play will summon up AI demons! Let's get down!"

Anonymous
11/26/25(Wed)09:05:03 No.107333348

Anonymous 11/26/25(Wed)09:05:03 No.107333348

>>107333337
never said any of that but you do you

Anonymous
11/26/25(Wed)09:05:09 No.107333349

Anonymous 11/26/25(Wed)09:05:09 No.107333349

>>107333345
shit forgot link lmao
https://www.youtube.com/watch?v=FZREn9f0tzU

Anonymous
11/26/25(Wed)09:05:38 No.107333352

Anonymous 11/26/25(Wed)09:05:38 No.107333352

>>107333345
That dude resigned.

Anonymous
11/26/25(Wed)09:05:42 No.107333354

Anonymous 11/26/25(Wed)09:05:42 No.107333354

>>107333338
Thanks.

>>107333333
hexaget

Anonymous
11/26/25(Wed)09:06:13 No.107333358

Anonymous 11/26/25(Wed)09:06:13 No.107333358

>>107333345
the face consistency is all right but again the plastic skin ruins the whole thing

Anonymous
11/26/25(Wed)09:07:58 No.107333374

Anonymous 11/26/25(Wed)09:07:58 No.107333374

File: Z-image turbo.png (2.11 MB, 1024x1280)

2.11 MB PNG

Anonymous
11/26/25(Wed)09:08:03 No.107333376

Anonymous 11/26/25(Wed)09:08:03 No.107333376

File: arrest.jpg (604 KB, 3986x832)

604 KB JPG

Sora - Nano Banana Pro - Flux 2 Pro. Which did it better? Is Flux 2 comparable to the closed source stuff?
prompt: A gritty photo-journalistic nighttime street scene in Osaka’s Dōtonbori district, shot with a long telephoto lens that compresses the dense field of neon billboards. The iconic Glico running-man sign glows brightly in the background, surrounded by a collage of saturated, overlapping advertisements: a large pink billboard with bold white letters resembling “Calbe,” various LED panels, and storefront lighting casting hard reflections onto the wet pavement. In the foreground, three Japanese riot police officers in full tactical gear aggressively secure the scene beside two marked police sedans with rooftop beacons flashing red and blue. Their uniforms include dark ballistic vests, riot helmets with half-raised visors, radios, and reflective patches catching the neon glare. One officer is a woman, shown in a low-angle, close-range perspective arresting a yakuza member. The yakuza man lies prone on the ground, face turned sideways, grimacing in frustration or anger. His hands are behind his back as the female officer forcefully applies handcuffs, her expression focused and stern. The two male riot cops stand nearby holding MP5 submachine guns. One officer keeps his MP5 shouldered and aimed down the street, posture tense. The other male officer shouts at surrounding pedestrians, ordering them to disperse, his body half-turned toward the crowd. Passersby in the midground appear blurred from motion, emphasizing urgency and chaos. Harsh mixed lighting from neon signs, police strobes, and storefronts creates deep contrast, sharp highlights on metal surfaces, and reflective color spill across the scene. Slight atmospheric haze softens the distant signage but leaves the officers sharply defined. The overall tone is raw, documentary-like, and captures a tense moment of police action in a busy urban night setting.

Anonymous
11/26/25(Wed)09:08:06 No.107333378

Anonymous 11/26/25(Wed)09:08:06 No.107333378

>>107333358
starting to think the plastic skin anon is the same as the fake cum anon in /gif/ and it's one very long running joke

Anonymous
11/26/25(Wed)09:08:09 No.107333379

Anonymous 11/26/25(Wed)09:08:09 No.107333379

File: document_0.jpg (53 KB, 512x512)

53 KB JPG

Anonymous
11/26/25(Wed)09:08:58 No.107333387

Anonymous 11/26/25(Wed)09:08:58 No.107333387

>>107333358
yeah no kiddin'. just taking a fun break from 1girls for a sec to do a concept i've been laughing about for months.
i swear if we get one more parametermaxxed shit checkpoint releasing before the end of the year im gonna have a stroke.
PLEASE SAVE US Z.

>>107333378
oh please do not start lumping me in with another poopdickschizo this is the first time i've personally posted flux in this general.

Anonymous
11/26/25(Wed)09:08:59 No.107333388

Anonymous 11/26/25(Wed)09:08:59 No.107333388

>>107333123
>what's their secret sauce?
they decided to stop acting like retards and train their model with real images instead of synthetic shit

Anonymous
11/26/25(Wed)09:09:11 No.107333391

Anonymous 11/26/25(Wed)09:09:11 No.107333391

File: generation-8d0343b6-0360-(...).png (1.73 MB, 1216x832)

1.73 MB PNG

>>107333376
Just the Flux 2 pro img

Anonymous
11/26/25(Wed)09:09:19 No.107333394

Anonymous 11/26/25(Wed)09:09:19 No.107333394

>>107333376
didn't read the prompt but the first image looks like straight anime ass.

Anonymous
11/26/25(Wed)09:10:20 No.107333405

Anonymous 11/26/25(Wed)09:10:20 No.107333405

File: shire.png (1.46 MB, 1344x752)

1.46 MB PNG

Anonymous
11/26/25(Wed)09:10:27 No.107333408

Anonymous 11/26/25(Wed)09:10:27 No.107333408

>>107333394
Sora and their yellow tint; name a more iconic duo

Anonymous
11/26/25(Wed)09:10:37 No.107333410

Anonymous 11/26/25(Wed)09:10:37 No.107333410

File: let this be the one.png (178 KB, 656x381)

178 KB PNG

>>107333387
>i swear if we get one more parametermaxxed shit checkpoint releasing before the end of the year im gonna have a stroke.
same anon, I feel ya
>>107333387
>PLEASE SAVE US Z.
*raises his hands*

Anonymous
11/26/25(Wed)09:10:43 No.107333412

Anonymous 11/26/25(Wed)09:10:43 No.107333412

File: 20 steps Q6_K.png (1.86 MB, 1024x1024)

1.86 MB PNG

>>107333333
Here's also a prompt from >>107333175
Lol
>A woman standing on a rainy city sidewalk holding a transparent umbrella, making a cute pouty expression. Raindrops on the umbrella sharply detailed, neon reflections in puddles. Her floral top appears slightly damp at the shoulders. Street signs and headlights create soft bokeh lights. Photorealistic mood.

Anonymous
11/26/25(Wed)09:11:14 No.107333417

Anonymous 11/26/25(Wed)09:11:14 No.107333417

>>107333408
>name a more iconic duo
peanut butter and jelly

Anonymous
11/26/25(Wed)09:11:29 No.107333420

Anonymous 11/26/25(Wed)09:11:29 No.107333420

File: ComfyUI_00023.png (2.74 MB, 1200x1800)

2.74 MB PNG

>>107333309
I use Gemma 3 27B Q4 in LM Studio. You can even add multiple images at a time and have it combine them into a prompt if you want.

Anonymous
11/26/25(Wed)09:11:47 No.107333423

Anonymous 11/26/25(Wed)09:11:47 No.107333423

>>107333412
that's quen image right?

Anonymous
11/26/25(Wed)09:14:21 No.107333447

Anonymous 11/26/25(Wed)09:14:21 No.107333447

File: 1740811055942702.jpg (807 KB, 1790x1277)

807 KB JPG

>>107333412

Anonymous
11/26/25(Wed)09:14:29 No.107333449

Anonymous 11/26/25(Wed)09:14:29 No.107333449

cmon china release the new qwen edit.

Anonymous
11/26/25(Wed)09:14:32 No.107333450

Anonymous 11/26/25(Wed)09:14:32 No.107333450

File: ComfyUI_temp_fjmmx_00003_.png (1.04 MB, 739x739)

1.04 MB PNG

Tested low resolution, 8 steps fast gens on a Q4 that fits in 24gb vram.
Seed variability without any hacks is better than qwen, but nothing to write home about. The goblin witch is always on the right, but I like that faces and poses are different.
The model seems like it has lightning built in. Gens at 8 steps have mostly converged, but trademark low step dithering remains. There is no collapse in composition at 368x368.
Speed is around 20 seconds per this meagre gen.

Anonymous
11/26/25(Wed)09:14:37 No.107333451

Anonymous 11/26/25(Wed)09:14:37 No.107333451

File: 195c3139-0083-453d-89f1-d(...).jpg (85 KB, 1024x1024)

85 KB JPG

>another skimpy preview has hit the Z page
i'm gonna come. I'M GONNA COME!
xi-sama, forgive me for what i'm about to do to my social credit score.

Anonymous
11/26/25(Wed)09:14:54 No.107333454

Anonymous 11/26/25(Wed)09:14:54 No.107333454

>>107333447
lmao

Anonymous
11/26/25(Wed)09:15:20 No.107333458

Anonymous 11/26/25(Wed)09:15:20 No.107333458

File: 20 steps Q6_k.png (3.32 MB, 1328x1328)

3.32 MB PNG

>>107333412

>>107333423
It's Flux 2

Anonymous
11/26/25(Wed)09:16:22 No.107333464

Anonymous 11/26/25(Wed)09:16:22 No.107333464

>>107333447
bfl released their model in the worst moment I swear to god

Anonymous
11/26/25(Wed)09:17:47 No.107333476

Anonymous 11/26/25(Wed)09:17:47 No.107333476

>>107333451
the hair texture is great, I really love the details this shit has probably the best local vae

Anonymous
11/26/25(Wed)09:19:22 No.107333484

Anonymous 11/26/25(Wed)09:19:22 No.107333484

>>107333476
>he was focused on the hair detail
haha lol GAY

Anonymous
11/26/25(Wed)09:20:00 No.107333488

Anonymous 11/26/25(Wed)09:20:00 No.107333488

File: my flux2 is kinda realist(...).jpg (990 KB, 1958x2016)

990 KB JPG

>>107333447
I knew I made the right choice by not downloading this 50gb bloat plastic shit

Anonymous
11/26/25(Wed)09:22:27 No.107333508

Anonymous 11/26/25(Wed)09:22:27 No.107333508

>>107333447
Z-image is better but makes the woman a chinaman automatically. Not saying it's a bad thing

Anonymous
11/26/25(Wed)09:22:30 No.107333509

Anonymous 11/26/25(Wed)09:22:30 No.107333509

File: flux2__00031_.png (2.75 MB, 1920x1088)

2.75 MB PNG

>i have taken the GGUF pill

Anonymous
11/26/25(Wed)09:23:59 No.107333526

Anonymous 11/26/25(Wed)09:23:59 No.107333526

File: file.png (463 KB, 500x628)

463 KB PNG

>>107333447
>Ilya Sutskever said yesterday that scaling is dead and that we should focus more on optimizing the training process
>One day later Z-image has proved just that
we definitely live in a simulation, it can't be just coinscidences lol

Anonymous
11/26/25(Wed)09:25:11 No.107333545

Anonymous 11/26/25(Wed)09:25:11 No.107333545

>>107333508
to be fair, bfl is a western model so the default human will be an european type, for a chinese model the default human will be from east asia I guess

Anonymous
11/26/25(Wed)09:25:29 No.107333548

Anonymous 11/26/25(Wed)09:25:29 No.107333548

I swear all these bloatmaxxed models are a psyop to drive people to use API models

Anonymous
11/26/25(Wed)09:25:35 No.107333550

Anonymous 11/26/25(Wed)09:25:35 No.107333550

>>107333526
>optimizing the training process
Isn't the data quality biggest bottleneck for most things AI now?

Anonymous
11/26/25(Wed)09:25:52 No.107333552

Anonymous 11/26/25(Wed)09:25:52 No.107333552

File: 1744150076421862.png (1.46 MB, 832x1248)

1.46 MB PNG

I fed a certain image into joycaption and didn't read what it spit out lol

Anonymous
11/26/25(Wed)09:27:02 No.107333568

Anonymous 11/26/25(Wed)09:27:02 No.107333568

>>107333552
Kek I recognize what photo you used

Anonymous
11/26/25(Wed)09:29:44 No.107333588

Anonymous 11/26/25(Wed)09:29:44 No.107333588

File: let's gooooo.png (1.21 MB, 1024x1024)

1.21 MB PNG

>>107332467
>It can do 90's animes
GOTY

Anonymous
11/26/25(Wed)09:30:18 No.107333596

Anonymous 11/26/25(Wed)09:30:18 No.107333596

Flux 2 verdict:

It's a better but much more bloated version of Qwen Image that might be saved with good loras and speedups but given how even Qwen isn't as popular given it's own size this basically won't ever be saved, especially given that unlike with Qwen Image, you can't even train a lora of Flux 2 on 24gb vram gpus.

And with Z image about to drop which is much better out of the box while being a small model, Flux 2 won't even be in the spotlight enough for anyone to even waste time and money trying to cope their into trying to fix it.

Anonymous
11/26/25(Wed)09:30:41 No.107333600

Anonymous 11/26/25(Wed)09:30:41 No.107333600

>>107333588
really doesn't look great though honestly. but not too surprised, its probably out of the area of expertise for Z.
probably the reason the more they add new photos the progressively less dressed the women are getting and this is the only anime example kek

Anonymous
11/26/25(Wed)09:31:24 No.107333612

Anonymous 11/26/25(Wed)09:31:24 No.107333612

>>107333600
>really doesn't look great though honestly.
it's still from the turbo model though, it'll look better on the base model

Anonymous
11/26/25(Wed)09:31:39 No.107333615

Anonymous 11/26/25(Wed)09:31:39 No.107333615

>>107333596
thank for you for the detailed update from obese coomer news

Anonymous
11/26/25(Wed)09:32:00 No.107333620

Anonymous 11/26/25(Wed)09:32:00 No.107333620

by the way there's a demo available for Z, for any of you that can brute force through the chinese and have a fake phone number to give them for signup kek

Anonymous
11/26/25(Wed)09:33:47 No.107333641

Anonymous 11/26/25(Wed)09:33:47 No.107333641

File: 1755751478926553.jpg (1.19 MB, 1248x1824)

1.19 MB JPG

>>107333596
Flux 2 is decent, but it feels like a very minor step in most aspects compared to chroma, and with less stylistic knowledge across the board. A good finetune can make it desirable though.

Anonymous
11/26/25(Wed)09:34:02 No.107333643

Anonymous 11/26/25(Wed)09:34:02 No.107333643

File: IMG_20251126_192715.jpg (306 KB, 1659x1416)

306 KB JPG

The model has a dedicated flux2 scheduler which is almost identical to beta 0.9/0.8 at 512. But curiously it is resolution dependent (which should be the right way to approach this for other models, too.) Resolution works kind of like automated shift, making the slope steeper as it rises. I've only noticed a tangible difference in slope at above 1024.

Anonymous
11/26/25(Wed)09:35:28 No.107333655

Anonymous 11/26/25(Wed)09:35:28 No.107333655

>>107333596
>And with Z image about to drop which is much better out of the box
We don't know that. To me it looks slopped in previews. Let it drop first.

Anonymous
11/26/25(Wed)09:36:34 No.107333665

Anonymous 11/26/25(Wed)09:36:34 No.107333665

File: not ready.jpg (252 KB, 1440x1080)

252 KB JPG

>>107333588
>>107333010
>>107332695
>>107333451
>a turbo model managed to get this much praise
Z-Image be like:

Anonymous
11/26/25(Wed)09:37:06 No.107333672

Anonymous 11/26/25(Wed)09:37:06 No.107333672

>>107333641
That shit will cost hundreds of thousands if not millions to finetune, it's DOA

Anonymous
11/26/25(Wed)09:38:14 No.107333680

Anonymous 11/26/25(Wed)09:38:14 No.107333680

>>107333655
It looks like it leans towards asian women which might indicate it's overfit on other things that matter but outside of that there's no question it's more realistic and detailed

Anonymous
11/26/25(Wed)09:38:24 No.107333685

Anonymous 11/26/25(Wed)09:38:24 No.107333685

File: Z-image turbo test.jpg (103 KB, 1024x1024)

103 KB JPG

It definitely looks good but I'd like to see something else than asian people though.

Anonymous
11/26/25(Wed)09:38:44 No.107333689

Anonymous 11/26/25(Wed)09:38:44 No.107333689

>>107333665
>krillin is BFL getting blowed up but there's no goku around to 'avenge' his death
you love to see it.

Anonymous
11/26/25(Wed)09:43:15 No.107333718

Anonymous 11/26/25(Wed)09:43:15 No.107333718

How does emphasis work in flux vs sdxl models? do i just put some parenthesis around the specific words/sentence of my prompt and that's it?

Anonymous
11/26/25(Wed)09:44:07 No.107333725

Anonymous 11/26/25(Wed)09:44:07 No.107333725

>>107333718
yes, if you want to go for a ratio you go for (prompt:2) for x2 for example, it can also work for (prompt:1.5) too

Anonymous
11/26/25(Wed)09:44:52 No.107333731

Anonymous 11/26/25(Wed)09:44:52 No.107333731

>>107333725
thanks babe

Anonymous
11/26/25(Wed)09:45:39 No.107333734

Anonymous 11/26/25(Wed)09:45:39 No.107333734

File: img_00071_.jpg (455 KB, 1264x1592)

455 KB JPG

Anonymous
11/26/25(Wed)09:46:33 No.107333744

Anonymous 11/26/25(Wed)09:46:33 No.107333744

>>107333718
Emphasis is CLIP only thing. So SD 1.5, 2.1, SDXL.
No other text encoder uses it. Don't use it with flux.

Anonymous
11/26/25(Wed)09:47:16 No.107333749

Anonymous 11/26/25(Wed)09:47:16 No.107333749

>>107333734
I don't think knees are supposed to work like that...

Anonymous
11/26/25(Wed)09:47:45 No.107333756

Anonymous 11/26/25(Wed)09:47:45 No.107333756

File: interesting.png (2.25 MB, 2546x963)

2.25 MB PNG

>>107332467
>https://www.modelscope.cn/models/Tongyi-MAI/Z-Image-Turbo/picture
So it can do like Nano Banana Pro? You give it a vague prompt and it'll think of the rest by itself? sounds based

Anonymous
11/26/25(Wed)09:48:46 No.107333768

Anonymous 11/26/25(Wed)09:48:46 No.107333768

>>107333744
>>107333718
>Don't use it with flux.
debo with his misinformation again, on ComfyUi the code works with every single model

Anonymous
11/26/25(Wed)09:49:02 No.107333772

Anonymous 11/26/25(Wed)09:49:02 No.107333772

>>107333744
It 100% works with natural language text encoders like T5. I've personally tested this myself and so has another anon.

IE, you can do "the man has a (large penis):1.5" with Chroma, WAN, etc.

Anonymous
11/26/25(Wed)09:49:39 No.107333782

Anonymous 11/26/25(Wed)09:49:39 No.107333782

>>107333376
Sora fails hard at everything.

Anonymous
11/26/25(Wed)09:52:16 No.107333802

Anonymous 11/26/25(Wed)09:52:16 No.107333802

>>107333768
>>107333772
yeah it works alright. needed it to balance out the huge titties, wide hips, and hugo boss number on abby here. Without emphasis it really killed detail and accuracy on the outfit.

https://files.catbox.moe/3jdvv1.png

Anonymous
11/26/25(Wed)09:53:19 No.107333816

Anonymous 11/26/25(Wed)09:53:19 No.107333816

>>107333802
>nazi shit
cringe

Anonymous
11/26/25(Wed)09:53:50 No.107333819

Anonymous 11/26/25(Wed)09:53:50 No.107333819

>>107333802
>nazi shit
based

Anonymous
11/26/25(Wed)09:54:48 No.107333828

Anonymous 11/26/25(Wed)09:54:48 No.107333828

>>107333802
cringe

>>107333819
yeah if you're an edgy 12 year old.

Anonymous
11/26/25(Wed)09:56:07 No.107333841

Anonymous 11/26/25(Wed)09:56:07 No.107333841

>>107333828
holomeme didnt happen, you cant even prove its mathematically possible, sorry

Anonymous
11/26/25(Wed)09:57:40 No.107333856

Anonymous 11/26/25(Wed)09:57:40 No.107333856

File: 1762322156475960.jpg (1.33 MB, 2668x2018)

1.33 MB JPG

We really got cucked with the dev model, flux 2 pro looks way more realistic

Anonymous
11/26/25(Wed)09:57:57 No.107333857

Anonymous 11/26/25(Wed)09:57:57 No.107333857

>>107333841
>implying I care about some fake news from jews
nazi shit is still cringe

Anonymous
11/26/25(Wed)09:59:10 No.107333872

Anonymous 11/26/25(Wed)09:59:10 No.107333872

>>107333856
>a 6b model somehow BTFO's at the very least qwen
How?

Anonymous
11/26/25(Wed)09:59:33 No.107333876

Anonymous 11/26/25(Wed)09:59:33 No.107333876

>>107333857
if youre not jewish what part of it is cringe, soi?

Anonymous
11/26/25(Wed)09:59:57 No.107333882

Anonymous 11/26/25(Wed)09:59:57 No.107333882

File: ComfyUI_04649_.png (614 KB, 832x1216)

614 KB PNG

>>107333856
They want you to pay for their API.
Of course it does.

Anonymous
11/26/25(Wed)10:01:03 No.107333892

Anonymous 11/26/25(Wed)10:01:03 No.107333892

>>107333876
>what part of it is cringe
forced edginess is cringe. you might as well say you like my little pony while you're at it, faggot.

Anonymous
11/26/25(Wed)10:01:17 No.107333894

Anonymous 11/26/25(Wed)10:01:17 No.107333894

File: Z-image turbo.png (1.51 MB, 1024x1280)

1.51 MB PNG

>>107333872
looks like the non-Qwen team at alibaba is the more talented one lol

Anonymous
11/26/25(Wed)10:02:28 No.107333904

Anonymous 11/26/25(Wed)10:02:28 No.107333904

>>107333892
sir, this is 4chan. there's nothing edgy about posting nazi stuff here

Anonymous
11/26/25(Wed)10:02:48 No.107333906

Anonymous 11/26/25(Wed)10:02:48 No.107333906

>>107333856
Qwen Image is so unrealistic it looks like a nba2k game lool

Anonymous
11/26/25(Wed)10:03:18 No.107333911

Anonymous 11/26/25(Wed)10:03:18 No.107333911

>>107333892
>calls someone a faggot
>gets triggered at a nazi uniform
reddit is most definitely more your speed lil bro

Anonymous
11/26/25(Wed)10:03:21 No.107333912

Anonymous 11/26/25(Wed)10:03:21 No.107333912

>>107333856
>Celtics
Nice

Anonymous
11/26/25(Wed)10:03:32 No.107333916

Anonymous 11/26/25(Wed)10:03:32 No.107333916

>>107333904
>this whole site is cringe thus I must be cringe too
kill yourself newfag.

Anonymous
11/26/25(Wed)10:03:45 No.107333918

Anonymous 11/26/25(Wed)10:03:45 No.107333918

>>107333892
what are you even sperging about? im gonna cum to abigail shapiro either way. you're the only one crying about it to this extent. why even care this much?

Anonymous
11/26/25(Wed)10:04:56 No.107333929

Anonymous 11/26/25(Wed)10:04:56 No.107333929

ok chinaman
give it to me
what is the catch?

Anonymous
11/26/25(Wed)10:05:35 No.107333936

Anonymous 11/26/25(Wed)10:05:35 No.107333936

nazi larp is cringe, commie larp is cringe, larping is cringe.
cringe nigger alert

Anonymous
11/26/25(Wed)10:06:31 No.107333946

Anonymous 11/26/25(Wed)10:06:31 No.107333946

>>107333929
>what is the catch?
it's only great at realistic shit, not so much at everything else

Anonymous
11/26/25(Wed)10:07:43 No.107333952

Anonymous 11/26/25(Wed)10:07:43 No.107333952

File: Z-image turbo.png (846 KB, 720x1280)

846 KB PNG

oh momyyyy

Anonymous
11/26/25(Wed)10:08:29 No.107333963

Anonymous 11/26/25(Wed)10:08:29 No.107333963

File: ComfyUI_00045_.jpg (1020 KB, 2048x2048)

1020 KB JPG

"4k quality, highly detailed professional photo. a portrait photo of a beautiful nordic woman with blue eyes and blonde hair. she stands in a forest during a sunset. she is backlit by the sunset as her face is lit by a sliver of light from above. her skin is realistic with freckles. her expression is disgusted with the viewer, frowning. the depth of field of the background is beautiful and there is a bit of dust particles in the volumetric haze."

Ok, I abandon flux 2.

Anonymous
11/26/25(Wed)10:10:06 No.107333980

Anonymous 11/26/25(Wed)10:10:06 No.107333980

>>107333952
Do we know when it will be released? I want to run that shit on ComfyUi ffs.

Anonymous
11/26/25(Wed)10:11:47 No.107333992

Anonymous 11/26/25(Wed)10:11:47 No.107333992

>>107333944
Why does Pinkie Pie have a massive hunchback?

Anonymous
11/26/25(Wed)10:12:00 No.107333994

Anonymous 11/26/25(Wed)10:12:00 No.107333994

>>107333206
> multiple men, standing

Anonymous
11/26/25(Wed)10:13:34 No.107334005

Anonymous 11/26/25(Wed)10:13:34 No.107334005

File: 1752095934303075.jpg (1.64 MB, 2592x2304)

1.64 MB JPG

https://www.reddit.com/r/StableDiffusion/comments/1p7a1g3/some_anime_style_images_i_generated_with_z_image/
For those wondering how well it fares at anime

Anonymous
11/26/25(Wed)10:13:39 No.107334006

Anonymous 11/26/25(Wed)10:13:39 No.107334006

>>107333963
that one million gigabyte mistral text encoder is really proving its worth here

Anonymous
11/26/25(Wed)10:13:43 No.107334008

Anonymous 11/26/25(Wed)10:13:43 No.107334008

File: 1739751800916514.png (3.52 MB, 1248x1824)

3.52 MB PNG

Anonymous
11/26/25(Wed)10:15:11 No.107334020

Anonymous 11/26/25(Wed)10:15:11 No.107334020

>>107334005
for a model ""specialized on realism"" it produces better anime images than your regular Qwen/Flux slop lool

Anonymous
11/26/25(Wed)10:16:08 No.107334029

Anonymous 11/26/25(Wed)10:16:08 No.107334029

>>107333963
kek

>>107334005
>>107333946
are you sure about that?

Anonymous
11/26/25(Wed)10:16:55 No.107334037

Anonymous 11/26/25(Wed)10:16:55 No.107334037

>>107334029
>are you sure about that?
I rest my case, seems like this model is pretty good at everything

Anonymous
11/26/25(Wed)10:17:00 No.107334039

Anonymous 11/26/25(Wed)10:17:00 No.107334039

>>107333963
>no idea how to prompt
>the model must be bad

go back to sdxl anon

Anonymous
11/26/25(Wed)10:18:01 No.107334051

Anonymous 11/26/25(Wed)10:18:01 No.107334051

>>107333944
ponyfag, makes sense why you're cringe. you'll troon out by next year.

Anonymous
11/26/25(Wed)10:18:07 No.107334055

Anonymous 11/26/25(Wed)10:18:07 No.107334055

File: WHAT A LOSER.png (239 KB, 640x478)

239 KB PNG

>>107334039
you lost to a 6b model you bfl-cuck employee

Anonymous
11/26/25(Wed)10:18:08 No.107334056

Anonymous 11/26/25(Wed)10:18:08 No.107334056

>>107334005
Aieee sam altman save me, it's too dangerous

Anonymous
11/26/25(Wed)10:18:28 No.107334058

Anonymous 11/26/25(Wed)10:18:28 No.107334058

>Flux 2 can't "obtain view from side" on images
lol...

Anonymous
11/26/25(Wed)10:19:31 No.107334067

Anonymous 11/26/25(Wed)10:19:31 No.107334067

>>107334058
*it worked after 4 seeds...

Anonymous
11/26/25(Wed)10:19:50 No.107334069

Anonymous 11/26/25(Wed)10:19:50 No.107334069

>>107333685
it knows who is JJ Lin?

Anonymous
11/26/25(Wed)10:20:12 No.107334070

Anonymous 11/26/25(Wed)10:20:12 No.107334070

>>107333952
>great at realism
yeah ok whatever, I sleep
>>107334005
>great at anime
REAL SHIT??

Anonymous
11/26/25(Wed)10:20:28 No.107334073

Anonymous 11/26/25(Wed)10:20:28 No.107334073

Why arent we using FastWan?

>Sparse distillation for Wan2.1 and Wan2.2 to achineve 50x denoising speedup

https://github.com/hao-ai-lab/FastVideo

Looks like they've released I2V https://huggingface.co/FastVideo/CausalWan2.2-I2V-A14B-Preview-Diffusers/tree/main

Anonymous
11/26/25(Wed)10:22:05 No.107334095

Anonymous 11/26/25(Wed)10:22:05 No.107334095

Damn I wanna fine-tune z image.

Anonymous
11/26/25(Wed)10:22:07 No.107334096

Anonymous 11/26/25(Wed)10:22:07 No.107334096

>>107334073
does it work with newer lightning loras at actual minimal quality hit?
how does it compare to painter nodes?

testing all this is annoying so people use what works

Anonymous
11/26/25(Wed)10:22:43 No.107334100

Anonymous 11/26/25(Wed)10:22:43 No.107334100

>>107334073
Why aren't you? Post some examples. Shouldn't take too long, right?

Anonymous
11/26/25(Wed)10:23:48 No.107334112

Anonymous 11/26/25(Wed)10:23:48 No.107334112

>6B parameters
>modern text encoder
>good at realism
>good at anime
>nearly Qwen-Image levels of prompt comprehension
>less slopped and better vibes than basically any other base model
>apache 2 license
Yeah, I'm thinking this is the model to finally kill SDXL once and for all. Fucking finally.

Anonymous
11/26/25(Wed)10:23:55 No.107334113

Anonymous 11/26/25(Wed)10:23:55 No.107334113

File: img_00072_.jpg (507 KB, 1264x1592)

507 KB JPG

>>107334005
i wonder how do they do this with such a small model

Anonymous
11/26/25(Wed)10:24:57 No.107334120

Anonymous 11/26/25(Wed)10:24:57 No.107334120

>>107334112
>Yeah, I'm thinking this is the model to finally kill SDXL once and for all. Fucking finally.
you have to respect how long SDXL managed to stay relevant though, almost 3 years lmao

Anonymous
11/26/25(Wed)10:25:16 No.107334121

Anonymous 11/26/25(Wed)10:25:16 No.107334121

>>107334005
> 6 fingers

Anonymous
11/26/25(Wed)10:25:18 No.107334122

Anonymous 11/26/25(Wed)10:25:18 No.107334122

File: alex jones approval.gif (2.12 MB, 177x210)

2.12 MB GIF

>it's even REALLY fucking good at anime out of the box
this is it.

Anonymous
11/26/25(Wed)10:25:41 No.107334129

Anonymous 11/26/25(Wed)10:25:41 No.107334129

>>107334113
a lot of performance and quality was always on the table, look at how much pony v6 improved the quality its base model back in the day, bigasp loras etc, small tuning can drastically improve the models, let alone when a company doesnt fuck up the model themselves from the start

Anonymous
11/26/25(Wed)10:25:57 No.107334135

Anonymous 11/26/25(Wed)10:25:57 No.107334135

>>107334112
There's always a catch. SDXL survived all other models, chances are this one too

Anonymous
11/26/25(Wed)10:25:58 No.107334136

Anonymous 11/26/25(Wed)10:25:58 No.107334136

>>107334121
>turbo model

Anonymous
11/26/25(Wed)10:26:25 No.107334143

Anonymous 11/26/25(Wed)10:26:25 No.107334143

File: ComfyUI_temp_rppeu_00005_.png (1.75 MB, 1392x952)

1.75 MB PNG

BFL employee seething in this thread for being cucked by a small model lmao

Anonymous
11/26/25(Wed)10:26:46 No.107334147

Anonymous 11/26/25(Wed)10:26:46 No.107334147

>>107334096
>>107334100
Calm your tits, I just found it

Anonymous
11/26/25(Wed)10:27:12 No.107334150

Anonymous 11/26/25(Wed)10:27:12 No.107334150

>>107334135
>SDXL survived all other models
SDXL is literally doing this aura shit
https://www.youtube.com/watch?v=IRPI3lSACFc

Anonymous
11/26/25(Wed)10:27:40 No.107334157

Anonymous 11/26/25(Wed)10:27:40 No.107334157

>>107334112
You also need it to respond well to training and not be unreasonably slow to run inference from (can't tell purely from param size) but yes honestly the most exciting model in a while to say at least.

Anonymous
11/26/25(Wed)10:29:03 No.107334174

Anonymous 11/26/25(Wed)10:29:03 No.107334174

>>107334157
>You also need it to respond well to training and not be unreasonably slow to run inference from (can't tell purely from param size)
https://www.modelscope.cn/models/Tongyi-MAI/Z-Image-Turbo/summary?version=master
>It offers sub-second inference latency on enterprise-grade H800 GPUs
this shit is hella fast, probably faster than SDXL

Anonymous
11/26/25(Wed)10:30:59 No.107334195

Anonymous 11/26/25(Wed)10:30:59 No.107334195

File: YASSSSS.gif (760 KB, 498x243)

760 KB GIF

>>107334122
>this is it.
I waited for this moment for so long, I almost gave it up in believing it, local is saved

Anonymous
11/26/25(Wed)10:31:46 No.107334206

Anonymous 11/26/25(Wed)10:31:46 No.107334206

>>107334112
> apache 2 license
To kill SDXL you need much more. And this is like illustrious based models compared to pony - just slightly better.

Anonymous
11/26/25(Wed)10:32:33 No.107334214

Anonymous 11/26/25(Wed)10:32:33 No.107334214

>>107334206
>apache 2 license
>To kill SDXL you need much more.
what's better than Apache 2.0?

Anonymous
11/26/25(Wed)10:33:26 No.107334220

Anonymous 11/26/25(Wed)10:33:26 No.107334220

>>107334214
Ligma 3.0

Anonymous
11/26/25(Wed)10:33:35 No.107334224

Anonymous 11/26/25(Wed)10:33:35 No.107334224

File: do it.png (174 KB, 640x640)

174 KB PNG

>>107334005
I'm not gonna lie, if this alibaba team can cook such a great image model, what's preventing them to do the same on videos? I can see them making something better than Wan 2.2

Anonymous
11/26/25(Wed)10:34:25 No.107334233

Anonymous 11/26/25(Wed)10:34:25 No.107334233

Yume is anime SOTA

Anonymous
11/26/25(Wed)10:35:10 No.107334239

Anonymous 11/26/25(Wed)10:35:10 No.107334239

>>107334233
I almost forgot that existed..

Anonymous
11/26/25(Wed)10:36:06 No.107334250

Anonymous 11/26/25(Wed)10:36:06 No.107334250

>>107334214
It was accidental quote.

Anonymous
11/26/25(Wed)10:36:06 No.107334251

Anonymous 11/26/25(Wed)10:36:06 No.107334251

>>107334174
Wow.
Z-Image is the GEM that saved local diffusion from API demons.
All Hail Alibaba
All Hail Chinese overlords.

Anonymous
11/26/25(Wed)10:36:38 No.107334258

Anonymous 11/26/25(Wed)10:36:38 No.107334258

File: plus 15 social credit.png (414 KB, 1280x720)

414 KB PNG

>>107334251

Anonymous
11/26/25(Wed)10:37:04 No.107334264

Anonymous 11/26/25(Wed)10:37:04 No.107334264

>>107334258
+15 izzat credits

Anonymous
11/26/25(Wed)10:37:23 No.107334268

Anonymous 11/26/25(Wed)10:37:23 No.107334268

>>107334233
And other hilarious jokes you can tell to yourself!

Anonymous
11/26/25(Wed)10:38:23 No.107334282

Anonymous 11/26/25(Wed)10:38:23 No.107334282

when are your chinese overlords releasing Z?

Anonymous
11/26/25(Wed)10:40:26 No.107334300

Anonymous 11/26/25(Wed)10:40:26 No.107334300

File: 1758243525681183.jpg (1.02 MB, 2460x1586)

1.02 MB JPG

>>107334005
https://www.reddit.com/r/StableDiffusion/comments/1p7a800/zimageturbo_anime_generation_results/
Here's some more anime kino

Anonymous
11/26/25(Wed)10:41:26 No.107334313

Anonymous 11/26/25(Wed)10:41:26 No.107334313

>>107334282
they said today, so we're waiting...

Anonymous
11/26/25(Wed)10:41:27 No.107334314

Anonymous 11/26/25(Wed)10:41:27 No.107334314

File: Z promotional image.jpg (80 KB, 1024x1024)

80 KB JPG

realistically speaking, how can white w*men compete?

Anonymous
11/26/25(Wed)10:41:33 No.107334316

Anonymous 11/26/25(Wed)10:41:33 No.107334316

Z-Image sounds too goid to be true

Anonymous
11/26/25(Wed)10:42:14 No.107334324

Anonymous 11/26/25(Wed)10:42:14 No.107334324

>>107334313
pls I really hope its not some bait and switch shit.,its waay too good

Anonymous
11/26/25(Wed)10:42:37 No.107334326

Anonymous 11/26/25(Wed)10:42:37 No.107334326

>>107334282
comfy already supports it, the pr has been merged already so it's imminent, also a PR for flux 2 scheduler previews just went in if anyone is brave enough to update.

Anonymous
11/26/25(Wed)10:42:41 No.107334327

Anonymous 11/26/25(Wed)10:42:41 No.107334327

>>107334282
>>107334324
soon enough, Comfy implemented the inference code yesterday, so the release will come right after

Anonymous
11/26/25(Wed)10:43:42 No.107334336

Anonymous 11/26/25(Wed)10:43:42 No.107334336

>>107334300
this is actually pretty good, and like it knows some characters IP, no more migu spam anymore, that's always a good thing

Anonymous
11/26/25(Wed)10:43:48 No.107334339

Anonymous 11/26/25(Wed)10:43:48 No.107334339

File: 1632792585100.png (486 KB, 500x762)

486 KB PNG

i don't even care about my thanksgiving pig out in less than 24 hours, all i care about is genning 1girl (huge booba:1.5) with the new best realistic chinese model
thank you xinnie the poo

Anonymous
11/26/25(Wed)10:44:46 No.107334343

Anonymous 11/26/25(Wed)10:44:46 No.107334343

>>107334339
Asian 1girl >>>>>>>>>>>

Anonymous
11/26/25(Wed)10:44:47 No.107334344

Anonymous 11/26/25(Wed)10:44:47 No.107334344

>>107334339
>i don't even care about my thanksgiving pig out in less than 24 hours,
to be fair, I know this model will be my best thanksgiving ever, we peaked lol

Anonymous
11/26/25(Wed)10:44:54 No.107334346

Anonymous 11/26/25(Wed)10:44:54 No.107334346

>>107334314
Actually existing may help.

Anonymous
11/26/25(Wed)10:45:33 No.107334355

Anonymous 11/26/25(Wed)10:45:33 No.107334355

>>107333447
>>107332562
The flux2 output is obviously following the prompt better despite the skin texture. Prompted for dim lighting, but get a blast of hipster flash. Noted that the shoulders of the top should be wet, but got a top with no shoulders.

It's just a different kind of slop where it's still deciding what looks nice regardless of the ask. and with the flux images, by providing the plastic skin by default it's at least not directly contraditicting the prompt.

Anonymous
11/26/25(Wed)10:45:46 No.107334357

Anonymous 11/26/25(Wed)10:45:46 No.107334357

>>107334324
>>107334326
>>107334327
If Z is actually what it seems, then FLX2 is DOA

Anonymous
11/26/25(Wed)10:46:00 No.107334359

Anonymous 11/26/25(Wed)10:46:00 No.107334359

>>107334343
>implying some giga autist won't immediately have lora training figured out in less than 5 hours from release
or even just tags that seemed to work through brute forcing

Anonymous
11/26/25(Wed)10:46:23 No.107334365

Anonymous 11/26/25(Wed)10:46:23 No.107334365

>>107334346
existing within the digital world is preferable to ours

Anonymous
11/26/25(Wed)10:46:38 No.107334366

Anonymous 11/26/25(Wed)10:46:38 No.107334366

File: wut.jpg (12 KB, 185x185)

12 KB JPG

What was Flux again, 12B? I can barely run that shit quantized to Q6_K on my shitbox but it's very slow so if Z-Image is half the size that should be good news for me (sometime in the future when it gets quantized).

Anonymous
11/26/25(Wed)10:47:02 No.107334370

Anonymous 11/26/25(Wed)10:47:02 No.107334370

>>107334314
by virtue of not being a bug

Anonymous
11/26/25(Wed)10:47:28 No.107334375

Anonymous 11/26/25(Wed)10:47:28 No.107334375

>>107334355
it's a turbo model, the base model will respond better to prompts, and even if it's not the case you will be able to use the "reasoning" shit to make it even better at prompt understanding >>107333756

Anonymous
11/26/25(Wed)10:47:47 No.107334379

Anonymous 11/26/25(Wed)10:47:47 No.107334379

File: flux2__00044_.png (1.68 MB, 832x1216)

1.68 MB PNG

>>107334143
kek

Anonymous
11/26/25(Wed)10:48:23 No.107334381

Anonymous 11/26/25(Wed)10:48:23 No.107334381

File: e7ebf406-f528-4626-a082-b(...).jpg (195 KB, 1072x1920)

195 KB JPG

that is fucking insane. no way a turbo'd 6b is doing this. we're being memed.

Anonymous
11/26/25(Wed)10:48:58 No.107334388

Anonymous 11/26/25(Wed)10:48:58 No.107334388

File: 12473cbb-0136-499d-afe2-3(...).jpg (162 KB, 1152x864)

162 KB JPG

>>107333376
Z-image with the same prompt; Euler, 9 steps. Can't get it to have it in more steps, it gives me an img not found error

Anonymous
11/26/25(Wed)10:49:20 No.107334393

Anonymous 11/26/25(Wed)10:49:20 No.107334393

File: 1741916672643899.png (431 KB, 800x582)

431 KB PNG

>>107334381
>no way a turbo'd 6b is doing this. we're being memed.
don't underestimate china, they got this

Anonymous
11/26/25(Wed)10:50:16 No.107334400

Anonymous 11/26/25(Wed)10:50:16 No.107334400

>>107334388
>9 step model somehow nearly matches fucking nano banana
wuh?

Anonymous
11/26/25(Wed)10:50:37 No.107334403

Anonymous 11/26/25(Wed)10:50:37 No.107334403

>>107334366
>What was Flux again, 12B?
yep, so it's twice as small but it looks way better and knows way more concepts

Anonymous
11/26/25(Wed)10:51:23 No.107334413

Anonymous 11/26/25(Wed)10:51:23 No.107334413

>>107333756
lolol it's the incredibly useful 'solve a math problem on a whiteboard image' demo again

Anonymous
11/26/25(Wed)10:52:01 No.107334421

Anonymous 11/26/25(Wed)10:52:01 No.107334421

>>107334336
Does it know teto?

Anonymous
11/26/25(Wed)10:52:12 No.107334422

Anonymous 11/26/25(Wed)10:52:12 No.107334422

so whats the catch? theres always a catch

Anonymous
11/26/25(Wed)10:52:24 No.107334424

Anonymous 11/26/25(Wed)10:52:24 No.107334424

>>107334381
JPEG the strongest

Anonymous
11/26/25(Wed)10:54:34 No.107334439

Anonymous 11/26/25(Wed)10:54:34 No.107334439

>>107334400
I wouldn't go that far, the text is noticeably more slopped.
But the overall quality is very good, yes.
>>107334422
China will win AI race.
Which I am totally fine with.

Anonymous
11/26/25(Wed)10:55:03 No.107334445

Anonymous 11/26/25(Wed)10:55:03 No.107334445

>>107334413
>it's the incredibly useful
you have no idea how useful it actually is, you can literally say something really vague like "make a comic about this subject" and it'll do everything, including the script, like Nano Banana Pro

for example, on that example >>107332797
I used this simple prompt
>Create a manga page (in color) on the following topic: a new local 32b image model called “Flux 2” (developed by the German company bfl) has just been launched and users are testing it, but they end up being disappointed (not realistic enough and too big). A day later, the Chinese company Alibaba releases a new local 6b model called “Z-Image” that is much better than Flux 2, almost as if they had waited for bfl to release their model so they could humiliate them right after (they are really petty).

Anonymous
11/26/25(Wed)10:55:21 No.107334447

Anonymous 11/26/25(Wed)10:55:21 No.107334447

>>107334366
>just downloaded Q4_0
>between the model and encoder its 35GB
>35GB

Thinking of just giving this one a miss and waiting for chroma 2 or some shit. I'm already running out of space on a 2TB SDD (and that's after getting rid of all of the models I didn't use).

Anonymous
11/26/25(Wed)10:55:21 No.107334448

Anonymous 11/26/25(Wed)10:55:21 No.107334448

I have a bad feeling about this, I don't buy it after all the failed models

Anonymous
11/26/25(Wed)10:55:38 No.107334450

Anonymous 11/26/25(Wed)10:55:38 No.107334450

>>107334422
>so whats the catch?
you have to accept the chinese state mandated bugwife who will record your sex to send to alibaba for nsfw model training of which you will only get the distilled turbo model.

Anonymous
11/26/25(Wed)10:55:40 No.107334452

Anonymous 11/26/25(Wed)10:55:40 No.107334452

>>107334370
and not eating ones (also bats, snakes, cats and dogs).

Anonymous
11/26/25(Wed)10:55:42 No.107334454

Anonymous 11/26/25(Wed)10:55:42 No.107334454

>>107333262
>idk
what you dont know could fill a library

Anonymous
11/26/25(Wed)10:56:27 No.107334461

Anonymous 11/26/25(Wed)10:56:27 No.107334461

>>107334422
>so whats the catch? theres always a catch
so far the only thing that could go wrong is nudity, and even that isn't that serious lol

Anonymous
11/26/25(Wed)10:56:29 No.107334463

Anonymous 11/26/25(Wed)10:56:29 No.107334463

File: ef62cc6a-1d70-47c5-ac41-6(...).jpg (100 KB, 1024x1024)

100 KB JPG

>>107334400
Goes to show that the "pile more layers on it" is super flawed, at best

another z-image one, I tried the prompt improved-version of the F-35 pic I tried last thread:Professional quality modern analog photograph with visible film grain and vibrant color palette, featuring a single F-35 Lightning II in mid-air performing a tight pirouette maneuver. The aircraft’s entire fuselage and wings are covered in intricate, full-color irezumi-style tattoo illustrations — including traditional motifs such as koi fish, dragons, cherry blossoms, and wave patterns — rendered with fine linework and saturated pigments that contrast against the jet’s metallic surface. Colorful smoke trails in magenta, cyan, and gold emanate dynamically from both wingtips, curling through the air in response to the aircraft’s rotation. The shot is framed as a beauty portrait, captured from a low-angle three-quarter rear perspective to emphasize motion and artistry. Lighting is bright daylight with soft directional highlights reflecting off the jet’s curves and inkwork. Background is a clear sky gradient transitioning from pale blue at the horizon to deep azure overhead. The image appears as a full-page editorial spread in a professional aviation magazine; overlaid text in English is clearly legible and positioned along the bottom margin in a clean, sans-serif typeface, reading: "F-35 Lightning II: Where Stealth Meets Tradition". Text block is horizontally centered, 12-point size, with 1.5x line spacing and subtle drop shadow for readability against the background.

Anonymous
11/26/25(Wed)10:57:41 No.107334475

Anonymous 11/26/25(Wed)10:57:41 No.107334475

>>107334355
This level of cope...

Flux open models will always look like plastic, which makes them worthless for anything realistic involving people

Anonymous
11/26/25(Wed)10:57:54 No.107334477

Anonymous 11/26/25(Wed)10:57:54 No.107334477

File: Come on Alibaba, don't di(...).png (129 KB, 612x344)

129 KB PNG

>>107334421
>Does it know teto?
I hope so

Anonymous
11/26/25(Wed)10:58:22 No.107334481

Anonymous 11/26/25(Wed)10:58:22 No.107334481

>>107334214
AGPLv3

Anonymous
11/26/25(Wed)10:59:01 No.107334490

Anonymous 11/26/25(Wed)10:59:01 No.107334490

>>107334439
I wonder if adding more steps: 20, 40 etc. Will improve the text. Unfortunately modelscope gives me an "img not found" error if I try more than 9 steps

Anonymous
11/26/25(Wed)10:59:05 No.107334491

Anonymous 11/26/25(Wed)10:59:05 No.107334491

What's the best Local Diffusion model that can generate good looking images with minimal prompt. Good looking in terms of something that average boomer isn't going to recognize as AI?

Anonymous
11/26/25(Wed)10:59:56 No.107334505

Anonymous 11/26/25(Wed)10:59:56 No.107334505

File: Z-image turbo.jpg (161 KB, 864x1152)

161 KB JPG

Anonymous
11/26/25(Wed)11:00:10 No.107334512

Anonymous 11/26/25(Wed)11:00:10 No.107334512

>>107334005
gonna wait for danbooru finetune

Anonymous
11/26/25(Wed)11:00:27 No.107334519

Anonymous 11/26/25(Wed)11:00:27 No.107334519

>>107334445
It's cool, but
a) it's done by having an llm think about the initial prompt and then generate a very detailed prompt for the image generator, so the generator itself isn't doing the heavy lifting on that (autoregression isn't a requirement). The important thing is complex prompt adherence.
b) this is useful for people who don't know what they want and have the mindset of commissioning a synthetic artist. If you want to act as an artist yourself, you need to maximize control over the image, not delegate large aspects to the computer.

Anonymous
11/26/25(Wed)11:00:28 No.107334521

Anonymous 11/26/25(Wed)11:00:28 No.107334521

>>107334502
>>107334502
>>107334502
>>107334502

Anonymous
11/26/25(Wed)11:01:21 No.107334533

Anonymous 11/26/25(Wed)11:01:21 No.107334533

>>107334519
>If you want to act as an artist yourself, you need to maximize control over the image
you have the choice to use it or to not use it, so everyone is happy

Anonymous
11/26/25(Wed)11:01:28 No.107334538

Anonymous 11/26/25(Wed)11:01:28 No.107334538

>>107334422
the examples so far have what looks like severe compression or scaling artfacts

Anonymous
11/26/25(Wed)11:02:41 No.107334554

Anonymous 11/26/25(Wed)11:02:41 No.107334554

>>107334490
The answer is most probably yes but by how much is up in the air.
I guess we will be able to test locally when it releases SOON™

Anonymous
11/26/25(Wed)11:03:02 No.107334559

Anonymous 11/26/25(Wed)11:03:02 No.107334559

>>107334224
>alibaba's only real competition is going be themselves
I really want this to happen now, sounds funny as fuck

Anonymous
11/26/25(Wed)11:57:48 No.107335156

Anonymous 11/26/25(Wed)11:57:48 No.107335156

>>107333379
nice

Anonymous
11/26/25(Wed)13:25:52 No.107335955

Anonymous 11/26/25(Wed)13:25:52 No.107335955

>>107332443
Offloading 16 GB VRAM to cpu with anon's loading technique >>107324741 with fp8 (which is workflow optimized by Nvidia) it's only 2 mins per gen on my 3090. You might be doing something wrong or GGUF not optimized yet.

Anonymous
11/26/25(Wed)13:27:41 No.107335968

Anonymous 11/26/25(Wed)13:27:41 No.107335968

>>107332467
>>107332503
This is great, is it uncensored? Also it looks as though the Chinese have never seen a Chroma output, this is not groundbreaking realism y'know.

Anonymous
11/26/25(Wed)13:32:53 No.107336022

Anonymous 11/26/25(Wed)13:32:53 No.107336022

>>107335955
using reference images makes it take longer

Anonymous
11/26/25(Wed)14:12:40 No.107336374

Anonymous 11/26/25(Wed)14:12:40 No.107336374

Can someone please tell me what files to download to use Qwen in forge-neo?
Searches engines are shit nowadays, I can't find anything, and nothing Copilot tells me works...

[Return] [Catalog] [Top]

Post a Reply

Return Catalog Top Refresh

[Advertise on 4chan]

Delete Post: [File Only] Style:

[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.