[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


人物特征:东亚年轻女性 Edition

Discussion of Free and Open Source Text-to-Image/Video Models

Prev: >>107334502

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/kohya-ss/sd-scripts
https://github.com/tdrussell/diffusion-pipe
https://github.com/ostris/ai-toolkit

>WanX
https://rentry.org/wan22ldgguide
https://comfyanonymous.github.io/ComfyUI_examples/wan22/

>NetaYume
https://civitai.com/models/1790792?modelVersionId=2298660
https://nieta-art.feishu.cn/wiki/RY3GwpT59icIQlkWXEfcCqIMnQd
https://gumgum10.github.io/gumgum.github.io/
https://huggingface.co/neta-art/Neta-Lumina

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
Training: https://rentry.org/mvu52t46

>Illustrious
1girl and Beyond: https://rentry.org/comfyui_guide_1girl
Tag Explorer: https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
>>
File: 1750417508230724.png (443 KB, 1200x900)
443 KB
443 KB PNG
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo
It's here gentlemen
>>
ITS UP
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo/tree/main/transformer
ITS UP
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo/tree/main/transformer
ITS UP
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo/tree/main/transformer
ITS UP
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo/tree/main/transformer
>>
is there a workflow for z-image yet?
>>
>>107336636
>>107336638
>3 safetensor files
>fp32
what the fuck am I supposed to do with that?
>>
yawn

wake me comfy implements it
>>
Most important question: can it easily be trained on cunny?
>>
>>107336638
gotta wait for the fused comfyui version
>>
>>107336639
no workflow, no files that can be directly used on ComfyUi
>>
>>107336650
>what the fuck am I supposed to do with that?
touch yourself to the idea of being smart enough to figure out inference without baby visual scripting ui
>>
File: 1736197942411076.jpg (807 KB, 1790x1277)
807 KB
807 KB JPG
Black Forest Labs got spawnkilled today and you're laughing?
>>
>>107336676
bvrutal mog
>>
File: Z-image turbo.png (685 KB, 1024x768)
685 KB
685 KB PNG
>>
>>107336676
man the backgrounds in zimage are trash
>>
File: 1741566111348066.png (25 KB, 120x130)
25 KB
25 KB PNG
>>107336676
>>
>>107336650
most diffusion AI and trainers have supported multi-part models for a while now

that's irrespective of just about everyone here preferring single large files. and there probably will be merged files by comfyorg or kj, third parties, and various gguf model derivative creators very soon
>>
File: 1732839677653387.png (515 KB, 644x937)
515 KB
515 KB PNG
>>107336676
>Black Forest Labs got spawnkilled today and you're laughing?
Yes.
>>
>>107336639
There is support for Z-Image in comfy, so perhaps if you git pull there will be an example in the templates ?
>>
>>107336651
>wake me comfy implements it
he already implemented the inference code, now we need to compatible files
>>
>>107336721
there isnt an example yet
>>
uhm, comfy update? I sleep until chink-image update
>>
it's here.
https://huggingface.co/Comfy-Org/Z-Image-Turbo_repackaged
>>
>>107336726
it shouldn't be too hard to run it, it's the same shit for everything, transformers, text encoder, vae...
>>
>>107336704
>and there probably will be merged files by comfyorg
This, they usually merge so that there is just one diffusion model, one text encoder and one vae file
>>
>>107336731
You motherfucker
>>
>>107336731
BASED

Thank you Comfy!
>>
File: 1754187749863423.png (57 KB, 220x220)
57 KB
57 KB PNG
>>107336731
oh you...
>>
>>107336638
Is this a brand new model or another flux tune? How uncensored is it?
>>
>>107336638
>24gb
So that is the impressive 6gb small thing that is blowing 32gb out of the water?
>>
File: file.png (188 KB, 383x377)
188 KB
188 KB PNG
Why did BFL phone it in? And Z Image makes them look retarded.
>>
>>107336750
it's 12gb actually, since that one is f32 but that's overkill we always use bf16
>>
>>107336750
it's 6b parameters smoothbrain
>>
just show me the fucking cunny already
>>
>>107336731
comfy, you need to make it public
>>
>>107336749
>How uncensored is it?
we don't know until we can try it locally *shrug*
>>107336749
>Is this a brand new model
brand new model, by another team from Alibaba (not qwen team)
https://www.modelscope.cn/models/Tongyi-MAI/Z-Image-Turbo/summary?version=master
>>
>>107336779
It's basically lumina image 3.0 btw
>>
>>107336731
he made a typo, it's really here
https://huggingface.co/Comfy-Org/Z-Image-Turbo_nigger_faggot_67
>>
File: 1605809028493.jpg (23 KB, 398x500)
23 KB
23 KB JPG
>so many models in such a short time
fuck yeah
>>
COMFY MUST BE DRAGGED INTO THE STREETS AND SHOT
>>
>>107336750
fp32 retard
scale down to fp8 or quantize to q8 and you'll get your 6gb
>>
>>107336792
Thanks, anon
>>
File: ZImage_Output_16156.png (1.31 MB, 1216x832)
1.31 MB
1.31 MB PNG
Z-Image is a bit better than I expected. Not perfect, the prompt adherence isn't mind blowing and the English text support is nothing to write home about, but for what it is it's quite nice (largely only because of the reduced step count via distillation though, it wouldn't be impressive at all if it were an undistilled model that needed a standard amount of steps).

Distillation does also mean training it will have all the same annoyances as Flux, unfortunately.
>>
>>107336790
it's the long awaited bigma, finally!
>>
so. strengths of z image over other models? it doesnt look plastic like flux, what else?
>>
>>107336795
based
>>
File: 1744328306574080.png (104 KB, 1450x766)
104 KB
104 KB PNG
>>107336812
>Distillation does also mean training it will have all the same annoyances as Flux, unfortunately.
it's all right anon, they'll release the base model too, but later
>>
>>107336795
>>107336823
we don't have time for your comfy derangement and samefagging right now, julien
>>
>>107336790
Not really, it's guidance distilled, Lumina isn't. You'll be able to train Z-Image but it'll be exactly the same as training Flux was in terms of caveats and limitations.
>>
File: 1750640019722480.png (1.03 MB, 576x1280)
1.03 MB
1.03 MB PNG
>>107336821
>apache 2.0
>it knows more character IP, celebrities and styles (like jojo style)
>>
>>107336812
What's important is making Flux 2 look like SD 3.0. Maybe the time they spent making it the safest model ever should've been spent on making it good.
>>
HAPPENING HAPPENING LOCAL WON LOCAL WON!!!
!!!!!
>>
>>107336843
>it's guidance distilled
They're releasing the base model, too
>>
https://huggingface.co/Comfy-Org/z_image_turbo
>>
File: file.png (479 KB, 894x894)
479 KB
479 KB PNG
>>107336846
>Maybe the time they spent making it the safest model ever should've been spent on making it good.
>>
WHERE IS THE GODAMN .SAFETENSOR FILE. LET ME RUN IT FFS
>>
>>107336831
Remains to be seen if that one is better enough relative to whatever speed hit is has versus Turbo, though.
>>
>>107336812
>Distillation does also mean training it will have all the same annoyances as Flux
The Z-Image Base model is undistilled, it is what you will train on.

Flux dev was only available as distilled.
>>
>>107336855
>https://huggingface.co/Comfy-Org/z_image_turbo
ok it's the real one, let's gooo
>>107336862
dude it's a 6b model it's gonna be fast
>>
>>107336855
WHERE IS THE FUCKING VAE FUCKERRRRRRRRRRRR
>>
here it is sirs
https://huggingface.co/Comfy-Org/z_image_turbo/tree/main
>>
>>107336855
>>107336876
that one has no vae, but you can catch it here
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo/tree/main/vae
>>
>>107336676
Actually it's quite impressive that a company in rural europoor manages to develop their own model that can compete with models from megacorporations with 400B to multiple trillion in marketcap + government subsidies + built in AI education + imported researchers.
Results might be disappointing but at least they encourage OSS competition.
>>
>>107336877
Based
>>
Sweet, 2 new image models to play with, hunyuan to try and isn't ltx2 due this weekend? Localsisters, we're back again
>>
>>107336894
>develop their own model that can compete with models from megacorporations with 400B
they can't even compete against a 6b model ;-;
>>
>>107336845
can it work with image inputs? like qwen edit, or img2img?
>>
>>107336911
you'll have to wait for the release of the edit model >>107336831
>>
>>107336875
> it's a 6b model it's gonna be fast
retard
>>
>>107336894
It doesn't cost that much money to train a model, don't let them fool you. Flux 2 sucks because undertrained it for its parameters and spent 80% of their time censoring it so you can't generate a JC Penny's lingerie catalog.
>>
>>107336911
see >>107336831
Z-Image-Edit - To be released
>>
>>107336846
It doesn't make Flux 2 "look like SD 3.0" in any way unless you're delusional lol, Z-Image subjectively looks a bit better aesthetically for photographic gens but the prompt adherence and everything else is quite noticeably worse than Flux 2. IRL when locally run Flux 2 is somewhat LESS "censored" than the original Flux was in my experience so far, anyways. I guarantee you that 100% of the people making a big deal out of their generic safety shpiel targeted at inference providers on HF were either ESL jeets who literally don't read English well or just people who have some kind of financial motive to discredit it.
>>
>>107336894
didnt they have a video model they said theyll release before nothing happening? they slop and censor their models too, so they dont get too much grace from the foss community
>>
>>107336904
kek

true though, but surely all the time generating and training on synthetic images of breasts without nipples and missing genitalia was worth it for BFL, surely you want to be SAFE don't you anon ?
>>
>>107336877
> 19 hours ago
>>
>>107336875
It's a 6B model with 4B Qwen3 as the text encoder, which isn't exactly THAT lightweight in a vramlet context itself.
>>
>>107336934
he's not the one deciding when the model will be released, so he did this shit in advance and had to wait for the creators to give its start
>>
>>107336925
Okay BFL employee, no one will use it because it's too big and it's not good for how big it is. Congrats on making the next gigantic turd, sorry about the trauma of the Flux 1.0 dead baby gen causing you to censor everything remotely not a Rated-PG stock photo.
>>
https://huggingface.co/Comfy-Org/z_image_turbo/tree/main/split_files
transformer + text encoder
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo/tree/main/vae
vae
>>
>>107336918
>so you can't generate a JC Penny's lingerie catalog
Releasing local models that are more censored than public big tech SAAS models, surely BFL is on the right track!
>>
No workflows yet right
>>
>>107336853
give me your prompt I'll test it out on z-image turbo
>>
>>107336953
Ah, so you're either that one schizo I've talked to before who has some kind of aggressive specific anti-BFL agenda and thinks anyone who is just sensibly objective is in cahoots with them, or a different but identical schizo kek
>>
>>107336972
it's not complicated, it's just your regular transformer + text encoder + vae text2image workflow, you put 8 steps + cfg 1 and you're good to go
>>
>Z-Image is basically completely uncensored, can do pussy out of the box
holy FUG
>>
File: Z-Image turbo.png (463 KB, 640x960)
463 KB
463 KB PNG
>A six panel Internet meme. In the second panel, a famous Irish male actor in black high neck shirt and dark hair tilts his head slightly to the left and asks "Show me the real successor to stable diffusion." In the fourth panel, he says "I said the REAL successor." In the sixth panel, he says "Perfection." The first panel is the logo of stable diffusion, which looks like a paint palette. The third panel is the logo of the word "flux", which is a letter F with cyan neon effect. The fifth panel is a logo of "Z-Image-Turbo", which is a yellow icon of thunderbolt.
>>
>>107337004
is it properly sideways?
>>
>>107336975
Historical colorized WWII-era photo, slight motion blur, film grain, slightly faded.
a white american newspaper boy wearing a wool coat and newsboy hat. He's running in the streets of new york, holding up a newspaper to show the viewer, has one hand up next to his mouth and he's yelling happily. Focus on newspaper.
The newspaper is named "The Jew York Times" from the year 2025.
Headline says in bold text "LOCAL WON". another smaller headline says "BFL becomes BTFO"
>>
Can someone please tell me what files to download to use Qwen in forge-neo?
Searches engines are shit nowadays, I can't find anything, and nothing Copilot tells me works...
>>
>>107336987
nta but which text encoder node do i use?
>>
File: 1734962950933666.png (227 KB, 500x378)
227 KB
227 KB PNG
>>107337004
>>
Just to temper expectations I think, just like many other distill models, the prompt adherence of this thing is rather poor.
It is prone to only partially follow your prompt.
Or maybe it is a quirk of these newer, smarter text encoders, I don't know.
It also can do few people and characters consistently properly. It is specific style knowledge also seems poor.
>>
>>107336961
He's just outright lying in a way that takes two seconds to disprove though (typical behavior if he's the exact schizo I think he is). I literally posted a Flux 2 pic of a lady in skimpy white lace lingerie last thread.
>>
>>107336985
>schizo
>schizo
>schizo
filtered
also no Flux 2.0 gen attached
>>
File: zimage___0001.png (1.3 MB, 896x1152)
1.3 MB
1.3 MB PNG
it's actually really fast, gotta debug some stuff on my workflow though
>>
>>107337004
Does it pass the Hunyuan Image 2.1 blowjob test?
>>
>>107337018
the one that loads your safetensors text encoders models?
>>
>>107337024
>no Flux 2.0 lingerie catalog attached
>>
File: zimage afro woman.png (1.26 MB, 1024x1024)
1.26 MB
1.26 MB PNG
Here is a z-image gen for the luls
>>
>>107337036
holy moly
>>
>>107337021
you may
https://files.catbox.moe/9hgygi.png
I don't even know if the workflow is right but it seems to be working.
>>
>>107337050
>>107337036
this look really good, we're so back!
>>
>>107337032
> anyone who has opinions based on easily observable fact and not literal fabricated bullshit I made up on the spot is wrong
yeah ok there kek
>>
bros, share your z workflow. also, vae link broken
>>
>>107337057
>apache 2.0
>great skin texture
>fast as fuck
>small model
>can do vaginas and boobs
lmao this is the fucking dream model, god bless the fucking chinks
>>
>>107337066
amazing how you said nothing
>>
>>107336855
>full model is 12g only
dang
>>
File: QWEN_00127_.png (1.59 MB, 896x1152)
1.59 MB
1.59 MB PNG
naisu


its super fast!
>>
File: zimage__00002_.png (1 MB, 896x1152)
1 MB
1 MB PNG
holy shit, this could be something
https://files.catbox.moe/mwuseh.png
>>
>>107337067
>share your z workflow
https://files.catbox.moe/9hgygi.png
>vae link broken
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo/tree/main/vae
>>
Is Flux 2.0 the first model that can be classified as obese?
>>
recommended sampler/cfg/etc settings for Z Image?
>>
>>107336894
They can fuck off bro.
>>
>>107337045
>I win because you heckin' didn't repost an image you already posted in the literal last thread!

why are you people like this lmao, why do you think this kind of tactic is going to work, is it a zoomer-specific thing or something
>>
>>107337087
hunyuanimage 3.0 is the true obese one lol
>>
File: 1748635265430239.png (1.22 MB, 832x1248)
1.22 MB
1.22 MB PNG
10 seconds on a 3090. damn
>>
>>107337057
>>107337021
HOLY SHIT

what about closeup on the pussy though, does it turn into qwen style body horror?
>>
>>107337018
Any and set it to qwen.
>>
>>107337097
>you already posted in the literal last thread
prove it
>>
>>107337103
Doesn't matter, it means they didn't lobotomize it like Flux 2.0, which they wrote an essay about how they censored it in every way possible from the dataset to lobotomies to red teaming prompts to give bad results.
>>
>>107337102
>10 seconds on a 3090. damn
lmao, it took me 5 minutes on flux 2 on my 3090, and all I had was slop
>>
File: QQCiRqVdYpc1_V2sIliPK.png (806 KB, 1024x768)
806 KB
806 KB PNG
>>107337057
>>
File: zimage__00003_.png (1.14 MB, 896x1152)
1.14 MB
1.14 MB PNG
looks like you want cfg 1

15 steps euler/beta
>https://files.catbox.moe/uay0f5.png
there are limitations here though
>>
vae added

https://huggingface.co/Comfy-Org/z_image_turbo/tree/main/split_files
>>
>>107337112
>it's the 'prove it' guy
>>
>>107337077
12gb at its full size, it means no more quant copes anymore, it's beautiful
>>
File: r.jpg (139 KB, 832x1488)
139 KB
139 KB JPG
>>107336901
yes, they got us many decent releases.

>>107337006 >>107337057
impressive start
>>
File: 1758055281190134.png (1.85 MB, 1282x848)
1.85 MB
1.85 MB PNG
Gotta test the seed varience more but for this promo prompt from the promo image on the right i get the 4 images on the left

人物特征:东亚年轻女性,齐肩中长发,发尾微内扣,深棕色头发,带有空气刘海;佩戴黑色大圆框眼镜,镜腿有蓝色细节;化淡妆,唇色为淡粉色,左耳戴一颗小巧耳钉,颈间有细链项链;穿着米白色针织上衣。
风格氛围:清新日常的自拍风格,光线柔和,人物面带自然微笑,神态亲切。
背景细节:后方有带有 “L NREODNAZ SOVEK ALLERHAO” 字样的圆形标识,背景墙是大理石纹理与木质材质的组合,带有暖光照明。
>>
>>107337138
it's undertrained and probably not captioned properly, but it's fixable with minimal effort
>>
>>107337138
>limitations
it doesn't look too bad, that can be so easily fixed
>>
>>107337144
>it's the "'prove it' guy" call out guy
>>
File: 4278921117.jpg (8 KB, 299x168)
8 KB
8 KB JPG
What's the chance lllyasviel comes back just to add support for this shit to Forge. I don't want the noodles.
>>
>>107336638
Any 6gb version out for us vramlets?

Captcha: DVRRM
Gimme da V-ram
>>
Question, if I may:

Why are y'all posting all this lame shit here, while the technology is way better these days? Look at what this dude posted (mildly NSFW): >>107336734

Also, what is he using? WAN 2.2 ?
>>
>>107337170
Just cut your dick off bwo
>>
>>107337072
>small model
Isn't the file 24GB?
>>
>>107337173
distorch2
>>
>>107337173
>Any 6gb version out for us vramlets?
run the model on fp8 mode
>>
>>107337173
not yet I think but maybe the comfyui-multigpu distorch2 loader can offload some to system RAM?
>>
File: A8W208uSoLmaLyWVd66Ud.png (787 KB, 1024x768)
787 KB
787 KB PNG
Think it's pretty obvious that Pony V8 is going to be based on this model, especially since we're getting a pre distilled version.

Only thing I don't like is that the seed variation is minimal
>>
>>107337176
shutup fag and get lost
>>
>>107337185
no, it's a 6b model so it's 12gb at bf16
https://huggingface.co/Comfy-Org/z_image_turbo/tree/main/split_files/diffusion_models
>>
File: 1736043765716838.png (622 KB, 500x653)
622 KB
622 KB PNG
>>107337157
prompt: woman

OK, it's not bad like qwen was
>>
File: QWEN-FD_00019_.png (2.77 MB, 1080x1920)
2.77 MB
2.77 MB PNG
>>
File: ComfyUI_08838_.png (1.65 MB, 1152x1152)
1.65 MB
1.65 MB PNG
>>107336320
Yeah I guess we will see. One thing it's got going for it though is that it's an edit model, so that's pretty neat that we have one this good at photorealism, more converged on small details and got exceptionally better celeb/IP concept knowledge than Chroma (possibly also better architecture).

However detail wise Chroma still does a better job at defeating slop, going based on these two threads here

https://www.reddit.com/r/StableDiffusion/comments/1p77dli/some_images_i_generated_using_the_zimage_model/?captcha=1

https://www.reddit.com/r/StableDiffusion/comments/1p7d69e/image_comparisons_between_flux_2_dev_32b_and/#lightbox

It seems like Z image is a finetune of a model that is already a bit slopped, that would explain why there is still some slop left. Anyways, given it's lightweight and non-distilled then I guess nothing a couple of finetune or LoRA can't fix.

Also about the details, that's one thing I'm seeing Chroma do better consistently. I mean, I've got plenty of images to test when it comes out, but can this Z model produce images as sharp as the ones Chroma can out of the box?

>>107336636
Oh, actually, beautiful. Time to test (note pic rel is Chroma).
>>
holy crap it's so fast
kinda excited for the base non-turbo version
>>
File: 1739990718350626.png (87 KB, 1211x667)
87 KB
87 KB PNG
>>107337173
offload some of the model to the ram with this node
https://github.com/pollockjj/ComfyUI-MultiGPU
>>
>>107336676
>that'll be another $1bn investment round please (+tip)
>>
File: z-i-t.jpg (140 KB, 896x1152)
140 KB
140 KB JPG
>>107337217
yep and it is really quite good while being fast
>>
>>107337170
just learn comfy. i used forge last year. but illyasviel is too slow
>>
File: 1758796356738478.png (1.22 MB, 896x1152)
1.22 MB
1.22 MB PNG
>>
File: z-i-t.jpg (145 KB, 896x1152)
145 KB
145 KB JPG
>>
>>107337170
Sorry bro gotta transition to comfy to use any model newer than 2024. Those are the rules.
>>
File: 1761419123821565.png (169 KB, 310x293)
169 KB
169 KB PNG
https://files.catbox.moe/k3gb2l.png
>>
>>107337256
prompt?
>>
>>107337256
this looks so good, can't believe it's just a 6b model
>>
I swear to fucking god these permissively licensed open source Chinese models are a CCP funded psyop to destroy western AI industries.
Why else would you make something this good and then give it away for free.
Not that I'm complaining.
>>
>>107337186
>>107337188
>>107337194
>>107337219
Thx, but i need it for Forge.
>>
>>107337256
oi oi catbox
>>
>>107337204
Wrong. I'm just impressed by the tech advancement that that video demonstrates (like any person with a brain would be)

Why are y'all still posting lame shit like a semi-blurry photo of some girl with 4.5 fingers standing around in a bar?
>>
File: fuck'em western dogs.png (86 KB, 204x247)
86 KB
86 KB PNG
>>107337273
>Not that I'm complaining.
me neither
>>
What are odds that the LORAs between base and distill with be compatible?
The distill will likely be not great to train.
>>
File: z-i-t.jpg (546 KB, 2048x2048)
546 KB
546 KB JPG
higher resolution gens also seem to work pretty well
>>
>>107337265
im feeling unsafe rn
>>
>>107337273
remember when deepseek knocked dollars off OAI shares? that
>>
>>107337286
>waa waa my 28b MoE model is better than a 6b model
no it's not, get the fuck out lol
>>
https://huggingface.co/spaces/Tongyi-MAI/Z-Image-Turbo

if you wanna just quickly try without setting up locally, even with queue and 5 H100s in use an image takes just 7 secs and they dont have any additional filters here
>>
>>107336853
this took 16 SECONDS on my 7900 XTX vs 90 seconds for same prompt with Chroma. Better text, prompt adherence, and anatomy.

I think I'm about to delete qwen image. have to do more comparisons first.
>>
File: z-i-t.jpg (105 KB, 896x1152)
105 KB
105 KB JPG
>>107337267
> anime, sakimichan \(style\), dramatic lighting, cold colors, ray tracing, HDR, goth girl, solo, petite, wet clothes, black micro shorts, big belt with skull decoration, highleg panties, black crop top with skull print, spiked collar, underboob, pantyhose, goth makeup, platinum blonde hair, twintails, shiny skin, high heels boots, stone steps, night, rain, dynamic lighting, modern, vogue, historical architecture, standing, dutch angle, looking at viewer

it can be shortened
>>
>>107337286
when we could be posting tim cook in a frock? fuck off
>>
>>107337057
>flux_vae.safetensors on the workflow
oh they used that vae I see
>>
>>107337246
>>107337261
I use comfy for TTS, but it sucks for visual work compared to Forge. Oh well.
>>
>>107337138
When is their execution scheduled? Xi won't let them live for too long.
>>
30s for 4 images on 3090 with Z
>>
>>107337179
I already did. Now what?
>>
File: Z_00014_.png (3.36 MB, 1080x1920)
3.36 MB
3.36 MB PNG
>>
why dont they release the models as single files or quantized files
>>
>>107337123
dude there's no practical reason to give a shit about what BFL wrote on HF lmao, nothing you say is going to magically change the capabilities of Flux 2 IRL just because you want to be mad about something. FLUX 2 works in a mundane completely expected manner that's not worse than the original Flux in any way as far as "censorship".
>>
File: 1742669879013621.png (635 KB, 2160x1440)
635 KB
635 KB PNG
>he didn't pull
>>
>>107337273
All the Anthropic CEO talks about anymore is banning open source Chinese models, they're really starting to get to them
>>
File: 808470.png (1.57 MB, 896x1152)
1.57 MB
1.57 MB PNG
>8/8 [00:08<00:00, 1.09s/it] (3090)
AIEEEE this is so fast, it's even faster than fucking SD1.5 lmaooooooo
>>
Is there a good way to upscale wan2.2 outputs?
Somewhere after the final video combine output or right before would probably work, right?

I like my current workflow, so I really just want something to slot in, rather than something brand new.
>>
>>107337328
comfy is fucking awful for all media kek
>>
>>107337273
>give it away for free
Because the paying customers don't have $800 GPUs? Why are people so retarded? Some serious brainlets making retarded decisions because they can't critically think.
>>
>update comfyui
>new offloading method is suddenly hot garbage
go figure
>>
so do we have a certified LDG workflow yet or do i have to sit here with my dick in my hand
>>
File: 1732969766457980.jpg (517 KB, 2048x2048)
517 KB
517 KB JPG
Flux knees status: capped
>>
>>107337428
>>107337085
>>
File: 1738145011682237.jpg (1.27 MB, 2048x2048)
1.27 MB
1.27 MB JPG
>>107337370
>8/8 [00:40<00:00, 5.05s/it]
it's still fast at 2048x2048 but it's obvious it wasn't trained for such high res
>>
File: ComfyUI_temp_jdmdl_00001_.png (1.32 MB, 1152x1152)
1.32 MB
1.32 MB PNG
>>107337214
Same prompt
>>
>>107337102
catbox please
>>
File: 3818047352.png (1.55 MB, 1680x720)
1.55 MB
1.55 MB PNG
>>
File: ZIMGUps_00002_.jpg (1.75 MB, 2160x3840)
1.75 MB
1.75 MB JPG
>>107337342
>>
Is SDXL finally obsolete?
>>
File: z-i-t.jpg (264 KB, 2048x2048)
264 KB
264 KB JPG
>>107337330
i'm not sure xi will care to rouse the coomers? he also learned this is not worth risking the mandate of heaven when he tried to rouse the gamers.

>>107337269
yea, its cool
>>
>>107337463
Already was.
>>
File: ComfyUI_08841_.png (1.43 MB, 1152x1152)
1.43 MB
1.43 MB PNG
>>107337440
Quite fast though which is interesting,
>>
>>107337463
SDXL only becomes obsolete when a new model gains as many loras and finetunes as it.
>>
File: ZImage_00021_.png (1.18 MB, 1152x896)
1.18 MB
1.18 MB PNG
Z-Image feels like being back in the SD1.5 days where you can roll a batch of 4 images at a time in 10 seconds, god bless
>>
File: Z Image Turbo.png (2.75 MB, 1920x1080)
2.75 MB
2.75 MB PNG
>Detailed photograph RAW of seven smiling friends of different races that are at a nightclub concert with dim lighting that is shining on their faces, behind them is a crowd of people dancing while fighting with large swords, everyone is holding a sword in their left hand and an intricate beer glass with differently colored beer in the right hand. Far behind them above the DJ there is a sign which has "Minimum drinKing age 021!" written on it in stylized cursive letters.
>>
>>107337196
ok but pony sucks, what are the chances we can get another noobai?
>>
>>107337478
Will be never because porn is effectively censored for distribution going forward.
>>
>>107337482
good text for 6B
>>
>>107337326
Or genned videos that are indistinguishable from reality.
>>107337309
>a 6b model
So why post this yesteryear's stuff here? Post some cutting edge shit or fuck off.
>>
>>107337438
Workflow pls
>>
Is this the new go-to model for most local use?
>>
File: zimg_0016.png (2.02 MB, 1080x1920)
2.02 MB
2.02 MB PNG
you can toss a face detail on and it's still like 30s. good god. it's not as _good_ as flux but it's way way better value
>>
File: 1746077248026276.jpg (628 KB, 2048x1244)
628 KB
628 KB JPG
>>107337438
Since it's fast as fuck you can increase your steps without much problem, don't stick to 8
>>
Total Chang Supremacy.
Western Dogs on suicide watch.
Praise Xi.
The white man begs for thick juicy Chinese researcher cock.
>>
>>107337485
>seven smiling friends
It's only six, picture still looks good though.
>>
it does paintings too. has its own prompting quirks, but holy fuck. other than NSFW, there's almost no reason to use flux, qwen, chroma, lumina, or SDXL compared to this.
>>
File: z-i-t.jpg (266 KB, 2048x2048)
266 KB
266 KB JPG
>>107337283
just plug prompt >>107337324 into any of the already catboxed workflows in this bread, saves you extra nodes you didn't need so far and unconventionally named checkpoints
>>
>>107337496
>>107337085
>>
>>107337455
that splash makes no sense but otherwise cool pic
>>
File: ZIMG_00009_.jpg (419 KB, 1080x1920)
419 KB
419 KB JPG
>>107337457
>>
>>107337485
>>107337512
I was genning at fullhd instead of whatever is the more usual res
>>
File: 1755495491449030.png (3.35 MB, 1152x1536)
3.35 MB
3.35 MB PNG
>>
will that fox fucker ever decide on an icon for the bypass button
>>
>>107337485
this looks really good, I love the skin texture and that soft lightning, total chang victory!
>>
if this thing can be trained.. dear god
>>
>>107337505
officially speaking, yes
>>
Would my 4070 Ti Super (16GB) be good for this model?
>>
>>107337515
prompt?
>>
>>107337482
try to increase the number of steps for better text
>>
File: ZImage_00026_.png (1.54 MB, 896x1152)
1.54 MB
1.54 MB PNG
>>107337539
lmao you're right this is the third time he's changed it. but this one actually makes sense i think
>>
We finally have mogao at home, bros
>>
>>107337170
>>107337170
No one should have to be subject themselves to spaghetti. There's some forks of Forge that are being updated and can run the latest models and you can use SwarmUI to sidestep the spaghetti, it's all in the OP
>>
File: z-i-t.jpg (97 KB, 896x1152)
97 KB
97 KB JPG
>>107337564
shouldn't really be a problem. if needed use the multigpu distorch2 loader
>>
>>107337564
it's 12gb nigga
>>
File: 1749777890175824.png (1.57 MB, 1024x1024)
1.57 MB
1.57 MB PNG
>A 1700's painting style of Elon Musk, he is playing a N64 video game,
kek
>>
File: 2601649848.png (915 KB, 1152x896)
915 KB
915 KB PNG
It refuses to give me BIG boobs though.
>>
File: z-i-t.jpg (125 KB, 896x1152)
125 KB
125 KB JPG
>>
>>107337598
it's very thin biased unless you want blubbering whales
>>
>from now on, all 1girls will have asian features
>>
File: ComfyUI_08847_.png (2.15 MB, 1152x1152)
2.15 MB
2.15 MB PNG
Nice
>>
>>107337598
There's no mainstream text captions that have captions for breasts, you'd have to see what Qwen VLM says for an image for a well endowed woman.
>>
>>107337564
I'm using a 4070 12gb and it's great
>>
Can someone objectively explain why this model is better than FLUX.2 despite being over 5 times smaller? Without memes. Is it because FLUX.2 tries to do too much stuff at once, all the different styles or dataset? Or why?
>>
File: 1638690581044.jpg (1 MB, 2475x1124)
1 MB
1 MB JPG
Check
suiseiseki desu 
with Z-Image-Turbo please? Flux they openweighted is terrible.
>>
File: z-turbo_00007_.png (2.99 MB, 1152x1536)
2.99 MB
2.99 MB PNG
damn, crushing it right out of the box. can't wait to see how good this can get once people start dialing it in.
>>
>>107337625
https://huggingface.co/spaces/Tongyi-MAI/Z-Image-Turbo check yourself
>>
>>107337618
because flux devs have bad taste
>>
>>107337267
>>107337269
>>107337283
looks like the million xl slop images posted to civitai KEK
not to mean i think the model is bad but holy fuck raise your standards anon
>>
>>107337618
likely better dataset, less safety
>>
>>107337632
Vramlet desu, otherwise I wouldn't be testing with a crappy API.
>>
File: ComfyUI_08848_.png (1.83 MB, 1152x1152)
1.83 MB
1.83 MB PNG
>>107337611
This fly fishing prompt is first result I got that is as sharp as Chroma. Given the models are probably not prompted the same, how fast the images are going and it's just first try I'll take it,
>>
File: zimage.png (2.13 MB, 1024x1536)
2.13 MB
2.13 MB PNG
sirs...
>>
...i have to update comfy...dont i? getting errors

>Error(s) in loading state_dict for Llama2: size mismatch for model.embed_tokens.weight: copying a param with shape torch.Size([151936, 2560]) from checkpoint, the shape in current model is torch.Size([128256, 4096]).
>>
>>107337488
Unless some new weeb savior appear, ain't no way anyone but Pony will finetune this model
Every corpo doing weebstuff ran out of money, see neta, illust, animagine
>>
File: 000.jpg (586 KB, 1440x2304)
586 KB
586 KB JPG
holy mog, guess which one's flux2
>>
>>107337645
Sir i linked you a fucking HF space which runs those models online
>>
>>107337533
right thumb got fucked up

compare to (warning for semi-gays: if you are worried about turning fully gay, do not click) >>107336734
>>
>>107337510
this anon is correct. ACTUALLY, you can even use CFG on this turbo model. 4 CFG works fine though it slops a little and changes output, I think 1.5 or 2.5 CFG might be a good sweet spot worth testing.

pic is 20 steps, 1.5 CFG. Could be worth it if you want a negative prompt.

>>107337515
>>107337570
Masterpiece realist oil painting by Czeslaw Znamierowski.
Extreme very distant low angle side profile shot of many strong and confident Chinese PLA red army soldiers wearing full Chinese PLA military uniforms, some of them holding black modern Kalashnikov rifles, on top of a hill in the middle of a ruined warzone full of rubble, one man is raising a billowing flag with golden bold text saying "6B" sewn into it. like the famous iwo jima photo and screaming a war cry.
Bold red text at top says "TOTAL VRAMLET VICTORY".
Dramatic cinematic lighting, sunrise, snowstorm blowing in wind, epic.
>>
File: ComfyUI_18640_.png (1.43 MB, 1024x1024)
1.43 MB
1.43 MB PNG
>>
File: zimg_0023.png (2.14 MB, 1080x1920)
2.14 MB
2.14 MB PNG
>>107337618
better is subjective. for fast generation of realistic images this is better if you aren't concerned with super high fidelity
>>
>>107337653
>Every corpo doing weebstuff ran out of money, see neta, illust, animagine
except novelai who are doing just fine...
>>
>>107337370
is there a q8 or single file yet? I still dont know how to use the default batch of files
>>
File: ZiMG_00017_.jpg (392 KB, 1080x1920)
392 KB
392 KB JPG
>>107337639
>>107337661
>>
>>107337648
>tfw making chubby indian women in chroma
>now in z

time to get my zutt on
>>
Btw this model can generated naked <18 [CENSORED], are you sure ETHICISTS won't complain about it on Xitter and get it taken down?
>>
anyone know the command line arg to force fp32 in ForgeUI? Forgot it and can't find it online anywhere
>>
first load is pretty slow on 4060 but 8step is like 20 seconds

impressive, hope someone will do an illustrious-style finetune using full model
>>
File: 1754835087913.png (1.26 MB, 1024x1024)
1.26 MB
1.26 MB PNG
>>107337660
Oh. My mistake. It seems to need actual prompting but cool.
>>
File: ZiMG_00020_.png (2.46 MB, 1080x1920)
2.46 MB
2.46 MB PNG
NOICE
>>
File: 1748576941856962.jpg (362 KB, 2048x2048)
362 KB
362 KB JPG
>>107337625
this is what i got with just suiseiseki desu
>>
>IP range is blocked from uploading images again
FUCK YOU GOOKMOOT I WILL NEVER BUY A PASS
>>
File: ComfyUI_08846_.png (1.6 MB, 1152x1152)
1.6 MB
1.6 MB PNG
>>
File: gen_01.jpg (844 KB, 1408x768)
844 KB
844 KB JPG
>>107337256
>>
At this speed i can train a zimage lora on my saved realistic chroma gens with the specific women i like and speed up those by x10

Hopefully Ostris publishes the vid soon
>>
>>107337057
Ahh, it being uncensored out of the box means it will most likely train these concepts very well meaning you can easily improve them with loras / finetuning, which will also be fast since the model is 6B, what a time to be alive

Chinks FTW!
>>
>>107337715
Imagine the loras at this speed
>>
>>107337706
at least thats temporary, 4chins wont let me post images or vids because of "incognito" or some shit
>>
File: 3453899854.png (2.51 MB, 2048x1024)
2.51 MB
2.51 MB PNG
>>107337611
Quite good indeed
>>
File: ComfyUI_08854_.png (1.17 MB, 1152x1152)
1.17 MB
1.17 MB PNG
Close but no dice
>>
File: ZImage_00032_.png (827 KB, 896x1152)
827 KB
827 KB PNG
>>
>>107337706
>>107336734

You shouldn't upload gay shit to 4chan. It's in the rules.
>>
>>107337706
it looks like a firefox issue, i got that same message and it worked fine in edge lmao
>>
File: ComfyUI_temp_nsdns_00006_.png (3.04 MB, 1408x1792)
3.04 MB
3.04 MB PNG
we are so back
>>
File: ComfyUI_08844_.png (1.32 MB, 1152x1152)
1.32 MB
1.32 MB PNG
>>107337728
Yep, better than all API models I've tried for the same prompt.
>>
Yellow Fever General
>>
>>107337696
BigASP guy had just recently decided to use Wan2.2 5B for BigASP 3.0, I think he should hold his horses and wait for Z-Image Base before he goes into full training.
>>
>>107337631
>>107337703
>>107337697
>>107337646
>>107337598
Why are you all gening women with a Down syndrome?
>>
>>107337745
>>107337746
Looks like flux1dev.
>>
>>107337762
>kink shaming
>>
File: z-anime.png (2.71 MB, 1080x1920)
2.71 MB
2.71 MB PNG
>>
File: ZiMGFD_00003_.jpg (427 KB, 1920x1080)
427 KB
427 KB JPG
>>
>>107337754
Looked up this BigASP guy, but doesn't he only do realism finetunes? Didn't see any illust-like model on his repo
>>
File: ComfyUI_08856_.png (1.26 MB, 1152x1152)
1.26 MB
1.26 MB PNG
>>107337765
Yeah, the model is still a bit slopped and not as detailed as Chroma, but it's still refreshing to see a model this good, and it's just Turbo. Let's see how non-Turbo fairs against it.
>>
help a guy out, could someone point me in the direction of where to learn about setting z-image turbo up? it's comfyui isn't it?
>>
>>107337798
any good nsfw finetune needs cartoons because it makes the model learn better
it's why I have furries even though I'm not a furry, because you need the model to know more context
>>
File: z-turbo_00022_.png (2.6 MB, 1536x1152)
2.6 MB
2.6 MB PNG
>>
File: ComfyUI_18664_.png (1.34 MB, 1024x1024)
1.34 MB
1.34 MB PNG
Pretty good with realism
>>
File: brazilian_pentecostal.png (999 KB, 896x1152)
999 KB
999 KB PNG
>>107337610
>what is prompting
>>
>>107337677
use this single file
https://huggingface.co/Comfy-Org/z_image_turbo/tree/main/split_files/diffusion_models
>>
File: 1674168175267.png (934 KB, 1024x1024)
934 KB
934 KB PNG
>>107337703
Best attempt with
Suiseiseki, Rozen Maiden, A high-quality 2D anime-style illustration of a girl with very long, wavy chestnut-brown hair. She has heterochromia, with her right eye being red and her left eye green. She is wearing a white headdress with long, heavy lace veils draping down past her shoulders. She wears an elaborate deep emerald-green Victorian gown. The dress features a high-necked white ruffled bodice with a black ribbon bow, a corset-style waist, and a voluminous floor-length skirt with tiered layers and black ruffled edges. The coloring is flat and cel-shaded.
It doesn't seem to want to do heterochromia, much less follow the color arrangement.
Thanks, anon.
>>
>>107337810
Yep, download and install comfyui and use a basic sdxl image genning workflow
>>
File: z-i-t.jpg (79 KB, 1168x896)
79 KB
79 KB JPG
>>107337765
it is certainly better if you ask me

>>107337711
looks nice, yes
>>
File: 1399231694.png (1.31 MB, 1248x832)
1.31 MB
1.31 MB PNG
>>107337823
cute
>>
File: ZiMG_00026_.png (2.76 MB, 1080x1920)
2.76 MB
2.76 MB PNG
>>
>>107337732
increase your steps
>>
>updated comfy
>now it just loops loading

Oh come the fuck on
>>
Z-Image is the most exciting thing for me since Wan given its quality despite how small and fast it is. Here's hoping it trains well, then it's a small miracle.

Alibaba must have wanted to shank BFL though with this release timing, lol what an absolute murder.
>>
File: ZImage_00036_.png (1.73 MB, 896x1152)
1.73 MB
1.73 MB PNG
if this distilled 8 step shit is already so good i have high hopes for the base model and subsequent loras/finetunes
>>
File: ComfyUI_08861_.png (1.39 MB, 1152x1152)
1.39 MB
1.39 MB PNG
>>
>>107337852
is comfyui really this unstable now? I'd hate to be an investor paying for this shit lmao
>>
is fp16 and bf16 the same?
>>
File: 2058653894.png (1.26 MB, 896x1152)
1.26 MB
1.26 MB PNG
>>107337858
indeed
>>
File: syusdus.png (1.97 MB, 1080x1920)
1.97 MB
1.97 MB PNG
>>107337828
>>
>>107337868
no, bf16 is lower precision
>>
File: ComfyUI_08862_.png (1.15 MB, 1152x1152)
1.15 MB
1.15 MB PNG
>>107337841
Interesting, here's 10 steps.
>>
>>107337868
no, transformers models are on bf16 and text encoders are on fp16, don't ask me why I guess both models train better those ways
>>
Fresh

>>107337882
>>107337882
>>107337882

Fresh
>>
>>107337874
but is it better than fp8 or still lower? sorry for the dumb questions.
>>
Welp guess I'm a Zigger now
>>
File: z-i-l.jpg (96 KB, 1168x896)
96 KB
96 KB JPG
>>
>>107330399
>>107330336
>>
File: 1741924356601568.png (2.54 MB, 1280x1280)
2.54 MB
2.54 MB PNG
>>107337855
>Alibaba must have wanted to shank BFL though with this release timing, lol what an absolute murder.
indeed, this model is fucking amazing
>>
>>107337867
Looks like it. I update as I normally would and got this new error...

>Alert Error saving setting Comfy.CustomColorPalettes: TypeError: NetworkError when attempting to fetch resource.

Its a good thing its not my work comfyui set up
>>
File: ComfyUI_temp_nsdns_00018_.png (3.04 MB, 1600x1600)
3.04 MB
3.04 MB PNG
feels good man
>>
>>107337885
It's better than fp8, it's also better than fp16 in some ways(more stable even though less precise)
>>
>native 2048x2048 gens all run properly with no monstrosities or fucked up background

what the FUCK are those chinks cooking?
>>
>>107337920
thank you
>>
fresh thread btw
>>107337882
>>107337882
>>107337882
>>107337882
>>107337882
>>
>>107337901
kek based
>>
>>107337826
ty
>>
>>107337732
>>107337875
Huh. Wonder what the optimal number will be.
>>
>>107337485
still has ai face
but everything else including hands and background faces look superior to everything that came before it

brains vs bloat
they should employ more brains



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.