[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


Total VRAMlet Victory Edition

Discussion of Free and Open Source Text-to-Image/Video Models

Prev: >>107336625

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/kohya-ss/sd-scripts
https://github.com/tdrussell/diffusion-pipe
https://github.com/ostris/ai-toolkit

>Z
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo
https://huggingface.co/Comfy-Org/z_image_turbo

>WanX
https://rentry.org/wan22ldgguide
https://comfyanonymous.github.io/ComfyUI_examples/wan22/

>NetaYume
https://civitai.com/models/1790792?modelVersionId=2298660
https://nieta-art.feishu.cn/wiki/RY3GwpT59icIQlkWXEfcCqIMnQd

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
>>
File: ZiMGFD_00004_.png (2.82 MB, 1080x1920)
2.82 MB
2.82 MB PNG
gm
>>
>>107337882
>Z
>Wan(X)
>Neta(Y)ume
heh alphabetical
>>
Why are y'all posting all this lame shit here, while the technology is way better these days? Look at what this dude posted: >>107336734
>>
>release a great model anyone can run to give all white men yellow fever and take over the world
i kneel...
>>
Z is DOA. Shit model.
>>
File: z-i-t.jpg (112 KB, 1168x896)
112 KB
112 KB JPG
>>107337882
ty for bake
>>
blessed thread of frenship
>>
>>107337897
how are you getting the sharpness?
>>
File: ComfyUI_temp_nsdns_00019_.png (3.14 MB, 1600x1600)
3.14 MB
3.14 MB PNG
AHAHAH LOCAL WON
>>
File: ZiMGFD_00006_.png (2.99 MB, 1080x1920)
2.99 MB
2.99 MB PNG
>>107337928
fast laplacian sharpen node
>>
File: step-compare.png (2.13 MB, 4096x1536)
2.13 MB
2.13 MB PNG
same seed, prompt, euler beta.

steps: 8, 16, 26, 50
time: 14s, 25s, 40s, 76s
>>
Back to the 1girl mines I go!
>>
>>107337899
This thread is for GPU-poor weebs
>>
File: ComfyUI_23447_.png (2.81 MB, 2048x1280)
2.81 MB
2.81 MB PNG
this is a fun model
>>
>>107337949
more steps doesnt do much ehh
>>
File: 1751642843798502.png (2.54 MB, 1280x1280)
2.54 MB
2.54 MB PNG
Rip Flux 2 lmaooo
>>
File: z-i-t.jpg (533 KB, 2048x2048)
533 KB
533 KB JPG
>>
File: file.png (816 KB, 2442x876)
816 KB
816 KB PNG
>127% activity
>highest on 4chinz
heh
>>
File: 1740345217312105.png (3.41 MB, 1280x2048)
3.41 MB
3.41 MB PNG
Reminder that from this point onwards all future newfags will be living in a world where zimage exists and will never know all the struggles oldGODS had to go through to get a good asian 1girl...
>>
can someone do a sampler comparison? i'm too lazy
>>
BRO LOOK IT CAN DO 1GIRL HOLY SHIT 1GIRL!!!! SHE'S STANDING THERE!! LOCAL IS SAVED!!!!

post something good and ill download the model
>>
File: 1742160179195759.jpg (841 KB, 2048x2048)
841 KB
841 KB JPG
>>107337962
>more steps doesnt do much ehh
it does when you go for really high resolution, 2048x2048 will create 3 arms at 8 steps, but at 15 steps it works
>>
You need more than 1girl? What are you, a faggot?
>>
>>107337989
>post something good and ill download the model
give me a prompt nigga, what do you want to see?
>>
File: ZiMG_00034_.png (2.75 MB, 1920x1080)
2.75 MB
2.75 MB PNG
>>107337989
nogens maldin
>>
Until they fix the consistency problem all models are only good for 1girls.
>>
File: ZImg_00029_.png (1.93 MB, 1440x1152)
1.93 MB
1.93 MB PNG
how bout 7girl
>>
>>107338017
now arr rook same
>>
File: ComfyUI_23448_.png (3 MB, 2048x1280)
3 MB
3 MB PNG
>>107337989
>>
File: 1758211583476974.png (73 KB, 2078x413)
73 KB
73 KB PNG
AIEEEEEEE
>>
File: z-turbo_00037_.png (3.19 MB, 1920x1280)
3.19 MB
3.19 MB PNG
>let me get uhhh...
>>
I'm more excited for the edit model tho, I hope it can run on my 16 vram.
can I img2img with this one yet?
>>
Can I get a QRD on the Bong Tanget scheduler? Trying out neoForge and I don't know this one, and it gives very interesting results for XL, anatomy and details are extremely good much better than with other schedulers, but the image is much niosier and has a sort of low contrast amateur look with it. it's actually perfect to emulate early 2010 smarthpone photos.
>>
File: Comfy's prompt.png (2.28 MB, 1280x1280)
2.28 MB
2.28 MB PNG
:^)
>>
>>107338030
doesn't matter for fapping
>>
File: 1744555440646133.png (2.45 MB, 1280x1280)
2.45 MB
2.45 MB PNG
>>107338048
>I hope it can run on my 16 vram.
it's a 12gb model so you're good
>can I img2img with this one yet?
you can i2i with every single image model
>>
>>107338059
I thought they shot you in the streets
>>
>>107338027
truly TLP-like
>If you have a sexual fantasy, it is impossible to attain it. If you have a pornographic sexual fantasy, it is impossible to attain it-- except in porn. In real life you could sleep with a nurse who is a slut, a slut who is a nurse, a nurse pretending to be a slut, a slut pretending to be a nurse, but the only place to find a slutty nurse is porn. Porn, which exists, offers you the impossible, which can’t.
>>
>>107338045
>barkeep, your finest chinkslop. with extra 1girl please
>here you go sir
>but bartender, this is kino
>exactly
>>
File: ZImage_00049_.png (1.46 MB, 1024x1536)
1.46 MB
1.46 MB PNG
>>107337989
>>
File: 56441235445.jpg (612 KB, 2394x1941)
612 KB
612 KB JPG
Alright, so first thoughts? Chroma looks a lot less boring and slopped for amateur photography look. It also has more seed variety.

But z-image is still neat for a turbo model. It's twice as fast as HD Flash at least.
>>
File: ZiMG_00039_.png (2.47 MB, 1440x1152)
2.47 MB
2.47 MB PNG
>>107338082

kekw
>>
>>107338094
z-image has the best color balance, and the perspective isn't as fucked
>>
>>107338009
[spoiler]2girls[/spoiler]
>>
what is the ideal sampler and steps
>>
>>
>>107338094
chroma looks sharper but we're comparing to an 8 step distilled untuned model, so we'll see in the long run.
either way a win for us localfags
>>
Official comfyui's workflow
https://github.com/comfyanonymous/ComfyUI_examples/tree/master/z_image
>>
>>107338037
>What are they doing at Blackforest
Censoring
>>
>>107338114
whichever looks best to you unironically
the fact that you ask this question tells me youre not cut out for kino though unfortunately
>>
>>107337951
>>107337992
What's up with this grain artifact?
>>
File: z-turbo_00044_.png (1.76 MB, 1152x1536)
1.76 MB
1.76 MB PNG
>>
File: z-i-t.jpg (366 KB, 2048x2048)
366 KB
366 KB JPG
>>107338048
with some system ram, probably? you can also already run the mighty qwen image edit right now.

>>107338017
wan, qwen and others can prompt 2-4 characters or so most of the time
>>
File: ComfyUI_07431_.png (1.46 MB, 912x1200)
1.46 MB
1.46 MB PNG
I wonder what the BFL cucks are thinking now.
they must be extremely seething, drowning in seethe literally
>>
>>107338123
Why the FUCK should I trust Comfy or his ilk to prepare a proper workflow lmao
They all suck
>>
File: 1736884431271682.png (195 KB, 1580x888)
195 KB
195 KB PNG
>>107338123
>https://github.com/comfyanonymous/ComfyUI_examples/tree/master/z_image
are you retarded comfy, there's no negative prompt at cfg 1 lool
>>
>>107338037
If you read their essay about how much censoring they did beyond the dataset you'd know why. Lobotomies always hurts the model and worse is antagonistic prompt training. Oh and that many parameters is easy to undertrain and they learned nothing from Flux 1 which was also severely undertrained with many layers literally doing nothing.
>>
File: 4545512845411.jpg (787 KB, 2394x2095)
787 KB
787 KB JPG
>>
>>107338129
All z image output so far looks like it has terrible compression artifacting, but people are blind to it because it has the particular slop tuning they wanted.
>>
File: ZiMG_00042_.png (2.24 MB, 1152x1440)
2.24 MB
2.24 MB PNG
>>
>>107338132
prompt?
>>
>>107338037
>What are they doing at Blackforest
Shibari nights
>>
>>107338154
you mean the particular anti slop tuning which before this was unheard of especially for chinaman
>>
>>107338030
That's true for all base models, it's how large scale training works, it generalises every concept and since celebrities are no longer trained by name, you can't prompt them for unique features.

Luckily there is lora training, and with this being such a small model training will be REALLY fast compared to Chroma / Flux / Qwen etc
>>
>>107338135
What I'm saying is until a model can produce essentially a perfect image based on a prompt and an input image that doesn't require gacha regenning hoping to get a good seed, it's only good for toy things.
>>
>>107337944
thx
>>
anyone train a z- image lora yet?
>>
>>107338123
>>107338138
>>107338140
Spoonfeed me what is the /ldg/ approved workflow? I just arrived.
>>
File: 993.jpg (22 KB, 316x82)
22 KB
22 KB JPG
poggies?
>>
>>107338132
prompt 4 of those girls pointing and laughing at the 5th girl while she's on the ground crying after dropping her spaghetti.
>>
Are the Germans seething yet
>>
File: 1749470386285766.jpg (628 KB, 2048x1244)
628 KB
628 KB JPG
>>107338178
just use comfy's workflow it works well (with more steps than 8 if you want, you'll get better results and less slopped results)
>>
>>107338072
>you can i2i with every single image model
How?
>>
File: ZImg_00038_.png (2.35 MB, 1440x1152)
2.35 MB
2.35 MB PNG
total chink supremacy
again
>>
>>107338154
yeah it looks like built in camera sensor noise, but this is unironically what people want because they're so used to post-processed iphone pictures

nevertheless, it's a very fun model cause it's so fast, i'm grateful
>>
>>107338194
Basic image to image is just a base image with noise.
>>
>>107338194
look at a youtube tutorial?
>>
>>107338161
Anti-slop tuning = fine tune on random frames of cell phone video from 2010?
>>
File: ComfyUI_temp_mkcch_00003_.jpg (998 KB, 2048x2048)
998 KB
998 KB JPG
Symmetrical grid of four anime characters separated by ornate gold borders, anime style. Hatsune Miku in bottom right corner. Chun-Li in top left corner. Raiden Shogun is in top right corner. Princess Peach in bottom left corner. Each girl is looking at viewer, striking a playful expression. There is a heart symbol in the middle of the grid, separating the girls.


pretty nice adherence for sparse instruction, didn't fuck up the position in 4 runs, needs a lot of booru finetune to work for anime styles and concepts though
>>
File: 1746282221256362.png (2.4 MB, 1536x1024)
2.4 MB
2.4 MB PNG
nice
>>
>>107338094
Chroma is for goofy wonky fun that's just outside the edge of realism.
>>
File: 1745316822497208.png (2.16 MB, 1280x1280)
2.16 MB
2.16 MB PNG
>>
>>107338154
There is no dataset of millions of perfectly crisp well annotated 4K images. Every model will have compression artifacts and other noise and seeing as you'll never make a good dataset, it will stay that way.
>>
>>107338210
unironically yes, most of the zoomers here haven't seen a full frame sensor DSLR photograph in their life and think it's all AI
>>
File: z-turbo_00047_.png (2.43 MB, 1152x1536)
2.43 MB
2.43 MB PNG
>>107338159
https://files.catbox.moe/fzcd4z.png
>>
File: Z Image.png (3.08 MB, 2048x1280)
3.08 MB
3.08 MB PNG
Can any other models gen 2048x1280 out of the box?
>>
>>107338210
You prefer 2.5d hyperslop that even the newest of fags can generate with WAI and three tags? kys
>>
File: ZIT.png (2.16 MB, 1280x1280)
2.16 MB
2.16 MB PNG
IDEMOO
>>
>>107338129
The photo is simply because that's what most real world photos on the web has due to compression, and this model has been trained on lots of real world data unlike Flux / Qwen which are primarily synthetic.

I'll take this 1000% over synthetic Flux / Qwen slop any day.
>>
File: 1751420437488637.jpg (548 KB, 2048x2048)
548 KB
548 KB JPG
>>
flux2 status - forgotten
>>
>>107338203
That's not i2i. i2i should let you say "take this girl, wearing the same clothes and make her do something else"
>>
File: 445451254545.jpg (668 KB, 2394x2102)
668 KB
668 KB JPG
>>107338122
Yep, Z-Image tunes should be good, but either way Chroma HD Flash is a distillation too.
>>
Once again you can safely disregard anyone complaining about le compreshun because they are unable to comprehend kino.
>>
>>107338240
Make the guy lie down and look at miku like that while the girls are standing and looking at him with disdain.
>>
>>107338244
the edit functionality is pretty good, shame it's tied to a lacklustre model
>>
>>107338248
that's not what ever was, stupid summer zoomer
>>
>>107337973
>Flux 2, we hardly knew ya
>>
>>107338244
i do wonder sometimes, did BFL forget their core audience? or do they actually have some customer base of API-only retards that pay for censored slopgens
>>
File: 1752450063720966.png (141 KB, 1972x833)
141 KB
141 KB PNG
>>107338248
oh so you're asking for an edit model, we won't get that today
https://www.modelscope.cn/models/Tongyi-MAI/Z-Image-Turbo/summary?version=master
>>
>no negatives
erm..... rlly?
>>
>>107338239
this output is useless for any application that's not trying to fake being a smartphone photo
>>
will there be a non turbo version anons?
>>
>>107338179
>newfag doesnt know about --disable-api-nodes nor can look at the .bat which just calls that
>>107338248
>newfag #2 literally doesnt know what i2i is
>>
>>107338256
Qwen Image Edit is uncensored and smaller and thus faster. Flux 2.0 has no purpose besides running corporate image gen.
>>
>>107338263
Their main target is businesses
>>
File: ComfyUI_temp_mkcch_00005_.png (2.39 MB, 1024x1536)
2.39 MB
2.39 MB PNG
>>
>>107338248
how new
>>
>>107338269
let's be real most people want to gen 1girl smartphone photos
>>
>>107337981
It's the journey, not the destination

I'm not coping, truly I'm not
>>
>>107338268
it's a turbo model yes, for negatives we'll have to wait for the base model
>>
>>107338253
did you also buy into lomography and ccd sensors?
>>
>>107338263
>>107338279
their core audience is redditors and advertisers
>>
File: ComfyUI_18799_.png (1.36 MB, 1024x1024)
1.36 MB
1.36 MB PNG
>>
>>107338291
>redditors
redditors are making fun of flux 2, they hate it as much as we hate it >>107338037
>>
>generating passably real images is actually not desirable
what kind of mental illness is this
>>
>>107338299
>>107338291
lol nice
>>
>>107338291
Redditors are just porn addicts that won't admit to it because they don't objectify women openly.
>>
>>107338250
I like z image more. Less lewd tho
>>
File: ComfyUI_00117_.png (2.58 MB, 2048x2048)
2.58 MB
2.58 MB PNG
the visual effect here is an artifact of using the wrong sampler, kind of cool though...

i wish anime eyes were less slopped in z-image, details in general can smudge but you can deal with that w/ upscale models or w/e, but it's a shame eyes are still fucked

still extremely good for a small model though
>>
>>107338094
Z-Image looks more aesthetically pleasing if somewhat slopped, but I want to fuck Chroma
>>
>>107338121
kek
>>
File: ZIT.png (2.18 MB, 1280x1280)
2.18 MB
2.18 MB PNG
>>107338235
>>
File: ZiMG_00058_.png (2.08 MB, 1152x1536)
2.08 MB
2.08 MB PNG
>>
File: zimage_c_00001_.jpg (1.84 MB, 4096x4096)
1.84 MB
1.84 MB JPG
Here's what Z Image does if you try to generate at 4K*4K. 2K*2K works well enough.
>>
File: ComfyUI_276367_.png (1.2 MB, 1024x1024)
1.2 MB
1.2 MB PNG
>>
>>107338228
thanks anon
>>
File: ComfyUI_00213_.png (1.6 MB, 1024x1024)
1.6 MB
1.6 MB PNG
>>107338213
Tried your prompt and luckily it got Raiden on the first try. A booru finetune would be really awesome
>>
File: 45441245214.jpg (1.22 MB, 3290x2541)
1.22 MB
1.22 MB JPG
This prompt is where Z-Image has performed best against Chroma compared to everything else I've tested.
>>
does zimage know who sadako is?
>>
I can't describe how happy to see BFL utterly raped. Never seen someone eat so much crow since Flux trumped SD3. Karma is a bitch lmao.
>>
File: ZIT.png (2.28 MB, 1280x1280)
2.28 MB
2.28 MB PNG
>>107338332
lmaoo
>>
>>107338285
sad but true
just another form of slop
>>
File: ZImage_00060_.png (1.87 MB, 1024x1536)
1.87 MB
1.87 MB PNG
Early 18th century Daguerreotype of President Donald posing for a photo dressed as an aristocrat, monochrome colors
>>
>>107338345
>implying the future is dystopian noodle niggardry
>>
>>107338367
>>107338372
it's insane how realistic this look, damn...
>>
File: ComfyUI_18833_.png (1 MB, 1024x1024)
1 MB
1 MB PNG
>>
>flux 2 is over 60 gigs
lel
>>
>>107338244
Truly DOA

Here is our new model, large and slow, super censored, plastic skin
>>
File: zimage_00001_.png (1.45 MB, 1024x1024)
1.45 MB
1.45 MB PNG
bros what shift to use for this model?
OWO
>>
File: 3534533457.jpg (458 KB, 1328x751)
458 KB
458 KB JPG
>Homer simpson in a photorealistic style, drinking a beer while looking at the sunset, with writing in the sky in cursive way 'Flux was never good'
Doesn't seem like it can convert 2D character into 3D, unless my prompt is shit
>>
File: 1743424976059091.png (3.09 MB, 1790x1277)
3.09 MB
3.09 MB PNG
>>107338389
not only it's that big but it's worse than this 6b model keeek
>>
>>107338072
>>107338135
Nice, I will give it a try then.
>>
File: 1739062103014901.jpg (453 KB, 2048x1902)
453 KB
453 KB JPG
>euler
>simple
>>
>>107338406
Flux wrecks Z on the prompt adherence there. Z barely incorporated the bokeh more than the default, and ignored the thing about the shoulders.
>>
File: cumra.jpg (83 KB, 1012x624)
83 KB
83 KB JPG
>>107338345
>>
File: ZIT.png (2.08 MB, 1280x1280)
2.08 MB
2.08 MB PNG
>>107338367
it's fun when you can prompt for some celebrities desu, what anime IP can it do too?
>>
File: zimage_00002_.png (1.13 MB, 1024x1024)
1.13 MB
1.13 MB PNG
>>107338398
just realized this is on LUMINA2 arch lmao, LUMINA BROS WE WON!!!!!!!
>>
File: 1741748568533913.png (803 B, 82x62)
803 B
803 B PNG
You are happy zimage is 6b so you can run it on your 1060
I am happy zimage is 6b because I can gen it at 8 images in a batch of 2048x2048 at fp16

We are not the same. (but we both gen big booba 1girls)
>>
>>107338398
shift is unironically snake oil disable it
>>
https://files.catbox.moe/zxvsta.png
https://files.catbox.moe/fpggo7.png
https://files.catbox.moe/sw0m0g.png

the textures actually look satisfying, mirrors seem wonky
>>
>>107338423
what are we calling this weird skin disease?
Z-rash?
>>
>>107338173
I'm pretty sure models are commercially useless if you can't img2img with it, preferably passing multiple images at the same time.
>>
>>107338421
>Flux wrecks Z on the prompt adherence there.
Z-image turbo is a distilled model, we'll get better prompt adherence on the base model (and it'll also have a reasoning feature)
>>
File: flux2_bf16_c_00028_.jpg (940 KB, 1600x1600)
940 KB
940 KB JPG
>>107338365
Disagree, I think it still has a lot of potential. It's too early to say what the limits are between the two of them.
>>
File: billie.jpg (968 KB, 1664x2432)
968 KB
968 KB JPG
>>107338415
it just works
>>
attention: very original post

ahem

based chinks

fuck bfl, you safety cucks. go distill your piss
>>
anon that base model WILL be API there's just no way it won't be
we are giving them so much advertising
>>
>>107338435
is the new "just a few more epochs!" ?
>>
>>107338406
The flux umbrella seems super fucked to me
>>
File: 1741609138060270.png (6 KB, 287x81)
6 KB
6 KB PNG
Reminder more kino to come
>>
File: 1755331659288800.png (3.41 MB, 1536x1536)
3.41 MB
3.41 MB PNG
This is literally better than my chroma output with the same prompt
HOW?
>>
File: ComfyUI_temp_ahcuj_00009_.png (2.76 MB, 1408x1792)
2.76 MB
2.76 MB PNG
z-model lora training when?
>>
File: 1732923292198130.png (581 KB, 676x507)
581 KB
581 KB PNG
>slowly realizing im gonna have to train a bunch of nude loras like with flux again
>>
how do i prompt this thing good
>>
File: kek.png (2.3 MB, 1280x1280)
2.3 MB
2.3 MB PNG
>>
>>107338441
clean af! HOW???
>>
File: 1744496421551798.png (198 KB, 1000x1000)
198 KB
198 KB PNG
Will z-image work on 8gb vram and 16 RAM?
>>
Goonerbros..... we won!
>>
File: 56454521545.jpg (400 KB, 2394x1345)
400 KB
400 KB JPG
>>107338319
Right on that one does look significantly more slopped though.
>>
>>107338443
add a fuck cumfart for garbage software
>>
File: file.png (2.2 MB, 1024x1536)
2.2 MB
2.2 MB PNG
1girl squatting laughing at viewer bros... WERE SO BACK!!!!!!!!!!!
>>
File: zimage_c_00002_.jpg (1.46 MB, 3072x3072)
1.46 MB
1.46 MB JPG
Z image 3K*3K also breaks down but not as badly as 4K. The way it fails is also a bit more interesting than usual.
>>
>>107338269
That only shows at higher resolutions images, you can gen at ~1024x1024 and upscale.

Most people running local want a realistic look in photos, not synthetic advertising material like what you get with Flux / Qwen, if that's what you want then you are already covered.
>>
>>107338483
How many steps?
>>
chroma is a melted failbake
>>
File: ComfyUI_temp_ahcuj_00014_.png (3.02 MB, 1216x2112)
3.02 MB
3.02 MB PNG
THEY EVEN TRAINED IT WITH CELEB IMAGES AHJAHAHHA TOTAL CHINA DOMINATION, WEST IN THE MUD
>>
>>107338475
the one on the left has scary faces though lmao, it's weird to say they both have the same exact VAE desu
>>
>>107338433
Apparently it's called "kino" per the above, but I think your suggestion might be a bit more appropriate.
>>
Any way to get Z image to work in Forge?
>>
>>107338473
yes if you run the model on fp8 mode
>>
>>107338345
I mean, the present is ComfyUI, so we live in the future
>>
File: ComfyUI_temp_ahcuj_00015_.png (3.1 MB, 1216x2112)
3.1 MB
3.1 MB PNG
>>
>>107338503
wait for forge neo to implement it.
>>
>>107338488
40 on that one
>>
File: 1750418485301901.png (1.73 MB, 883x1320)
1.73 MB
1.73 MB PNG
it knows what an emo girl is
>>
File: file.png (2.18 MB, 1024x1536)
2.18 MB
2.18 MB PNG
>>
noooo everyone stop genning offensive shit

think of the safety

t. buttmad german
>>
anyone got a json workflow for z-image since comfy is a GIGANTIC PIECE OF SHIT that won't load pngs anymore?
>>
Yall are celebrating too fast. There's no indication they'll be releasing training code
>>
>>107338498
>I found 1 bad image so I'm gonna categorize all the remaining good images as bad too
debo please
>>
Lumina walked so Z could run.
>>
>>107338515
>doesnt know what an egirl or alt style is
OH N-
>>
File: ZiMGUps_00005_.jpg (1.16 MB, 4608x6144)
1.16 MB
1.16 MB JPG
>>
Installed everything, workflow from comfyui examples, each image takes 10s on 4070ti super, what am I doing wrong?
>>
>>107338231
what do you say to get bigger boobs?
>>
File: zimage_c_00000_.jpg (738 KB, 2048x2048)
738 KB
738 KB JPG
>>107338483
2K*2K with the more expected results
>>
I'm noticings some slight asian same face but no biggie
>>
File: file.png (2.44 MB, 1024x1536)
2.44 MB
2.44 MB PNG
>>
>>107338523
>There's no indication they'll be releasing training code
who gives a fuck? ostris will figure this shit out, he's the god of training
>>
File: 1749467677412934.png (2.32 MB, 1536x1536)
2.32 MB
2.32 MB PNG
No fucking way this is real.
>>
Anyone else notice specific words or maybe phrases being giga slopped?
>>
>>107338525
feel free to reference output here that doesn't look like it came from an mjpeg stream
>>
>>107338539
this is beautiful, I need the prompt anon
>>
>>107338434
Journalists would like to disagree.
>>
the speed of replies shows how stagnant local was previously. qwen image, chroma, neta, hidream, etc were all just bloated garbage. if they weren't, people would've posted gens. really is that simple
>>
>>107338536
halp
>>
>>107338538
Add ", huge incredibly breasts, cleavage," to your prompt, although some images gen where she is naked so play around with synonyms to get the size right while it still not genning a naked woman.
Also keep cleavage to force the model more towards her having a top.
>>
File: lewds.png (2.83 MB, 1080x1920)
2.83 MB
2.83 MB PNG
>>
File: 1747532803216131.png (1.73 MB, 884x1320)
1.73 MB
1.73 MB PNG
doesnt do a very good hitler
>>
i just got a huge error when using the zimage comfyui workflow, the clip is set to lumina but i dont know what to set it to.
>>
>>107338508
LET'S FUCKING GOOOOOOOOOOO
>>
File: 1762135874487833.png (3.21 MB, 1280x1280)
3.21 MB
3.21 MB PNG
>>
>>107338571
>went to sleep for a couple hours
>wake up
>already past three whole threads
knew just from that alone that it dropped kek
>>
I can't BELIEVE what I'm seeing here! "Z model gives better coherency at 4k*?! "The anatomy is more consistent"?! BLASPHEMY! You know what gives you consistent anatomy? PRACTICE! EFFORT! Learning your NEGATIVE PROMPTS like a REAL artist!
>>
>>107338521
works for me but here
https://files.catbox.moe/8ez07o.json
>>
>>107338584
update comgy
>>
>>107338491
catbox?
>>
>>107338523
arr train same
>>
>>107338571
facts, if a model is good, people are gonna spam it, simple as that
>>
>>107338433
It's how black people look when they have bruises, stop being racist
>>
>>107338573
Help with what, each image IS suppose to take around 10s. I have the exact same card and getting 9s, kek
>>
File: ComfyUI_temp_ahcuj_00024_.png (3.16 MB, 1216x2112)
3.16 MB
3.16 MB PNG
threads going so fast
>>
File: 1746594702479596.png (1.79 MB, 884x1320)
1.79 MB
1.79 MB PNG
>>107338580
not stalin either
>>
File: ZImg_00075_.png (2.1 MB, 1440x1152)
2.1 MB
2.1 MB PNG
>>
>>107338563
Journalists are too busy learning to flip burgers to worry about based Chinese models now.
>>
>>107338620
>does hitler better than stalin
uh oh that soifag in the thread is gonna have another melti
>>
File: ComfyUI_temp_ahcuj_00025_.png (3.32 MB, 1216x2112)
3.32 MB
3.32 MB PNG
>>
File: 1735492438525725.png (2.79 MB, 1280x1280)
2.79 MB
2.79 MB PNG
damn even the background people look good, what kind of magic is this
>>
>>107338617
It's too fast! Something must be wrong.
>>
>update comfy
>option to not change the seed after genning is gone
>>
File: 4564564125454.jpg (393 KB, 2394x1720)
393 KB
393 KB JPG
https://files.catbox.moe/svdbu3.jpg
>>
>>107338475
More variation and less slopped in Chroma I must say, but also a bit wonky at places.
>>
>>107338648
comfyui is the future so you will own nothing
>>
>>107338434
Any diffusion model is capable of img2img, it's just swapping out the blank canvas txt2img uses for an existing image. Multi-image input just stitches all the images together, either in pixel space or latent space. Edit models where you can give explicit instructions in the prompt and reference objects in the input image are useful but there are obviously commercial uses for normal models too.
>>
>>107338438
>potential
Dead on arrival like Hunyuan 3.0
No one wants a censored obese model.
>>
File: file.png (2.78 MB, 1024x1536)
2.78 MB
2.78 MB PNG
impressive really, unironically local SAVED
>>
>>107338336
Prompt?
>>
my only disappointment with this model is that is has to be used in cumfart. we need new software that doesn't waste your fucking time
>>
>>107338562
https://files.catbox.moe/pr4n6f.png

>prompt = '''high-contrast manga-style illustration of a female space marine wearing a form-fitting robotic exoskeleton. she has a black bob haircut and narrow blue glasses studded with blinking LEDs. her armored suit is gray and blue. she is leaning back against a railing, in front of a large window looking out into space from low Earth orbit. Half of the Earth's face occupies the view out the window. Mechanical contrivances are visible dangling around the periphery of the frame. The illustration is highly detailed, and the character design is in the style of Ilya Kuvshinov and Tsutomu Nihei.'''
>>
File: 1750146072364727.png (2.29 MB, 1280x1280)
2.29 MB
2.29 MB PNG
Richad Ross Ist
>>
>>107338601
aighty
>>
File: 1733910037238409.png (1.8 MB, 884x1320)
1.8 MB
1.8 MB PNG
>>107338620
does a good trump though
>>
>>107338673
Diffusers is there for you
>>
>>107338673
preach brother!
>>
File: zwiggle.png (3.29 MB, 1080x1920)
3.29 MB
3.29 MB PNG
>>
>>107337882
>big nigga two
A rare gem in this collage of coal
>>
File: 554451254454.jpg (305 KB, 2266x1568)
305 KB
305 KB JPG
Also best result compared to Chroma, beats Nano Banana Pro, Qwen, Seedream 4.
>>
File: 1742376740924740.png (1.59 MB, 832x1248)
1.59 MB
1.59 MB PNG
i want him to piss on her tho
>>
>>107338698
>beats Nano Banana Pro
no it doesn't
>>
>>107338523
>There's no indication they'll be releasing training code
Since when do you need training code from model makers, practically all trainers do their own implementations from scratch

Kohya, OneTrainer, Diffusion-Pipe, AI-Toolkit, one or several of these will have working training days after Z-Image Base drops, I doubt anyone will try to support training Z-Image Turbo though.
>>
File: file.png (2.73 MB, 1024x1536)
2.73 MB
2.73 MB PNG
>>107338673
>>
>>107338519
In Germany, it's considered a felony to download a Brittany Venti LoRA.
>>
>>107338575
*incredibly huge
>>
File: ComfyUI_276375_.png (2.52 MB, 1536x1536)
2.52 MB
2.52 MB PNG
>>107338523
ComfyUI natively supports training loras.
>>
>>107338523
That's literally the point of diffusers lmao
Diffusers models all use the same basic pipeline so PEFT Loras should just work.
>>
File: 1743144108887648.png (2.22 MB, 1280x1280)
2.22 MB
2.22 MB PNG
>>
I remember how it felt when Flux dropped and fingers were "solved". Z is really good at it.
>>
>>107338712
>t. brownoid

you know I'm right. comfyui sucks fucking ass and is controlled by grifter chink. comfy is a saas dick sniffer that puts flux2 over z
>>
>>107338755
i dont have 92gb of vram so the choice is simple
>>
>>107338741
>>107338422
>>
>>107338760
ok trani, go shill your shitty ui elsewhere now lmaooo
>>
>z-image doesn't understand gyaru
seedream remains winning
>>
File: ComfyUI_18898_.png (1.4 MB, 1024x1024)
1.4 MB
1.4 MB PNG
Kinda liking Z-Image, it can do more than just realism.
>>
>>107338755
I'm morally opposed to a model that goes the extra mile for safety.
>>
>>107338756
I think the anatomy is better than flux, while having better nudity, skin texture, hands and feet, and that model is 2x smaller, it's incredible really
>>
>>107338741
nobody uses cumfart to train because it's a fucking stupid interface for it
>>
File: 1746807207066677.jpg (1.39 MB, 2048x2048)
1.39 MB
1.39 MB JPG
This model is almost as good as chroma despite running like 10 times faster.
>>
>>107338666
nice 6s
ironically hunyuan 3 isn't censored at all
>>
Show me a mesugaki
>>
>>107338776
>2x
Four times smaller than Flux 2. Flux 2 is double Flux 1. What those extra parameters are doing is a mystery, they must have embedded a safety LLM with an FBI backdoor.
>>
>>107338781
Nice. Since it's almost Thanksgiving and we have a cool new model could I please get a workflow?
>>
>>107338765
>trani
nta schizo nigger
>>
File: ZImage_00095_.png (1.94 MB, 1024x1536)
1.94 MB
1.94 MB PNG
>>
>>107338789
it doesnt understand loli :(
>>
>>107338791
>they must have embedded a safety LLM with an FBI backdoor.
lmao
>>
so what workflow does zimage use?
>>
File: shill.png (950 KB, 1688x1132)
950 KB
950 KB PNG
>comfy cant announce z-image on his blog because that would mean bumping off the paid ad slots
>>
File: 4545123245.jpg (397 KB, 2394x1421)
397 KB
397 KB JPG
https://files.catbox.moe/9qidcp.jpg
>>
>>107338807
oy vey
>>
>>107338618
what sampler/scheduler and steps are you using. mine looks nothing like that.
>>
File: 1763434542300873.png (2.02 MB, 1280x1280)
2.02 MB
2.02 MB PNG
>so what workflow does zimage use?
>>107338123
>>
>>107338781
>>107338792
*catbox
Because I'm curious about both your prompt and settings.
>>
File: file.png (707 KB, 1512x874)
707 KB
707 KB PNG
>>107338804
this
>>
>12gb
How is this good for me, a vramlet..?
>>
>>107338804
>workflow
tired of this word being used for graphs. workflows make finished works, not images for funsies
>>
>>107338807
comfy's actual blog is the one on github.io
that one is just for the investards
>>
>>107338795
It understands ages but like many other models it seems to often abruptly go from too young to too old.
>>
>>107338807
kek, it's all right, he implemented it, that's all that matters
>>
>>107338668
And people haven't even started training loras yet, which is usually when a model really takes off
>>
>>107338825
go "deliver" for you "clients" or whatever no one here cares
>>
File: 1750949666046437.jpg (447 KB, 2048x2048)
447 KB
447 KB JPG
>>107338255
>>
>>107338795
it can gen [CENSORED] realistic and naked
>>
File: z.png (614 KB, 1280x2048)
614 KB
614 KB PNG
>>107337981
And z-image too.. quite literally.
>>
>>107338842
Nice.
>>
>>107338828
when was the last update on that? two years ago? they make randoms in comfyorg write the blogs and they just boil down to dicksucking
>>
File: ComfyUI_temp_ahcuj_00038_.png (3.11 MB, 1088x1856)
3.11 MB
3.11 MB PNG
>>107338807
AHAHAHAHAHHAA
>>
File: 1742315320231227.png (118 KB, 1286x942)
118 KB
118 KB PNG
>>107338823
how many gb of vram you have? if you don't have enough you can offload to the ram
https://github.com/pollockjj/ComfyUI-MultiGPU
>>
>*beep beep beep* anime... *beep beep* finetune *beep* when... *beeeeeeeep*
>>
File: 1736614419691704.jpg (88 KB, 1280x853)
88 KB
88 KB JPG
it's so over for api images
>>
File: _0010.jpg (2.56 MB, 1792x2304)
2.56 MB
2.56 MB JPG
it smudges things in the bottom on higher resolutions
>>
>>107338854
12 lol
>>
why does it take 10 seconds on rtx 4070ti super, am i missing something? why is that so slow
>>
Fresh when ready

>>107338862
>>107338862
>>107338862

Fresh when ready
>>
>>107338850
today
https://comfyanonymous.github.io/ComfyUI_examples/z_image/
>>
>>107338866
you should be fine, offload a bit like 3gb and it's still fast
>>
File: 1740599645574823.jpg (895 KB, 2048x2048)
895 KB
895 KB JPG
>>107338792
https://comfyanonymous.github.io/ComfyUI_examples/z_image/
>>
File: ComfyUI_18940_.png (1.12 MB, 1024x1024)
1.12 MB
1.12 MB PNG
>>
>>107338795
Try gothic lolita
>>
>>107338807
what is flux 2 good at other than using 35 fucking gigs of VRAM
>>
File: 45456456451.jpg (477 KB, 2394x1794)
477 KB
477 KB JPG
>>107338810
Alright, that wraps up testing for now. Overall I prefer Chroma due to seed variety and detail, also flexibility when prompting it. The other side is usually too slopped or the colors are not my type. However I am optimistic that just plain LoRAs can correct some of these issues. Let's see how LoRAs do.
>>
>>107338600
thank you.. i updated the other day before doing flux and it broke the png loading shit.. fucking hate comfy
>>
>>107338906
You can finetune 6B on a 4090, no need to cope with Loras.
>>
File: z_image__00001_.png (1.16 MB, 1024x1024)
1.16 MB
1.16 MB PNG
>>107338600
output from your prompt
>>
>>107338864
looks like a bit of Erin Burnett or Siobhan Fallon Hogan was mixed into tswift

>>107338906
Chroma's obviously looks more like an amateur photograph. The contortion pose is more 'advanced' (as per the prompt), the towel is actually draped over her for modesty in the Chroma gen whereas does nothing for that in the other, neither looks much like a barn but Chroma is closer, the "duck tape" (funny to use the original name, rather than the far more common but arguably incorrect 'duct tape') is actually correct on Chroma whereas z-image has the wrong kind of tape, etc. Obvious win for Chroma.
>>
>>107338886
storming of the winter palace?
>>
fresh btw
>>107338862
>>107338862
>>107338862
>>107338862
>>107338862
>>
File: zimg_0027.png (1.37 MB, 1152x1152)
1.37 MB
1.37 MB PNG
sampler/scheduler grid:
17 steps, because i fat fingered it
> https://files.catbox.moe/jaskta.png

the experiments must continue
>>
File: you.jpg (625 KB, 1280x720)
625 KB
625 KB JPG
>>107337989
>>
File: 1750641459671251.png (1.19 MB, 1536x1024)
1.19 MB
1.19 MB PNG
>>
File: Z6d5.gif (3.71 MB, 430x242)
3.71 MB
3.71 MB GIF
>we actually got a good model before 2026
there is a god
>>
>>107338407
i'd recommend the gguf q8 version with the distorch2 loader to offload some to system RAM. it's nearly perfect (vs the full version
>>
>>107338094
what if you hiresfix the chroma pic with zimage?
>>
>>107339719
explain how to do this please?
>>
>>107339805
running the chroma output through an upscaler and then running it through a zimage pass at lower denoise
>>
>>107340034
ty
>>
File: 1743683371276673.png (244 KB, 379x371)
244 KB
244 KB PNG
>>107338094
holy
>>
>>107337934
>pepe has four fingers topkek
not realistic
looks fake



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.