[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


It Just Works Edition

Discussion of Free and Open Source Text-to-Image/Video Models

Prev: >>107337882

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/kohya-ss/sd-scripts
https://github.com/tdrussell/diffusion-pipe
https://github.com/ostris/ai-toolkit

>Z
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo
https://huggingface.co/Comfy-Org/z_image_turbo

>WanX
https://rentry.org/wan22ldgguide
https://comfyanonymous.github.io/ComfyUI_examples/wan22/

>NetaYume
https://civitai.com/models/1790792?modelVersionId=2298660
https://nieta-art.feishu.cn/wiki/RY3GwpT59icIQlkWXEfcCqIMnQd

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
>>
>>107338862
First for fuck ai general fags

Nigger ass captcha
>>
>>107338882
Cope
>>
why does z take 10 seconds on rtx 4070ti super, am i missing something? why is that so slow
>>
Damn, we movin FAST. Cant wait till ltx2 drops.
>>
fuck comfyorg and bfl!
praise Alibaba and xi!
>>
>>107338882
>Nigger ass captcha
i dont have to solve the captcha because of ai hehehe
>>
File: 1747581425820573.png (2.44 MB, 1280x1280)
2.44 MB
2.44 MB PNG
Wait, another thread? it goes so fast I can't keep up lmao
>>107338791
>they must have embedded a safety LLM with an FBI backdoor.
oy vey!
>>
z image has a poor understanding of -kei leading it to feel underwhelming when prompting candid lewdity
>>
File: file.png (1.67 MB, 1152x1152)
1.67 MB
1.67 MB PNG
>>107338882
>>
File: coil.mp4 (1.02 MB, 1376x928)
1.02 MB
1.02 MB MP4
>>107338899
checked
chinese UI when?
>>
Safety bros... not like this... our response?
>>
>>107338916
who are they presenting to?
>>
>>107338923
less than a dozen people. a16z should really get rid of the executives at this point
>>
im just happy I can run Flux 2 with 10 VRAM
we're gonna make it bros
>>
@BFL I'd like to congratulate you on the release of Flux 2. It's an amazing model that I'm sure will have a great future. However I am running in to an issue. I am trying to prompt for realistic photography but the images seem to be covered in some kind of plastic blur filter. It can't be an issue with the model because I tried the same prompts on Z-image which is 20x faster and smaller and and they look fine. Could you guys please tell me what I am doing wrong when it comes to prompting Flux 2 dev? Thank you
>>
File: 1733610279671239.png (3.83 MB, 1416x2128)
3.83 MB
3.83 MB PNG
>surreal painting of an abstract dream sequence landscape
>>
>>107338894
You have to enable Tensor Cores in BIOS settings -- for some people they are not enabled automatically.
>>
File: 1760894057155068.png (2.48 MB, 1280x1280)
2.48 MB
2.48 MB PNG
>>
>>107338939
Yes, you forgot to open your image and resave it as a jpg a couple dozen times.
>>
>>107338909
Thank you for posting your portrait but I can already guess you look like that.
>>
File: 1735699197384413.jpg (1.49 MB, 2048x2048)
1.49 MB
1.49 MB JPG
>>
File: 4545645454.jpg (443 KB, 3704x1495)
443 KB
443 KB JPG
>>107338705
Yes it does.
>>
>>107338938
Let us know when your first image finishes in about an hour.
>>
File: 1757576745495859.png (1.55 MB, 1024x1024)
1.55 MB
1.55 MB PNG
let me gooo, let me gooooo
>>
>>107338956
holy soul nuke
what was your prompt?
>>
What's the verdict? Is anime saved?
>>
File: 1735110295123426.png (3.67 MB, 1920x1088)
3.67 MB
3.67 MB PNG
>a screenshot from the retro pc game Diablo 2: Lord of destruction.
even the video games it generates look asian
>>
imagine how many newfags will be brought to local now and with no conceptualization of what it was like in the past
>>
>>107338973
Like chroma, it responds well to boomer prompting

>[Style]
Tim Burton-inspired stop motion animation style with exaggerated proportions and gothic sensibilities. Characters have impossibly thin limbs with oversized heads and wide, expressive eyes. The color palette is deliberately muted with splashes of saturated color for emphasis. Textures appear handcrafted with visible stitching and material imperfections. Lighting creates dramatic shadows with strong directional contrast. Environments feature impossible architecture with curling, spindly structures and crooked angles. The overall aesthetic combines macabre themes with whimsical charm.

>[Action]
3d render of two girls, Hatsune Miku and Rem from Re:Zero, Hatsune Miku is dramatically scolding Rem, Rem stands defensively. Miku points assertively as pixelated magic effects flicker around her. Rem looks flustered but defiant. A faux dialogue box reads: "You have every right to be here, but I'm still number one!"
>>
>>107338981
>anime finetune when
oh my sweet summer child
>>
>>107338939
hi, BFL here

you are using the dev version which is a just a little inside joke

just use the pro version using our api :)
>>
File: 1758009103008386.png (368 KB, 480x427)
368 KB
368 KB PNG
https://files.catbox.moe/tqdv5c.png
>>
File: ComfyUI_18955_.png (1.28 MB, 1024x1024)
1.28 MB
1.28 MB PNG
>>
>>107338997
me
>>
>>107338967
160 seconds avg
full disclosure: I have 64 GB system RAM
>>
File: ZImage_00108_.png (2.71 MB, 1024x1536)
2.71 MB
2.71 MB PNG
>>
>>107338997
keek
>>
>>107338997
now gen this with him surrounded by police officers
>>
cloudcucks real quiet now
>>
>>107338996
damn, this is good
>>
>>107338996
Now post this but with the metadata
>>
>>107339019
>model that has almost 0 IP knowledge
damn theyre fuckin owned
>>
>>107338882
>t.
>>
>>107338916
Well, I mean this is what they are selling, safe model output for corporations to make their own adverts and concepts thus having no need for advertising companies which saves money.

For people running local it's of practically zero interest though, meanwhile BFL are competing with big tech SAAS models, they're best hope is to be bought by said big tech, else they will just slowly bleed out.
>>
>>107339033
with the edit model you won't need IP knowledge
>>
>>107338987
thanks bud. that's awesome.
>>
File: file.png (1.19 MB, 1024x1024)
1.19 MB
1.19 MB PNG
>nursing handjob
I'll come back in a few months when the finetunes are done.
>>
>>107339019
we're busy having fun safely with flux 2 available now through comfyAPI
>>
File: deDL_zi_00010_.png (1.17 MB, 1344x768)
1.17 MB
1.17 MB PNG
has anyone done sampler/scheduler experiments with zimg yet
>>
File: ComfyUI_00046_.png (1.55 MB, 1280x960)
1.55 MB
1.55 MB PNG
Pretty good. Any tips to make it look more vintage?
>>
>>107339046
still would
>>
>>107339041
Why even release cucked weights then?
>>
File: 1760675324510399.png (3.4 MB, 1920x1088)
3.4 MB
3.4 MB PNG
amazing. this model fucking rocks
>>
File: zimage_c_00008_.jpg (1.49 MB, 3072x3072)
1.49 MB
1.49 MB JPG
Traditional hires fix method works on Z image, but at anything above 4MP I'm getting this increasingly strong painterly texture. Also, it takes longer than Hunyuan 3.0.
>>
>>107339051
where's ur nerd glasses and white skin nigbo?
>>
File: Z-Image turbo.png (3.21 MB, 1280x1280)
3.21 MB
3.21 MB PNG
>>107338981
>Is anime saved?
What do you think?
>>
>>107338972
thought that was lady gaga as elsa from the thumbnail
>>
>>107339029
Just the official workflow with basic dramatic cinematic modeling photoshoot prompt and
>>107338575
>Add ", incredibly huge breasts, cleavage," to your prompt, although some images gen where she is naked so play around with synonyms to get the size right while it still not genning a naked woman.
>Also keep cleavage to force the model more towards her having a top.
>>
File: ComfyUI_temp_xtcuo_00055_.png (3.05 MB, 1080x1920)
3.05 MB
3.05 MB PNG
>>107338987
>>
>>107339065
the shitposting kino definitely has taken another level kek
>>
>>107338996
>unable to find workflow
DAMN YOU LEONIDAS
>>
>>107339051
grovel a little more and maybe anon will take pity on you if he hasnt already
>>
File: grrr.png (54 KB, 996x510)
54 KB
54 KB PNG
I pulled and now I get this shit.
>>
>>107339064
To generate hype

Alibaba just made them look silly
>>
>>107339102
it switched you to windows? fuck. comfy really fucking sucks.
>>
File: ComfyUI_128068_.png (1.99 MB, 1024x1024)
1.99 MB
1.99 MB PNG
"lol someone carved breasts into a mountain"
>>
File: 1745081761450557.jpg (1.07 MB, 2048x2048)
1.07 MB
1.07 MB JPG
>>
yo this model is fuckin tits.. takes like .2 seconds to make some decent images.. unlike flux which took like 90 seconds for 'okay' images
>>
>>107339051
I've been using the default in the comfy workflow
>>
>>107339102
update drivers, try again
if doesnt work reinstall
https://github.com/woct0rdho/triton-windows
and sage
if doesnt work, reinstall torch
>>
are there actually, unironically, people who gen on windows?
>>
No negatives is really killing me
>>
>>107339126
rentry.org/debo
>>
File: 1745154706197389.png (2.3 MB, 1280x1280)
2.3 MB
2.3 MB PNG
>It knows John Lennon
noice
>>
>>107339129
Dualbooting is a hassle and i need adobe and anticheat games
>>
>>107339129
99% of people, yes.
>>
File: ComfyUI_128079_.png (1.9 MB, 1024x1024)
1.9 MB
1.9 MB PNG
>>107339114
>>
>>107339144
>people
*normal fags
>>
>>107339133
I wished we had a NAG Z-Image
>>
File: zimg_0033.png (1.55 MB, 1152x1152)
1.55 MB
1.55 MB PNG
sampler/scheduler grid:
17 steps, because i fat fingered it
> https://files.catbox.moe/jaskta.png
>>
File: file.png (1.98 MB, 1152x1152)
1.98 MB
1.98 MB PNG
1st try, unironically also better than qwen for this kind of shit
>>
>new model comes out
>everyone hates comfyui a little bit more
>>
File: ComfyUI_00057_.png (2.47 MB, 1536x1536)
2.47 MB
2.47 MB PNG
>they're sending agents of faggotry to seethe in the general now that we have a new chinese SOTA
you love to see it
>>
>>107339157
thank you gridmaster
>>
>>107339159
>>everyone hates comfyui a little bit more
from my perspective he implemented this cool new model almost immediately so i dont see the issue
>>
File: ComfyUI_00052_.png (2 MB, 1280x960)
2 MB
2 MB PNG
Thank you Mr. Xi.
>>
>>107339158
>I'm fat and shit
ahahahah
>>
>>107339157
meant for >>107339051
>>
File: ComfyUI_18987_.png (1018 KB, 1024x1024)
1018 KB
1018 KB PNG
>>
>>107339157
Thanks based anon
>>
File: ZImage_00123_.png (1.97 MB, 1920x1080)
1.97 MB
1.97 MB PNG
>>
>>107339190
>>107339157
>>107339137
>>
>>107339172
>he implemented this cool new model almost immediately so i dont see the issue
the researches handed them the code instead of having a dev program for any open source repo to do the same. it's fucking selfish and enshittifies the community. fuck comfyorg
>>
>>107339192
get out of my house
>>
>>107339172
it's just anti-comfy schizo sperging out as usual. pay them no attention.
>>
File: 1763428870671062.jpg (992 KB, 2048x2048)
992 KB
992 KB JPG
>>
>>107339197
kys trani
>>
Can't wait to try training on Z-Image, please release base soon!
>>
>>107339197
>the researches handed them the code
i bet that feels so good. id be so pissed if i maintained a ui that didnt get the same treatment. id probably post about how much i hate it here
>>
>>107339213
>please release base soon!
do we know when it'll be released?
>>
>>107339144
99% lmao.. ya right
>>
>>107339212
nta you fucking retard and if you support this shit getting locked to comfy before release then fuck you
>>
>>107339194
This whole NVidia Downton Abbey sponsorship kind of breaks immersion...
>>
>>107339052
Daguerrotype, monochrome, scratches, blur, etc.
I'm noticing it tends to output always the same faces, though, this model might be a 1-trick pony.
>>
>>107339213
In what world would they release base for phree and not paywall it
>>
>>107339228
if youre not using comfy or diffusers it means ur fucking retarded or a literal brainlet
>>
>>107339148
>>
File: 1738215698424927.png (249 KB, 2040x893)
249 KB
249 KB PNG
>>107339226
>>please release base soon!
>do we know when it'll be released?
let them cook
>>
>>107339223
cool, when's it going to be in neoforge so I don't have to do this fucking noodle shit that breaks all the time?
>>
File: ComfyUI_temp_ahcuj_00043_.png (2.33 MB, 1088x1856)
2.33 MB
2.33 MB PNG
>>
z image is gonna train so fast ill try a lora on it while waiting for base
>>
>>107339232
>1-trick pony
That's fine if the one trick is passably real images desu. Why the fuck else do you gen?
>>
>>107338966
>unemployment
>entry level
>CEO

>>107338997
kekd
>>
>>107339223
kek
>>
>>107339157
euler_ancestral seems to be gud through most of them
>>
local won
>>
File: 1480061930691.jpg (51 KB, 400x323)
51 KB
51 KB JPG
Does the Z Image use any model samping nodes or do i just slap loader and prompt and ksampler and call it?
>>
>>107339158
is there a recommended comfy workflow yet
>>
>cumfart shills slurping poop
>>
>>107338987
Which prompt enhancer do you use?
>>
>>107339234
In this world where they've already stated that they will release base and edit it as open weights
>>
china just keeps on dropping Ws all over the place
>>
>>107339289
we dont eat the shit here saar
>>
>>107339232
You need to boomer prompt it
>>
>>107339284
check literally any of the threads since release
>>
>>107339223
lol'd
>>
the jaypeg noise is kind of an issue, its only real issue if anything
i wonder what it stems from.
>>
>>107339310
why are you samefagging something that everyone agrees is a shitty thing for the community?
>>
Does it know any tags?
>>
>>107339307
k there is a page now

https://comfyanonymous.github.io/ComfyUI_examples/z_image/
>>
File: 1735218930518077.png (1.37 MB, 1536x1024)
1.37 MB
1.37 MB PNG
>>107339293
chatgpt lol
>>
File: ComfyUI_temp_ahcuj_00049_.png (3.24 MB, 1216x1600)
3.24 MB
3.24 MB PNG
>>
I felt a great disturbance in the cloud, as if millions of GPUs suddenly cried out in terror and were suddenly silenced.

I think that was pony v8 training being paused.
>>
>>107339332
grim
>>
>>107339327
haha lole
>>
>>107339325
that's just sovl, dont worry bout it
>>
File: zimg_0034.png (1.57 MB, 1152x1152)
1.57 MB
1.57 MB PNG
redid the grid (this is so fast): https://files.catbox.moe/m72bgi.png

>9 steps, cfg 1
>prompt: an analog photo of an asian woman, busty, pale skin, emo makeup, standing in a city street during sunset, long wavy hair, a blue and yellow striped cable-knit sweater, blue jeans, canvas shows, a bustling city street with buildings, shops, and people walking in the distance, warm soft lighting, film grain
>>
we are so back, we have never been this back before.
>>
>>107339327
lmao
>>
>>107339294
I hope they deliver, then. Local has been burned too many times with false promises.
>>
>>107339325
>i wonder what it stems from.
Being trained on real world images, many of which has jpeg noise. It's not a mystery.
>>
WHY IS COMFYUI 2FPS ON FIREFOX????? FUCK YOU FRONTEND """DEVS"""
>>
Ok there's a paper showing how they made that magic happen
https://github.com/Tongyi-MAI/Z-Image/blob/main/Z_Image_Report.pdf
https://tongyi-mai.github.io/Z-Image-homepage/
>>
>>107339360
skill issue
>>
File: 1750773683147559.png (3.68 MB, 1760x1320)
3.68 MB
3.68 MB PNG
hello baby girl
>>
>it just werks first try
Ancient sinomagic
>>
>>107339360
>just keep scaling the UI with shit nobody asked for
>wtf? why is it so slow?!?!
I fucking hate webdevs
>>
File: you.jpg (625 KB, 1280x720)
625 KB
625 KB JPG
>>>107337989
Repost in this one because late to the joke in last.
>>
>>107339363
qrd?
>>
>>107339366
kek
>>
can you use image inputs or do img2img?
>>
>>107338909
isn't it mostly jeets who worship AI tho
>>
File: ComfyUI_temp_ahcuj_00054_.png (3.09 MB, 1216x1600)
3.09 MB
3.09 MB PNG
>>
No but seriously, how is the model so fast? I mean I get it's a distilled model, but still
>>
Can someone explain to me why threads are moving lightning fast? What did I miss while I was asleep?
>>
File: 1764048438030023.png (38 KB, 1164x269)
38 KB
38 KB PNG
>>107339363

uhhhhhhhh, z image base bros?
>>
>>107339395
>No but seriously, how is the model so fast? I mean I get it's a distilled model, but still
write us a qrd once you've read everything lol >>107339363
>>
File: f.jpg (139 KB, 832x1488)
139 KB
139 KB JPG
>>107339388
not on the new z-image-turbo. of course we can do it with other local models

it is unclear what you were asking
>>
File: 1739264942549788.jpg (731 KB, 2048x2048)
731 KB
731 KB JPG
>>
>>107339402
China saving local

again
>>
>>107339403
they said base is bad and still cooking so hopefully they release it when its done, they do have a huggingface to be released link on the github for it.
>>
>modern card
>plenty of RAM
>but OS is installed on HDD
>VAE decoding large image
>system slows to a halt for minutes at a time
There's no way using an HDD is this fucking bad. Are you joking? I'm going to rip my fucking nuts off what the fuck fuck bullshit nigger faggot. It's literally been decoding for 10 minutes and I can't fucking use my system at all
>>
File: z-i-t.pg.jpg (709 KB, 2048x2048)
709 KB
709 KB JPG
>>107339402
z-image-turbo was released and is good as one of the smaller models most people can quite easily run

there are also other new models released but the above is the main reason
>>
>>107339419
how can base be bad? I thought the turbo was distilled from the base one?
>>
File: ComfyUI_19089_.png (1.71 MB, 1024x1024)
1.71 MB
1.71 MB PNG
Genuinely loving Z-image, great quality and easy to test with it being so fast
>>
>>107339423
>not having exlusively NVME pcie4x drives in 2025
LMAO
>>
File: zimg_0037.png (1.68 MB, 1152x1152)
1.68 MB
1.68 MB PNG
>>107339332
curious, he's using the flux vae. i'm going to do one more grid
>>
so what is the magic causing a lower parameter model to be better than the 35GB flux 2 model
>>
File: ComfyUI_00066_.png (2.95 MB, 1536x1536)
2.95 MB
2.95 MB PNG
>a caught on trailcam photo of Elsa from the disney movie Frozen wearing a tattered loincloth running through the woods, the image is shot at night in the middle of the woods.

lol'd. probably should gemini my prompt a bit so it better understands what i mean.
>>
>>107339419
Worst case scenario you just train on distilled Z-Image Turbo, just like you trained on distilled Flux dev, but it would likely be better to train on a non-distilled model.
>>
>>107339423
>using a HDD in 2025 for anything but storage
nigga what
>>
>>107339435
>so fast
it's the best part, it's so fast you aren't afraid to experiment
>>107339439
the bfl cucks focused too much on the lobotomy, at the end they fucked everything up
>>
>>107339423
Nothing to do with HDD, same shit with an SSD. Blame comfy and get an SSD, what fucking year is this?
>>
>>107339434
>>107339240
>>
>>107339439
there is none, you might be better asking how flux 2 manages to waste 35gb on absolutely nothing of value
>>
>>107339409
Not true, i2i works fine with Z image
>>
>base drops
>api only
>>
>>107339439
flux 2 uses 33 of those GB for safety
>>
File: ComfyUI_temp_tlpbk_00005_.png (3.5 MB, 1600x1152)
3.5 MB
3.5 MB PNG
Anyone tried stability tests above 1MPx?
>>
>>107339439
Producing image quality equivalent to 512x512
>>
File: ComfyUI_00068_.png (3.36 MB, 1536x1536)
3.36 MB
3.36 MB PNG
Ahh that's better, thanks Gemini.

RAW, UNEDITED TRAIL CAMERA PHOTOGRAPHY of the character ELSA from Disney's Frozen. She is depicted as feral and desperate, eyes wide with panic and a primal expression. She is wearing a TATTERED, MUD-STAINED LOINCLOTH (a strip of rough cloth, not a dress), which is barely covering her and looks ripped from her former gown. She is in a FULL, mid-stride SPRINTING motion through dense underbrush. Her long blonde hair is tangled, dirty, and wild. The shot is captured AT NIGHT, deep in an unkempt, DARK PINE FOREST. Use a low-angle perspective. The camera's flash is the only light source, creating harsh, high-contrast shadows and making the dewy plants glisten. HIGH-SPEED SHUTTER, giving the image a slightly blurred, motion-streaked look. The final image should have dated, timestamp-style text in a corner and a slight 'fisheye' distortion typical of cheap trail cameras. Intense film grain and digital noise.
>>
>>107339436
Yes I know. I do have SSDs but I can't use them right now or I'd have to move all the data off before I install a new system. I thought HDDs were just slow not COMPLETELY FUCKING UNUSABLE.
>>107339449
My old install was on an SDD and it didn't do this bullshit. The image is technically done and it's STILL LOCKED UP WHAT THE FUCK
>>
File: 1748657486035279.png (104 KB, 1453x796)
104 KB
104 KB PNG
>>107339465
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo
they said everything will be released, relax lol
>>
File: 1755470445614002.png (1.52 MB, 893x1138)
1.52 MB
1.52 MB PNG
Is Z-image really this good for complex text?
>>
>>107339449
>>107339423
it has to do with hdd when you swap to hdd, get a cheap ssd retards
>>
>>107339471
2048*2048 is the limit before it goes weird. It extends the canvas with low-content area rather than repeating like most other models.
>>
>>107339463
can you use "image inputs" as in multiple images?
>>
>>107339431
I see. Some questions:

Is it censored?
How well does it handle character interaction and overlap?
How heavily is it tied to realism? Can it do cartoon/anime well?
>>
File: 1758541327352337.png (3.66 MB, 1760x1320)
3.66 MB
3.66 MB PNG
Oh no! Diaper Manjit is washing his feet in the curry again!
>>
File: 1744588513193257.png (1.29 MB, 1322x772)
1.29 MB
1.29 MB PNG
>>107339483
>the 80b model can't do text
AIEEEEEEEEEEEE
>>
may the great furk protect you in your sleep
>>
>>107339423
>plenty of RAM
If you have plenty of ram then the HDD shouldn't matter, it will just offload to ram and never touch the pagefile on the HDD

Likely you are using too much vram while having the Nvidia driver offload to ram option enabled which you should turn off ASAP
>>
>>107339477
>HIGH-SPEED SHUTTER, giving the image a slightly blurred, motion-streaked look
not how it works but ok
>>
>>107339499
this is me training a 128 rank lora on 22 images
>>
>>107339493
If you're asking if it's an edit model like flux kontext, no. You'll have to wait for the edit version.
>>
File: 1763309814934810.png (456 KB, 1555x1005)
456 KB
456 KB PNG
help?

is it supposed to be lumina2 for the text encoder? was default
>>
File: 1750154399279173.png (2.13 MB, 1280x1280)
2.13 MB
2.13 MB PNG
>>
>>107339510
update comfy
>>
File: ComfyUI_08866_.png (1.44 MB, 1152x1152)
1.44 MB
1.44 MB PNG
>A ethereal young woman with flowing auburn hair, standing by a misty lake at twilight, surrounded by lush foliage and ancient ruins, in the romantic Pre-Raphaelite style of John William Waterhouse, with intricate details on her gossamer dress, soft lighting casting a dreamy glow, high resolution, oil on canvas texture.

Z-Image, that's a little sad unfortunately.
>>
>>107339510
reinstall the whole thing. see you in three hours
>>
>>107339519
me on the right
>>
>>107339510
did you update comfyui?
>>
>>107339501
>while having the Nvidia driver offload to ram option enabled which you should turn off ASAP
Do I sound like a windowsfag? Fuck you and fuck this gay ass earth

The image has been finished for minutes and it's still locked up what the fuck
>>
File: ComfyUI_temp_tlpbk_00007_.png (3.59 MB, 1088x1920)
3.59 MB
3.59 MB PNG
res_2s looks good.
>>
>>107339510
switch to neoForge
>>
>>107339535
res_3m is slightly better and faster
>>
>>107339510
Update stable. Do NOT use regular update or dependencies, whatever you do. Happened to me earlier.
>>
>>107339437
nvm it's almost identical
>>
>>107339542
>and faster
>9s/it
nice meme
>>
>>107339540
imagine how much less problems we would have if model researchers didn't just pamper cumfart

>>107339551
what the fuck does this even mean? kek
>>
>>107339535
>>107339542
are these finally working with latest comfy again, last I checked they were kind of abandoned
>>
>>107339391
They worship chatgpt since they can't afford more than a 1050 over there.
>>
File: ComfyUI_08867_.png (1.85 MB, 1152x1152)
1.85 MB
1.85 MB PNG
>A elegant teenage girl with flowing long hair and flower petals swirling around her, confessing love under cherry blossoms in spring, soft romantic lighting and delicate features, in the shoujo anime style with sparkly eyes, pastel tones, and emotional close-ups, high resolution, watercolor-like.

>>107339403
Yeah, I knew they would do this. They have no reason to give us their base model given that the turbo model is so good and this is basically a promotion for their paid API model.
>>
File: 1763007168212370.png (2.27 MB, 1280x1280)
2.27 MB
2.27 MB PNG
SIUU
>>
>>107339556
res3m is faster than res2s on my 5090
>>
z-image is looking like the next SDXL killer.
>>
>>107339568
Should have been Will Smith.
>>
File: 1742062096088594.png (1.2 MB, 1024x1024)
1.2 MB
1.2 MB PNG
k update comfy from the folder worked

wow, this is like SDXL 1.0 speed.
>>
File: ComfyUI_08868_.png (2.57 MB, 1152x1152)
2.57 MB
2.57 MB PNG
>A pirate crew on a wooden ship sailing through stormy seas, captain with a straw hat grinning wildly, diverse character designs with unique abilities, in the mangaka style of Eiichiro Oda with intricate cross-hatching, exaggerated proportions, and adventurous storytelling, high resolution, black and white manga with color accents.

Interesting result.
>>
File: 1747899701414837.png (3.59 MB, 1248x1824)
3.59 MB
3.59 MB PNG
>>107339388
you can do img2img like any other model. if you mean reference images then you'll have to wait for the edit model
>>
>>107339576
>KILLS sdxl
>Curb STOMPS pregnant black forest labs
>judo throws sana into the woodchipper
who else is on the kill list for Z?
>>
How many decades till anime finetunes?
>>
File: hoplite.png (1.83 MB, 1568x880)
1.83 MB
1.83 MB PNG
>>
>>107339557
Use "update_comfyui_stable.bat" you goober
>>
>z-image just spills into sysram instead of being tactfully offloaded
they really didn't bother with that on this model huh, shit is slow as fuck for me
>>
>>107339585
for me it's wondering what the fuck chroma was doing with $200k
>>
>>107339592
>.bat
and I'm the goober?
>>
>>107339585
possibly qwen if z-image edit has comparable quality. imagine the speed once we get proper optimizations for z-image.
>>
>>107339532
>Do I sound like a windowsfag?
Yes
>>
bloatmaxxers, have we been BTFO?
>>
>>107339160
lmao the yuropoor shitty model defense force will be out tomorrow for sure. Downloading it now!
>>
>>107339576
They said this about lumina image 2.0 btw
>>
>>107339594
im on a 12gb 3060 and i can gen images in 25s with comfy ran with no args
>>
>>107339602
Yes, and my comfyui with Z image is working.
>>
File: z-i-l.jpg (297 KB, 2048x2048)
297 KB
297 KB JPG
>>107339483
i get a bit more mistakes. maybe it's my settings?

it is not bad at text by current model standards
>>
File: 1741488037804251.png (1001 KB, 1075x830)
1001 KB
1001 KB PNG
how good is lora likeness gonna be?
>>
>>107339620
at slower speeds than linux...
>>
File: 1749918006831150.png (1.31 MB, 1024x1024)
1.31 MB
1.31 MB PNG
Hatsune Miku in a 4 panel children's comic. She is holding a green leek vegetable in each panel, and is saying something different in a speech bubble.
>>
>>107338981
I have some bad news for you anon...
>>107339494
nta, but
>Is it censored?
Can do tits and vag but not dicks
>How well does it handle character interaction and overlap?
50/50, sometimes it fucks up, sometimes it works, it's at least better than sdxl... I guess?
>How heavily is it tied to realism?
A lot, it's mostly a realism model with some popular cartoon in it
>Can it do cartoon/anime well?
Only the very popular stuff, as usual. More than flux and qwen, but less than chroma
>>
>>107339619
what the fuck this is black magic
>>
>>107339046
lmao
>>
File: 1745471929579768.png (1.51 MB, 1024x1024)
1.51 MB
1.51 MB PNG
>>107339577
yeah true
>>
>>107339611
memes aside these all have their place depending on what you're working on but for the majority of people on consumer GPUs, why would you bother with flux2?
>>
ZimageGOD BTFO Flux.2 KEK
>>
>>107339594
>>107339645
-> >>107338854
>>
>>107339600
De-distilling, de-slopping and de-censoring Flux Schnell

Obviously if Z-Image had existed back then, Chroma would be based on it rather than Flux Schnell
>>
File: ComfyUI_08869_.png (2.69 MB, 1152x1152)
2.69 MB
2.69 MB PNG
>A swirling starry night sky over a quiet village with cypress trees in the foreground, vibrant blues and yellows swirling in expressive brushstrokes, in the post-impressionist style of Vincent van Gogh, thick impasto texture, dynamic movement in the clouds and stars, high resolution, oil painting feel.
>>
>>107339645
something must be wrong with ur setup anon
im using the default comfy workflow
im on linux if that helps
>>
>>107339611
just 10 more layers

just 20gb more

just a little more security

bros please
>>
What's the safest?
Z Image or Flux2?
>>
File: ComfyUI_08870_.png (1.77 MB, 1152x1152)
1.77 MB
1.77 MB PNG
>A surreal landscape with melting clocks draped over barren trees and a vast desert plain, an elephant with impossibly long legs in the distance, in the surrealist style of Salvador Dalí, dreamlike precision and bizarre elements, warm earthy tones with high contrast, high resolution, as if an oil painting.
>>
File: 1735036877579124.png (969 KB, 1096x827)
969 KB
969 KB PNG
pony v7 still has a chance... right?
>>
>>107339638
>linux
>comfyui
lol. lmao.
>>
>>107339641
Understood thanks.
Regarding cartoon/anime, can it at least adjust proportions? Can it do chibi characters? Can it gen characters with varying degrees of height and hip/waist proportion?
>>
File: z-i-t.jpg (166 KB, 1024x1024)
166 KB
166 KB JPG
>>107339507


>>107339494
not very censored, not porn either

>How well does it handle character interaction and overlap?
decent, but wan and qwen are probably better at it. maybe the base model is stronger than turbo tho, wouldn't be the first time.

>How heavily is it tied to realism? Can it do cartoon/anime well?
i'd basically say yes in terms of agreeable style. no in terms of near-encyclopedic understanding of characters like nai or illustrious
>>
>>107339672
I'm a different anon, 25s just seems like black magic compared to the fatass models we've been getting since flux
>>
File: ComfyUI_08871_.png (1.85 MB, 1152x1152)
1.85 MB
1.85 MB PNG
>A dramatic self-portrait of an elderly man in shadow, illuminated by a single light source highlighting his thoughtful expression and textured clothing, in the baroque style of Rembrandt van Rijn, rich chiaroscuro with deep browns and golds, intricate details on fabric and skin, high resolution, oil on panel.

Neat, very close to Flux.2's result.
>>
>>107339695
linux is better for all ai related things, including comfyui
>>
I... Probably should've specified realistic image. but there you go. Z is trained on the entire frozen movie frame by frame.

now watch elsa and the disney style get a lora on civitai the microsecond the base gets released kek
>>
>>107339519
This is purely T2V right? I can't use an input image for ZImage to reference?
>>
File: ComfyUI_temp_tlpbk_00015_.png (3.46 MB, 1664x1152)
3.46 MB
3.46 MB PNG
The stability above 1MPx is crazy. The image quality is kind ass so I hope the base model will fix that a little. Also doesn't know the kind of freak porn I am into.
>>
File: file.png (14 KB, 292x211)
14 KB
14 KB PNG
>>107339704
>25s just seems like black magic
There are people itt waiting half a minute for their image gacha?
>>
File: 1734943812051682.png (2.2 MB, 1280x1280)
2.2 MB
2.2 MB PNG
>>107339648
https://www.youtube.com/watch?v=I3TwuAQZE58
>>107339713
yep, pure T2V, we don't have the edit model yet >>107339482
>>
Will we get the non distilled Z-image model?
>>
File: 1753251610533598.jpg (2.59 MB, 2048x2048)
2.59 MB
2.59 MB JPG
Absurd
>>
>>107339708
lmao.
>>
>>107339714
it's fine up to 4mp and then it dies suddenly
>>
>>107339738
they're still cooking it
https://xcancel.com/bdsqlsz/status/1993757819020206173#m
>>
>>107339743
Prompt?
>>
>>107339743
>me in the foreground arguing with a cumfart spaghetti enjoyer
>>
>>
File: ComfyUI_07437_.png (2.88 MB, 2048x1280)
2.88 MB
2.88 MB PNG
>collages full with many cool pics
>quality of discussion is up
>4 ldg threads in the catalog
>new model that released isnt shit

today was a good day
>>
>>107339758
thank you, I was fucking sad with the whole safety obsessive shit around flux 2, finally something good again
>>
>>107339743
thats fucking hard god damn
i gotta listen to some black sabbath (dio era) now
>>
>>107339713
>I can't use an input image for ZImage to reference
no, for the last fucking time
>>
File: z-i-t.jpg (101 KB, 1024x1024)
101 KB
101 KB JPG
>>107339682
z-image is not very safe at all

flux.2 still does some stuff for <safety> even if it's clearly not as bad as flux.1
>>
>>107339767
I'm still waiting for something terrible to happen, every new models was a disappointment, there is a catch here too you'll see
>>
>>107339743
damn this is good
>>
File: 1744078710690345.png (3.84 MB, 1920x1440)
3.84 MB
3.84 MB PNG
>>
>>107339782
It'll def be something to do with either the base model or training this model. One of those will be the major downside.
>>
>>107339743
swords look good and are being held correctly, most models fail at this
>>
>>107339782
>there is a catch here too you'll see
yeah the catch is we had to wait all year for this, and it has a jpeg shmegma filter over everything. give it some time. allow yourself a sliver of faith.
oh and the US economy WILL topple due to the money going around the big corpos and now that's totally bunk with this new model that can run on 6gb saar gpus.
>>
>>107339682
>>107339776
>flux.2 still does some stuff for <safety> even if it's clearly not as bad as flux.1
Flux 2 has quite a few paragraphs exclusively on how ""safe"" their model is, if they could get away with getting rid of women in their dataset, they would.
>>
>>107339743
There's some funkyness with the sword, but fucking hell that's metal as fuck.
A shame today's D&D session got aborted.
>>
>>107339747
Are you retarded ? Everything AI is developed on Linux, all proffessional use of AI is on Linux, the best NVidia drivers for AI are those for Linux.
>>
testing some z-image
there's no truly dark images, lens effects/distortion/aberration control right?
actually, the turbo model feels ultra distilled and limited.
flux2 is slow as shit but seems to have more range/knowledge or else less damage from distillation.
>>
File: 1746191392149408.png (1.18 MB, 1024x1024)
1.18 MB
1.18 MB PNG
Obviously that chink model knows who Jackie Chan is kek
>>
>>107339663
This node works a lot better than it used to, thanks!
>>
File: file.png (1.38 MB, 1024x1024)
1.38 MB
1.38 MB PNG
>ask for snake girl with long hair, human head, and snake body
>get pic related
>>
>>107339809
>Are you retarded ?
/g/eets tend to be
>>
File: 1760418209237948.png (1.14 MB, 1024x1024)
1.14 MB
1.14 MB PNG
a cartoon with a chibi Hatsune Miku pointing and laughing ad a cartoon indian man saying "DO NOT REDEEM!"

8 seconds. it's fast.
>>
>>107339817
Hooooly shit that's awesome.
That looks fucking badass.
Give him the wings too.
>>
>>107339817
cool as fuck pic, this model has SOVL
>>
>>107339809
can verify, since switching to fedora my gens are 3x faster and i can do a lot with low VRAM
>>
>>107339814
agree, feels like z has a lot of low hanging fruit and not much above that
>>
>>107339809
lmfao.
>>
File: dsp WOW WHAT.gif (1.45 MB, 640x452)
1.45 MB
1.45 MB GIF
>>107339825
>ask for snake
>get snake
WHAT? WOOOW!
>>
File: ComfyUI_08873_.png (1.87 MB, 1152x1152)
1.87 MB
1.87 MB PNG
>A young girl riding a magical flying creature over lush green valleys and ancient forests, whimsical creatures peeking from the trees, in the enchanting Studio Ghibli anime style with detailed hand-drawn backgrounds, soft earthy colors, and a sense of wonder, high resolution, as if from a fantasy film.
>>
>>107339825
The model has good taste.
feral > girl monster > monster girl
>>
File: 1735623988791746.png (1.05 MB, 1024x1024)
1.05 MB
1.05 MB PNG
>>107339830
>>
>>107339835
i don't like how all of the newer models have no seed variation. give me back my gacha!
>>
Fresh when ready

>>107339853
>>107339853
>>107339853

Fresh when ready
>>
>>107339847
kiki's delivery service, encoded by yify
>>
>>107339817
Make him bitch slap his ugly dyke daughter!
>>
>>107339856
>page 1
>only two thirds of the image limit
Retard.
>>
>>107339857
lmao
>>
File: ComfyUI_08874_.png (2.66 MB, 1152x1152)
2.66 MB
2.66 MB PNG
>>107339814
I can confirm Flux.2 has significantly better artist/mangaka knowledge. I will post side to side comparisons in a sec.

>A lone warrior in massive dark armor wielding an enormous jagged sword, standing atop a battlefield strewn with fallen enemies under a blood-red eclipse, intricate Gothic architecture crumbling in the background, demonic entities emerging from shadows, in the mangaka style of Kentaro Miura with incredibly detailed cross-hatching, dramatic high-contrast black and white, muscular anatomy, baroque ornamentation, and visceral dark fantasy atmosphere, high resolution, as if from a seinen manga masterpiece.
>>
>>107339862
We're about to hit bump limit
>>
>>107339814
It has a poor range of vocabulary. Stuff Seedream understands, like the concept of gyaru, is completely lost to z-image. The dataset needs to be improved for future versions. It is 85% of the way there which is incredibly impressive for the size, but it just needs a bit more worldly knowledge.
>>
>>107339869
And?
>>
>>107339817
based
>>
>>107339874
Now the thread falls off the catalogue, with your post being the 311th
>>
>>107339871
You can tell with a better dataset it'd be incredible. It knows some really obscure things, and really well
>>
File: 53453454.jpg (535 KB, 1334x756)
535 KB
535 KB JPG
>>107339696
>Can it do chibi characters?
It can but you need some boomer prompt like >>107338987
>Can it gen characters with varying degrees of height
Yeah it's fine
>and hip/waist proportion?
That's another problem, it doesn't seem to be that precise for anime/cartoon stuff
It will have a hard time with huge breasts, but not impossible
>>
File: file.jpg (304 KB, 1792x2304)
304 KB
304 KB JPG
>>
File: ComfyUI_00022_.png (1.27 MB, 1024x1024)
1.27 MB
1.27 MB PNG
>>107339731
I get 9s for the bf16 on a 3090
finally a model with both reasonable speed and quality that can fit snugly
I never even bothered with the qwens or flux 2 (lol)
>>
>>107339936
thanks
>>
>>107339798
>>107339782
no, everything will be good bros
>>
anyone getting this crap with zturbo -
Error(s) in loading state_dict for Llama2:
size mismatch for model.embed_tokens.weight: copying a param with shape torch.Size([151936, 2560]) from checkpoint, the shape in current model is torch.Size([128256, 4096]).
size mismatch for model.layers.0.self_attn.q_proj.weight: copying a param with shape torch.Size([4096, 2560]) from checkpoint, the shape in current model is torch.Size([4096, 4096]).
size mismatch for model.layers.0.self_attn.k_proj.weight: copying a param with shape torch.Size([1024, 2560]) from checkpoint, the shape in current model is torch.Size([1024, 4096]).
size mismatch for model.layers.0.self_attn.v_proj.weight: copying a param with shape torch.Size([1024,
>>
>>107340124
post workflow
>>
>>107340171
https://pastebin.com/raw/7C03TCVY

downloaded it from reddit
>>
>>107340191
404
screenshot is enough. Do you have text encoder set to qwen image and correct vae?
>>
z image is the model we were waiting for. Nsfw out of the box, knows tons of characters, small, fast...
Once they drop the base model finetunes should start flooding everywhere
>>
File: 1744184867639988.png (302 KB, 2091x1199)
302 KB
302 KB PNG
>>107340203
>>
>>107340234
wrong vae I think?
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo/tree/main/vae
>>
>>107340228
>knows tons of characters
It's pretty spotty, pretty random
>>
File: file.jpg (397 KB, 1792x2304)
397 KB
397 KB JPG
>>107340269
it uses flux vae

>>107340234
you sure you updated comfy?
>>
>>107340228
Nah we need a model that takes in reference images like Flux2 or QIE for it to be The One
>>
>>107339817
kek
looks good
>>
>>107340453
they are also releasing a non distilled and a edit model later
>>
>>107339360
weirdly it renders just fine on icecat
>>
>>107340409
sigh updated it and now everything is broken,
>>
>>107340580
The edit model only takes in one reference image though
>>
File: 1762315072806622.png (2.42 MB, 1440x1120)
2.42 MB
2.42 MB PNG



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.