[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


Z Image Won Edition

Discussion of Free and Open Source Diffusion Models

Prev: >>107988202

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/tdrussell/diffusion-pipe

>Z
https://huggingface.co/Tongyi-MAI/Z-Image
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>Klein
https://huggingface.co/collections/black-forest-labs/flux2

>LTX-2
https://huggingface.co/Lightricks/LTX-2

>Wan
https://github.com/Wan-Video/Wan2.2

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>NetaYume
https://huggingface.co/duongve/NetaYume-Lumina-Image-2.0
https://nieta-art.feishu.cn/wiki/RZAawlH2ci74qckRLRPc9tOynrb

>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
>>
File: Zimage_base__00120_.png (1.66 MB, 1024x1024)
1.66 MB
1.66 MB PNG
nom nom nom
>>
>>107989768
>not even a MB
you can make the collage bigger anon
>>
Inb4 during and after schizo apocalypse
>>
>>107989768
>>Maintain Thread Quality
>https://rentry.org/debo
>https://rentry.org/animanon
why did you put these ones in the op? the new ones are better because they weren't written by a mentally ill troonoid
>>
>>107989789
Kill ani
>>
>>107989789
maybe we can include this one instead?
https://rentry.org/barbie
>>
Blessed thread of frenship
>>
>>107989768
>>107989785
>>
File: Zimage_base__00125_.png (1.52 MB, 1024x1024)
1.52 MB
1.52 MB PNG
>>
>>107989807
Nigger
>>
>>107989784
4chanXT automatically converts to JPG when the image is too large.
>>
File: Zimage_base__00126_.png (1.69 MB, 1024x1024)
1.69 MB
1.69 MB PNG
>>
>>107989810
Yeah rushing at just 405 replies, damn
Kill ani
>>
Been trying to narrow down how to get crisper gens with LTX-2 without getting the usual cooked look.

Winner settings so far:
>Base model
>Step 1: 8 steps manual sigma, CFG 1, Distilled lora at 0.6, euler
>Step 2: Manual Sigmas CFG 1, Distilled lora at 0.8, res_2s

Things to try:
- Step 1: CFG 4-5 with default negative prompt, 30 steps euler, no distill, step 2 at 0.8 distill, res_2s
- bump generation resolution to 1535 longest side.
>>
>>107989806
what about blimbo and sticker anon? where is colonscopy rentry?
>>
>>107989820
oh, you can use this
Bakery: https://rentry.org/ldgcollage
it allows you to set the megapixel
>>
>>107989811
>>Klein
>solar panels upside-down
how... lora-like

378/500, the result is incoming.
>>
>>107989848
meant *zbase
>>
File: Zimage_base__00128_.png (1.64 MB, 1024x1024)
1.64 MB
1.64 MB PNG
>>
I tried downing a gguf of z base but it no work in Comfy yet? I also nuked NAG at the same time pls halp
>>
>>107989832
Yeah that's how this collage was created but it's easier to let it be converted instead of guessing with that.
>>
So single subject ZIB loras are fine on ZIB but slightly worse on Turbo. There's no benefit whatsoever in training a ZIB lora and using it on ZIT, vs just training on ZIT with Ostris V2 adapter.
>>
You faggots need Jesus and a GIRLFRIEND.
>>
Professional enterprise diffusor here. These models are only interesting if you've never played with Nano Banana Pro before. Once you unlock the power of ComfyUI API Nodes you'll never go back
>>
zbase is crap, but ace step will save China's face

https://ace-step.github.io/ace-step-v1.5.github.io/
>>
>>107989884
send me a dickgirl plz
>>
>>107989871
I alredy got better results on dedistilled than with adapter before despite conflictingclsims, I learned not to trust statements of random anons ITT, better try things yourself
>>
File: BFL Klein.jpg (243 KB, 1242x1526)
243 KB
243 KB JPG
>>107989885
>>
SOON
>>
File: file.png (3.87 MB, 1152x2048)
3.87 MB
3.87 MB PNG
>>
>>107989921
>custom advanced snakoil node
I'm afraid
>>
>>107989921
How bout you shut the fuck up and Show shit when it's finished grease golem
>>
>>107989940
I'd be soooo ashamed to be asian rn after zbase released.
>>
>>107989931
What is she aiming at
>>
>>107989909
I'm saying I did what my comment said, my self.
>>
come on, ace step, what's the hold up?
>>
File: file.png (2.44 MB, 1280x1280)
2.44 MB
2.44 MB PNG
>>107989955
the guy posting loli in the previous thread
>>
Been using ComfyUI. It's pretty good, but I have been struggling with image manipulation. Simple shit like changing the color of shirt or changing the background setting for a person standing at a bus stop. That sort of thing. It's a pain in the ass trying to do this in Comfy. Been following guides and trying to get it to work but failing every time.
Is there an alternative to comfy that's easier to learn and use? I'm on Linux with AMD btw
>>
>>107989998
If you mean impainting try forge or one of its forks
>>
>>107989998
>Been using ComfyUI. It's pretty good
stop lying, faggot
>>
File: lino.safetensors_00002_.png (3.18 MB, 1694x1286)
3.18 MB
3.18 MB PNG
diffsynth style lora in 5 minutes
this is based
>>
zimage edit WHEN
>>
>>
Ok guys
I gave you time.
What is z image now?
Complete:
Z image is ...
>>
>>107989998
you're using an edit model, right?
>>
>>107990032
Nothing special.
>>
>>107990032
basically what I expected base to be, not interesting for inference
>>
is something wrong with ZImage?
>>
File: 1745472305653780.png (78 KB, 380x483)
78 KB
78 KB PNG
So how is ZBase vs. ZTurbo in terms of image quality? Given that they claim it only has "high" quality vs. Turbo's "very high".
>>
File: F2Kb__00024_.png (1.93 MB, 1024x1024)
1.93 MB
1.93 MB PNG
Klein is great.
>>
File: z-image_00002_.png (1.46 MB, 1024x1024)
1.46 MB
1.46 MB PNG
ok I downloaded and tested Z image base, anyone knows what the optimal settings are for this?
>>
>>107989992
ooh perty
>>
>>107990047
Body horror problem like Klein.
>>
File: Zimage_base__00135_.png (1.71 MB, 1024x1024)
1.71 MB
1.71 MB PNG
>>
>>107990053
I can't gen anything without artifacts, regardless of sampler+scheduler+steps combo on my 5090
>>
>>107990061
sure, type man rm
>>
>>107990053
Non-Turbo has more variety and may give you 2D- or CG-looking artstyles about half the time without being asked, at least in my testing. Turbo tends to give you a more-realistic look more easily.
>>
>>107990029
unrefined, just like it says on the box.

you niggers expecting a finished product are stupid.
it's like buying a giant chunk of marble and getting pissed that it doesn't morph into the statue of david.
>>
>>107990114
I'm white.
>>
>>107990109
Dunno if it's true in your case but apparently sage attention affects zimage gens pretty badly.
>>
>still no ramtorch
the quantcuck nightmare continues
>>
>>107990088
Not nearly as pronounced as my Klein 4B distilled testing. Sometimes a finger or eyeball can be off.
>>
>>107990116
>chunk of marble
ahahahahahahahahahahahah

HA-HA
>>
File: zbase_21.jpg (457 KB, 1440x1120)
457 KB
457 KB JPG
z base is kino, it's so good at styles
>>
File: Comparison2.jpg (4 MB, 8064x1728)
4 MB
4 MB JPG
`a professional DSLR photograph of a busy street in San Francisco.`
>>
>>107990061
would breed both
>>
>>107990130
kino style
>>
Honestly, flux always was also a shitmix. It's a problem. This is shitmixing, it isn't prompt following, because it's just triggering the "lora". That's why you get zit face and flux face.
>>
>>107990125
none of these models have that much of an anatomy problem unless you're using them wrong, it's a pretty dumb way to judge them, none of them are anything remotely like e.g. SD3 (which was just broken)
>>
>>107990130
this doesn't reflect who jesus was
>>
>>107990034
No I was using stable diffusion 1.5. I am just learning this stuff. I can generate images which are coming out great and it wasn't hard setting that up with the nodes. But trying to change existing images sucks (because I don't know how to do it)
>>
>>107990122
thanks, I knew something was wrong with my gens.
>>
>>107990061
that's basically what it looks like, IDK what you're expecting really
>>
>>
>>107990152
I wish I was this retarded it's probably bliss
>>
>>107990130
Sadly, they can stamp stuff. It's not capable of putting a turtle in a dress. it can do a turtle. and a dress with a woman in it... stamp. stamp. stamp. Once you see the stamps you can't unsee it. stamp stamp stamp.
>>
Before going to sleep, I laughed at the Z boys hard, and after waking up, I realize that there are the first signs of contact with reality, but their brains are still in a state of cognitive dissonance, ready to deny reality.
And I have to laugh myself half to death again.

The prime example of human stupidity caught up in herd mentality. Kek
>>
>>107990155
It represents who Jesus is, the destroyer of jewish abominations like non-white immigration.
>>
>>107990172
you believe in st.paul heresy
>>
File: 76.png (1.61 MB, 1024x1024)
1.61 MB
1.61 MB PNG
>>107990130
aesthetic tuning completely rapes the artistic capabilities of models, its infuriating, base is much superior for this kind of stuff, this is also why sd 1.4/5 was peak kino for artsy stuff, it was trained on raw internet material with no sloppy aesthetic tuning, pure unfettered art.
>>
File: F2Kb__00025_.png (1.97 MB, 1024x1024)
1.97 MB
1.97 MB PNG
>>107990166
You'll never matter.
>>
>>107990157
that's your issue. look into qwen edit. default workflows are available in comfy
>>
File: z-image_00054_.png (1.65 MB, 832x1216)
1.65 MB
1.65 MB PNG
>>
File: z-image_00055_.png (1.81 MB, 832x1216)
1.81 MB
1.81 MB PNG
SOVL
>>
>>107990175
You have no potential to be worthy of my enormous value.
>>
>>107990137
klein won
>>
>>107990188
>stamps

>>107990179
^ why are the phones the wrong way?

stamps.

stamp stamp stamp stamp.
banana banana banana banana.
>>
File: 1748614688356011.png (1.22 MB, 720x1280)
1.22 MB
1.22 MB PNG
Zisters, how we doin?
>>
>>107990125
Klein frequently has missing limbs.
>>
File: file.png (2 MB, 864x1536)
2 MB
2 MB PNG
>>
Z won. Again.
>>
>>107990225
unmatched kino very nice anon
>>
File: file.png (2.52 MB, 864x1536)
2.52 MB
2.52 MB PNG
>>
>>107990235
thanks
>>
File: it's fucking.gif (500 KB, 300x191)
500 KB
500 KB GIF
>>107990032
>>
>>107990130
kek
>>
>>107990168
> cognitive dissonance
I'm having such a good time
>>
>>107990225
Prompt?
>>
>>107990225
>image you could do on tons of models
ok
>>
>>107990277
that post is retarded if you read if though, and he's wrong, they DON'T work better on ZIT if they're trained on Base than the same one already trained on ZIT would
>>
Where are the sampler / scheduler grids
>>
>>107990285
Stop asian hate
>>
File: z-image_00056_.png (1.81 MB, 1216x832)
1.81 MB
1.81 MB PNG
>>
>>107990137
>photograph
ambiguous
>>
I think I completely forgot how to negative prompt well after using turbo and kedit
I am lost
>>
lora dumpster fire

save the asian race from shame, ace step 1.5!
>>
>>107990307
I member when women didn't have glued on fingernails.
>>
File: somebody's fetish.png (2.62 MB, 1792x1024)
2.62 MB
2.62 MB PNG
>>107990125
>>107990153
>>107990221
Example pics. Klein 4B-dist at 768x768 was pretty screwy for me, so I went up to 1024 after a few tries. That fixed the eyes, but still fucked up the horse legs repeatedly.
>>
>>107990288
Everything is retarded, every single post, comment and upvote around this topic.
That's the funny thing about it.
Especially the comments that compare it to SD1.5 and seriously argue how much freedom they have with this model.

Really, this is the best model release I've ever had. I just enjoy and delight in being.
>>
>>107990315
prompt?
>>
>>107990307
it's a logically constructed English sentence that should not result in a photographer, or a DSLR camera by itself, with this kind of model.
>>
File: Flux2-Klein_00003_.png (1.63 MB, 1024x1024)
1.63 MB
1.63 MB PNG
>>107990321
>Young Greek woman with yellow eyes and dark red hair with bangs and sidelocks, tied in a flat bun behind her head with a dark purple ribbon. Wearing a white sleeveless tunic with a dark purple corset, and a wide dark purple gold-patterned sash over her right shoulder, across her chest, and around her waist. Her arms are bare, with dark purple fingerless gloves covering only her hands. Black thighhigh stockings and pantyhose with garterbelt, black harness straps hanging from her waist, gold armored sandals, and a black choker around her neck. She is glaring fiercely at the viewer and aiming a white bow and arrow, while riding a large black warhorse with a white mane, amber eyes, and black-and-gold faceplate, rearing its front hooves off the ground in a cloud of dust. The horse and rider are in a sunlit canyon under a blue sky. Dynamic close-up action shot from below. Cinematic photo.
>>
>>107990315
9b base, ok?

I did download 4b, but I haven't moved it to my comfy, cuz into why person of non-white appearance?
>>
>>107990320
based
>>
>>107990315
>>107990344
if increasing the steps worsen the images (artifacts) you can try res_2s or 3s, it will be 2x and 3x slower respectively but it tend to help with these kinds of stuff a bit
>>
>>107990344
when are we going to get a model that can actually draw a bowstring over a face?
>>
File: image.png (612 KB, 720x720)
612 KB
612 KB PNG
>>107990315
768x768? Are you generating on a potato, my dear Saar?

My first z base gen (hugging space)
> A smelly indian sitting in front of its computer. His computer is a potato.

Im impressed
>>
>make the girl in source image follow dance from a reference video
how do I do it? best model for this?
>>
>>107990345
I tried 9b-dist once, but it was either really slow or flat-out too big for my hardware to run, so I didn't download 9b base.
>>
File: F2Kb__00026_.png (1.86 MB, 1024x1024)
1.86 MB
1.86 MB PNG
Klein.
4B
>>
File: z-image_00004_.png (1.27 MB, 1024x1024)
1.27 MB
1.27 MB PNG
>>107990377
this is flux-2-klein-base-9b-fp8:
>>107990378

picrel is zbase. :rolleyes:
(yes I know this isn't a forum)

Trying to decide. Delete zbase or no?
>>
>>107990363
iGPU laptop. I can go bigger, but not without kicking the text encoder out of RAM, which slows down prompt testing/iterating.
>>
>>107990388
prompt:
A photograph in bleach-washed Kodachrome style.

A bioluminescent turtle has orange fabric stretched on it and giant black pump heels fitted to its feet awkwardly.

there is a runway event, we see the backs of phones.

****
the only thing I like about zbase is how good it is at generating inhuman monsters with weird little eyes like that. freaky!
>>
Retard here.
Could I generate coomer slop on a 3060ti?
What are the least amount of steps I could take to get there?
I perused some of the links in the OP but I don't know most of these terms.
Thanks for any help, I'm literally stupid
>>
>>
>zib gen without sage attention : 360s
>zib gen with sage attention 2++ : 301s
yeah I'd really like the weird artifacts issues to be corrected, the speedup is appreciable
>>
>>107990388
Please don't take that away from me. Just believe in it for two more weeks.
>>
>>107990344
>rearing
i have a theory that using terms like "rearing it's front hooves" isn't clear to the model so it just scrambles the legs. “a horse lifting its front hooves while standing only on its hind legs” is clearer. i've had to do a lot of restructuring prompts to avoid this kind of thing but once the prompt works it works well
>>
>>107990410
>>
Ace step 1.5 is available now (If you are an influencesaar.)
>>
>>107990328
Zimage is an ESL model
>>
File: z-image-upscaled_00021_.png (1.67 MB, 1608x1096)
1.67 MB
1.67 MB PNG
>>
>>107990378
>>107990388
Z-Image gets the phone camera angles more correct, interestingly. I am impress.
>>
>>107990413
Yes, but it won't be very fast and you'll be limited to quants (These are sort of like "compressed" versions of models). If you have an okay amount of ram (32gb at least) You can utilize that as well, but VRAM is always going to be faster.
>>
>>
>>
File: 1759371067875252.png (2.85 MB, 1752x1168)
2.85 MB
2.85 MB PNG
>>
is base out?
>>
>>107990473
not yet, 2 more weeks
>>
>>107990473
yeah
>>
>>107990199
>klein won
Except that's not what SF looks like
>>
File: image(1).png (1.35 MB, 1024x1024)
1.35 MB
1.35 MB PNG
What a bummer.
Now that z base is a flop, we don't have a base model that would make sense as the next generation of finetunes.

Total diffusion death.
>>
File: F2Kb__00028_.png (2.2 MB, 1024x1024)
2.2 MB
2.2 MB PNG
>>107990433
saucy gal

>>107990452
yeah. funny how it just threw the shoes in there like uh. done.

>>107990363
Klein's putting it up. Working on seeing if I can prompt the idea of a computer that actually is potato shaped, and I think maybe I need to make it a potato or something that happens to have computer features.
>>
>some seeds are worse than others
It's real
>>
>>107990371
/g/ is so useless
>>
File: F2Kb__00029_.png (1.89 MB, 1024x1024)
1.89 MB
1.89 MB PNG
>>107990483
whoa, calm down bro!!!
>>
File: z-image_00062_.png (1.61 MB, 1216x832)
1.61 MB
1.61 MB PNG
>>
Z image is such a shitshow, I'm returning to SDXL and ControlNet. You losers can keep pretending it's a finetuner's dream or whatever cope you need, later, retards.
>>
Has ChatGPT always been racist towards Indians too?
>>
File: ComfyUI_00006_.png (1.39 MB, 960x1280)
1.39 MB
1.39 MB PNG
base + lora isn't that bad for a first try
>>
>>107990565
i got brave's ai to create an entire list of racist quotes that it came up with on its own, using southern vernacular

making ai misbehave is a lot of fun
>>
File: F2Kb__00030_.png (1.65 MB, 1024x1024)
1.65 MB
1.65 MB PNG
>>107990547
>>
the dust is starting to settle aaaandd... ouch.. yeahhh that's gonna leave a mark. looks like z-image flopped massively
>>
Noob Z will be another overfitted trainwreck and Tongy Labs will shit out yet another half assed overfitted finetune like clockwork.
>>
>>107990570
man torso
>>
File: kekmax.png (130 KB, 259x262)
130 KB
130 KB PNG
>>107990570
jej
>>
File: ComfyUI_temp_npyqf_00001_.jpg (476 KB, 1694x1286)
476 KB
476 KB JPG
>>107990592
>>107990604
iterating is gonna be slow but there's plenty of gains to make
>>
>>107990589
Noob Z is not even happening. Nobody believes that shit anymore, especially when they can't even deliver the rest of the z-slop family. It was but another marketing trick to get people to tune in
>>
>>107990610
none of this is better than ZIT in any way
>>
>>107990610
whats your dataset like
>>
File: F2Kb__00032_.png (1.71 MB, 1024x1024)
1.71 MB
1.71 MB PNG
>>107990604
>>
File: 1756481184391022.png (2.18 MB, 1168x1752)
2.18 MB
2.18 MB PNG
looks like the <think> trick still works
>>107990548
nice
>>
>>107990611
This is textbook chinese culture, ame story with llms they sold vapor with DeepSeek, everyone lost their minds, and then every other model release afterward turned into low effort slop.
>>
Any point in still combining NAG with zimage base being able to go cfg>1?
>>
>>107990413
for comfy, follow the setup instructions in their docs
https://docs.comfy.org/installation/manual_install

it's pretty straightforward but you'll have to install a few things and figure out how virtual environments work (think of it as a condom for comfyui)

once you have comfy running (it won't launch automatically, it'll give you a local IP address that functions as its interface), you can just browse templates and choose the one you want to use (wan2.2 for video for example). you'll have to download the tensor files (which are huge) into the appropriate folders, but it tells you where to put them.

if you don't want to learn how to do these things i guess you can just use the installer, but i've heard that's limited in function
>>
>>107990646
It was 100x slower with NAG. I still had it hooked up when I first tried it and it would've taken 15min for a 1024*1024 gen.
>>
>>107990570
i came buckets
>>
File: Zimage_base__00142_.png (1.77 MB, 1024x1024)
1.77 MB
1.77 MB PNG
>>
>>107990570
>isn't that bad
Insane cope, your dopamine expectations were manipulated to death by Chinese culture.
>>
>ltx2
verdict?
>>
>>107990629
><think> trick
what's that? unless you mean llm prompt generation



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.