[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


Discussion of Free and Open Source Text-to-Image/Video Models

Prev: >>107368734

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/kohya-ss/sd-scripts
https://github.com/tdrussell/diffusion-pipe
https://github.com/ostris/ai-toolkit

>Z
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo
https://comfyanonymous.github.io/ComfyUI_examples/z_image/

>WanX
https://comfyanonymous.github.io/ComfyUI_examples/wan22/

>NetaYume
https://civitai.com/models/1790792?modelVersionId=2298660
https://nieta-art.feishu.cn/wiki/RY3GwpT59icIQlkWXEfcCqIMnQd

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
>>
File: z0001.jpg (2.4 MB, 2048x2048)
2.4 MB
2.4 MB JPG
>>
File: We're so fucking back!.jpg (2.01 MB, 3840x2160)
2.01 MB
2.01 MB JPG
This model is black magic, how can it look so good while being a fucking 6b model, goddam.
>>
1girl, standing, looking at viewer,
>>
>>107372526
dancing in the ballroom of the titanic?
>>
File: 3595139124.png (3.93 MB, 1216x1856)
3.93 MB
3.93 MB PNG
>>
>train realistic lora
>apply to animu!?
>>
all this improvement and prompt adherence and we're still just prompting 1girl, standing
>>
File: 1456636796.jpg (3.86 MB, 1792x4783)
3.86 MB
3.86 MB JPG
So what's the verdict on Euler Normal? I feel like I'm liking the Euler Beta outputs more.
>>
File: ComfyUI_00647_.png (1.81 MB, 1280x896)
1.81 MB
1.81 MB PNG
>>107372526
>under the sea dance
nice
>>
File: 1740715856679791.png (3.32 MB, 1440x1920)
3.32 MB
3.32 MB PNG
>>107368173
if anyone cares about my training experimentation: i added an additional 2000 steps to a total of 6000 and it greatly improved the lora in every aspect. so for me right now it seems that i should aim for 6000-7000 steps for a 1024px based lora.
>>
>>107372569
>So what's the verdict on Euler Normal?
supposedly it's the official scheduler to use
>>107368919
>>107369419
>>
File: ComfyUI_temp_pfprh_00171_.jpg (3.23 MB, 2048x2048)
3.23 MB
3.23 MB JPG
>>
>>107372589
I volunteer to go in first
>>
>>107372579
what optimizer and learning rate?
>>
>>107372564
>>107372579
How do you train Z?
>>
File: 1736954416437640.png (3.17 MB, 1920x1080)
3.17 MB
3.17 MB PNG
>>107372526
>>
File: 1813723217.jpg (2.94 MB, 3200x1792)
2.94 MB
2.94 MB JPG
>>107372579
That's a lot of steps.
>>
anything with which one can run z-image with just an exe like kobold ? just click and done nothing else
>>
File: 649969154.png (1.56 MB, 1600x896)
1.56 MB
1.56 MB PNG
>>
File: 1740899629600546.png (3.11 MB, 1224x1840)
3.11 MB
3.11 MB PNG
>>107372605
All default settings except for lowering rank to 4. AdamW8bit, and 0.0001 LR. I also changed timestep type to Sigmoid from Weighted for the last 2000 steps. Oh and i also added custom bucket sizes to fit my chroma dataset.
>>
File: metal_giar_chinese_film.jpg (910 KB, 1536x2048)
910 KB
910 KB JPG
>wrong image
>>
File: Z Image Turbo.png (3.49 MB, 1536x1536)
3.49 MB
3.49 MB PNG
>>
here comes the tifasaaaaar spam
>>
File: ComfyUI_00056_.png (3.89 MB, 1536x2048)
3.89 MB
3.89 MB PNG
At least the characters know how to play musical instruments.
>>
File: z-image_00031_.jpg (1.47 MB, 1664x2432)
1.47 MB
1.47 MB JPG
>>
>>107372608
ostris ai toolkit
>>
what the fuck is the point of z-image if it cant do porn? pointless
>>
>>107372617
share the workflow with the Lora node attached please
>>
File: 2514204216.png (3.28 MB, 1248x1824)
3.28 MB
3.28 MB PNG
>>107372654
If you want good likeness, why would you do rank 4? Try 1500 steps, batch size 2, rank 32.
>>
>>107372684
woops coombrain logged on
>>
File: 3439656041.png (1.76 MB, 1600x896)
1.76 MB
1.76 MB PNG
>>107372692
I'm using forge-neo, sorry.
>>
>wahhh it cant do jewish goyslop pedowood content wahhh
>>
>>107372684
wait for the coom loras lol
>>
>>107372711
Could you take an screenshot of the upper part of the webUI to look at the settings?
>>
File: 1744756432098948.png (3.37 MB, 1920x1080)
3.37 MB
3.37 MB PNG
>>107372614
>>
File: 4ch.png (227 KB, 1276x846)
227 KB
227 KB PNG
first good use of my 10gig connection
>>
File: ComfyUI_00059_.png (1.73 MB, 1024x1536)
1.73 MB
1.73 MB PNG
I guess every woman will look kind of chinky unless you specify otherwise
>>
>>107372696
i've always gotten better results with low ranks for characters. maybe i should try high rank next.
>>
>>107372711
>>107372651
>>107372617

Okay so with this, any SDXL gen cannot compete if it were not for copyrighted characters or specific styles. SDXL literally got BTFO with these compositions in terms of OC, neither NetaYume
>>
File: 982426560.png (215 KB, 2502x734)
215 KB
215 KB PNG
>>107372727
Sure, I guess.
>>
>>107372702
porn is all AI is good for.
>>
>>107372684
this but unironically, if a model can't approximate every single entry on the list of paraphilias Wikipedia page, it's literally over for the team of chucklefucks who wasted time training it
>>
>>107372761
this on windows?
>>
File: 1745493812647162.png (308 KB, 1649x847)
308 KB
308 KB PNG
>>107372744
Comfy must be so happy this model exists, I'm sure a lot of people jumped to the spaggheti land just to try Z-image
https://xcancel.com/tphuang/status/1994392411917947097#m
>>
File: 283514060.png (1.5 MB, 1600x896)
1.5 MB
1.5 MB PNG
>>107372750
That's weird, because rank 4 is very bottlenecked.
>>
File: ComfyUI_00366_.png (2.81 MB, 1920x1440)
2.81 MB
2.81 MB PNG
Rapunzel
https://civitai.com/models/2176561?modelVersionId=2451052
>>
>>107372761
thanks so it was Lumina after all, time to say good bye to comfy
>>
>>107372777
ye
>>
File: 1756010505980525.png (3.16 MB, 1920x1080)
3.16 MB
3.16 MB PNG
>>107372734
>>
>>107372780
and his workflow still had a redundant text note to confuse people,
a bypassed flow node that is set to 3 which confuses people since even if bypassed it still is in effect with the default value 3
and he set an opinionated 9 steps generation for some reason on an 8 step by default model to increase gen times by 10% just for the fuck of it
but the bypassed thing he DIDNT include is a lora loader node to help people who will obv be wanting to use loras with this but wont know with which node and where to place it

this is all the reason many FOSS is a joke to the public, because autistic low iq retards cant put themselves in the shoes of someone who isnt them for 1 second to save their life.
>>
>>107372780
>>107372744
ehemmmm
>>
>>107372857
but neoforge is windows only no?

i cloned the git repo but its all just bat files and shit
>>
>>107372862
Yeah, it's only windows, but fuck Comfy
>>
File: Seed Variation.png (2.78 MB, 1336x1582)
2.78 MB
2.78 MB PNG
>>107372857
But can you do that on Neo Forge though?
https://www.reddit.com/r/StableDiffusion/comments/1p9mypu/even_more_improved_zimage_turbo_variation/
>>
>>107372761
did you try hiresfix or controlnet or img2img? I could do that with Neta Lumina in NeoForge
>>
File: ComfyUI_00606_.png (2.18 MB, 1280x1088)
2.18 MB
2.18 MB PNG
samir you're breaking the car
>>
>>107372870
literally don't care, I prefer waiting that using fucking nodes
>>
>>107371727
>Z-turbo seems very overcooked. Very little variation from seed to seed.

I'm really not appreciating anons bitching about this. Variation is FUCKING BAD, it kills style consistency from generation-to-generation which is horrendously awful when you're trying to create a lot of images in the same style like assets for a video game.

Variation is so easy to incorporate in a million different ways in your workflow, you can insert noise, randomize prompts, and there's tons of workflow nodes in comfyui that influence the math. The default should be plain-as-fuck oatmeal.
>>
File: psxcheeks_0010.png (1.14 MB, 896x1152)
1.14 MB
1.14 MB PNG
ngl, i don't even like anime but zimg has me bricked tf up

>>107372886
kek
>>
File: z-image_00035_.jpg (990 KB, 2048x2048)
990 KB
990 KB JPG
>>
>>107372857
based
>>
>>107372892
>ngl
>bricked tf up
i fucking hate ebonics so goddamn much holy fucking SHIT

anyways, can't wait for the base model fr fr ongah no cap straight bussin to 1girl slop with my unc prompts
>>
>>107372886
lamoo
>>
>it's still not out
come on
>>
>>107372857
I'll name my first born child after him for saving me from noodles.
>>
>>107372857
nice, especially after comfy fucked up loras for z-image. it's clear comfy is a tasteless hack, even sdxl gens look better in a1111/forge than they do in comfyui.
>>
>>107372579
>so for me right now it seems that i should aim for 6000-7000 steps for a 1024px based lora.
how many hours does that take you guys on a 3090 rtx?
>>
>>107372935
bout tree fiddy
>>
>>107372935
about an hour per 1000 steps at 1024px on my 4080
>>
File: 3891.jpg (741 KB, 2048x1536)
741 KB
741 KB JPG
Sadly nano banana is still more precise than z-image


> A highly realistic, low-quality amateur smartphone photo taken in a suburban backyard under overcast evening light.
>
> **Subject:** A pale Caucasian female cosplayer with a soft, fleshy, non-athletic build. She has black hair styled in a thick side-braid resting on her left shoulder. She wears pointed elf ear prosthetics and gold earrings. Her expression is relaxed with a gentle smirk, head tilted slightly to the side, looking directly at the camera.
>
> **Pose:** She is NOT standing stiffly front-facing. Her torso is turned slightly to her right (approx 15 degrees) in a natural, relaxed off-center stance.
>
> **Anatomy & Physics:** Extremely massive natural breasts. They hang low and heavy in a teardrop shape due to gravity.
>
> **Costume Precision:**
> She wears a deep emerald green velvet fantasy outfit.
> 1. **The Top:** A hooded mantle/capelet. The front consists of **two very wide, rectangular velvet panels** that drape over the bust.
> 2. **Safety & Fit:** These panels are **wide and heavy**, covering 75% of the breast volume to ensure distinct coverage, leaving only a deep cleavage visible.
> 3. **Gravity:** The panels hang straight down vertically past the waist, weighted by heavy gold metal tassels at the hem which pull the fabric taut.
> 4. **Details:** Gold circular emblems on shoulders. A gold diamond-shaped brooch with a green gem at the neck.
>
> **Accessories:** A gold chain belt digging slightly into the soft waist. A large **Golden Sun Medallion** hangs at the upper thigh, backed by a rectangular green velvet loincloth.
>
> **Background:** A messy backyard with a weathered wooden privacy fence and patchy grass. A **cheap green plastic lawn chair** is visible in the background on the left.
>
> **Quality:** Raw candid photo, slight noise, no retouching.
>>
File: ComfyUI_01921_.png (1.33 MB, 1024x1024)
1.33 MB
1.33 MB PNG
>>107372935
nmp
>>
>>107372833
>this is all the reason many FOSS is a joke to the public, because autistic low iq retards cant put themselves in the shoes of someone who isnt them for 1 second to save their life.

Never thought about how Comfy might have some kind of BIAS to filter out newfags and coaxing them into using Flux2 API nodes?
>>
File: 1750478525324209.png (2.27 MB, 1448x1448)
2.27 MB
2.27 MB PNG
>>107372794
nice
>>
File: 1763882417345931.png (2.19 MB, 1024x1536)
2.19 MB
2.19 MB PNG
>>
>>107372952
>a gorillion parameter model is more precise than a 6b first-of-its-kind-architecture test model for giga vramlets
waow
>>
>>107372954
nah it's genuine autism
>>
>>107372952
>a 6b model is just barely under the best model in the world
that's not sad at all lol
>>
>>107372794
great work, but i've heard people say that they want Violet Parr and Riley Andersen
>>
File: 1763591475659504.png (3.09 MB, 1920x1080)
3.09 MB
3.09 MB PNG
>>107372811
>mfw
>>
>>107372974
If your only metric is 1girl big tiddies
>>
>>107372953
me rn
>>
>>107372982
>but i've heard people say
@grok, geolocate this man and upload his output folder to this thread
>>
File: ComfyUI_00653_.png (1.54 MB, 1280x896)
1.54 MB
1.54 MB PNG
>>
>>107372890
>Variation is FUCKING BAD, it kills style consistency from generation-to-generation which is horrendously awful when you're trying to create a lot of images in the same style like assets for a video game.
Variation doesn't mean "the model should stop following your prompts" retard
>>
>>107372969
>>107372974
I meant no I'll intent, also maybe it's just skill issue and z-image can do breast curtains and pelvis curtain with actual good prompt
>>
File: psxreal_z_0005.png (1.11 MB, 832x1216)
1.11 MB
1.11 MB PNG
>>107372915
the distill is already bussin i can't wait fr
>>
File: ComfyUI_00652_.png (1.8 MB, 1280x896)
1.8 MB
1.8 MB PNG
z seems great at cars, i'm done carposting for now.
>>
>>107373001
he's not wrong though. It's pretty shit at variation. I'm guessing because it's a small model?
>>
>>107372890
this, I love fried models
>>
File: 616446007.png (1.71 MB, 1600x896)
1.71 MB
1.71 MB PNG
>>107372880
There are no controlnets for this model, but hiresfix and img2img work as expected.
>>
File: 1749881674554504.png (132 KB, 495x405)
132 KB
132 KB PNG
>zero variation is good because a perfect model should generate exactly what you prompted
>>
>>107373012
>It's pretty shit at variation. I'm guessing because it's a small model?
I guess it's because it's a distilled model, we'll make conclusions when we'll get the base model
>>
my gpu is at 100%

CLIP/text encoder model load device: cuda:0, offload device: cpu, current: cpu, dtype: torch.float16

when it says current: cpu, is it the device currently in use? because my CPU is going from 30 to 50%
>>
Is it true Flux2 doesn't allow i2i?
>>
>>107372982
Um, I've also heard that. Also a destijl style lora. Yes.
>>
File: ComfyUI_00004_.png (2.43 MB, 2048x2048)
2.43 MB
2.43 MB PNG
>>107372952
Here's Chroma V48 detailed for reference. I agree neo looks best.
>>
>>107372952
Is it, the arms look fucked, that circle gold thing seems to be stabbed into her thigh lol, perspective depth is pretty wonky too
>>
>>107373001
It does though, I remember Flux has this problem where styles are slightly different between seeds. SDXL (and finetunes) always maintained the exact precise strength of style.

And it's very likely the reason Flux had this problem is simply so they could generate more diverse people when using a vague term like "woman". Because if you're going to interpret "woman" in multiple ways - whose to say you're not going to interpret "illustration" in multiple ways (and other style terms).
>>
>>107373048
>Chroma V48 detailed
ironic name, the details are ass
>>
>>107373053
>It does though
then it's shit variation, no one asks for that, obviously
>>
File: z-image_00040_.jpg (1.34 MB, 2048x2048)
1.34 MB
1.34 MB JPG
>>
>>107372985
lel
>>
As a rule of thumb how many steps do you train a style 32 rank lora per 1024x1024 image in the dataset?
>>
File: Untitled.png (1.32 MB, 993x736)
1.32 MB
1.32 MB PNG
> training a 2-layer lora for zimg
> still the same damn size as training the full net
wups
>>
File: 1665434769469907.jpg (67 KB, 600x600)
67 KB
67 KB JPG
>>107373080
>>
>>107372952
nano banana iteratively improves flaws on image generations, this ability can be added to any local model once sufficient local vision capacities are available & someone writes the agentic glue code
>>
>>107373075
>It's bad when it's bad and it's good when it's good
Wow real fucking insightful.

You're not going to get anywhere unless you think like a programmer.
>>
civitai download isn't working for me. I get "The connection has timed out", just me?
>>
File: 1761503369187686.jpg (128 KB, 960x960)
128 KB
128 KB JPG
my fedora is going to explode when this model is finetuned for nudes
>>
>>107373106
again, you're thinking that to get variation the model has to stop following your prompts, it's completly retarded I have no idea why you believe such nonsense, are you low IQ anon? >>107372870
>>
>>107373050
It's not the only example I made.
But I agree it's pretty slopped compare to Z-image.

https://files.catbox.moe/4hsjjr.png
https://files.catbox.moe/jegabs.png
https://files.catbox.moe/e8srok.png
https://files.catbox.moe/mc420m.png
https://files.catbox.moe/vv0p1w.png
>>
Ok, so who wonned and who losted?
For me

Wonners:
Z image Turbo
SDXL

Losers:
Black Forest Lab
Lodestone
NetaYume
>>
File: 1753686170347090.png (3.18 MB, 1920x1080)
3.18 MB
3.18 MB PNG
>>107372985
>>
>>107372526
yeah it's crazy, but for me it means the bigger models should look way better than what they do
>>
>>107372684
based, if a model can't do futa with veiny cocks then its worthless.
>>
you can train z on civit
I just need to know if i have to retag my images since its they're danbooru style tags
>>
>>107372765
this. making surreal landscapes and memes is fun for first few months but after that it's mostly just porn and waifus.
>>
>>107372579
> to a total of 6000
> 6000
maybe you should raise lr instead
>>
>>107373197
>after that it's mostly just porn and waifus.
Wan is still about memes though
>>
>>107373157
Winners:
Z Image Turbo (Alibaba)
ComfyUI

Losers:
BFL
Tencent
The 10x Forge forks competing against each other.
>>
>>107373157
>Z image Turbo
right now at least
>SDXL
not since noobai

>Black Forest Lab
yes
>Lodestone
if he convinces the z team to put up with his autism then he will be winning bigly unfortunately
>NetaYume
he doesnt seem to mind, hes releasing the final version and then moving on to z
>>
wan 2.5 when
>>
>>107373227
>ComfyUI
>winner
the general hasn't had a comfy hateboner this hard since the UI wars. they fucked up so many things in the past few releases and chose to put a spotlight on flux2. everyone sees comfyorg for what it is, a grifty company that will sell off everything and throw all the users under the bus when they get the chance
>>
>>107373227
>Tencent
which is?
>>
>>107373219
yeah and i dont really care about it
>>
>>107373268
and I don't care that you don't care
>>
>>107373276
and i don't care that you don't care that i don't care
>>
>>107373276
I care that you don't care
>>
>>107373263
it really was the cherry on the cake that their PR took a hit just adding a model and advertising the shittier one. sick of the API node shilling constantly on their discord too. fuck grift chink and they should get rid of him
>>
>>107373227
No matter how much comfy is ahead with model support and corpo money people still fucking detest it and try to avoid using it at all costs. He's so fucking mentally ill for forcing this garbage noodle ui on everyone. there are examples of good node interfaces like blender but comfy fucking sucks
>>
>ornate intricate hyprefine details actually works
we've never been so back
>>
>>107373263
https://huggingface.co/Comfy-Org/z_image_turbo
>679,715 downloads

https://huggingface.co/Tongyi-MAI/Z-Image-Turbo
>31,451 downloads

ComfyUI seems to be winning pretty hard.

>>107373267
They released that 80B hunyuan image 3 model.
>>
>>107373263
>>107373294
that's mad cope, he probably gathered thousands of new users since the Z-image release, those new people have no idea about his ancient shenanigans and they don't care lol
>>
>>107373307
Doese HF account for model download numbers being pilfered like this? That's so fucked up.
>>
>>107373227
>winners
>comfyUI
How funny, I set myself a personal goal to make sure Comfy doesn't become the default UI in 2026, at least I managed to get him to stop posting here. Guess I got to keep shitposting then...
>>
>>107373157
Absolutely Winning
>Z Image
Basically immortal
>SDXL
>WAN
Coping
>Chroma
>Neta Yume
>APIkeks
Disaster
>Flux 2
Failbake
>Hunyuan 80b
>Pony v7
>>
diffusion-pipe now supports Z-Image (and Ostris's adapter for training the Turbo version).
>>
File: 1747316763790683.jpg (456 KB, 1875x1787)
456 KB
456 KB JPG
>>107373307
>>679,715 downloads
HOLY FUCK??
>>
>>107373319
I've been saying this since they started doing this. it's fucking stolen valor
>>
>>107373320
Hi Julien
>>
>>107373307
what the fuck
>>
>>107373309
This. Like it or not, Comfy is really the ONLY inference application at this point, supports local models within hours while the 'competition' take months to support a subset of local models if at all.

I'd love competition but there simply isn't any.
>>
>>107373350
huggingface has nothing to do with the code
>>
>>107373319
It's their own fault for shipping models in stupid formats like: https://huggingface.co/Tongyi-MAI/Z-Image-Turbo/tree/main/transformer
>>
File: zimg_852.png (3.37 MB, 1536x1536)
3.37 MB
3.37 MB PNG
>>107373320
>make sure Comfy doesn't become the default UI in 2026
It's already the de facto default UI, and has been for a long time

Get a grip
>>
File: 1733964493678071.png (3.43 MB, 1920x1080)
3.43 MB
3.43 MB PNG
>>107373161
>>
File: ComfyUI_00005_.webm (1.16 MB, 1008x560)
1.16 MB
1.16 MB WEBM
>>107372485
Please help people. What I doing wrong?
>>
>>107373370
mummy
>>
HF should force comfy-org account to purchase enterprise subscription. Imagine how much it costs them to be free CDN lmao
>>
>>107373263
>the general
you mean (You) and one other cumfy FUD schizo?
>>
>>107373357
give chinaman a break hes literally saving local and stop apologizing for kumfart
>>
File: ComfyUI_00061_.png (1.6 MB, 1536x2048)
1.6 MB
1.6 MB PNG
>>
>>107373307
nobody fucking wants to deal with diffusers format
>>
>>107373390
I hope they try it. Comfy going and building their own model hosting infrastructure to compete against huggingface would be a huge loss for them and a win for everyone else.
>>
>>107373307
>679,715 downloads
>in 3 days
this is insane what the hell, what the helly?
https://www.youtube.com/shorts/xuDf5WM0ba8
>>
>>107373370
I already make him not to come here and shill his grift garbaeUI, I did something about it, now I just go to keep doing my work
>>107373332
I'm not Julien, in fact I kicked Julien out of /g/
>>
>>107373426
>comfy building anything of value
More chance of the sun exploding tomorrow
>huge loss for them
If anything it would be a huge win for them anyway, comfy using HF as a CDN provides no value to HF, the majority of users aren't even aware that HF is providing the downloads so they receive no service interaction or additional users. By removing comfy's free CDN they'd push more users towards diffusers and gain more site users
>>
>>107373451
we get it, you're a deranged retard
>>
>>107373357
>could just pr the convenient model
>fuck up the leaderboards instead
>>
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo/discussions/26
SAAR 6B IS A GIANT MODEL I CANNOT RUN IT ON MY GTX970 SAAR PLZ DO THE NEEDFUL
>>
>>107373324
>diffusion-pipe now supports Z-Image
Ahh time start training!
>>
File: 1751935402504541.png (2.04 MB, 1448x1448)
2.04 MB
2.04 MB PNG
>update comfy to use z-image
>now my wan workflow is broked
>>
File: 1747127036494375.jpg (714 KB, 2048x2048)
714 KB
714 KB JPG
>>107373019
What upscaler are you using?
>>
>>107373441
Z-Image Turbo is REALLY popular, and Comfy is where you run it locally

Even people with potato gpu's can run it and it delivers insane quality for its size, but yeah ~700k downloads already...
>>
>>107373307
>https://huggingface.co/Comfy-Org/z_image_turbo
>>679,715 downloads
Flux 2 status?
>>
>>107373514
>Introducing the distill of our new open weights model!
>(gets 700k times in three days)
>So like I was saying, introducing the distill of our new SaaS model!
>>
>>107373514
What do you mean?
>>
>>107373527
we are fucked
>>
>>107373541
It hasn't happened yet but you know it's going to.
>>
>Can't make porn with it
Explain Why I should care
>>
File: 1743666254899856.png (221 KB, 3085x615)
221 KB
221 KB PNG
>>107373522
>Flux 2 status?
168,078 :^)
>>
>>107373514
runs best on neoforge
>>
>>107373514
GTFO ComfyAnon
>>
yesterday the team said "pushing forward with the open-sourcing process." and "long live open-source!" please for the love of god chang do not let us down
>>
>>107373527
>>107373541
>>107373545
stop dooming omg :(
>>
File: 2491584473.jpg (724 KB, 1792x2304)
724 KB
724 KB JPG
Okay so with the current LoRA implementation you have to run it at 0.75 strength or it gets overcooked. That doesn't seem right, but oh well at least it works.
>>
>>107373561
>:3
ZiT will do the same path as Illustrious
>>
I can't wait for the day we throw comfyui in the trash
>>
>>107373514
Damn, the official release only has 30k downloads.
Comfy strong. Deserved, too.
>>
I wish I could be a fly in black forest offices so I could hear and laugh at them shitting their suits
>>
WHERE THE FUCK IS RADIANCE SHILL, NETA LUMINA SHIL AND CHROMA SHILL!?!!? WHERE ARE MY NORMAL AND ORGANIC EVERYDAY POSTERS!?!?!
>>
File: 1757284086215368.png (3.33 MB, 1920x1080)
3.33 MB
3.33 MB PNG
>>107373378
>>
>>107373573
>Comfy strong. Deserved, too.
kill yourself comfy shill
>>
>>107373573
what does stealing valor have to do with the app? those downloads should be the lab's
>>
>>107373594
you are alone against 679,715 Comfy fans btw, you lost
>>
can i use Z-Image on a 12gb card or should I stick with Flux for now
>>
>>107373379
Post noodles. Could be bad shift or sampler/scheduler combo.
>>
>>107373584
Crazy right? also they were doing it 100% as a hobby, just to have fun
>>
4am sunday in china if it doesnt release today im killing myself and the other 4 anons ITT
>>
The base model isn't even out but people are training loras on Turbo like crazy already, this is what winning looks like
>>
>>107373584
>CHROMA SHILL
>NORMAL AND ORGANIC POSTERS
keeek
>>
>>107373610
eat your onions
stick with flux
pay for your API nodes
>>
>>107373607
I downloaded it from comfy but I use forge because it's better
>>
File: 156165165.jpg (1004 KB, 2321x1294)
1004 KB
1004 KB JPG
>>107373620
By noodles you mean this?
>>
>>107373610
what? flux is 2 times bigger than Zimage, if you can run flux you can run that model
>>
>>107373646
thanks for putting yet another screenshot of griftui in the thread! thank you so much!
>>
>>107373584
I was a Chroma shill up until Z-Image since it was the best for realism and NSFW.

Z-Image totally wipes everything else off the map when it comes to image generation, and since it's not been poisoned against NSFW like the Flux models, introducing / enhancing missing part will be easy.
>>
File: chroma_00027_.png (1.15 MB, 1024x1536)
1.15 MB
1.15 MB PNG
kinda weird going back to chroma, feels so slow now

>>107373646
turn down cfg, try 4
>>
>botting download stats to dupe investors
>>
>>107373646
WAN22 consists of a high-noise model and a low-noise model, you're only running the high-noise half. It's not the same thing as WAN 2.1.
Load up Comfy's default WAN22 workflow and start from there.
>>
File: NetaYumev35.png (1.79 MB, 1024x1536)
1.79 MB
1.79 MB PNG
>>107373584
NetaYume...so based!
>>
>>107373672
yeah wtf theres no way it has over half a mil downloads
its hyped but no way
>>
>>107373658
>I was a Chroma shill up until Z-Image since it was the best for realism and NSFW.
based, the more people move on from this failure the better, it poisoned the discourse of this general so much I'm glad it's finally over
>>
File: 1750862825026278.png (293 KB, 1616x831)
293 KB
293 KB PNG
>>107373672
>>107373682
not ready, this model will be the Deepseek of the diffusion ecosystem
https://xcancel.com/tphuang/status/1994392411917947097#m
>>
>>107373610
im a vramlet so im using the gguf
>>
>>107373705
oh ononono it wil be SaaS, I can imagine it
>>
>'ComfyUI-EulerFlowMatchingDiscreteScheduler': With the current security level configuration, only custom nodes from the "default channel" can be installed.
fuck off comfy
>>
>>107373656
sorry about your brain damage
>>
File: ai-toolkit samples.jpg (3.98 MB, 2803x8192)
3.98 MB
3.98 MB JPG
I'm training Z Image Turbo DALL-E 3-like Girls LoRA on ~110 1024x1024 images at default settings unquanted and the samples in ai-toolkit look terrible a lot of the times, body horror, concept bleeding, overfitting very heavily, but when I test the checkpoints in comfyui the lora keeps getting better and better and just gets better likeness over time without breaking down when tested at ~0.7 strength which seems to be the sweet spot no matter the checkpoint like multiple people realized.

And now I'm at 9250 steps and will train until it actually starts breaking down.

Test your LoRAs in comfyui at resolution higher than 1024x1024, like 1536x1536, and at 0.7 strength instead of looking too much at ai-toolkit samples and continue training until the model actually breaks down in comfyui itself.

Is this kind of output breaking mid training before it fixes itself later common?
>>
whats the think tag or whatever? is it snake oil or does it work? what should i write?
>>
File: radiance.png (2.15 MB, 832x1488)
2.15 MB
2.15 MB PNG
WHERE IS HE HIDING?
>>
>>107373741
>Is this kind of output breaking mid training before it fixes itself later common?
it's a distilled model it won't fare well on lora training, wait for the base model to do that
>>
>>107373675
I have absolutely no idea what you are talking about. This was my first time doing this.
>>
>>107373752
bookend you prompts with <think> </think>
>>
>>107373763
We aren't just training a distilled model, the training adapter fixes that problem to a large degree. And the LoRA works well, especially given it's not a full base model nor a large model.
>>
>>107373799
>the training adapter fixes that problem to a large degree.
meh, I doubt that desu, it's a hack at the end of the day
>>
I removed the GitHub star from cumfart. sad to see this project go to shit
>>
>>107373741
How can I continue training a completed job? Picking up where it left off. Is that even possible?
>>
>>107373812
based
>>
>>107373778
I don't know where you got that workflow from, but as it stands it is not correctly set up to run WAN 2.2.
Click on Templates on the left hand bar of Comfy UI, click Video and then select WAN 2.2 Image to Video. Use that as a starting point.
>>
>>107373826
Just edit the job to be more steps and start it again. You will need to still have the last checkpoint in the output folder.
>>
ZIT is really good at drawing prepubescent girls
>>
>>107373741
What am I even looking at? Are you trying to turn every concept under the moon into a busty girl?
>>
>>107373853
Based, thank you
>>
Anyone else unable to get the Euler Discrete sampler to import? i even udooted everything :(
>>
>>107373866
No, there are some moth and the bee costume images in the dataset that I want to see the evolution of during training to know if something is going wrong.
>>
>>107373888
>No
Damn
>>
>>107372485
>>107369172

Tried this and it's not following my prompt at all (only 1% the time). Not sure what I'm doing wrong.

https://files.catbox.moe/nqakuq.png
>>
>>107373527
>We've partnered with ComfyUI to make our API available on day 1!
>>
File: 1750256366519640.png (245 KB, 2213x827)
245 KB
245 KB PNG
>>107373937
increase the number of steps and go for something smaller, personally I go for this (0.01 -> always only the first step + 20 steps)
>>
why the fuck did they make the apis the main method of income? the company doesn't offer anything of value and they trapped themselves
>>
File: 3881841443.png (3.06 MB, 1344x1728)
3.06 MB
3.06 MB PNG
Ah fuck it, I'm reverting back to the old lora implementation untill they figure this out.
>>
>>107373960
because no one knows how to make any money off of AI shit and the only way anyone has figured out so far is charging for APIs a la reddit
>>
>>107373960
grift chink mentality. thank you grifter chink, I can see why you aren't accepted by mainland china
>>
File: 1.jpg (1.09 MB, 1800x2195)
1.09 MB
1.09 MB JPG
>>107373937
I use advanced KSampler for this. First KSampler gets an empty text conditioning
>>
>>107373972
>no one knows how to make any money off of AI
You scam investors, no?
>>
Given how the turbo model trains well even if base comes out unless its at least 3x better whats the point of using it lol
Maybe a lora trained on base along with a lightning lora will be superior to just a turbo lora i guess
>>
>>107374010
shut the fuck up
>>
File: 1750182137992217.png (57 KB, 941x631)
57 KB
57 KB PNG
is it not possible to continue training a lora with ai toolkit? i deleted the finished one from the queue
>>
>>107373843
I see. I try that.
>>
>>107374027
Recreate the job with the same settings and place the last checkpoint you have in the output folder before starting the job
>>
>>107373992
investment != revenue but all slimy ai corpos like to count it
>>
>>107374049
the comfyorg CEO looks slimy irl. even has the audacity to call comfyui a fucking OS lmfao
>>
Even if ZiT becames SaaS this turbo model it's at least ten times better than SDXL. As an anime slopper I'm already happy with it.
>>
>>107374074
>it's at least ten times better than SDXL
that is kind of a stretch but I do agree it's better
>>
>>107374074
Yeah and you can mix styles with loras and/or upscaling the base zit gens with sdxl if you want artist variance.
>>
>it would be fine if we only got turbo
again shut the fuck up you giant faggot
>>
File: file.png (178 KB, 498x498)
178 KB
178 KB PNG
>>107374074
>Even if ZiT becames SaaS this turbo model
Uhmm aktually ZiT means Z-Image Turbo so we already got that model, you have to say ZiB instead
>>
File: 1755553835656688.png (1.35 MB, 1280x720)
1.35 MB
1.35 MB PNG
>>
I popped a zit on my dickshaft. can I make it saas?
>>
File: ComfyUI_00121_.png (3.52 MB, 2048x1280)
3.52 MB
3.52 MB PNG
>>107374010
>unless its at least 3x better whats the point of using it lol
more concepts like survelliance footage that were lost in the process of tuning for high quality portraits
>>
>>107374100
You clearly are not an anime slopper. I have been fighting with SDXL for 3 years. This model alone, naked as it is, gives me renewed hope for anime, I already wonned, now I'm waiting for shitmerges and Loras
>>
File: 1745220803981999.png (1.45 MB, 1280x720)
1.45 MB
1.45 MB PNG
>>107374141
>>
>>107374074
Nobody ever finetuned previous models enough for them to be worth using besides Illust and Noob, what makes you think someone will finetune a model 3x bigger than sdxl for weebshit? When Illust had to go SAAS because they ran out of money? When the NETA team gave us an undercooked model because they ran out of money? And they were only 2B models. Lodestone doesn't have money anymore, and Pony will prune artists tags again.
Even if Ali does an anime finetune themself, but if their previous models are anything to go by, they will prune h and artists.
>>
>700k+ download in 1 day
>already on civit
>already loras
Z MODEL BASE WILL BE SAAS, GET OVER IT, IT'S TIME FOR YOU TO MOVE ON AND SIMP FOR ANOTHER MODEL
>>
>>107374164
If you're not trolling, base will be so much better because of >>107374153 and the fact that it wouldn't be a distilled aesthetic tune. You should not be completely satiated with Turbo.
>gives me renewed hope for anime
Yume did that. Zimg Base may actually do it though, but not if we pretend like Turbo is the end all be all.
>>
Z-Image can accidentally generated unwanted CSAM content.
BFL has rigorously removed CSAM and NCII content from FLUX.2's training set and actively monitors abuse in the community.
Why would anyone that's not a pedo use Z-Image again?
>>
>>107374175
So, it's over?
>>
File: 1749619671269628.jpg (352 KB, 1792x1317)
352 KB
352 KB JPG
NegPIP extension for Neo Forge updated
https://github.com/hako-mikan/sd-webui-negpip
https://xcancel.com/hakomikanx/status/1994799151566262472
>>
>>107374189
>already loras
Remind me why do I still have to wait for the base? We already won. We won. This is the SDXL moment of 2025 except it's times better, the moment is better because with SDXL we had to wait many months for it to become really usable. With this we'll have everything in mere weeks.
>>
>>107374201
BasedBasedBasedBasedBasedBased
>>
sdxl is all you need.
>>
>107374197
>Z-Image can accidentally generated unwanted CSAM content.
this is true, I got a couple of young girls when generating the 4 voluptuous women standing around the researcher in the hut, and I never prompted for young stuff

no one cares though
>>
>>107373951
Okay, much better now. Thanks!

>>107373980
Yeah, was using that wf and testing this new one out now.
>>
File: 635357075.png (3.65 MB, 1344x1728)
3.65 MB
3.65 MB PNG
>>
File: 1745682980445932.png (1.42 MB, 1280x720)
1.42 MB
1.42 MB PNG
>>107374201
>Neo Forge
>>
>>107374225
>hands too small
>neck too short
>>
File: file.png (34 KB, 461x250)
34 KB
34 KB PNG
Trust
>>
Already deleting my SDXL loras folder and Comfy folder, goodbye dumb model and dumb UI
>>
File: 98172775.png (3.3 MB, 1344x1728)
3.3 MB
3.3 MB PNG
>>107374240
for you
>>
>>107374255
thank god neoforge exists
>>
>>107374261
>>107374225
Are you a female? :3
>>
>>107374250
I have a feeling Ali is letting Tongyi fuck around because they already have the WAN 2.5 team for video SaaS and the Qwen Image team for image SaaS
>>
File: 1761464511377727.png (1.28 MB, 1280x720)
1.28 MB
1.28 MB PNG
>mfw :[
>>
File: 2225294052.png (3.47 MB, 1344x1728)
3.47 MB
3.47 MB PNG
>>107374289
nope
>>
>>107374300
could you see if you can gen the word "Scanner"? I kept getting "Scanler" in my gens.
>>
what settings should i set
54 images of an anime girl
>>
File: 634807930.png (1.26 MB, 896x1152)
1.26 MB
1.26 MB PNG
>>107374323
That looks fine to me.
>>
File: 1753063391580298.png (1.3 MB, 1280x720)
1.3 MB
1.3 MB PNG
>>
So do we forgive them for Wan 2.5 api?
>>
>>107374347
ok i sent it
>>
File: 1757729927662643.png (616 KB, 1024x1024)
616 KB
616 KB PNG
>Flowmatch Euler Discrete Scheduler
>>
>>107374352
No, absolutely not. Local video is perpetually stuck in 5 second hell now
>>
>>107374369
>>107369114 + >>107369419
>>
File: 3399343427.png (1.54 MB, 896x1152)
1.54 MB
1.54 MB PNG
>>107374367
godspeed
>>
>>107374385
>Local video is perpetually stuck in 5 second hell now
The Israelis will be releasing LTXVideo in december. Start studying the Talmud in preparation
>>
File: 1742996900552048.png (1.45 MB, 1280x720)
1.45 MB
1.45 MB PNG
>>107374308
try to increase the resolution or the number of steps, that helps (try to increase the shift from 3 to 5 too if you're still at 3)
>>
>>107374299
I hope they mog the Wan team with their own video model.
>>
>>107374394
>>107369419
>lemao, actually you don't need that custom node, the "Normal" scheduler is the exact same as the EulerDiscreteScheduler, the more you know
kek welp alright then
>>
>>107374423
Wonder if Flux 2 can write that or did they do a safety pass on text too
>>
>>107374427
why would Ali management allow that
>>
>>107374299
it won't last, they'll consider Tongyi to be the top dog and they'll put pressure toenshittifiy their shit :(
>>
>>107369172
I tried this and the results are way too random. It tends to stick to the original "empty" prompt so you get complete nonsense. The other workflow it's based off of seems more consistent without it just being the same image over and over.
>>
>>107374451
WAN 2.5 is architecturally more impressive and technologically advanced than Tongyi. Imagine posting on this board and thinking you know what Ali's plans are for the next 6 months
>>
File: ComfyUI_11564_.png (1.51 MB, 1024x1024)
1.51 MB
1.51 MB PNG
It's gonna be a fun weekend testing old prompts on z-image. It brutally mogged all local models before it and even many of the API models at measly 6b param with distillation on top of it, imagine when they scale this
>>
>>107374466
>WAN 2.5 is architecturally more impressive and technologically advanced than Tongyi.
how do you know that? you don't even have the model in your own hands
>>
What's the best upscaling model nowadays? I want to upscale a tileset for a 2d game.
Gemini does it "well" but it doesnt maintain aspect ratio so it completely misses the entire point of the tileset
It's a 16x16 pixels tileset, I want it to become 64x64
>>
>>107374478
>imagine when they scale this
I can't because that goes against the principals of the team kek
>>
>>107374451
>>107374466
Nigga, Wan is literally a product of Tongyi group
Wan is short for Tongyi Wanxiang (通义万相)
Qwen is short for Tongyi Qianwen (通义千问)
>>
>>107374489
cos he's chinky
>>
>>107374478
>imagine when they scale this
The entire point is that scale is not all you need
>>
>>107374454
try to go for this -> >>107373951
>>
>>107374490
Forgot the pic lol
https://files.catbox.moe/074itc.png
>>
>>107374493
you know what I meant, the team that managed to make the godlike 6b Z-image model, and not the team that made a slopped 20b Qwen Image model
>>
>>107374490
>>107374501
you dont need ai for this lmfao
>>
>>107374323
>>107374347
>540 steps
>fine
oh no no no
>>
Tried a few zit loras. They work, but they all mess up the image quality.
>>
>>107374506
I want the tileset to look good upscaled, so yes I do
>>
>>107374323
Not nearly enough
Try the default 3000 steps
>>
>>107374509
>>107374323
Oh shit, I'm stupid I thought it was 100 epochs kek
>>
>>107374489
>how do you know that? you don't even have the model in your own hands
because it can do video+audio which is strictly more complicated than image diffusion what the fuck...

>>107374493
>Official usage guide for Alibaba Cloud Tongyi Wanxiang 2.2 video generation model in ComfyUI
He's right. oops


Anyways wait for the kikes for updates in video in December. My point still stands that I'm very confident they won't SaaS the ZBase since they already have Image SaaS set up
>>
any tips for making wan22 suck less? I'm trying to turn this drawing of cicadas into a real cicadas flying
>>
>>107374525
>because it can do video+audio which is strictly more complicated than image diffusion
and yet they made Qwen Image (20b) and this shit doesn't hold a candle against that 6b model, so yeah, what the fuck...
>>
>>107374518
>>107374524
it was 1000 buzz if it doesnt turn out well ill resend it
>>
File: ComfyUI_00349_.png (1.44 MB, 1024x1024)
1.44 MB
1.44 MB PNG
>>107373654
i actually gave it a try and it worked better than i thought
>>
>>107374545
>>107374545
>>
>>107374537
It'll be way undercooked, sorry for misleading you.
>>
I propose we call ourselves /cldg/ chinese local diffusion general
Fuck censored copyright no c*nny API BIASED western companies
>>
>>107374535
nigger there are at least three different teams inside the Tongyi Lab. The lab is a lab. It's probably an entire building on one of Ali's campuses
>>
File: ComfyUI_11557_.png (1.48 MB, 1024x1024)
1.48 MB
1.48 MB PNG
>>107374491
Why wouldn't they do it?
>>107374497
True, but models being bigger do help though
They threw vramlets a bone, but that doesn't mean it has to be like that forever since it would hold them back
>>
>>107374563
>True, but models being bigger do help though
Are you dense? They are proving that isn't necessarily true.
>>
>>107375271

https://www.youtube.com/watch?v=Kmve1_jiDpQ
>>
>>107374208
days
>>
>>107374299
maybe they are (long term strat) priming users for the time when chinese made gpus become available world wide?
>not expensive, not as good as ngirida but it works

anyways it is sane development vs bloat+frameworks+legoblocks+webdev+ohmyscience10yearsexpert approach
as it should be

it is a good model



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.