[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


Discussion of Free and Open Source Text-to-Image/Video Models

Prev: >>107362793

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/kohya-ss/sd-scripts
https://github.com/tdrussell/diffusion-pipe
https://github.com/ostris/ai-toolkit

>Z
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo
https://comfyanonymous.github.io/ComfyUI_examples/z_image/

>WanX
https://comfyanonymous.github.io/ComfyUI_examples/wan22/

>NetaYume
https://civitai.com/models/1790792?modelVersionId=2298660
https://nieta-art.feishu.cn/wiki/RY3GwpT59icIQlkWXEfcCqIMnQd

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
>>
why aren't people testing flux 2 dev?
>>
>>
File: Wan garbage.jpg (39 KB, 1308x686)
39 KB
39 KB JPG
Why is it not working?
>>
File: 1762383313438.png (916 KB, 1280x1120)
916 KB
916 KB PNG
pathetic ping pong men NEED bbChroma to Zatisfy their wives
>>
>>107364569
z-image seems to have more potential when you consider the size
>>
>>107364569
Not enough VRAM
>>
File: flux2_bf16_c-0130.jpg (233 KB, 1600x1600)
233 KB
233 KB JPG
>>107364569
it seems I am the only one
>>
File: 1762217834893.png (1 MB, 1280x1120)
1 MB
1 MB PNG
"AH! Chroma-kun, you're hurting me with your 9 billion parameters...."
>>
>>107364569
Same reason people aren't testing Chroma
>>
It's it possible to use prompt weights with z-image?
>>
>>107364605
What is your take on the z model? I only trust your point of view and opinion.
>>
File: 1762203729035.png (1.18 MB, 1280x1120)
1.18 MB
1.18 MB PNG
>>107364627
It has a tiny 6 billion cock, very short and inadequate compared to bbchroma's 9 incher.
>>
File: ComfyUI_00037_.png (3.98 MB, 2048x1280)
3.98 MB
3.98 MB PNG
"A National Geographic photograph from the 1970s" doesn't seem to be working, neither does any variations of "80s vintage photo"

Or maybe it is and I'm too young to know what 80s documentary footage looks like, and I'm assuming it should look more "vintage" than this
>>
is reddit screaming for help because comfyui is fucked? this place has been kicking and screaming over spaghetti code all day
>>
>>107364662
>actually good model drops and it attracts newcoomers
>>
>>107364569
>why aren't people testing flux 2 dev?
we did >>107362123
>>
A friend is asking if he can do lore accurate Kuruminha in Zmodel
>>
>>107364662
I made a tapermonkey extension (thanks to claude) that removes all the "questions" flair, that way I filter out easily 80% of the saars lmao
>>
File: ComfyUI_00064_.png (3.8 MB, 2048x1280)
3.8 MB
3.8 MB PNG
>>
>>107364689
perfect for loras
>>
>>107364593
you can run with gguf
>>
>>107364676
nobody who hasn't done this shit before is coming to use it. normies are saas slaves forever at this point
>>
>>107364662
Most normies have been using site generators like Civitai or Tensor. With these sites restricting NSFW gens lately and a new uncensored model being released, there's probably a bunch of newfags trying to figure out how to install local.
>>
File: f2.png (1.23 MB, 832x1248)
1.23 MB
1.23 MB PNG
>>107364569
people did test flux2 for a while. it's not bad.
>>
>>107364679
Isn't that one that shows flux2 obeys prompts better?
>>
what image generator will allow me to generate accurate railings, given a specific railing model, around houses and shit

nano banana pro does it perfectly but i wondered if a local model could work
>>
>>107364717
good think that we'll get a base model that'll use CFG and makes the prompt adherence better
>>
File: flux2_bf16_c-0121.jpg (442 KB, 1600x1600)
442 KB
442 KB JPG
>>
>>
>>107364569
>supports up to 10 image reference
how good is it?
>>
>>107364745
it takes forever, dont have patience for that bullshit
>>
File: 1762198599164.png (1.39 MB, 1216x1408)
1.39 MB
1.39 MB PNG
So can Z do cool stuff or is it just real life failbake?
>>
File: f2.png (1.37 MB, 832x1248)
1.37 MB
1.37 MB PNG
>>107364722
i can imagine qwen image edit or flux kontext might be the models to try.

no, I haven't actually tried railings around a house
>>
>>107364741
cuteness jumpscare
>>
>>107364722
None. I think it's pretty clear that the community only wants a 1girl generator, so put up with mediocrity or pay for an API.
>>
>>107364745
>>supports up to 10 image reference
>how good is it?
- no image ref -> 4 mn
- 2 images ref -> 7 mn
I didn't have the patience to try for 10 images ref lool
>>
>>107364711
since internet started porn has been on it and today there is enough of it for 6000 life times, are they retarded - yes, yes they are.
>>107364716
>it's not bad
it is. no fine details.
observe the grainy ocean in the back, bear looks fake too.

someone posted nature photo on civit when they opened flux2; grass on the ground is just a green blur, no fine details as well.
>>
File: ComfyUI_09174_.png (2.08 MB, 1152x1152)
2.08 MB
2.08 MB PNG
>>107364380
Right, because popular thing = best to use right? That's why Flux.1 dev base was more popular than Chroma right? Quite sad, really. Just blind hatred for Chroma, both models have their strengths and weaknesses.
>>
>>107364745
use case for 10 image reference?
>>
>>107364773
>because popular thing = best to use right
if it's better why isn't anyone using it
>>
File: 1762197948974.png (1.54 MB, 1216x1294)
1.54 MB
1.54 MB PNG
Real life failbake it is
>>
File: ComfyUI_09179_.png (2 MB, 1152x1152)
2 MB
2 MB PNG
>>
>>107364773
Good models get popular at some point yes, I'm not playing your elitism game, you're not the chosen one who saw the potential of a model that no one saw, you spam your oversaturated images daily for more than 6 months at this point I think people have all seen what Chroma is capable of, and they all went to the conclusion it's not worth the hype, it won't happen dude, you can't force people to pretend your shit is realistic, it's not, Z-image convinced people it was a good model in less than 2 days because it is actually realistic, it's not that deep
>>
File: ComfyUI_00073_.png (3.93 MB, 2048x1280)
3.93 MB
3.93 MB PNG
didnt mean to share a 6toes gen sorry

>>107364722
>i wondered if a local model could work
your use case sounds perfect for nano banana pro and I can't in good faith recommend you explore local for your use case. it will be worse quality and more expensive for no benefit unless you really care about google knowing about you making railings for houses
>>
>>107364794
>>107364750
>>107364644
>>107364605
>>107364578
we getting raided or what?
>>
>>107364779
having an intentional composition instead of gacha rolling
>>
>>107364779
10 different styled toe nails
>>
>>107364807
He's just a lowly schizo who drops in occasionally to shit post. I think he's only been genning for a couple of weeks so it's safe to disregard him.
>>
>>107364820
The fuck is a netayume?
>>
File: 1762289727471.png (1.11 MB, 1280x1120)
1.11 MB
1.11 MB PNG
>>107364807
I admit I must make Z-image edits instead of posting about dead models (yume). All in due time.
>>
>>107364773
>That's why Flux.1 dev base was more popular than Chroma right?
flux has actually better anatomy and details, chroma lost those flux qualities when trying to undistill schnell,
>>
>>107364807
hes just brown
>>
asking in new thread:
is there a way to use the already loaded qwen model for LLM inference (I want to translate to chinese) directly in comfy?

The only things I found involved running llm inference in a separate program and just using comfy as a frontend.
>>
>>107364802
>oversaturated

Actually in Z images the saturation is too low. Nothing a tune can't fix, but maybe look at what real images look like before you decide what looks real and what doesn't.

https://www.dpreview.com/sample-galleries/9434362346/iphone-15-pro-sample-gallery/5666861150
>>
>>107364827
A step up from XL Illustrious with more anime knowledge than chroma but could perhaps be beaten by Z we shall see.
>>
File: flux2_bf16_c-0147.jpg (439 KB, 1600x1600)
439 KB
439 KB JPG
>>
I hope chroma 2 happen on flux 2
>>
>>107364837
>hmmpf! only me knows what a real image look like, that's why chroma is not popular, because only me can see through it realness
holy main character syndrome, you're not that dude
>>
File: 1761779244137.png (2.17 MB, 2560x1280)
2.17 MB
2.17 MB PNG
>>107364841
>step up from XL

lmao, you really believe that shit?
>>
File: ComfyUI_00089_.png (3.25 MB, 2048x1280)
3.25 MB
3.25 MB PNG
>>107364779
>use case for 10 image reference?
lora without training
you give it checkpoints of a girl of every 3 years of her life from 3 to 30 and you can make an image of her any age in between those checkpoints idk

>>107364836
>is there a way to use the already loaded qwen model for LLM inference (I want to translate to chinese) directly in comfy?
from intuition my answer is "not really" since you're loading it as an encoder not as a full LLM

but I may be wrong. you could do is get an ai to poke around the ComfyUI code if you're interested in a more definitive answer

>>107364837
>Actually in Z images the saturation is too low.
apparently the TAEF1 vae helps with this according to the redddit post with Z-image tips and tricks
It also makes VAE decode faster which can save you half a second per gen
I use it for my gens and see no reason not to use it
>>
File: 1761780700655.png (2.44 MB, 2560x1256)
2.44 MB
2.44 MB PNG
>SDXL vs Yume
Yume was a failbake from the start
>>
you were already laughed out of the thread for your shit comparisons twice kek
>>
>>107364851
Why the fuck would Chroma 2 even use Flux 2? That shit's a massive hog. A Z-Image-base finetune would make way more sense
>>
File: 1762393361879.png (1.15 MB, 1280x1120)
1.15 MB
1.15 MB PNG
>>107364883
is the "laughed out" in the room with us?
>>
File: 1753608327861751.png (1.05 MB, 2680x1536)
1.05 MB
1.05 MB PNG
>>
As soon as AniStudio adds Z-image support we should add it to OP.
>>
its always the same fucking retards here
>netalumina's strongest soldier
>chromakek's BRO LODESTONE FUCKED IT UP AFTER v29, here's the same image i post
>>
>>107364898
Is that before or after it gets i2i support
>>
>>107364893
Why wouldn't you want furry fuckers to donate to a dead project lol
>>
File: f2.png (993 KB, 832x1248)
993 KB
993 KB PNG
>>107364769
>no fine details.
as far as I can tell this varies with the gen - much like images online.

> bear looks fake too
i too do feel this is a common issue on flux2.

you can't tell much from one image but even in series of images: most feel fake. wrong shadows/lighting, weird character interactions, something or other. almost no subject is simple enough that it wouldn't feel quite fake.
>>
>>107364897
>perfect text
that's nano banana pro right?
>>
just keep seething :]
>>
File: 1762194991014.png (1.67 MB, 1216x1408)
1.67 MB
1.67 MB PNG
>>107364894
>>107364644
>>107364605
>>107364578
This is nuts, why do asian women burn coal so hard? does chang really have a tiny 2B cock?
>>
File: ComfyUI_00102_.png (3.74 MB, 1280x2048)
3.74 MB
3.74 MB PNG
>>107364902
>its always the same fucking retards here
bro there's like 10 regulars in this general and like 50 unique users a month
what do you want/expect from 4chan in 2026 like seriously it's a miracle this site is even still up kek
>>
is 5070ti considered vramlet
>>
>>107364851
could lode even afford it? lol
>>
>>107364911
why complicate things? screenshot of blender with a qwen-image-edit image loaded into it
>>
File: lol.png (2.71 MB, 1445x1771)
2.71 MB
2.71 MB PNG
>>107364851
>I hope chroma 2 happen on flux 2
why? it's worse than a 6b model
>>
File: ZImage_00386_.png (382 KB, 768x1024)
382 KB
382 KB PNG
tried training Z-Image against my perfectly scaled pixel art dataset, i can't get it to align to the pixel grid perfectly even after overtraining
also it's very very biased for flat pixel art even though my dataset contains soft images
interestingly the best i was able to do in the past was Flux 1 and Wan, while Qwen was hopeless and Chroma was kind of not listening very well
will retry on the base if they release it
>>
File: z-i-t.jpg (105 KB, 1024x1024)
105 KB
105 KB JPG
>>107364931
kind-of. but you can run nearly everything with offloading to system RAM

also z-image-turbo as the current hotness in the thread is not really an issue
>>
>>107364807
Respect the BBC transfem, chud
>>107364832
> Anyone I don't like is brown
Cope
>>
File: flux2_bf16_c-0166.jpg (480 KB, 1600x1600)
480 KB
480 KB JPG
>>
>>107364803
i think youre right
>>107364760
i also think youre right
>>107364752
i tried qwen on their website but even that failed

thats actually a good test for ai generation
only nano banana pro can do it
>>
>>107364911
nano banana pro's text handling isn't perfect. It's equal or below seedream (or whatever they're using on tiktok live)
>>
Blessed thread of frenship
>>
File: ComfyUI_09187_.png (2.25 MB, 1152x1152)
2.25 MB
2.25 MB PNG
>>107364876
>TAEF1 vae
Interesting, I'll check it out.
>>
Can someone link the hottest Chroma model right now? Is it still HD-1 o algo? Tomorrow we making some rice bunnie-Z.
>>
How do I run mistral flux fp8, a 18gb text encoder for flux on my 5070ti?
>>
>>107364931
>is 5070ti considered vramlet
5070ti with 64 or 96gb of ram is king of the vramlets until video + audio comes out

>>107364972
you should, it seems to be a free lunch
>>
File: ZImage_00410_.png (411 KB, 768x768)
411 KB
411 KB PNG
this is from another Z-Image training run i tried training on isometric pixel art dataset, again not happy with the result, it can't maintain perfectly angled straight lines and align to pixel grid properly
i got better results on other models
the plus side is this one can actually do text i guess
really hoping it's just because the turbo model is turbotrained on realistic shit
>>
>>107364984
>again not happy with the result
why do you guys expect to get good results on a distilled model? it's not meant to happen, we have to wait for the base model to get serious shit
>>
>>107364989
i'm not bitching about the model, just sharing my conclusions after experimenting
>>
>>107364596
Using flux to do anime of all things... man some of you guys are hardcore masochists
>>
>>107364984
yeah the distilled training kinda sucks. this might be dead in the water if the base model isnt a big improvement.
>>
Half of all anime pics probably qualify as CSAM or NCII
>>
File: ComfyUI_07922_.png (1.57 MB, 944x1280)
1.57 MB
1.57 MB PNG
>>107364569
because flux2 is garbage.
Big Fucking Losers really dropped the ball with this. what a bunch of retards.
>>
btw where did ostris get this mysterious 160mb de-distillation training adapter from?
>>
File: 1762203259725.png (1.21 MB, 1280x1120)
1.21 MB
1.21 MB PNG
>>107364984
its gonna be a failgen just like yume even with the base model
>>
File: Wan garbage 2.jpg (125 KB, 1563x784)
125 KB
125 KB JPG
Why doesn't the end result video actually move? It's not even 1 second long??
>>
>>107365017
>NCII
>C
You have one chance to explain your case.
>>
>>107364984
I have a theory that any small loras basically rely on the base model's dataset having a lot of that stuff to begin with, so the small lora can kinda "piggy-back" off the original dataset.

Z-Image absolutely has big knowledge gaps. So I think you just need an even larger dataset.

And yeah, of course it won't be as good as SDXL or Flux or whatever, because those models already have a lot of pixel art inside them.
>>
does qwen image still has its place?
>>
>>107365031
https://x.com/ostrisai/status/1994107044572573821
https://x.com/ostrisai/status/1994427365125165215
>>
>still sperging out
>>
>>107365036
Your promptfu is weak. What is spinning? The bottle? The camera? In what direction?
>>
>>107365043
Best model to train non-anime on 24gb vram
>>
File: 1752106136541766.png (63 KB, 1175x779)
63 KB
63 KB PNG
>>107365036
what are you doing dude? you have to use this node to set the number of frames
>>
>>107365043
No it's the most censored out of Qwen Image, Wan 2.2 T2I and Z-Image-Turbo
>>
>>107365045
ok so it's a bit of a hack on top of a hack, maybe i should temper my expectations of the current training process and hope that stuff improves eventually
>>
>>107365045
testing it rn
>>
>>107365053
Yeah the adapter was trained using a small selection of pics only.
>>
>>107365052
but it has loras
>>
>>107364976
text encoder doesn't really need to run on gpu, cpu/system RAM is usually still decent

smaller text encoder quant may be an option

if you must try a big full model and it doesn't automagically work somehow, use comfyui-multigpu and offload as much as you need to system vram, it'll shuffle stuff back to the gpu as needed
>>
File: flux2_bf16_c_00069_.jpg (1.79 MB, 2560x1600)
1.79 MB
1.79 MB JPG
>>107365010
>>
>>107364984
The whole idea of "pixel art" where pixels are larger than a pixel is retarded.
If you trained the base model on pixel art that isn't upscaled maybe you'd get better results.
>>
>>107365090
That's not how VAE's work, retard
>>
pixel diffusion models when
>>
Ah shit is this base model gunna be DPOd to hell and back
>>
why are there a billion anime checkpoints on civit when they all look the same? am i missing something? im not even trying to shitpost
>>
>>107365090
it's not retarded, i've been doing this for a long time so know way more than you about it.
it's because it's impossibly hard to gather a dataset of sufficient resolution to train at 1 pixel = 1 pixel and also the VAE would fuck up fine details anyway.
you can't just train a model on 64x64 pixel resolution images and expect good results, that's why people scale the pixels up to make it actually work
>>
File: 1754607643975782.png (295 KB, 758x1288)
295 KB
295 KB PNG
Did anyone training models on ai-toolkit test any of those advanced options?
>>
>>107365119
People can't into prompting so they need an entire model that by default has a style they like.
>>
>>107365119
90% of them are shitmixes that just merge 40 models together. Most of them are based on Noob or Illustrious, the two ACTUAL major finetunes.
>>
>>107365119
we call those "jeetmixes" or "sloptunes" or even "cancermixes" at least you recognize they all look the same because some (stupidly) swear by them
>>
damnit is there any llm that is up to date on how to deal with comfyui? these assholes keep giving 1 month old instructions
>>
>>107365137
gemini 3 with search engine, elon's mecha hitler
>>
>>107365137
No
Most LLMs have 2024 cutoff
>>
>>107364569
don't need to
>>
>>107365125
I think I'll try Differential Guidance
>>
>>107365123
It is retarded because you're working with way more data and a way bigger latent space than you should.
You're just saying that current models aren't suited for pixel art which doesn't make it less retarded.
>>
>>107365137
No, but if you give it a direct link to the github project it may be able to parse it for you.
>>
>>107365119
To bury worthwhile tunes such as Nlx, Art-illustrious, NTD, 2DN, epitune's stuff.
>>
>>
imagine not being able to gen kino on base finetunes or regular base models holy fuck id kill myself
>>
File: ComfyUI_09202_.png (2.05 MB, 1152x1152)
2.05 MB
2.05 MB PNG
>>107364980
That's interesting. Tried it in conjunction with variation node and it's def. a net improvement.
>>
>>107365137
I asked Grok and it worked
>>
File: Flux2Img_00006_.png (2.58 MB, 1440x1152)
2.58 MB
2.58 MB PNG
>>
>>
>>
>>
File: ComfyUI_09207_.png (1.89 MB, 1152x1152)
1.89 MB
1.89 MB PNG
>>107365178
Not a perfect set of variations, but it's something
>>
>>107365178
>variation node
what's that?
>>
File: 1758640500830476.png (1.19 MB, 1024x1024)
1.19 MB
1.19 MB PNG
I've been enjoying this CGI painting style with z-image.
>>
>>107365216
This workflow:
https://www.reddit.com/r/StableDiffusion/comments/1p99t7g/improving_zimage_turbo_variation/

(Set the batch size to 1)
>>
File: ComfyUI_07966_.png (1.53 MB, 944x1280)
1.53 MB
1.53 MB PNG
so gentleman, what are you training?
>>
File: ComfyUI_00490_.png (1.37 MB, 768x1088)
1.37 MB
1.37 MB PNG
>>
>>107365232
nothing, i'm being a patient boy and waiting for base.
>>
File: flux2_bf16_c-0176.jpg (618 KB, 1600x1600)
618 KB
618 KB JPG
>>
File: 1743090815255378.jpg (665 KB, 2560x1520)
665 KB
665 KB JPG
>>107364876
>TAEF1 vae helps with this according to the redddit post with Z-image
hmm, the effet is very subtle but something is happening yeah lol
https://www.reddit.com/r/StableDiffusion/comments/1p8462z/comment/nr3is87/?utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button
>>
File: 1762101173496.png (1.96 MB, 1328x1280)
1.96 MB
1.96 MB PNG
>>107365133
hey, still better than netayume
>>
File: file.png (1.34 MB, 1024x1024)
1.34 MB
1.34 MB PNG
>>107365229
>https://www.reddit.com/r/StableDiffusion/comments/1p99t7g/improving_zimage_turbo_variation/
>Prompt: Beautiful woman riding a bike
it seems to destroy the prompt adherence though
>>
File: 1762285673906.png (1.25 MB, 1280x898)
1.25 MB
1.25 MB PNG
>>107365269
what a riced model
>>
>>107365243
>and waiting
I train while watching youtube or tv shows or a movie or play videogames or read a book or while I take a shit etc.
>>
File: 1732985400627483.mp4 (513 KB, 720x720)
513 KB
513 KB MP4
>>107364741
>>
File: 454454545125.png (998 KB, 815x728)
998 KB
998 KB PNG
>>107365269
Does it? Have yet to see that for any prompt I've tested. These are nine variations of

>Amateur photograph, view from behind as cute Japanese woman rides a bike down Japanese street, bunny cosplay, thick thighs and fishnet leggings, heels, hair tied to a bun
>>
File: ComfyUI_09218_.png (1.95 MB, 1152x1152)
1.95 MB
1.95 MB PNG
>>
>>107365304
yoooo wtf
>>
>>107365318
SSSSSSNNNNNNIIIIIIIIIIIIIIIIIIIIFFFFFFFFFFF

ouughh fuck. pungent.
>>
File: ComfyUI_09220_.png (1.73 MB, 1152x1152)
1.73 MB
1.73 MB PNG
>>
>>107365119
they're essentially just "style loras" without needing to load the lora.
>>
>>107365291
>I train
and I wait.
>>
>>107365281
The small asian woman is paralyzed by the black man's bbc. He buries his massive manhood deep in her tiny snatch and immobilizes her as he slowly plants his primitive seed. She feels every pulsation as he dumps gallon after gallon of rich seed
>>
>>107365318
>>107365349
slop, reeks of chroma
>>
File: file.png (1.29 MB, 1024x1024)
1.29 MB
1.29 MB PNG
me and my gf
>>
>>107365060
2250 steps
>>
File: Flux2Img_00014_.png (3.33 MB, 1440x1152)
3.33 MB
3.33 MB PNG
I'm less than impressed, this is the third revision of making it simpler
>>
is there going to be a new iteration of chroma? i know spark came out recently but just curious.
>>
>>107365398
Did someone try to grave rob Qwen
>>
File: 1733116715657681.png (2.74 MB, 1751x1573)
2.74 MB
2.74 MB PNG
>>107365229
>https://www.reddit.com/r/StableDiffusion/comments/1p99t7g/improving_zimage_turbo_variation/
>An old 1980s-style photograph with the look of a slightly faded family snapshot: soft film grain, subtle chromatic aberration, muted contrast typical of consumer point-and-shoot cameras of that era, natural indoor lighting with slight unevenness
>A woman is skateboarding in New York
heeh, not bad actually, some of those actually follows the style
>>
>>107365417
that's supposed to be safety tape
>>
File: 778455844845.jpg (242 KB, 2394x1326)
242 KB
242 KB JPG
>>107365252
Yeah, seems to mess w/ small details a bit unfortunately
>>
>>107365398
Qwen is just Alibaba too so they won either way.
>>
File: flux2_bf16_c-0163.jpg (264 KB, 1600x1600)
264 KB
264 KB JPG
>>
>trillions of dollars spent on developing AI
>all everybody does is make 1girl slop
we are all monkeys
>>
File: ZImg_00013_.png (3 MB, 1152x1440)
3 MB
3 MB PNG
thanks for another excuse to post this
>>
File: ComfyUI_00133_ - Copy.jpg (663 KB, 2048x1280)
663 KB
663 KB JPG
>>
File: 1742849094709800.png (2.6 MB, 720x2000)
2.6 MB
2.6 MB PNG
>it can do manga pages
neat
>>
>>107365450
catbox me boss. you have activated my jungle fever. and fevor.
>>
File: file.png (1.57 MB, 1024x1024)
1.57 MB
1.57 MB PNG
>>107365445
>all everybody does is make 1girl slop
me and my gf after cloning technology become available
>>
>>107365445
have you ever seen a painting called mona lisa
>>
File: 1.png (55 KB, 695x253)
55 KB
55 KB PNG
>>107365449
rice bunnies are big hypocrites
>>
File: ComfyUI_00002_.png (970 KB, 1024x1024)
970 KB
970 KB PNG
BUILT FOR BBC
>>
File: 1747766548562764.png (1.53 MB, 630x991)
1.53 MB
1.53 MB PNG
Z Image Turbo, DALL-E 3-like Girls lora
Training it longer than needed to see how it breaks.

I'll train it 250 steps more before testing if any of the checkpoints are worth anything.

https://files.catbox.moe/01cbo7.jpg
>>
File: 1760603130362327.png (2.29 MB, 1152x1536)
2.29 MB
2.29 MB PNG
>>
File: 1737087770804222.mp4 (1.94 MB, 720x640)
1.94 MB
1.94 MB MP4
>>107365310
>>
>>107365032
this art style is disgusting.
>>
File: 1764382082.png (2.19 MB, 1504x1504)
2.19 MB
2.19 MB PNG
>>
File: ComfyUI_00005_.png (1.1 MB, 1024x1024)
1.1 MB
1.1 MB PNG
I like chroma v50
>>
>>107365460
mona slop
>>
File: file.png (1.46 MB, 1024x1024)
1.46 MB
1.46 MB PNG
It can generate the mona lisa almost perfectly but I couldn't get it to reproduce american gothic.
>>
>>107365511
you want american gothic? here's 1girl sameface instead
>>
>>107365451
>X model can do Y
>neat
Dopamine addict post. Really looking forward to your colored by AI manga volume, don't flake out or let it sit unfinished!
>>
>>107365511
does it recognize yamamura sadako?
>>
Please care about Chroma.
>>
File: Karen.png (180 KB, 633x356)
180 KB
180 KB PNG
>>107365519
>Dopamine addict post. Really looking forward to your colored by AI manga volume, don't flake out or let it sit unfinished!
>>
stop complaining about 1girl or i'll start posting the manliest, hairiest and sexiest 1male
>>
>>107365536
>or i'll start posting the manliest, hairiest and sexiest 1male
you post that like its a threat
>>
File: file.png (1.31 MB, 1024x1024)
1.31 MB
1.31 MB PNG
>>107365520
I might as well have put 1girl in the prompt.
>>
>>107365525
The problem is its basically a NSFW Flux that drops all the stable diffusion lora support. It's also huge. Z image is smaller and faster so training it to be NSFW is going to be much faster.

Z image also more photorealistic and better at natural prompting
>>
File: ComfyUI_00006_.png (1.28 MB, 1200x1024)
1.28 MB
1.28 MB PNG
>"C-chroma-san, please be gentle... my husband Z-hang has never been this rough"
>>
>>107365374
>slop, reeks of chroma
They are Z image, though the taef1 VAE messes it up a bit. Even with that, neither model is perfect with mesh.
>>
>>107365541
its crazy that it doesnt recognize her.
>>
>>107365499
got dam
>>
>>107365539
>you post that like its a threat
its no fun if you're into it, tch
>>
>>107365553
NTA it knows what milena velbas nipples look like, but can't generate her well.
>>
File: file.png (1.56 MB, 1024x1024)
1.56 MB
1.56 MB PNG
>>107365536
I gave "the manliest, hairiest and sexiest 1male" to GLM 4.6 with the enhancer prompt.

>A single adult male with a heavily muscular physique stands in a three-quarter view, filling the frame. His skin is tanned and weathered. He has thick, dark, curly body hair covering his chest, abdomen, and arms. His face features a strong jawline, high cheekbones, and a full, well-groomed dark beard that covers his lower face. His expression is confident with a direct gaze. He is wearing only dark, worn denim jeans. The setting is the interior of a rustic log cabin; warm, flickering firelight from an off-camera fireplace casts dramatic shadows across his torso, highlighting the definition of his muscles and the texture of his body hair. The background consists of rough-hewn wooden logs that recede into shadow.
>>
>>107365567
do you check every new model for milena velbas' nipples? i'll stick to chroma since it recognizes sadako.
>>
>>107365526
Did I miss something? Feels like everyone's has the new toy syndrome but not much actual use or any real purpose for this stuff.

>>107365451
I'm interested in your comment, do you actually color manga pages? Do you plan on using this model for that? Tell me more, which volumes or what kind of manga do you plan to color? Because I actually do and I need your help.
Or are you just an empty shill?
>>
>>107365568
the dreaded 1man, standing...
>>
O M G how does it just KNOW her???
>>
File: z-image_00018_.png (2.27 MB, 1024x1568)
2.27 MB
2.27 MB PNG
>>107365386
It actually transfers the character pretty well. ai-toolkit seems stable enough ngl
>>
File: ComfyUI_00509_.png (2.04 MB, 1280x1280)
2.04 MB
2.04 MB PNG
blood looks a bit weak, i'll try again tomorrow
>>
So when's Z base dropping? Hopefully around midnight (Saturday in China).
>>
Dammit I tried the diffusion-pipe fork that had z-image support and it did a shit job. Guess I'll try ai-toolkit.
>>
>>107365586
>I need your help.
What do you mean?
>>
File: ComfyUI_00492_.png (902 KB, 1088x768)
902 KB
902 KB PNG
>>
File: 1749363473304068.png (2.8 MB, 800x2000)
2.8 MB
2.8 MB PNG
>>107365586
no i generated that entire manga panel using prompt, no loras and just z alpha

you have to do like

**Panel 1 (Top Left, Wide Establishing Shot):** and describe panel 1 and then

**Panel 2 (Top Right, Medium Close-Up):** describe etc etc
>>
File: ComfyUI_00008_.png (922 KB, 1200x1024)
922 KB
922 KB PNG
>>
>>107365612
will that 1girl hurt me if i follow her into that room?
>>
File: based.png (1.32 MB, 720x1280)
1.32 MB
1.32 MB PNG
>>
>>107365624
it would be extremely painful
for you
>>
File: ComfyUI_00332_.png (2.08 MB, 1152x1536)
2.08 MB
2.08 MB PNG
>>107365536
I like stealing 1girl prompts off of civitai and making it gayer
>>
File: 1747755075430976.png (2.33 MB, 1249x1355)
2.33 MB
2.33 MB PNG
>no prompt
so basically it just outputs something close to the training dataset? lol
>>
File: file.png (2.52 MB, 1024x1552)
2.52 MB
2.52 MB PNG
Your new party member.
>>
>>107365689
why do you need to steal prompts from civit when most of them are just "masterpiece, best quality, score_9, score_8, score_7, score_5, score_4, score_3, score_2, score_1, HDR, CINEMATIC, 1girl, standing"?
>>
File: 1751331567491841.png (2.34 MB, 1373x1439)
2.34 MB
2.34 MB PNG
>>107365696
it sure has more seed variation than when you have a prompt kek
>>
File: awfgvawsgvawgvaw.png (1.19 MB, 592x795)
1.19 MB
1.19 MB PNG
>lora style i was looking forward to trying in flux ages ago that didn't work out has been made for z-image
>and its fucking WAY better than the flux version
this is insaaaaaaaaaane

https://civitai.com/models/2174416/technically-color-z

>Error: Maximum file size allowed is 4MB
cringe site
>>
>>107365707
>score_9, score_8, score_7, score_5, score_4, score_3, score_2, score_
what's that even for? something to activate a lora?
>>
>>107365726
It's Pony slop, but for some reason retards use it on every model.
>>
Someone is false flagging to create the impression that Chroma is Indian-coded. I don't know why they're doing that, but it's very transparent.
>>
File: ComfyUI_00009_.png (1.32 MB, 1200x1024)
1.32 MB
1.32 MB PNG
>"Uooh, your 9B crotch smells so nice.. unlike Z-hang's pathetic 6B cock.."
>>
>>107365726
theyre quality tags for pony but i was mocking it because 90% of the morons that post their images are using pony modifiers on noob/illustrious
>>
>>107365738
With which Sims version are these made?
>>
fyi
>flux2 dev on 5070ti
>1024x1024 20 steps, measuring sampler time
>1. gguf 24 gb virtual vram = 174 s
>2. torch compile + 1. = 148 s
>3. sage attention + 2. = 146 s
>>
>>107364944
z image is brilliant, no question about it.
It's polished for realism rn. Flux 2 has so many styles available that you have to prompt them explicitly, otherwise you'll get ‘average’. Gen again with streetphotography, taken by a amateuer with SLR camera or stuff like that
>>
>>107365751
score_x also works on bigasp.
>>
File: ComfyUI_00011_.png (1.51 MB, 1200x1024)
1.51 MB
1.51 MB PNG
>>107365760
Sims 5: Real Adventures In Guangdong
>>
File: flux2_bf16_c-0207.jpg (342 KB, 1920x1200)
342 KB
342 KB JPG
>>
File: 1739142416605842.jpg (880 KB, 2048x1424)
880 KB
880 KB JPG
>>107365718
>https://civitai.com/models/2174416/technically-color-z
>sovless vs sovl
very nice!
>>
>>107365779
Left is better, half the promot is lost on the right
>>
oot, wtf, who the fuck is this tranny bghira guy? he's annoying as shit. found him in zimg discord server lol
>>
turbo loras wont load for me
>>
>>107365816
you'll get some "lora key not loaded:" but it's still working
>>
how censored is z image?
>>
>>107365810
Wasn't he the guy doxing and reporting people for making nsfw LoRA on qwen edit?
>>
>>107365826
It can't into penises or spread leg vaginas but nudity works great.
>>
>>107365826
not censored, just lacks human anatomy knowledge. lora can fix it tho, i think.
>>
>>107365838
yep, that's the guy, I'm still surprised not everyone has blocked this mentally ill schizo
>>
File: file.png (1.38 MB, 768x1280)
1.38 MB
1.38 MB PNG
Man it's tough deciding between z-image and Chroma. Chroma's main issues are the awful noise in a lot of the gens and the hand and limb anatomy fucking up half the time. Without those issues I think Chroma would be a clear winner. You lose a lot going from Chroma to z-image, especially for degen stuff. Also z-image is crazy fast.
>>
File: 3018303179.png (1.19 MB, 1344x768)
1.19 MB
1.19 MB PNG
>>107365718
Yeah, I was expecting this hacky method of training a distilled model to be subpar, but it actually just works. Props to Ostris.
>>
>>107365849
i tried genning a woman with her hand inside of her panties and it wouldnt work.
>>
>>107365856
What about Chroma + Z-Image refiner?
>>
>>107365838
ye
>>
>>107365826
it appears to be rather untrained on nsfw stuff than being censord
so genitelia is eldrich horrors, stuff like nipples can be hit or miss and it doesn't get the concept of various sex acts
>>
>>107365872
>"When you can't beat them, put them in jail" - bghira
>>
>>107365867
Haven't tried it yet, did a kind anon make a workflow?
>>
do anon run the bf16 z image or fp8?
>>
>>107365872
why hasnt this guy been run out of every ai community yet
>>
>>107365838
I'm honestly starting to believe that he's single handedly the reason for a lot of censorship with local models.
He probably reports every controversial gen he comes across to these teams.
>>
>>107365896
if you can handle bf16, just use it.
>>
oh fuck off WHY DID I UPDATE COMFYUI
WTF IS THIS SHIT UI?
>>
>>107365896
Really just depends on how much VRAM you have, I think 16gigs or more is fine with bf16.
>>
>>107365810
>zimg discord
?
>>
File: 1757281580043112.jpg (1005 KB, 2560x1564)
1005 KB
1005 KB JPG
>>107365779
The difference is huge gaddam, but like that anon said >>107365795 you lost some prompt adherence, let's hope the base model will be able to produce those kind of styles out of the box
>>
does torch compile work with z image
>>
>>107365928
https://x.com/Ali_TongyiLab/status/1994234720893255859
>>
>>107365872
Him joining the Discord is the biggest red flag ever. He's probably DMing the owners to prevent releasing models without more safety tuning.
>>
>>107365941
>making a discord
what a mistake, the discord trannies will influence those fags and the model will suck, FUUUUUUUCK
>>
File: Flux2Img_00016_.png (2.4 MB, 1440x1152)
2.4 MB
2.4 MB PNG
>>
the monkey's paw is curling, local had a nice 3 days but it's already over
>>
File: 1266936454.png (1.23 MB, 1344x768)
1.23 MB
1.23 MB PNG
>>
File: 1758275906084229.png (2.98 MB, 800x2000)
2.98 MB
2.98 MB PNG
damn if i was young and still had time i would make an entire malazan graphic novel in z alpha
>>
File: local sabotage.jpg (621 KB, 780x1723)
621 KB
621 KB JPG
>>107365872
this is the troon that killed pixart btw. they were looking for anime datasets and he told them to train on midjourney slop instead. he should be publicly shamed and run out of that discord before he kills z-image.
warn them while you still can: https://www.reddit.com/r/StableDiffusion/comments/1lsfobb/full_breakdown_the_bghirasimpletuner_situation/
>>
holy shit it sure did load. in this concept's case it definitely followed the prompt just fine kek >>107365779
>>107365936

seriously if a fucking 6b turbo model can handle loras this good, that base is going to be a monster.
>>
>>107365777
mvsterpiece....
>>
>>107365988
FUUUUUCK
>>
File: 1763392444899636.png (883 KB, 1000x563)
883 KB
883 KB PNG
>>107365988
>the anon that said that something gay and retard will ruin Z-image base was right
PROTECT THE DEVS
>>
>>107365872
zootjeet should post this to plebbit like he did the noob announcement which quickly spread throughout the chinese community. they need know to axe this little worm before he kills the base model
>>
File: file.png (2.23 MB, 1024x1552)
2.23 MB
2.23 MB PNG
>>
>>107365872
>>107365941
>>107365988
Why are they making a discord in the first place, them interacting with western troons is a really bad idea, they are good because they aren't influenced by this nonsence
>>
it's so fucking over
SDXL for another 2 years
>>
>>107365988
>warn them while you still can
how would you warn them and truly make them listen?
>>
Do asian women actually like to get blacked?
>>
>>107365988
To say he single handedly tanked Pixart isn't exactly true; it was always _just_ slightly too wonky for mass adoption. That being said, I hope Based Chinamen see through his faggotry.
>>
>testing flux 2
>prompt 1girl in bunny suit
>expect sexy
>generate in mascot suit instead
doa
>>
>>107366034
https://www.youtube.com/watch?v=X27dvuBSyXE
>>
>>107366033
well there are plenty of threads documenting everything he has done to sabotage local, i am sure if multiple people message the staff that might take the hint.
>>
did they just ban him? yo based? wtf?
>>
>>107366044
yup, banned. lmao
>>
incredible things are happening in local diffusion
>>
>>107366044
who was banned?
>>
File: 1734935678283211.png (306 KB, 500x500)
306 KB
306 KB PNG
>>107366044
lmaooooooo, they don't miss!
>>
File: alex jones approval.gif (2.12 MB, 177x210)
2.12 MB
2.12 MB GIF
>>107366044
>>
File: 3881841420.jpg (541 KB, 2688x1536)
541 KB
541 KB JPG
>>
Chinks might be addicted to BBC but if they ban censortroons I applaud them
>>
>>107366065
>who was banned?
this schizo
https://www.reddit.com/r/StableDiffusion/comments/1lsfobb/full_breakdown_the_bghirasimpletuner_situation/
>>
>>107366044
someone should tell them he makes alt accounts hes probably going to try to sneak back in https://huggingface.co/lodestones/Chroma/discussions/67?not-for-all-audiences=true
>>
File: LeftRight.png (2.18 MB, 2048x1024)
2.18 MB
2.18 MB PNG
Right now, my gens look like the game on the left.
I'd like them to look more like the one on the right.
What should I start writing in the prompt to achieve that?
>>
>>107366065
bghira. btw comfy joined
>>
File: ComfyUI_08008_.png (1.29 MB, 944x1280)
1.29 MB
1.29 MB PNG
>>
>>107366084
He also did the same thing trying to take down Ostris's finetuner because it was better than his schizocoded garbage
>>
>>107366081
>>107366088
based Z Turbo Dicks removed a schizo that ruins open source AI models?
>>
I just looked into Chroma. So it's based on Flux and not its own from scratch model. Why are people acting like it's a unique base model? The dude could just train a new one on Z Image or other future original bases no?
>>
>>107366081
what compels someone to do this outside of severe mental illness?
>>
>>107366088
>btw comfy joined
he'll probably ask them to add more fenec girls on the training keek
>>
>>107366023
prompt, please?
>>
>>107366100
>He also did the same thing trying to take down Ostris's finetune
no way keeeeek what a giant faggot
>>
>>107365988
I'll do the needful and warn them before that faggot tries anything
>>
>>107366118
based saaranon
>>
that technicolor lora adds pussy and breast detail by the way. have fun kek
>>
>>107366085
warm light?
>>
>>107366085
maybe try this? https://civitai.com/models/2049756/shading-slider-bridge-tools-noobaiillustriouspony
>>
>>107366108
Mostly GLM with some edits.

A line of girls, all dressed exactly the same stand barefoot waiting in line before the altar.

A single girl at the front of the line with very pale skin stands before a rectangular stone altar. Her skin has a smooth, cool-toned appearance, almost luminous in the dim light. She is offering a blood sacrifice.

She is leaning over the stone slab of the altar. Her palms are resting far apart on either side of the altar. Blood drips from her mouth, down her chin, and splashes onto an existing large puddle of blood covering the altar.

The blood flows down the altar and into a wide channel on the floor that is filled with blood.

The girl is wearing a loose-fitting, sleeveless dress made of dark, coarse linen fabric that falls to her ankles. Her head is bowed slightly, and her gaze is directed downward at the offering in her hands. Her hair is long and straight, falling over her shoulders.

The altar she stands before is carved from a single block of dark, weathered granite. Its surface is flat and cool, with faint, shallow geometric runes etched around its edges. The altar sits within a large, stone-walled chamber. In the background, massive cylindrical pillars support a high, vaulted ceiling lost in shadow.

The primary light source is a strong, cool beam of moonlight shining down from a high, arched window on the right side of the frame. This light illuminates the girl, her arms, and the altar, casting long, deep shadows to the left. On the distant walls, small flickering torches provide a faint, warm orange glow, creating subtle highlights on the stone pillars but doing little to dispel the overall darkness.

The atmosphere is still and heavy. The color palette is dominated by cool blues, grays, and deep blacks, contrasted sharply by the single point of saturated red from the blood.
>>
Fresh

>>107366147
>>107366147
>>107366147
>>
File: turbosloppa.png (2.19 MB, 1320x1156)
2.19 MB
2.19 MB PNG
>check for z image style loras
>its all garbage trained on synthetic ponyslop outputs
local is a complete joke
https://civitai.com/models/1506464/hazard-style?modelVersionId=2448505
https://civitai.com/models/2174436/z-image-25d-anime-01?modelVersionId=2448658
https://civitai.com/models/2174309/z-image-illustria-anime-01?modelVersionId=2448521
>>
>>107366133
appreciate it, thanks!
>>
>>107365619
>>107365738
extremely low quality. I'm surprised it's possible to be this bad with AI.
>>107365700
sexooo
>>107366151
>NOOO MY AI SLOP GENERATOR ISN'T SLOPPING RIGHT
>>
>>107366151
some people just like different things than you
we can think its retarded, and we do, but that wont stop them
gotta learn to not care desu
>>
>>107365936
>you lost some prompt adherence
lora overfit not letting model to be creative
happens with all models
>>107366033
in short? loud minority with malicious agenda. it is always true.
>>107366038
slighty tuned pixart sigma i think or something like that, last one, is on civt, with some text encoders and what not (i had it but lost it when disk died) - look for it under checkpoints, author done good work in terms on output quality
>>107366085
besides prompts comfy has handful of nodes authors that are not that popular (except layer style i think) which help with image effects (as if you done some photoshoppping on it) - search githubs

overfit does not mean 'burnt' image
>>
>>107365988
He's on the Z-Image discord? Also I don't think Alibaba team would be dumb enough to let a troon like this sabotage them.
>>
>>107366044
If true this alone confirms Z-Image and the team behind it is the future
>>
>>107365988
damn holy shit I remember seeing this nickname like 2 years ago, mentioned as being one of the biggest faggots in sd community
>>
>>107364905
I've not decided yet, i2i is niche anyway.
>>
>>107365398
I'm still on SD 1.5. I don't know if it's the deep web images that leaked into the dataset or what it is, but there's soul to that model.
>>
>got bored of old models
>try new ones
>update all my shit that I haven't touched since sdxl came out
>spend all day sorting out weird shit
>now I'm using more vram for worse outputs



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.