[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


File deleted.
Discussion of Free and Open Source Diffusion Models

Prev: >>107997567

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/tdrussell/diffusion-pipe

>Z
https://huggingface.co/Tongyi-MAI/Z-Image
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>LTX-2
https://huggingface.co/Lightricks/LTX-2

>Wan
https://github.com/Wan-Video/Wan2.2

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>NetaYume
https://huggingface.co/duongve/NetaYume-Lumina-Image-2.0
https://nieta-art.feishu.cn/wiki/RZAawlH2ci74qckRLRPc9tOynrb

>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg
>>
>>107999241
hoooo
>>
>>107999241
bruce lee got buff
>>
File: zib_00013_.png (600 KB, 512x768)
600 KB
600 KB PNG
base is pretty good, im having fun
>>
>>107999259
I think comic fags are enjoying it but it shits the bed pretty bad when it comes to realism.
>>
>>107999266
proof?
>>
File: Zimage_base__00344_.png (1.66 MB, 1024x1024)
1.66 MB
1.66 MB PNG
>>107999259
yeah i love it
>>
Wait you guys are unironically gay and not just prison gay?
>>
>>107999272
prompt: american city
>>
>>107999269
proof: he posted a cartoon gen and said that he was having fun
>>
>>107999276
you want to join?
>>
>>107999279
bad faith
>>
>>107999272
prompt: julien lubimiv in six months
>>
>>107999259
Yep it's fun but sampler parameters are a bit annoying to get for higher resolutions to get a crisp look.
>>
>>107999277
600 lbs too thin to be american
>>
>>107999249
Is this a wan continued video or a i2v ltx2?
It's promising, I thought we were stuck with garbage nsfw animation with the model.
>>
>>107999249
You should try a closeup scene like that. it's harder than it looks.
>>
File: Zimage_base__00346_.png (915 KB, 1024x1024)
915 KB
915 KB PNG
>>
File: af.png (7 KB, 223x100)
7 KB
7 KB PNG
Does this work on base for you? I see no changes no matter the value.
>>
>>107999308
yea
>>
>>107999308
I don't think it even matter for res2x samplers
>>
>>107999241
>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
>>
>>107999308
It doesn't change much for me so the effect seems minimal on zib.
>>
>>107999308
yeah it makes subtle differences for me
>>
Another month. Another nofinetune. February? Only 30 days in that month, what are the odds we wake up on Febtember 30th and say, hey, no fine tune? No thank tune. Well I don't like it. While other models like Siri or Microsoft Tay leave us in the dust, dust is all that's left in our wallets. Four hundred Namibian dollars per month. Month after month. Inch after inch. Week after week. The Bibble warned us about this...
>>
File: Zimage_base__00348_.png (935 KB, 1024x1024)
935 KB
935 KB PNG
>>107999306
shift 3 >> shift 7
>>
File: 4445454564212.png (35 KB, 1121x150)
35 KB
35 KB PNG
>ACEStep 1.5 supposed to release a few days ago
>Let me give it to a few influencers first

What part of Chinese culture is this?
>>
>>107999333
nice double trips
>>
>>107999300
continued from a wan gen. I haven't tried i2v, I bet it's shit, but v2v works pretty well
>>
>>107999333
i haven't read a single one of ur spiels
>>
File: Zimage_base__00349_.png (2 MB, 1024x1024)
2 MB
2 MB PNG
>>
File: giphy.gif (964 KB, 480x275)
964 KB
964 KB GIF
>>107999333
>February 30th
nice digits
>>
File: Flux2-Klein_00047_.png (1.7 MB, 1008x1024)
1.7 MB
1.7 MB PNG
wtf... can someone make a collage?
>>
>>107999361
whoa, prompt?
>>
File: Zimage_base__00350_.png (1.97 MB, 1024x1024)
1.97 MB
1.97 MB PNG
shift 3
>>107999355

shift 8 this
>>
interesting. i thought jannies would nuke the thread, instead its just the image
based
faggots are insufferable
>>
>>107999365
3 is better
>>
https://github.com/agwmon/self-refine-video?tab=readme-ov-file

thoughts?
>>
>z image is a mess
>loras make it worse
yikes
>>
>>107999346
OK, I guess it's better than nothing, sad it can't do full good i2v yet, but the pipeline of wan 5s -> ltx2 10+s is at least possible.
>>
File: 1768770368794067.png (424 KB, 600x608)
424 KB
424 KB PNG
>>107999362
in fk:
>make him look realistic. keep the proportions, shapes, and colors exactly the same.
>>
>>107999388
lol nice thanks
>>
>>107999388
based
>>
>jannies delete OP image for showing the side of a guys thigh
Uh...
>>
>>107999392
no problem
>>
>>107999376
if it can fix weird physics and face inconsistencies maybe it's worth the slower gen time
>>
>>107999397
Gays are like 4% of the population bud. I know you watch your reddit cartoons that make you think there are more gay people than there are but you need to snap back to reality. Nobody wants to see that gay shit.
>>
File: z-image_00119_.png (1.06 MB, 1024x768)
1.06 MB
1.06 MB PNG
Oh no no no no ahahaha

Can confirm it's hard to get a kiss with open mouths pressed together. Passionate deep kiss, French kiss, and open-mouthed kiss all fail. Guess I'll try more tomorrow.
>>
>>107999406
who cares? boohoo i dont like different things QQ
>>
File: Zimage_base__00351_.png (1.99 MB, 1024x1024)
1.99 MB
1.99 MB PNG
shift 4
>>
File: 9714120.png (2.18 MB, 1072x1088)
2.18 MB
2.18 MB PNG
>>107999397
Was it >>107998157?
>>
>>107999425
yes
>>
File: 1769478638666582.jpg (44 KB, 660x574)
44 KB
44 KB JPG
>base is finally working
>requires millions steps
>slow as chroma
>but without the nsfw content
days of waiting for this.......
>>
File: Zimage_base__00352_.png (1.5 MB, 1024x1024)
1.5 MB
1.5 MB PNG
>>
>>107999397
>oh no, my gay troll pic got deleted
shutup fag
>>
File: 2250464.jpg (1.71 MB, 1072x4352)
1.71 MB
1.71 MB JPG
>Strictly preserving the composition, subject matter, and having all elements of the original image (Image 1) unchanged, make the image realistic. (Remember he is a frog opening his mouth and has tears of joy from laughing)
>>
>cathy (not a real person please don't ban me saaar!)
>vury good AI character trained on AI images
>>
File: z-image-fp_00003_.jpg (3.27 MB, 1664x2432)
3.27 MB
3.27 MB JPG
>>
>>107999428
also, becomes a garbled mess at high resolutions
>>
>>107999406
>100% of gens must appeal to 100% of people
What a dull world you must live in
>>
>>107999483
you could just be inclusive and make a collage out of people gens,
you know, inclusive, the shit you faggots been preaching for years
stonewall gays have contempt for you
>>
File: z-image-upscaled_00012_.jpg (3.56 MB, 1664x2432)
3.56 MB
3.56 MB JPG
>>
>>107999525
cool
>>
>>107999525

jesus christ the feet... fucking wtf
>>
File: ComfyUI_00138_.png (1.89 MB, 1440x800)
1.89 MB
1.89 MB PNG
Need darker shadows tbqh
>>
>>107999501
Holy shit you are really mindbroken by identity politics. Did seeing a picture of God's creation really trigger you that much?
>>
>>107999532
ever read hp lovecraft? yeah
horrors beyond your imagination
>>
>>107999536
try karras
>>
>>107999537
feeble minds and such
>>
attentionwhoring faggots are insufferable and such
>>
>>107999525
Smeagol's feet
>>
All of the ZIB loras on Civit look worse than I guarantee you a properly trained ZIT version would have anyways. ZIB is just not worth using over ZIT or Distilled Kleins for inference
>>
you already posted it seventeen times
>>
with the makeup off, it's clear that z-image is little more than another chinese failbake. worse than qwen even. you were fooled by turbo shilling into thinking they had some revolutionary architecture but nope, just more garbage
>>
>>107999540
Hewlett-Packard, what?
>>
>>107999626
all about the pentiums baby
>>
File: z-image-upscaled_00013_.jpg (3.73 MB, 1664x2432)
3.73 MB
3.73 MB JPG
y u bulli me
>>
anyway, as i was saying.
>>
File: 1754278561229532.png (489 KB, 1966x971)
489 KB
489 KB PNG
z base is still not out btw, it's just "z-image", but everyone thinks it's base. kek, what a shitshow
>>
File: 544848844545.png (187 KB, 1117x751)
187 KB
187 KB PNG
What do you guys think about this retardation? Are these guys sabotaging the project or what? This is from the ACEStep 1.5 discord. Training on Suno's or Udio's synthetic data when sourcing quality MP3's and FLACs with lyrics is a piece of cake is retarded, plus if you want to surpass Suno or Udio you wouldn't use it.

It's already confirmed ACEStep 1.5 at least used Suno's synthetic data in some form during training. My thoughts? Garbage in, garbage out. This model is neat, but it'll not be inherently better than either Suno or Udio. As for LoRA's? A good LoRA would require a decent base model, and that's not what we're getting. Seems like ACEStep 1.5 might be a nice toy, but it's still nowhere near SOTA.
>>
File: hmm.jpg (357 KB, 1528x1020)
357 KB
357 KB JPG
I think I'm doing something wrong, z-image doesn't look this bad
>>
>>107999647
cool story bro
>>
>>107999647
It's the non distilled model, and enough to create proper finetunes with.
>>
File: Zimage_base__00359_.png (2.09 MB, 1024x1024)
2.09 MB
2.09 MB PNG
>>
>>107999648
And for some bizarre reason, China is heavily obsessed with synthetic data, also with their image models and it's so obvious and just neuters the quality of their models incessantly.
>>
File: fuck ldg.png (1.18 MB, 896x1152)
1.18 MB
1.18 MB PNG
>>
>>107999670
go on, git
>>
>>107999661
no one wants to use anything slower than ZIT anymore though lol, it killed acceptance of full-step models basically
>>
name one celebrity funnier than Kat Dennings.. you can't
>>
>>107999648
there is some huge torrent with all spotify music around too (albeit not at the best quality, but does anyone even train with flac or 320kbps mp3s?) there is no reason to train on synthetic data
>>
>>107999648
>>107999665
Another thing to keep in mind, they might actually curate a quality dataset, but they will simply not open source the model they create with it, and purposely create the open source ones with synthetic data, which is disingenuous because we do get quality open models that are not like this.
>>
>>107999650
Fucked sampler and/or scheduler.
Try ddim for both.
>>
File: ComfyUI_temp_xoops_00001_.png (3.56 MB, 1272x1912)
3.56 MB
3.56 MB PNG
>>107999650
sampler? scheduler? steps?
Are you using the model loader from multigpu? It's broken and destroys gens after several runs.
>>
>>107999679
your mom
>>
>>107999685
>>107999689
Yeah it was scheduler issue + using sage attention for some reason, thanks anons
>>
File: Zimage_base__00363_.png (1.7 MB, 1024x1024)
1.7 MB
1.7 MB PNG
>>
File: 76586.png (2.88 MB, 1744x1344)
2.88 MB
2.88 MB PNG
>weaponized again
at least make your own gay images
>>
File: file.jpg (413 KB, 1120x1440)
413 KB
413 KB JPG
z base is literally kino.
>>
File: Zimage_base__00364_.png (1.68 MB, 1024x1024)
1.68 MB
1.68 MB PNG
>>
>>107999355
>>107999365
shift 1 is best desu
>>
don't think of them as humans... think of them as Americans *bang*
>>
>>107999721
proof?
>>
>>107999683
Apparently it is a mix of both real and synthetic, but, and the argument I saw some users on there give for that is if you want a model like Udio then you'd face copyright infringement lawsuits due to the model's ability to re-create songs it's heard before, but the irony is that China does not concern itself with that.
>>
>>107999721
like come on. no lora needed. local SOTA.
>>
File: Zimage_base__00369_.png (1.22 MB, 1024x1024)
1.22 MB
1.22 MB PNG
>>
Question, should I be taking security measures if I decide to use SwarmUI since it uses ComfyUI as a backend? Otherwise, do you guys use docker or have a separate offline machine for ComfyUI usage? Also, how do you guys feel about EasyDiffusion? I set it up with SD1.5 (outdated shit, I know) and have largely been getting shit results, so I am likely going to set up a proper UI with an actual model. There's a lot of information, and I've been reading multiple of the different generals, and it seems easy to get tangled up in information. I am not sure what models would be good for an RX 6600. I want to do mostly anime art and occasionally maybe realism (NSFW for anime - will most likely use Wai Illustrious NSFW v14.) Otherwise, I was thinking of using SDXL.
>>
File: Zimage_base__00370_.png (1.67 MB, 1024x1024)
1.67 MB
1.67 MB PNG
>>
https://www.youtube.com/watch?v=b5dTE00trRY for the racists
>>
>>107999756
it literally doesn't matter for their company because :
- like you wrote good luck issuing anything against their jurisdiction
- there is even less to threaten with if the model is released to the public for free
>>
>>107999782
as soon as python is involved, docker
>>
>>107999804
Now that I think about t, the creator did say that to train a model this big it takes roughly around 5 million dollars. That means they would need to get funding from somewhere. And my guess is that their investors are none other than a company that is established here in the States. And who else would invest in music models? Alibaba, who are working on their own. That would explain why they would have the constraint of keeping the model from reproducing songs.
>>
File: Zimage_base__00371_.png (1.49 MB, 1024x1024)
1.49 MB
1.49 MB PNG
>>
Are edit models a meme? It seems like if someone made one and did it right it could make loras a thing of the past, but they never work consistently and if it's a hybrid t2i/edit model it seems like it fucks up finetuning.
>>
>>107999869
how long has it been since you've used one? klein and qwen do a pretty damn good job
>>
>>107999869
What's you definition of meme? You can make fun slop with klein.
>it seems like it fucks up finetuning.
How does it do that?
>>
>>107999665
Very common misconception, it depends entirely on training methodology.
If you train on inputs directly, yes of course synthetic data is gonna eventually trigger collapse.
There are smarter ways to train already in use.
Consider: No model learns anything outside of distribution. Can't be done, a model will never correctly recall something it hasn't seen.
What it can do is fake generalisation by means of capturing and replaying specific vectors through the model against your desired inputs.
For example it has seen 3000 different golden things, different shapes and sizes, how they interact with light and whatever other physical properties. So if you ask for a "golden burger" or something, it can apply that vector to the burger and wow, it did something "new".
In reality then, what is important is not necessarily real or synthetic data but how well you can capture and associate a concept.
This has been achieved with difference-training, or image pairs. Sad -> Happy, Unfertilised imouto -> pregnant belly and so on. Even with extremely very few pairs, the training and generalisation is rapidly accelerated regardless of whether real or synthetic data was used because ultimately all we're trying to do is trace through the model a path where something is not represented and another where it is, that difference is the value.
So even if both images out of your pair are synthetic, the difference isn't. Further to this, synthetic data is suddenly "more" powerful here, insofar as you can perfectly isolate precise edits. Where taking two photographs IRL would produce innumerable side effects even down to camera sensor noise, synthetic data can be pure.
Now of course all of this only works on a model that can understand the things being represented at all, which itself requires ideally as much real data as possible, but once the foundations are down you can go all-in on synthetic if you want.
>>
>>107999895
solid wording
>>
>train z-image on porn images
>horrors beyond your imagination
>>
>>107999899
What do you mean?
>>
>>107999904
i like the words
>>
>>107999908
Makes sense.
>>
>>107999904
word good
>>
File: z-image-upscaled_00017_.jpg (3.87 MB, 1664x2432)
3.87 MB
3.87 MB JPG
>>
File: 1768981030928583.mp4 (2.42 MB, 704x1080)
2.42 MB
2.42 MB MP4
>>107999670
>>
the most obvious use case for editing is to take characters from one image and put them in a new scene for easy character consistency but they all fucking suck at that
>>
File: Zimage_base__00376_.png (1.71 MB, 1024x1024)
1.71 MB
1.71 MB PNG
>>
Working on some LTX gens that will put WAN to shame. you fuckers can keep your static camera angles.
>>
>>107999895
Interesting, thanks oniichan.
>>
>>107999923
how complex is it compared to wan?
if it's just nsfw lora on a normal wf and proper prompting that's nice
>>
>>107999932
He just made shit up. If you really want to know about AI models, you should read published papers on them.
>>
>>107999923
>static camera angles
It's incredibly rare to get the camera locked down in WAN. Even when you think it's still, there's always a tiny bit of wiggle.
>>
File: Zimage_base__00383_.png (2.1 MB, 1024x1024)
2.1 MB
2.1 MB PNG
>>
File: file.png (47 KB, 529x452)
47 KB
47 KB PNG
You can, today, train a concept on SDXL in as low as 30 steps with as few as 1 image pair, the more the better the generalisation but 3-7 is fine.
https://github.com/hako-mikan/sd-webui-traintrain
Believe it if you want, synthetic data is amazing.
It was used to remarkable effect in the Qwen Image Edit camera angle lora:
https://huggingface.co/fal/Qwen-Image-Edit-2511-Multiple-Angles-LoRA
>>
File: Zimage_base__00386_.png (1.93 MB, 1024x1024)
1.93 MB
1.93 MB PNG
>>
>>107999951
looks correct to me, what's made up?
>>
>synthetic data
Did I travel back to 2024?
>>
>>107999996
sar
i have successfully
generated
1 million
ai images
of woman
for regularization
dataset
>>
>>107999975
>A1111

This UI is no longer maintained. It would make more sense to create a standalone training script.
>>
>>107999975
>synthetic data is amazing
Yes as long as it's applied to finetuning or loras, it's very useful. Not for the base model.
>>
>>108000000
holy
>>
>>108000000
digits confirms sar
>>
File: 5554445645.png (173 KB, 1117x614)
173 KB
173 KB PNG
>>107999895
ACEStep creators are aware of the downfalls of overreliance on it and have mentioned it many times as the downfalls of competitor's models, so perhaps you are right. I'm still excited for the model, it can produce good music, especially with instrumental tracks it's quite obvious that the data was HQ, but I do still hear a bit of slop in English tracks (weirdly not quite as slopped as Suno itself though, but just enough to remind me of it). Maybe it has nothing to do with the data and more the model size, and a LoRA would fix all this, we'll see.
>>
>>108000000
kek, also good digits
>>
File: Zimage_base__00387_.png (1.27 MB, 1024x1024)
1.27 MB
1.27 MB PNG
>>
>>108000000
>1 million portrait close ups with greasy skin
>>
>>107999989
That's not a human-looking skeleton.
>>
File: z-image-upscaled_00018_.jpg (3.71 MB, 1664x2432)
3.71 MB
3.71 MB JPG
>>
>>>/gif/30182496
>>
>>108000050
Why do I always click..
>>
>>108000050
KEK

it's a shame /vdg/ is unusable with all the tranny spam
>>
File: Zimage_base__00391_.png (1.4 MB, 1024x1024)
1.4 MB
1.4 MB PNG
>>
>>108000033
its a hobbits
>>
>>108000000
good job.
>>
File: Zimage_base__00400_.png (1.32 MB, 1024x1024)
1.32 MB
1.32 MB PNG
>>
File: Zimage_base__00401_.png (1.49 MB, 1024x1024)
1.49 MB
1.49 MB PNG
>>
File: 1751019857254657.mp4 (1.97 MB, 704x1056)
1.97 MB
1.97 MB MP4
>>107998896
>>
>>108000169
nice one
>>
>>108000169
being limited by 5 seconds and 16fps feels like forever ago now, I hadn't used wan for a while before ltx2 debuted
>>
>>107999996
benchmark to moon bloody sir
>>
>>108000183
generate 5s clean with wan then extend v2v using ltx2
>>
We've come to the point where I can make fully convincing porn videos of me and my waifu (100% actual likeness) onlyfans style...

But... would people pay to watch it...?

(black btw not sure if that matters)
>>
>>108000197
you forgot padding it with interpolation
>>
File: Zimage_base__00405_.png (1.53 MB, 1024x1024)
1.53 MB
1.53 MB PNG
>>
>>108000203
pay? no
>>
>>108000203
only interested in the waifu, if she's hot and the animation isn't horror body, I'll look
>>
>>108000203
get a job, retard
>>
>>108000211
What if I grift reddit like Z-image devs using bots and make people think my shit is SOTA?
>>
>>108000244
go for it
>>
File: 1746703957542662.png (3.54 MB, 1168x1752)
3.54 MB
3.54 MB PNG
>>
where the fuck are the klein porn tunes already?
>>
what's the simplest, most pointless crap I can do with torch, just as an exercise
>>
>>108000309
burn down your house
>>
>>108000309
coin flip predictor
>>
File: z_image_0011_.jpg (380 KB, 1440x1120)
380 KB
380 KB JPG
not bad... it's not properly in the glass case though
>>
File: 1750575935962196.png (3.39 MB, 1168x1752)
3.39 MB
3.39 MB PNG
>>
>over 24 hours since z-base
>only discoveries are that it's complete shit for loras
local... what went wrong???
>>
>>107999647
>omni base
>visual quality medium
Heh
>>
>check the pornmaster lora
>trigger word
This hobby is really filled with retards.
>>
File: 1739415452140.png (1.19 MB, 2400x3872)
1.19 MB
1.19 MB PNG
I used grok to shade my flat image and it's stupidly good. Is there anything like this locally? A prompt to edit of existing pictures? Inpaint doesnt really do it. But if i fed my full image it would get denied because muh sensibilities
>>
>>108000357
no. you are finally realizing how far behind local really is.
>>
>>108000357
https://huggingface.co/black-forest-labs/FLUX.2-klein-9B
>>
>>108000357
flux klein
>>
>>108000357
you forgot wet skin in the negs
>>
>>108000357
You can use controlnet, even with an older model like illustrious. Basically it keeps the structure perfectly the same (so your lineart remains the same), and then you can adjust with a percentage how much you want to influence the image.

you're a noob though, you'd have to learn about comfyui and look up workflows on civitai.
>>
>>108000357
Yeah, that is when I realized that local models are toys for kids, just like LLMs. If anyone wants serious, high quality generative image or text work, they should use an API. Local is a simple demo for hobbyst.
>>
>>108000398
>>108000363
can you please just get the fuck out of this thread, it's called "Local Diffusion General", that means you're not supposed to be here
>>
>>108000398
>>108000400
samekek
>>
soo organic
>>
>>108000394
>Basically it keeps the structure perfectly the same
Do not trust his words. Grok did a clean job. ControlNet modifies the lineart, distorts it, adds thickness to the lineart where it does not belong, does whatever it wants, and does not respect the colors. Adding weight to ControlNet only worsens things.
>>
>>108000357
Grok is just too good, I have been using it every day and only use comfyui once or twice a week. And their pricing is very competitive.
>>
File: 1763310657686363.png (3.24 MB, 1168x1752)
3.24 MB
3.24 MB PNG
>>
>>108000357
Grok has saved my children from a burning building and fought for me in a court case where no one would, I only have eyes for grok now
>>
>>108000430
>lurkers
>>
You are not going to replicate the power of a 128GB model on your 12gb 3060. Grok, Nano Banana, etc are all leagues above local models especially for editing. You may be able to train a specific style transfer lora for qwen that understands one style really well but that's about it.
The only use-case for local is porn, people are willing to put up with worse quality in exchange for porn. This is why majority of local users are still using outdated SDXL hentai models.
>>
god why are you all so fucking POOR?
>>
>3060
lol, poor kid
>>
File: 1750653221175052.png (3.99 MB, 1280x1616)
3.99 MB
3.99 MB PNG
>>107999241
I've seen people both from here and on Reddit say that ZIB Is noticeably worse (not horrible, just worse) at realism than ZIT. Is this the case for anyone else? The main reason I cared about this release is because theoretically lora training should be able to work better If they are trained on the "ancestral weights" (ie a person/character Lora trained on the base model could work well on the base model AND the turbo model but a lora trained on turbo probably won't work well on the base model). A theoretical workflow I'm thinking of is
base model ---> train Lora off of base model ---> use on Turbo model #turbo Is the more ideal model to use due to loser step requirements and faster inference 
>>
>>108000400
Listen Mr Not Garglin Copium 24/7, being a local means being critical and honest. Maybe you did not realize, because you were refreshing Hugging Face for some new monthly disappointment model, but right now you are operating in a generative AI framework with preschool level quality.
>>
>>108000461
if base model drifted from turbo (extra training) then this wouldn't the case. this is why they should've released it the same day regardless of how bad it was, because if they made any changes to it at all it's fundamentally no longer the base that turbo was distilled from
>>
File: 1761189516926785.png (3.3 MB, 1168x1752)
3.3 MB
3.3 MB PNG
>>108000461
>Is this the case for anyone else?
no
>>
>>108000469
Welcome to chinese culure
>>
>>108000461
you waited 2 extra months for extra slop
>>
>>108000469
>>108000484
Then what's the point of releasing a model that requires MORE steps and resources to use? The thing can't even do porn well so there's no point in ANYONE, porn user or otherwise, to use the "base" model (they aren't even calling it base having further training is possibly the case. I wish they local community wasn't so up their own asses otherwise they could make a better model in less time and less "#SAFETY" posturing

>>108000469
Do they look generally the same or as base better in any way? If the quality difference isn't night and day, I don't see base worth using since it requires so many more steps.
>>
>>108000461
>turbo model but a lora trained on turbo probably won't work well on the base mode

Zit lora on base generates pure noise.
>>
>>108000488
The cope is that some other finetuner will finetune the base model and that would be the new standard. The problem that no one foresaw is that the finetuners do not know what they are doing.
>>
>>108000501
FinetuneRS, not a single person. That isn't practical even if you're a rich fag. Let's assume you had your own personalized team that answered to your beck and call: how would YOU go about training the perfect model?
>>
>>108000488
The only people genning with base now are /ldg/'s two polar opposites, schizo spammers showcasing how dogshit Z base is, or die hard /ldg/ zealots performing CPR on a corpse after months of edging.
>>
File: 1741941888303498.png (3.4 MB, 1168x1752)
3.4 MB
3.4 MB PNG
are you really having that much trouble with z?
>>
>>108000488
>Then what's the point of releasing a model that requires MORE steps and resources to use?
so kekstone can spend another $100k melting it even further
>Do they look generally the same or as base better in any way?
They are not significantly better on base because base isn't good. People seemed to misunderstand what base was thanks to the 2 months of hype-shilling. Base is an anatomical mess full of garbled details. Applying a lora to this doesn't fix it any more than applying a lora to SD3 would. The only use for base is large-scale finetuning on expensive enterprise hardware.
>>
>>108000515
Post prompts
>>
>>108000501
>The problem that no one foresaw is that the finetuners do not know what they are doing.
And no one foresaw either that competent finetuners already have an established name and are not going to risk money, time, and failed builds on experimental models that may be replaced months later, example Qwen, ZiT, Klein
>>
>>108000511
I thought it was kinda weird that there wasn't a single richfag that was willing to fund the ultimate local model but then I realized that he'd need a person that was competent in this regard and there's maybe handful of those on this entire planet.
>>
>>108000526
Prompts
>>
>>108000000
Saaaaaaarrrrrr
>>
>>108000540
And probably they are already working for a SaaS model, NoobAI dev working for Grok.
>>
why would anyone waste money on training the perfect fine tune when the perfect base model doesn't exist yet?
>>
>>108000540
if you're rich why pay to fund some localslop when you're already invited to the gentlemen's club of uncensored nano banana pro?
>>
Can someone help me?
>>
>>108000555
With uncensored copyright and uncensored styles?
>>
How NAI did it?
>>
File: nbp.png (1.82 MB, 1408x768)
1.82 MB
1.82 MB PNG
>>108000567
of course. when the goycattle censored version is already insanely good, imagine what the richfag exclusive one is like
>>
File: 1766413440123577.png (3.85 MB, 1168x1752)
3.85 MB
3.85 MB PNG
>>108000526
just caption an image you like with whatever language model and go from there
or write a couple of paragraphs by hand
that one was https://pastebin.com/UyBRJr0P
>>
Tbh, I’ve been posting Klein images and naming them zib to see if anyone even noticed and nobody has called me out yet.
>>
>>108000587
retard
>>
>>108000637
proof?
>>
Something fucked with my zimage, when I started I had long gen times, but now, it's like 8 times faster.
>>
>>108000657
yeah its called turbo
>>
>>108000139
nice
>>
>>108000680
can you say that to me too
>>
>>
>>108000692
double nice
>>
>turbo release, everyone celebrates the new age of china
>base releases, turns out it's just the same old slop once again
it's clear that the quality:speed ratio is what matters most, so how2 a model with turbo's quality/speed but base's trainability?
>>
>>108000438
> People are retards; they accept lower quality for local porn.
Yes! Finally, someone is saying it! I use NBP all day long to generate pictures of my dead dog riding a pig or walking on the moon. When I get bored, I generate what my living room could look like and clothes I would never buy because my visual imagination is stunted since I've never read a book. SARS models are also great for generating consistent OF profiles or customizing my dating picture to fuck real got girls - I think it will work. or to produce infinite yt videos to make money with ads. Saaaarrs
Why don't people understand the added value of diffusion models alongside local porn?
>>
localseethe. back to sdxl with you!
>>
My 2c about what I have tried so far

SDXL is still servicable for some usecases but has an inherent wonkiness to it. Still has decent robustness in its variants and loras

Illustrious I manage to finetune some unique styles with lora combining and there are so many loras out there the combination variants are near infinite. still very rewarding to use and play around with, especially using controlnets for more options. if the totality of local models ended here I'd still be satisfied and likely have stuff to play with for the rest of my life

Flux i managed to wrangle to get some decent stuff using certain cinematic loras combo'd to give it a specific analog look which looks cool to me. the gens are still stiff and too posed but it's good for key art images

Z I feel tried to reach higher but hitting the wall is because of that even more evident. Unless a new breakthrough is achieved the limitations are more frustrating now than the minor advancements. It's like a slightly better Flux in some ways but at this point there should be another leap forward and this feels like a lateral step more than anything else.

Haven't tried Klein yet or any video genning.
>>
So it's true... local really doesn't have any use cases besides porn... and it's not even good at that. How do we cope?
>>
>>108000756
however you want, it's a free country
>>
>>108000719
base didn't release thoughverbait
>>
File: 1762543650642270.jpg (609 KB, 1408x1632)
609 KB
609 KB JPG
>>
Easiest way to do xy grids for zit?
>>
>>108000775
The ZIT paper has shown a cost-effective way to do this.
Fine-tuning and then distilling a model will only cost us around $200,000–300,000.
Shortly after SD 1.5 was released, we had collected $80,000 on Kickstarter in three days from 0.2% of today's user base for an NSFW fine-tuning before anti-AI homos demonstrated and it was taken down.
1 famus AI youtuber would be enough to collect enough money for dozens of finetunes
>>
wait come up with new b8 tho plox
>>
local only needs a few million dollarinos and then we might be able to reach 2024 gpt levels. get hype!
>>
File: ZIT_00036_.png (1.77 MB, 1024x1024)
1.77 MB
1.77 MB PNG
Holy fuck, zimage does not like that huge quality tags essay.
>>
>>108000824
nice try
body horror = flux
reddit school teached me
>>
>>108000810
Eww, you actually like gpt yellow slop...
>>
File: 1743345748174666.jpg (554 KB, 1632x1376)
554 KB
554 KB JPG
>>
>>108000835
No this is zimage.
>>
now that we all know how base z-image is, do we switch to klein or sub to saas?
>>
File: 1762351952820727.jpg (806 KB, 1408x1504)
806 KB
806 KB JPG
man generating my shitposts with the 2-pass ZIB->ZIT workflow takes too long (70s x image)
Doing 0.9MP 15 steps res2s bong into ZIT euler/simple 9 steps, in the middle a 1.6x upscale with nomos upscaler and 0.6 denoise.
I want my fast shitposts.
>>
"anus,"
>>
File: 1757387804726225.jpg (602 KB, 1632x1408)
602 KB
602 KB JPG
>>108000862
but fuck ZIB has so much more fucking variety, no more ZIT sameface, and composition/prompt following is better. and doing the 2nd pass on ZIT just gives it the better/refined look. I might put this back to 1.0MP ZIB -> 1.5x upscale, because I see some noisy shit, otherwise im cooking. Eagerly waiting for cnet to complete my wf
>>
File: 1750522714581915.jpg (811 KB, 2016x1120)
811 KB
811 KB JPG
>>
>>108000882
It still looks like SDXL really.
>>
File: out.png (2.57 MB, 2432x1920)
2.57 MB
2.57 MB PNG
>>108000357
>>
is there a recommended negative prompt for z-image I can throw at it? I have bokeh, blurry and blurry background in, do we need the malformed limbs or other suchshit in?
>>
File: ZIT_00092_.png (1.67 MB, 720x1280)
1.67 MB
1.67 MB PNG
>>
>>108000943
Powerful
>>
>>108000357
I've had luck using this klein lora for anime.
https://civitai.com/models/2332320
It's really versatile, I just wish it did nsfw... that would make it perfect
>>
File: 1746517445480336.jpg (61 KB, 832x1216)
61 KB
61 KB JPG
>>108000593
Z-Image Turbo
>>
File: 1742861499212836.png (1.43 MB, 1024x768)
1.43 MB
1.43 MB PNG
>>108000593
>>108000970
Z-Image "Base"
>>
what timestep distribution to use for flux klein 9b character loras?
>>
>>108000970
>>108000972
your workflow is poor https://files.catbox.moe/sfs7yt.png
also why did you change the aspect ratio kek
>>
File: 1754151555243839.jpg (56 KB, 736x414)
56 KB
56 KB JPG
>>108000000
>>
File: ZIT_00003_.png (3.81 MB, 1440x1440)
3.81 MB
3.81 MB PNG
Zimage has such soul.
>>
>>108000000
checked and sir'd
>>
>>108000943
>>108001025
Don't let /mwg/ find out
>>
File: 1768556371315953.jpg (1.87 MB, 2147x1344)
1.87 MB
1.87 MB JPG
>>108000357
klein
>>
>>107999975
I was thinking of this. Train multiple sdxl loras on making side by side images with specific edits types. And then split the images into control/output, group all these edit cases for finetuning Klein/any edit model
>>
>>108000799
well whats stopping you
>>
File: ComfyUI_00437_.jpg (994 KB, 2304x4096)
994 KB
994 KB JPG
Chromahybrid is nice
>>
File: jdr4h8.png (1.55 MB, 2048x2048)
1.55 MB
1.55 MB PNG
>localseethe. back to sdxl with you!
>>
they're posting feet--and your laughing!
what is wrong with you? for the love of all that's holy! why now?
>>
>>108001093
wait
do that but in an entirely different style. or be destroyed. :)
>>
>>108001081
gigantic resolution damn
>>
File: is_this_real_photo_or_zb.jpg (2.77 MB, 2304x2304)
2.77 MB
2.77 MB JPG
>>
File: 1761371435199157.jpg (597 KB, 1536x1536)
597 KB
597 KB JPG
>>
File: 1743525290308637.jpg (746 KB, 1536x1536)
746 KB
746 KB JPG
>>
>>108001140
is this dype or 2pass?
>>
File: r.jpg (181 KB, 1280x1280)
181 KB
181 KB JPG
>>108000000
witnessed. congrats on excellent get.
>>
Looks like two major fine-tuners have decided to train Klein instead of Z, mainly because of Flux VAE2
>>
>>108001201
Who? Please tell me someone other than lode...
>>
>>108001078
Am I one of you?
>>
File: as4hh.png (1.31 MB, 2048x2048)
1.31 MB
1.31 MB PNG
>>>108001093(You)
>wait
>do that but in an entirely different style. or be destroyed. :)
>>
>>108001201
it is worth trying, the model is pretty nice
>>
>>108001183
ultimate sd upscale 4xNickelback deis simple 0.15 dn. dunno if it's optimal but it's what i used for turbo.
>>
feeling the universe behind you
it's like this is real.
>>
>onetrainer klein pr still not finished
what are they doing? it doesnt even work with 4b for me. why am i even paying these low performers?
>>
>>108001189
moar
>>
>>107999428
>>107999481
>be me
>obvious chink-hating pajeet
>rips on output quality of model that stated it's mostly for training
>creators stated it won't have images as good as the previous model
>will rage and cope even harder when finetunes currently baking are released
>>
Calling it now. Base will be forgotten entirely within a few weeks. The real magic of Z image was their distillation techniques. Base is just that, nothing but an expensive slow base model that has a little more seed variance than turbo.
>but it’s made for fine tuning
Okay do it then.
>>
>>108001253
>something unreleased will be forgotten
i doubt that
>>
>>108001253
I'm doing it right now. Seeing the mid-training samples and it's exactly as advertised
>>
>>108001262
Okay. Whatever the piece of shit tongyi released the other day will be forgotten.
>>
>>108001267
goon?
>>
>>108001267
>even the trainer admits they are mid
It’s over
>>
>>108001276
Haha no I actually meant I saw the images while training and they are coming out pretty good for the current step.
>>
File: 1754175844376509.jpg (563 KB, 1376x1664)
563 KB
563 KB JPG
2girls
>>
>>108001206
No answer, well I guess he was full of shit. Damn still stuck with the furry...I hate this
>>
>>108001287
inpainting doesnt count
>>
File: 1767330168516870.png (624 KB, 942x1172)
624 KB
624 KB PNG
>>108001300
it's actually 2 pass, ZIB 1st, ZIT 2nd (for good looks pr ready for merge)
>>
>>108001313
zit slop
>>
>>108001224
>when finetunes currently baking are released
I hope those can be diffed and meaningfully used on zit, because I ain't waiting for zi to gen.
>>
>>108001354
I'm sure the base will get 4-steps loras like every other model but turbo type finetune would be preferable.
>>
>>108001189

foolish coward, come to the lumiverse, hello :)
>>
>>108001354
it's actually really fast on my b200. maybe local users are just poor?
>>
>>108001378
What do you think a 4 step LoRA is? It’s just a distill
>>
>>108001300
yeah! imposter!
>>
>>108001313
i was wondering what the best combination for "totally white hispanic she's catiliostoning or something"
>>
migrate
>>108001415
>>108001415
>>108001415
>>
>>108001421
Make me bitch!
>>
File: r.jpg (141 KB, 1280x1280)
141 KB
141 KB JPG
>>108001222
>>
>>108001181
That reminds me I wanted to try genning Ascetic Blaze too. Man, the half-asleep stuff I forget when I'm fully awake...

>>108001140
Z-Image couldn't produce JLH by name for me earlier, so, I'm going with real photo.
>>
>>108001430
You want to be my bitch?
>>
>>108001526
Did I stutter!?



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.