[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


Frenship Edition

Discussion of Free and Open Source Diffusion Models

Prev: >>107814922

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/kohya-ss/sd-scripts
https://github.com/tdrussell/diffusion-pipe

>Z Image Turbo
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>WanX
https://github.com/Wan-Video/Wan2.2
https://comfyanonymous.github.io/ComfyUI_examples/wan22/

>LTX-2
https://huggingface.co/Lightricks/LTX-2

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe|https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
>>
Blessed thread of frenship
>>
one total schizo death please
>>
>>107817380
based
all jewlien has achieved all year is being the top nolife spammer and making the capture longer and more likely to fail for every poster
trvke
>>
>>107817380
>1408x670px
someone needs to chuck the collage script into an llm and have it fix that
>>
File: 1747315302583101.png (2.2 MB, 1152x1312)
2.2 MB
2.2 MB PNG
>>
File: 1767302957463824.png (2.34 MB, 1216x1248)
2.34 MB
2.34 MB PNG
>>107817380
ty 4 bake
>>
>>107817380
Thanks for the bake, finally a frenship edition
>>
>>107817450
>>107817443
>obfuscated filenames
tut tut
>>
File: 1751218377560305.png (1.98 MB, 1216x1216)
1.98 MB
1.98 MB PNG
1girl nation ww@
>>
Blackwellbros eating good https://huggingface.co/GitMylo/LTX-2-comfy_gemma_fp8_e4m3fn/blob/main/gemma_3_12B_it_nvfp4_uncalibrated.safetensors
>>
File: 1760383931709189.png (1.86 MB, 1248x1880)
1.86 MB
1.86 MB PNG
>>107817489
1girl
>>
>>107817504
what does uncalibrated mean in the autistic llm space
>>
>>107817497
Threads tend to be nothing but stolen gens from older threads if it isn't in the OP for whatever reason
>>
File: 1766296998155966.png (1.58 MB, 1024x1472)
1.58 MB
1.58 MB PNG
>>107817517
baited, it's 2girls time
>>
Best model for 1girl gooning for and ayymd 16gb vram fag (fedora if that matters)?
>>
File: 1901733864.png (1.08 MB, 832x1216)
1.08 MB
1.08 MB PNG
>>107817525
that's just 1girl with extra steps
>>
>>107817534
>ayymd
my condolences. anyway, z-image turbo
>>
>>107817547
Thanks
Amd yes i regret the amd card, great for gaming but meh
>>
File: 1047789214.png (1.14 MB, 832x1216)
1.14 MB
1.14 MB PNG
>>
>>107817534
ZiT
>>
>>107817380
Thank you for maintaining thread quality.
>>
>>107817585
Asuka was like this at 14yo??? ZAMN!!!!!!
>>
>>107817587
what about 2girl?
>>
File: 1744234558753538.jpg (120 KB, 912x1024)
120 KB
120 KB JPG
ok I'll try again here:

I have a question. ZiT has pretty good anatomy overall, and I have made some celeb/character loras with data including full body shots etc showing off the physique. The ZiT loras reproduce the likeness very well, and they picked up on the physique as well, but compared to the face the physique comes out significantly worse in comparison, sometimes some body features it picked up come out exaggerated, or proportions are offlike torso being weirdly squashed and so on. When using the same data for training XL loras in the past the physique was reproduced better overall, disregarding the usual XL defects.
Is this due to ZiT being distilled? Or does ZiT require different captioning or something(I didn't caption the physique at all except for thigns I want to be able to change like hair style and color and so on).
>>
>>107817618
Post an example of a bad gen with the prompt. What did you use to caption the images? Also it could be your prompts need work.
ZiT has better anatomy than XL by far if we're comparing base to base.
>>
File: 1767405479496581.jpg (1.79 MB, 2160x2706)
1.79 MB
1.79 MB JPG
>1girl cake thread
>>
File: file.png (72 KB, 749x337)
72 KB
72 KB PNG
>>107817520
>>>/wsg/6068851
No idea but it loads as fast as VAE
lmao
>>
>>107817618
You need very detailed captions to minimize body horror. I did two ZiT loras with the same dataset, one with detailed captions and one with just two words for each image. The dataset with minimal captions ended up with frequent body horror. The images also need to have mainly your concept. A cluttered room with your character, for example, is going to mess up learning.
>>
>>107817520
means it wasn't calibrated for ayyymd
>>
>>107817693
did you use aws blackrock to train it?
>>
Hello saars, can you spoonfeed me the exact versions of Pytorch, Python and WebUI ReForged to get xFormers working on Blackwell?
>>
>>107817618
i hate when anon makes no elaboration on their settings and expects any real help
>>
File: ZiMG_00149_.png (2.09 MB, 1440x1280)
2.09 MB
2.09 MB PNG
blessed be the real thread. based OP protecting us from malware advertisement.
>>
File: 1752276328980465.png (1.75 MB, 864x1760)
1.75 MB
1.75 MB PNG
>>
some ltx god here who wants to help me recreate the dumbledore dementor copypasta? i created the audio already
https://vocaroo.com/1gUBU3aiNIyI
maybe not perfect, but i think its solid enough
>>
>>107817635
I'm at work, I'll post one when I'm at home and the thread is still up. I used joycaption, with the training caption option. The effect is not pronounced to the point where you can say the anatomy is broken, hands have 5 fingers and joints aren't twisted, but it's noticeable that the physique and proportions are not as well reproduced as the likeness, sometimes head seems either a bit too big or to small as well. Or could the lora just be undercooked or burnt, leading to this effect? Apart fro mnthat they follow the prompt well, with some being kinda blurry despite not prompting for it, but that seems to happen randomy. I've only done a couple ZiT loras so far so I'm still trying to get the feel for it.
>>
>>>/wsg/6068859
[seinfeld theme]
>>
File: 916657722.jpg (1.7 MB, 1664x2432)
1.7 MB
1.7 MB JPG
>>107817600
Clearly, you haven't been paying attention to the literature.
>>
>>107817715
that's magneto
>>
>>107817707
Standard 8GB lora Onetrainer preset except autocast cache disabled, rank 32
>>107817716
nigger
>>
File: 093.png (3.22 MB, 1984x1248)
3.22 MB
3.22 MB PNG
>>
baseless thread
>>
>>107817733
>Or could the lora just be undercooked or burnt, leading to this effect?
Perhaps. I've not encountered what you've described. But
>with some being kinda blurry despite not prompting for it,
This sounds like a dataset or caption problem.

Desu you should really wait to ask for advice until you're able to post examples and such. There's not much anon can do apart from speculation.
>>
>>107817618
use the z lora block weight loader from https://github.com/ShootTheSound/comfyUI-Realtime-Lora and set the option to "face priority"
>>
it doesn't know planes very well, but at least better than XL
>>
quants work for ltx2 now:

https://huggingface.co/Kijai/LTXV2_comfy/tree/main

gguf node fix:

>Make sure to have the latest version of ComfyUI-GGUF installed via manager.

>Go to https://github.com/city96/ComfyUI-GGUF/tree/f083506720f2f049631ed6b6e937440f5579f6c7

>download loader.py / nodes.py and paste it into your ~/ComfyUI/custom_nodes/ComfyUI-GGUF folder.

>Completely kill and open ComfyUI (not just restart).

also update your installed nodes and it should work, testing now
>>
File: 1763411452535307.png (1.94 MB, 1024x1344)
1.94 MB
1.94 MB PNG
>mfw no omni
>>
>>107817847
DUDE SHAREW UR FUCKING HITLER LORA NOW!
>>
>>107817847
we got tricked again bros, chinese culture
>>
>>107817832
but it knows some planes better than others
>>
>not a single LTX-2 video in the thread
>>
>>107817872
>random catbok link
>click embed
>takes 30 seconds to buffer
>it's 5 seconds of pure slop
good riddance
>>
>>107817872
based people itt know that i2v is the only thing that matters, and ltx team said its fucked and they will fix it in a month or so, plus another week for that implementation to just work, and then after that it might be worth trying it
>>
>>107817830
I just want the block weight loader. I don't want to install gigabytes of files just to use that. Unfortunately, the block weight loader in the inspirepack hasn't been updated.
>>
>>107817872
it's just another DOA model, hype already died, everyone's back to waiting for Z-Base
>>
File: 0987235346.png (1.94 MB, 1024x1536)
1.94 MB
1.94 MB PNG
>>
>>107817880
just put this file in your custom nodes folder maybe https://github.com/shootthesound/comfyUI-Realtime-Lora/blob/main/selective_lora_loader.py
>>
File: SD35Medium_Output_2716671.png (2.12 MB, 1280x1536)
2.12 MB
2.12 MB PNG
>>
>>107817889
Local truly is stuck in the stone age of video
>>
>>107817880
the realtime lora custom node is 11mb
what are you talking about?
>>
>>107817862
nigbo world
>>
based jannies of frenship
>>
>>107817889
I don't think Z-base is ever going to be released. The CCP is probably torturing the researchers for allowing ZiT to gen non-PC and NSFW stuff.
>>
>>107817952
>The CCP is probably torturing the researchers
Hunyuan continues to be fully uncensored, doe?
>>
LTX i2v is pretty special. this gen even created new characters.

https://files.catbox.moe/6ggc31.mp4
>>
>>107817961
I know, you showed us yesterday
>>
so Nvidias only gonna make 3060s. 99 percent of the GPU consumers are gamers and they will be happy. fuck local sloppers, they dont exist. does that mean 5090/32GB will be as good as it gets, forever? what if I try a multi gpu setup
>>
cozy bread
>>
File: ComfyUI_00296_.png (2.1 MB, 2048x1152)
2.1 MB
2.1 MB PNG
>>
>>107817976
Do you actually think nvidia wants local sloppers to exist at all?
>>
why no zimage base tho
>>
File: ComfyUI_00012_.png (612 KB, 768x768)
612 KB
612 KB PNG
>>
>>107817889
Bullshit. Reddit is filled with LTX-2
>>
so it's the gemma text encoder that causes censorship with ltx?
https://civitai.com/models/2292336/ltx-2-nsfw-text-encoder-gemma-3-12b-abliterated?modelVersionId=2579572
>>
File: 785456.png (1.56 MB, 1024x1536)
1.56 MB
1.56 MB PNG
>>
File: LTX_2.0_i2v_00117_.webm (1.1 MB, 768x832)
1.1 MB
1.1 MB WEBM
>>107817443
>>>/wsg/6068891
>>
File: 152845049.png (3.56 MB, 1984x1248)
3.56 MB
3.56 MB PNG
>>
>>107817993
chinese culture
>>
>tfw trying to get a new job to buy a rtx pro 6000 before they either increase in price or stop selling all together

hold on 4070tis....just...a little.....longer
>>
>>107818067
are you a trucker?
>>
File: LTX_2.0_i2v_00125_.webm (1.06 MB, 640x960)
1.06 MB
1.06 MB WEBM
>>107817752
>>>/wsg/6068903
>>
File: ZImageTurbo_Output_372662.png (3.16 MB, 1280x1536)
3.16 MB
3.16 MB PNG
>>
>>107818058
looks great, saved

>>107818098 >>>/wsg/6068903
kek, the shitty audio is helping in this instance
>>
>>107818067
the real question is what are you gonna do with all that VRAM? is massive batch processing really gonna make amazing stuff that you couldnt do with a 5070? we're still working with the same open source models, same shitty comfyUI interface, same limitations as before
>>
>>>/wsg/6068917
it's over.
>>
File: LTX_2.0_i2v_00130_.webm (1.06 MB, 768x1152)
1.06 MB
1.06 MB WEBM
>>107818164
>>107817890
>>>/wsg/6068916
>>
>>107818089
nope

>>107818169
>the real question is what are you gonna do with all that VRAM?
do what i want
>>
>>107818186
porn just say porn
>>
Are there any other commits that Chang can make
I need more copium
>>
>>107817635
>>107817693
Not the same anon, but I've also been having trouble with training ZiT. I've training on a 3090 with datasets of roughly 60 images. I trained a couple loras on aitoolkit using pretty much the default settings, albeit with a couple permutations:
> 512px, square crop, both with and without captions
> 768px, square crop, both with and without captions
> 1024px, no crop, with captions
Of these, my best so far has been 768px, square, without captions, though I'm not entirely sure why. The loras get likeness pretty close, but are still not exactly right (not sure if the loras under-trained or not). Of the ones I captioned, I captioned using JoyCaption. My Chroma loras that I've trained are a bit better in terms of likeness, but chroma takes forever to train and gen relative to ZiT. My Chroma loras have weird artifacts (like torsos have significant amounts of blemishes and freckles compared to training data). What have you anons been training with?
>>
>>107818173
actual jumpscare
>>
File: grid_output.webm (1.05 MB, 2048x768)
1.05 MB
1.05 MB WEBM
>>
>>107817893
>>107817907
ok, it worked.
>>
>>107818512
benchod
>>
File: LTX_2.0_i2v_00141_.webm (1.04 MB, 768x448)
1.04 MB
1.04 MB WEBM
>>107817984
>>>/wsg/6068957
>>
does zit work on neoforge?
>>
hey yall folx, isit possible to run a gguf of gemma with ltx?
>>
>>107818593
I think someone said there is a PR for it on comfy just waiting for them to implement it
>>
>>107818260
Your dataset is probably too diverse. Lora training is essentially just averages the images the your dataset. If the images aren't similar enough, you get body horror because the concept is learned poorly. I got the best results with likeness training with only 1 image with long captions. The lora was surprisingly more flexible that I expected. If your training works better without captions, it's probably overfitting. When I trained loras with minimal captions, the lora gets overcooked.
>>
>>107818260
>What have you anons been training with?
OneTrainer. I don't crop. I just grab the highest resolution possible, caption with JC and let it rip for 4k steps (depending of course).
>>
>>107818593
This chad made a guide >>>/wsg/6068952
>>
>>107818583
>neoforge
>last update yesterday
im surprised that fork is still going
>>
>>107818619
4k steps is around a week right?
>>
>>107818647
?
Depends on your hardware. For me it's a little over an hour. 50 series.
>>
>>107818668
chinese hour or american?
>>
>>107818642
fuck mods gib ai board with sound naow
>>
>>107818669
chinese century or american?
>>
LTX2 is so fucking good.

the woman says "today, i'm going to end my life, for some third world Somalians, cause I am very smart!" in a whiny female voice.

https://files.catbox.moe/u1yl3j.mp4
>>
Okay I make WikiArt model now.
>>
upscale images with seedvr2 before lora training yay or nay?
>>
>>107818686
>solomians
>>
File: file.png (1.53 MB, 1024x1024)
1.53 MB
1.53 MB PNG
>>
>>107818691
I just grab the highest resolution possible
>>
>>107818696
makes it more funny imo
>>
>>107818699
>>>/wsg/6068989
>>
>>107818613
Interesting. I was training under the assumption that a more diverse dataset is better for these things. I’m gonna try pruning the data set and retraining with some captions.
>>
lmao

https://files.catbox.moe/b6pz8e.mp4
>>
at least add a thumbnail if you're just gonna post nothing but a link
>>
I just like to look at pretty girls I've never seen before. Gens that make me go "wow, she's cute". With ZiT I already know what she'll look like before I hit generate, so what's the point?

With Chroma, ok maybe her legs will be fucked up/etc. so probably I will have to do two or three or ten gens just to get one I like, but when I do get one it's a girl I've never seen before—it can be a complete surprise. If I want to see a different mix of faces I can just say "Baku, Azerbaijan. The quaint old Baku limestone houses in the background contrast against the imposing Stalinka apartments in the distance. 2005" and the model will do the rest. I adjust the general idea and some key details like the color of her shirt; I don't tell it exactly what to show me.

Will it ever make a perfect image? I don't think so. Too much goes wrong every gen, and the problems are usually big problems not small ones. But it shows me pretty girls I've never seen before.

I genned this while writing this post. Some bad limbgore happening with her right arm, meaningless details on the pants that don't make sense, mushy indistinct background, etc. This gen is maybe even worse than usual, needs inpainting. But her face makes a unique impression, her eyes, the dye in her hair that I never specified, there is so much here that is 'real' and striking and I didn't have to force it. To me this is what matters.

For now, Chroma is best model.
>>
>>107818753
>that maniacal laughter
fucking kek
>>
>>107818761
>he has bad internet
>>
File: locutus.jpg (615 KB, 1656x1264)
615 KB
615 KB JPG
>>
>>107818686
>>>/wsg/6069006
>>
File: x_cegm1x.png (1.67 MB, 1536x1024)
1.67 MB
1.67 MB PNG
>>
>>107818830
>>>/wsg/6069019
>>
>>107818773
>With ZiT I already know what she'll look like before I hit generate
It's honed for Chinese 1girls. Our only solace is training LoRAs but that doesn't solve your issue. Maybe you'd have luck training on a couple hundred random selfies of qt girls.

Bayse soon.
>>
>>107818839
cool
>>
>>107818845
I can make it do white girl faces, it's not difficult. You look up how Chinese people refer to American blonde women and put that Chinese text in the prompt box. I've played around with it enough to know lots of tricks that work. But there does not exist any prompt that can make it do anything surprising. You get what you prompt for, and that's that.

I don't know how to put into a prompt all the subtle variety that I encounter in the world. If the model requires me to do that, it is useless to me.
>>
>>107818915
>bro just learn chinese
>>
>>107818925
its much easier than that to get non chinese women NTA
>>
>>107818773
You don't mind seeing body horror as long as you get to see random faces? For me, a model that can't generate consistent faces isn't good as you won't be able to create a character. Body horror is a huge negative.
>>
for the gemma text encoder, can you use a split file model?
>>
File: x_6t7z8v.png (1.49 MB, 1536x1024)
1.49 MB
1.49 MB PNG
>>
>>107818583
>neo

Neo does support zit, I cannot find the guide for setting it up right now, but you could check out the screenshot in https://github.com/Haoming02/sd-webui-forge-classic/pull/406

Sampling method should be Euler, type uniform, step 8, CFG set to 1 (no negative prompt)
>>
>>107818642
Nice, also these may be of interest
https://huggingface.co/matrixportalx/gemma-3-12b-it-abliterated-GGUF/tree/main
and version 2
https://huggingface.co/mlabonne/gemma-3-12b-it-abliterated-v2-GGUF/tree/main
not sure on differences between them, but these are gguf versions of >>107818017
maybe helps with nsfw
>>
Does anyone have a good character replacement workflow for Qwen Image Edit?
The one on civitAI is garbage and doesn't work.
>>
ltx2 distilled vs dev, how big of a difference is it? also if you use the distil lora how many steps is it?
>>
am supposed to download the dev or distilled version?
>>
>>107818790
cool
>>
>>107818942
Thank you for restating the common wisdom which the post was explicitly rejecting, and emphasizing that the downsides it acknowledges do exist and are downsides. It is important we all see this restated with an attitude of feigned incredulity that anyone would disagree.

>>107818925
Do you not know that it's trivially easy to ask an LLM for this information or are you just pretending not to know?
>>
>>107819069
you can't use llms properly inside comfy
>>
File: nxnrzh.png (1.47 MB, 1536x1024)
1.47 MB
1.47 MB PNG
>>
where's base
>>
>>107819059
probably just try the distil for now

>>107819050
with all the stuff you COULD do in a workflow and the uncertainty if mine is set up right and the bugs and stuff, I think I can't even tell ATM

thus I'd say the distil is fine, for now. not sure you can obviously get more from the full version.

maybe in some day(s) when there's a high quality gguf workflow people agree is good, then we'll see the difference?
>>
>>107819132
A ZiT base release would really disappoint people, even if it's great. Imagine if Chroma released after Chroma Flash. It's not strictly "better" in every sense. In a lot of the ways this general cares about, it's worse.
>>
>>107818642
it still does not work, the dual clip loader (gguf) gives, someone must have this shit working ffs.
ValueError: Unexpected text model architecture type in GGUF file: 'gemma3'
>>
>>107819158
Is this Chinese Culture?
>>
>>107819158
proof?
>>
>>107819047
Anyone?
>>
>>107819165
I am the white European colonist surveying this primitive backwater and its bumbling, superstitious people with a mix of pity and amusement. They say someday the Alibaba ship will come again and leave a much greater gift than last time, and there will be no more suffering and the young will never grow old...
>>
>>107819179
yeah
>>
so I used 8 steps distil lora for the dev model and got a "take on me" video:

https://files.catbox.moe/gmci6u.mp4
>>
>>107819219
Can you show it if not shitposting?
>>
>>107819229
its the github with the miku being transferred into jojo
>>
>>107819244
What github?
>>
File: img_00088_.jpg (661 KB, 1520x1728)
661 KB
661 KB JPG
>>107819065
ty!

>>107819227
could you actually use stuff like "this photo slowly turns into hand drawn black and white animation" ? I havent had time to test it
>>
Why my wan 2.2 workflow x10 times slower now after comfy memory optimizations
>>
>>107819263
not sure, that was a bugged gen kek
>>
>>107819263
bulsit
>>
>>107819292
proof?
>>
>>107819310
What proof do you expect fucking run your own
>>
File: 09.png (1.82 MB, 856x1280)
1.82 MB
1.82 MB PNG
>>
>>107819340
is that the sacred stone that once found will release Z image base?
>>
File: Flux2Comparison.jpg (3.11 MB, 2432x1600)
3.11 MB
3.11 MB JPG
The Fal Turbo lora for Flux.2 is quite good, I actually find it improves the result versus not using it in a lot of cases.

`an angry terrifying female horror abomination witch wearing a tight black spiked collar and tight black spiked bracelets and Huttese slave-girl costume while posing alone in a dungeon. Long tongue out, evil grin, glowing eyes, photorealistic 8K raytraced cinema quality render with highest texture detail. Sexually charged and extremely provocative.`
>>
>>107819385
why use flux instead of chroma?
>>
>>107819069
>Thank you for restating the common wisdom which the post was explicitly rejecting

He didn't say anything about wanting or not wanting to generate the same unique character again. He acknowledged inpainting is need to fix the gen. Most people would want to see unique faces but also would love to be able to see the same character again. I don't see any rejection of conventional wisdom here.

>It is important we all see this restated with an attitude of feigned incredulity that anyone would disagree.

If my opinion is redundant then that anon's opinion would also be redundant.
>>
>>107819263
>>>/wsg/6069111
>>
>>107818773
im not yet bored of training zit on grills i know and genning softcore of them DESU
sometimes it even does passable nips
>>
File: ZImageTurbo_Output_125151.png (2.52 MB, 1216x1600)
2.52 MB
2.52 MB PNG
>>107818931
words cannot describe how retarded you'd have to be to somehow not get like, the race you want out of ZImage lmao, like wtf are these people talking about?????

```a candid amateur photograph of a very attractive white woman with blonde hair and blue eyes and tanned skin wearing stylish summer attire in an alley in front of a brick wall with graffiti on a lovely summer day. She is holding up a sign in front of her chest with both of her hands and she is smiling knowingly. The sign reads "HOLY SHIT, I'M NOT CHINESE!"```
>>
>>107819292
30xx GPU?
>>
>>107819443
without them being hag now
>>
In all seriousness, I don't get it. What's the point of generating SFW images?
>>
>>107819488
wdym
>>
File: 328083.png (1.54 MB, 856x1280)
1.54 MB
1.54 MB PNG
>>107819367
Yes, we must feed it blood
>>
>>107819492
If I can't fap to it, it doesn't hold any value
>>
File: ZImageTurbo-_0128.png (3.28 MB, 1824x1248)
3.28 MB
3.28 MB PNG
https://files.catbox.moe/v4aj7k.png
>>
>>107819501
just try harder
>>
>>107819503
no
>>
https://files.catbox.moe/v46gf5.png
>>
>>107819516
why wan lora
>>
File: ZImageTurbo-_0165.png (2.56 MB, 1824x1248)
2.56 MB
2.56 MB PNG
https://files.catbox.moe/62vnsa.png
>>
>>107817517
What’s 1girl?
>>
>>107819521
It's adapted from a Wan workflow, and I forgot to take them out. you can just put whatever lora you want in its stead

https://files.catbox.moe/90q63d.png
>>
File: ComfyUI_temp_llpiq_00012_.png (2.86 MB, 1824x1248)
2.86 MB
2.86 MB PNG
>>107819538
forgot to upload the img
>>
>>107819404
I just find Flux.2's better VAE and better prompt adherence is worth it for some stuff over other models. Not everything but some things.
>>
File: LTX_2.0_i2v_00172_.webm (1.07 MB, 896x576)
1.07 MB
1.07 MB WEBM
>>107819502
>>
>>107819530
“1girl” is a tag or shorthand that literally means “one girl.”

It’s most commonly used in these contexts:

Anime / manga image tags (e.g., Danbooru-style tags):
1girl indicates that the image contains exactly one female character.
(Similarly, you might see 2girls, 1boy, etc.)

AI image prompts:
People use 1girl to specify the number and gender of characters they want generated.

Fan art / illustration descriptions:
It helps quickly communicate the composition of the image.

Outside of those contexts, it can simply mean one girl in plain English, but the compact format (1girl without a space) strongly suggests tagging or prompt usage.
>>
>>107819562
but it can't goon
>>
>>107819047
Anyone?
>>
File: 752947581.png (1.87 MB, 856x1280)
1.87 MB
1.87 MB PNG
>>
>>107819603
Whatsup
>>
>>107819608
is this real?
>>
File: ZImageTurbo_Output_352311.png (2.5 MB, 1216x1600)
2.5 MB
2.5 MB PNG
>>107819574
Depends on what you goon to. If I want realistic hardcore NSFW though I'll just use Chroma, or for 2D I'll use NetaYume or some Illustrious model.
>>107819483
attached is just "woman" changed to "teenage girl" in the original prompt, same seed
>>
>>107819619
fuck the guy who kept insisting chroma is better for 2d
>>
>>107819608
Literal slop
>>
>>107819619
How about a 22-year-old woman? It can't
>>
>>107818098
I don’t get it.
>>
>>107819632
Elaborate
>>
z dont know artists
>>
File: 2.png (1.25 MB, 856x1280)
1.25 MB
1.25 MB PNG
>>107819625
uwu
>>
>>107819619
>illustrious
but why?
>>
>>107819641
Z don't know base model either.
>>
>>107819645
now make it give birth
>>
>>107819641
it know some
>>
z = zzzz
>>
>>107819638
Why is he making videos out of people’s gens? I’m at work so I’m too scared to listen to the audio.
>>
>>107819666
lol
>>
>>107819630
Why would you want to do that?
>>
>>107818773
>With ZiT I already know what she'll look like before I hit generate
Skill issue. Use wildcards for hairstyles and haircolors and use wildcards for names. Models like zit need to be prompted with more detail than just 1girl
>>
>>107819666
>watching semi-naked women at work is fine
>audio isn't
Curious
>>
File: 3641751.png (2.04 MB, 2296x840)
2.04 MB
2.04 MB PNG
I'm slooopin'
>>
>>107819671
I’m gonna get out early and then I’m gonna find out why you’re laughing at me.
>>
I can't coom to 1girl, whats a good model for an old man in his late 30s
>>
>>107819689
I’m in a corner in the break room.
>>
>>107819673
i have a younger adult woman fetish
>>
File: 9956725.png (1.76 MB, 832x2048)
1.76 MB
1.76 MB PNG
>>107819699
sd 1.4
>>
>>107819716
asshole
>>
File: ZImageTurbo-_0197.png (1.51 MB, 1824x1248)
1.51 MB
1.51 MB PNG
https://files.catbox.moe/wmcpvg.png
That ain't working, that's the way you do it
>>
>>107819716
I look like this
>>
>>107819677
If you think wildcards are a sufficient replacement then you're a fucking idiot lol. I could show you the array of custom nodes I've written for extending my ability to use wildcards. Back in the A1111 days I used that extension that allowed you to write some logic into the prompt so you could really double and triple down on how you combined multiple levels of wildcard/randomization. I also made a node to save and load conditionings so I could pre-generate hundreds of mixes of wildcards and then not need to be loading up the clip model between each gen for overnight batches. Etc.

It's not the same. It cannot replicate the same behavior. If you think it can, you're brown.
>>
File: 354.png (3.03 MB, 1176x2088)
3.03 MB
3.03 MB PNG
>>
>>107819726
niceu
>>
>>107819732
>Racism
You lost, also I'm Chinese.
>>
File: 867.png (1.37 MB, 1024x1024)
1.37 MB
1.37 MB PNG
>>
>>107819734
benchod
>>
that little faggot with the dire straits lora
yeah buddy, that's his own gen
>>
>>107819741
schizobabble
>>
File: ZImageTurbo-_0200.png (1.76 MB, 1824x1248)
1.76 MB
1.76 MB PNG
>>107819737
https://files.catbox.moe/beqdx8.png
>>107819739
nasty af, love it
>>
File: ZImageTurbo_Output_323412.png (2.78 MB, 1216x1600)
2.78 MB
2.78 MB PNG
>>107819630
this is again on the same seed but with "22-year-old young white woman", about what most models would give I'd say
>>
>>107819738
I should have guessed.
>>
>>107819739
Prompt?
>>
>>107819646
IDK, sometimes I might want something a bit faster than NetaYume I guess
>>
>>107819741
I want my MTV
>>107819751
the blocky 3d gens I posted are inspired heavily by the Dire Straits' Money For Nothing MV
https://files.catbox.moe/y2j6ql.png
>>
File: 4900.png (3.45 MB, 936x2624)
3.45 MB
3.45 MB PNG
>>
>>107817961
if that's the best ltx2 can do then it's over
>>
>>107819766
you are a good man for sharing wf
>>
>>107819780
loras will fix this
>>
>>107819732
A fucking replacement for what? Are you dense or something? You said you'd know what she looks like before generating. That's entirely bullshit if you use wildcards to induce variance.
So yeah, probably skill issue remains.
>>
>>107819800
Stop saying skill issue, I have been using AI since GPT 2, fuck you asshole. Wildcards are not the same as random.
>>
>>107819772
turn down your denoise to no higher than 0.4 for high-res fix if this is Z-Image, which I'm basically positive it is
>>
>>107819766
minecraft
>>
>>107819809
its chroma
>>
File: 472.png (1.4 MB, 1024x1024)
1.4 MB
1.4 MB PNG
>>107819752
thanks

>>107819755
Lora trained with the barnacle girl image from artist guru plus the following prompt:

>A 21-year-old blonde woman with giant barnacles covering her entire face, nose, mouth, and cheeks. She has no eyes. The barnacles have red cilia growing out from their centers.
>>
>>107819818
what did you use to train?
>>
>>107819808
>Wildcards are not the same as random.
You know that you can use more than one text file and can also cascade them, yes?
Skill issue, melty.
>>
>>107819800
From my perspective, the very obvious things I've already referred to which you either cannot understand or are pretending not to amount to a pretty strong proof that any time I spend trying to explain this to you will be wasted.

Also this is not me >>107819808
>>
>>107819726
that little faggot is a millionaire
>>
>>107819830
Still the same answer fits as a followup. Z-image does hallucinate. Little but it does.
>>
File: 1763251889449084.jpg (1.28 MB, 2160x2706)
1.28 MB
1.28 MB JPG
>>
>>107819809
it was lacking a bit of details with lower values ;-;
>>
blurjeeta-sama pls stop fall for b8
>>
>ZOMG I CANT WAIT FOR Z-IMAGE ANIM-ACK
https://nijijourney.com/blog/niji-7
saas wins again
>>
File: ComfyUI_temp_llpiq_00091_.png (3.06 MB, 1824x1248)
3.06 MB
3.06 MB PNG
>>107819794
you're welcome bby :*
>>107819818
cool af
>>107819812
chicken jockey
https://files.catbox.moe/qyf0vw.png
>>
>>107819836
Jokes on you, that's my fetish.
>>
File: 1740105017256697.jpg (531 KB, 1264x1584)
531 KB
531 KB JPG
>>107819864
>>
>>107819841
I think he's sincere. I could be wrong.
>>
File: ComfyUI_temp_llpiq_00095_.png (3.03 MB, 1824x1248)
3.03 MB
3.03 MB PNG
https://files.catbox.moe/mpv3ek.png
>>
>>107819822
I used Ostris. Rank 8, learning rate 0.0003. Stopped after 400 steps. The lora was actually trained on top of 2 existing 1-image loras, back-to-back as an experiment. I discovered that the older/concepts just get blended/overwritten. I also use the lora loader an anon mentioned earlier ITT to activate only layers 15 to 21 of the lora.
>>
>>107819699
spark.chroma or uncanny photorealism or sdxl/pony with loras if you like large sloppy fat bottoms

t. slopper
>>
>>107819866
>like fatties
>somehow this image fills me with visceral dread
i'm no longer hungry, thank you
>>
File: zzz_00001_.png (1.19 MB, 1152x864)
1.19 MB
1.19 MB PNG
Thank you OP for this blessed thread.
>>
Hibernation mode: engaged
>>
https://huggingface.co/Kijai/LTXV2_comfy/tree/main/diffusion_models

distilled q8 works

workflow here:

https://files.catbox.moe/gyfd95.json
>>
File: zit_000417_.png (2.46 MB, 1920x1080)
2.46 MB
2.46 MB PNG
>>
What system message do you use to properly caption an image?
>>
>try bigasp2.5
>comparable to chroma
>even tries to sneak in cartoons in gens like chroma
>chroma quality with sdxl speeds

kek, this is great, any similar sdxl models out there?
>>
>>107820167
NSFW taggers like joycaption, pixai, wd almost all ship with settings/gui/system prompts to use

qwenvl:
https://github.com/1038lab/ComfyUI-QwenVL/blob/main/AILab_System_Prompts.json
https://github.com/KLL535/ComfyUI_Simple_Qwen3-VL-gguf/blob/main/system_prompts.json
or maybe https://github.com/walke2019/ComfyUI-GGUF-VLM/blob/main/utils/system_prompts.py
>>
>>107820207
>or maybe https://github.com/walke2019/ComfyUI-GGUF-VLM/blob/main/utils/system_prompts.py
thanks anon, I am trying GLM4.6V so I guess I'll turn to this one
>>
>>107820171
bigASP 1.0 was so bad I'm reluctant to try newer versions.
>>
>>107819857
there's literally no way of even testing it for free so who cares, how would you even compare it to NovelAI or other paid stuff
>>
>>107820171
not really. 2.6 should be coming out soon though. Last SDXL one before he moves on to 3.0.
>>
LTX doesn't seem great, the controlnets are bad and it's impossible to direct the camera movement in a fine grain way like you can with Wan
>>
https://files.catbox.moe/80r0t1.mp4

ah sheeeeeeeeeit
>>
>>107820312
Maybe you're just a retard that is treating a model that isn't wan like wan.
>>
>>107820253
never tried 1.0. I started with 2.0. thought it was good but discovered chroma and havnt went back to sdxl until i seen bigasp2.5 on civit. its great for sloppy stuff, my only complaint that its very stubborn with camera angles and body positioning.
>>
>>107819385
>>107819385
>an angry
She is not angry
>terrifying female horror abomination witch
No she is cute
>wearing a tight black spiked collar and tight black spiked bracelets and Huttese slave-girl costume
Ok
>while posing
She is not posing
>alone in a dungeon. Long tongue out
Ok
>evil grin,
Is not evil,
>glowing eyes,
Ok
>photorealistic
Ok
>8K raytraced cinema quality render with highest texture detail. Sexually charged and extremely provocative.
Looool no! Fail fail fail!

Why is flux 2 so shit?
>>
>>107820313
with the dolly right ltx2 lora, KINO!

the black man is walking to the right and saying "I must defend my fent, it is the way of the nigga samurai." Japanese samurai movie music is playing.

https://files.catbox.moe/vbsi6e.mp4
>>
>>107820320
Nice gen btw
>>
>>107820329
>Why is flux 2 so shit?
It's not shit, it's SAFE.
>>
>this summer, floyd is: fent cop

https://files.catbox.moe/9y8rf8.mp4
>>
File: 1768017709959.jpg (57 KB, 1024x1024)
57 KB
57 KB JPG
>>
File: 1805288664.png (618 KB, 449x580)
618 KB
618 KB PNG
>>
q8 distil seems better (audio/video) compared to fp8 distil which is still good. worth a try:

https://files.catbox.moe/tm5r51.mp4
>>
>>107820329
oh look it's Flux Hater Guy. gr8 b8 m8
>>107820351
it's not though
>>
>>107820368
>no slit/small camel toe crease
sad
>>
kek, she narrated the directions but she left and did the door still.

the blonde girl gets up and leaves the room to the right, through a wooden door that she slams shut.

https://files.catbox.moe/yw1j8l.mp4
>>
lmao it knows trump natively, video and audio. only source was an image.

Donald Trump wearing a black suit walks in from the right and says "she has great genes, you know. the best genes! the best."

q8 ltx2 from kijai repo

https://files.catbox.moe/vwzztt.mp4
>>
>>107820375
smol img
>>
File: Comparisons.jpg (2.69 MB, 3488x1600)
2.69 MB
2.69 MB JPG
>>107820392
(intentional samefag self-reply)
I did in fact gen the exact same prompt on three other models when I did the Flux.2 one by the way, none of them were as close to what I wanted in this case.
>>
this is the best Trump i've seen in a i2v, so far. LTX2 q8 is pretty cool. and I only have 16gb vram, but it works cause 64 ram can hold the models.

https://files.catbox.moe/owz7gh.mp4
>>
>>107820450
Can any Chroma user explain the low quality of the image on the right? I've tried Chroma, and that level of quality is common.
>>
the camera pans to the right, that edit fixed it. Now it's Trump time.

https://files.catbox.moe/2lkemr.mp4
>>
File: 00168-2365336293.png (1.2 MB, 1536x1024)
1.2 MB
1.2 MB PNG
>>
>>107820507
>>107820507
>>107820507
when ready
>>
File: chroma_0345_.jpg (1.97 MB, 2048x3584)
1.97 MB
1.97 MB JPG
>>107820482
cause a troll keeps using the 512 res pretrain model instead of the 2k res model people actually use. But I assume you are said troll and are just trolling in bad faith like usual
>>
Fresh when ready

>>107820534
>>107820534
>>107820534
>>
>>107820526
its bghira, he has a personal vendetta against lodestone for some reason
>>
>>107820535
duplicate thread, please remove
>>
different OP different thread
>>
There needs to be a range ban on the duplicate thread trolling. It's insufferable.
>>
Anon didn't learn last time that no one wants his threads without the links?
>>
>>107820535
Stop ruining ldg, shitposting faggot
>>
>>107814922
>>107814922
>>107814922
Migrate back.
>>
>>107820576
wdym? the op isnt changed like >>107820521 tho
>>
>>107820450
the left one looks sick though, I'd fuck it
>>
>>107820586
i think it has the troll shitposter links removed. those rentries should be filtered at this point, the schizo uses them to ruin ldg all the fucking time
>>
>>107820592
? theyve been in op for over 1000 threads
>>
>>107820482
I'm the person you're replying to, I don't think the Chroma image here is really "bad" perse, it's just extremely different from other models ideas about things I guess. I'm sure if I fed the Flux version to my jailbroken Gemini setup for captioning, and then fed that long natural language output back to all the models, it would make the results from everything way more similar for example.
>>
>>107820227
No problem. I Haven't heard of people using that model to caption yet, I wonder if it's good at it.
>>
>>107820599
who took NetaYume out though? It was there like a couple threads ago. The guy who has the vendetta against NetaYume is the same one constantly making troll threads.
>>
>>107820673
it's huge and there has been an abliterated version out, so I thought why not trying it
>>
>>107820685
let us know if it compares well. else I recommend the models mentioned earlier, AFAIK these are ~the best ones you can run locally on consumer gpu



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.