[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


I'm Getting Litty Edition

Discussion of Free and Open Source Text-to-Image/Video Models and UI

Prev: >>106585705

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts/tree/sd3
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/tdrussell/diffusion-pipe

>WanX
https://comfyanonymous.github.io/ComfyUI_examples/wan22/
https://github.com/Wan-Video

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
Training: https://rentry.org/mvu52t46

>Neta Lumina
https://huggingface.co/neta-art/Neta-Lumina
https://civitai.com/models/1790792?modelVersionId=2122326
https://neta-lumina-style.tz03.xyz/

>Illustrious
1girl and Beyond: https://rentry.org/comfyui_guide_1girl
Tag Explorer: https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbours
>>>/aco/csdg
>>>/b/degen
>>>/b/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
>>
>>106589837
you didn't add anistudio to the OP
>>
>>106589848
correct
>>
>>106589837
faggot OP forgot this
https://github.com/FizzleDorf/AniStudio/releases/tag/pre-release

>>106589855
>t. faggot
>>
Blessed thread of frenship
>>
File: ComfyUI_00497_.png (1.91 MB, 1328x1328)
1.91 MB
1.91 MB PNG
>nunchaku qwen image models released
>deepcompressor last updated 6 months ago
>no quantization code in deepcompressor so you can't quantize your own merged models
they are going closed source
>>
SaaS is here to stay.
>>106589823
The only difference between you and me is how far my GPU sits from my setup.
I'm running local just like you, images save to my local drive in the same folder as my API nodes and local UI workflow.

Or are you telling me you're somehow generating with autonomous electricity and ibterbet access while staying local, retard?
>>
>>106589911
ani is saving local. saas lost
>>
>>106589911
>The only difference between you and me
kek, you are using the fisher price version of AI generation, you are like someone who thinks Facebook is the internet

go back to /saasdg/ and wallow in your misery
>>
File: R9~01.png (512 KB, 855x375)
512 KB
512 KB PNG
Doe lora files contain metadata? And if yes, is there a way to strip it?
>>
File: ComfyUI_00502_.png (1.66 MB, 1328x1328)
1.66 MB
1.66 MB PNG
>>106589911
>The only difference between you and me is how far my GPU sits from my setup.
>>
File: 00044-3161100875.png (1.86 MB, 1080x1576)
1.86 MB
1.86 MB PNG
>>106589866
>jumpscares you
>>
>>106589944
Doubt it, since that would have to be placed there by the trainer program, for what purpose ?
>>
>>106589911
>how far my GPU sits from my setup
>my GPU
LOL
>>
>>106589911
>The only difference between you and me is
kek, you litterally have a jew standing behind you saying 'no goy, you can't generate that, you can only generate this'

(((SAAS))) shill begone
>>
>>106589837
What is the alternative to buying a GPU to run AI models? Surely all those server farms aren't built off GPUs right?
>>
>>106589978
I think they call them accelerator cards in the enterprise market, but yes they're graphics cards
>>
>>106589978
nah you're right. they use pixie dust and unicorn farts
>>
/ldg/ bross welcome SaaS users too!
Share your gens, don't be shy about your outputs here!
We're all working with generative models, local or cloud based doesn't matter.

/ldg/ accepts everyone, post your creations and let's see what you've been making!!

Pic related generated in my local pc with local nodes with my local billing adress, same as you.
>>
*yawn*
>>
>>106589978
>Surely all those server farms aren't built off GPUs right?
NVidia is swimming in endless money, surely those server farms aren't using GPU's, right ?
>>
Qwen SRPO waiting room:
https://github.com/Tencent-Hunyuan/SRPO

Waiting for a hero to try this on Qwen
>>
>>106589978
They're not built off gaming gpus, no. That's why nvidia can charge an arm and a leg for tripling your power bill, they don't care about you because gaymers don't financially matter (anymore)
>>
>>106590010
It was tried on Flux and results weren't that good. Flux is more realistic than Qwen by default. It is over.
>>
>>106589987
big if true
>>
>>106589837
>previous thread
>ctrl + f "litty"
>zero results
what did OP mean by this
>>
>>106590019
>Flux is more realistic than Qwen by default
lel
>>
>>106590027
This is a zoomer thread now, he's bussin on god no cap fr fr on ohio skibidi rizzler's gyatt
>>
>>106589911
KYS shizo
>>
>>106589987
>the true state of local
>>
File: 00284-3571758110.png (2.15 MB, 1080x1576)
2.15 MB
2.15 MB PNG
>>106590040
fr fr ong no cappin' bruhski we
uh, fuck i don't know more zoomer lingo
>>
>>106590040
O.o
>>
>>106590019
Even if it doesn't completely unslop the model, lowering the bias towards 4o garbage would already be a victory, don't you agree?
>>
>>106590038
It's the truth. Krea was no accident. It is based off an non-distilled Flux model. Chroma was no accident, similar. Qwen is not even distilled and it's already slopped.
>>
File: AnimateDiff_00370.mp4 (2.19 MB, 1040x720)
2.19 MB
2.19 MB MP4
>>106589958
gib catbox for godzilla tits miku
>>
File: ComfyUI_00022_.jpg (1.12 MB, 2048x2048)
1.12 MB
1.12 MB JPG
>>
>>106589995
Imagine yawning at objective truth.
Imagine yawning when your "local" models were literally trained on LAION scraped by German academics using cloud compute clusters, then distributed by Stability AI (a company).
Imagine yawning when your loras were trained on datasets someone else curated and uploaded to HuggingFace (owned by investors).
Imagone yawning when your "local" inference is running on hardware designed by NVIDIA/AMD and drivers maintained by corpos.

Sure, keep yawning about how clicking download on civitai makes you some kind of digital freedom fighter while SaaS chads just cut out the middleman. At least we are honest about using corpo services instead of larping as tech libertarians.
>>
>>106590063
bro dumped like a dozen catboxes recently FAGGOT GO LOOK FOR THEM IN THR PREVIOUS THREADS
>>
*yawn*
>>
>106590071
holy cloudkek cope
>>
>>106590058
Krea attempted to specifically remove Flux slop, and even that failed.

Flux is THE slop model, nothing else comes close.
>>
>>106590065
i should call her...
>>
Decided to try out flux and apparently my 4090 is not enough lol
>>
>>106590052
https://en.wikipedia.org/wiki/Glossary_of_2020s_slang
some i remember being used when i was a kid and probably before that
quite a few are p funny
>>
>>106590065
Why are you posting pics of my waifu
>>
>>106590090
that's one way to out yourself as a total retard
>>
>>106590063
that just made my dih leak a fluid, sure bud
https://files.catbox.moe/w7lnm4.png
though the background is a bit fried, i think i busted the cfg scale in this one though i forgor

>>106590073
i appreciate the enthusiasm but its fiiine

>>106590095
>99% of this shit is just black/urban simplifications of already simple words
shits the way the cookie crumbles i guess, personal favorite has to go to "ligma" and its variations though honestly.
>>
>>106590052
Model/catbox? I'm trying to improve my proompting when it comes to realistic gens
>>
File: ComfyUI_00035_.jpg (329 KB, 1328x1328)
329 KB
329 KB JPG
>>
>>106589862
>Commercial License
>If you prefer to use this project under a commercial license, please contact us at [your-email@example.com]

kek
>>
>>106590110
>https://files.catbox.moe/w7lnm4.png
no i wanted the catbox of miku, unless that wasn't you
>>
>>106590075
Aother yawn from "I downloaded someone else's model" anon?
Let me guess, you're running it on Windows too? Microsoft thanks you for your "local" rebellion.
While you yawn your GPU drivers getting updates from NVIDIA servers?
While you yawn your Python packages from PyPI?
Your model checkpoints from Google Drive links posted by randoms?

Please, tell me about your sovereignty while your entire stack depends on corpo infrastructure you have no control over.

Yawn harder,

SaaS brothers welcome! We are /ldg/!
>>
>>106590113
this catbox >>106590110
same model, same prompt, just cfg i'd float around 3.5-4.5 depending on the lora,
though honestly if you have a gpu with more than 8gb of vram, i'd look into doing a refiner pass with an already realistic sdxl model
the second i pick up the 16gb card i'm looking at, i'm dedicating entire weekends to trying that out. nova animal is good but, it has its weaknesses.
>>
>>106590135
>no i wanted the catbox of miku, unless that wasn't you
oh i thought by the way you worded it, you wanted to TRADE for the catbox of my image.
periods and commas are important my m8. pretty sure that guy just used wan and asked for some titty jiggles, it's not difficult to pull off.
>>
>>106588315
sexo with jenny
>>
>>106590128
Qwen does an okay job with the prompt. The model is superior to Seedream 4.
>>
>>106590129
where does it say that?
>>
>>106590139
Thanks
>more than 8gb vram
I do, but aren't there options to unload the base model if you're using a refiner? After all, the base model is not needed during the refiner pass
>>
>>106590183
reforge as far as i can tell, fully unloads the first model then loads the second, then when you do a new model it does the process over again.
very slow on my near 10 year old card, i'm sure its near instantaneous anything rtx 3000 and newer.
with comfyui there's specific nodes you're supposed to use for better memory management, like unloading models and clearing memory. some wan workflows use that for vid2vid passes.
>sorry i can't be of more help i haven't touched anything like this in months again due to aging gpu
>>
>>106590175
sounds like a prompt skill issue
>>
File: FluxSRPO_2.png (1.07 MB, 1024x1024)
1.07 MB
1.07 MB PNG
>>
out of curiosity, there's no way to monetize genning an animation from an image right? since i assume there's a lot of legal trouble even if its extremely private


like theres i2v websites but if you were a named individual on like patreon or something, you'd probably get taken down and sued fast.
>>
>>106590190
The prompt is as straightforward as you can get. Works on Chroma. Works on actual decent SaaS models. It doesn't get any better than that.
>>
File: AnimateDiff_00373.mp4 (3.05 MB, 1040x720)
3.05 MB
3.05 MB MP4
>>106590146
noooo i'm the one who animated it
>>
>see insane outputs on civitai model page
>"dude, why do my outputs always look like ass but everyone else's are fire?"
>download image
>import into comfyui
>at least 30 nodes with multiple groups for upscaling, skin enhancers, hand fixers, face detailers and so on
Do... do you guys actually do all that? Am I just not autistic enough?
>>
javascript:quote('106589609')
Escape from Tokyo with Snakette Pliskenawa
>>
>>106590233
holy fucking booba animation dude

>>106590248
you're not autistic enough for the autism club
>>
>>106590248
>Am I just not autistic enough?
correct
>>
File: 1422449559229.jpg (16 KB, 330x344)
16 KB
16 KB JPG
>106590249
>>
>>106590263
that's how chatgpt said I should do it man
>>
>>106590255
>>106590257
Okay then, what do you guys recommend? Are upscalers needed when the default resolution range of sdxl models (so like 768x1280 for example) is fine for me? What about the other groups I mentioned?
>>
>>106590289
If you have to ask, you'll never make it.
>>
>>106589609
Escape from Tokyo with Snakette Pliskenawa
>>
>>106590294
How dare I ask for advice instead of wasting hours experimenting with what works and what's needed, shame on me
>>
>>106590206
Where there's a demand, there's a way, but discussion falls outside of the scope of this site.
>>
>>106590289
>(so like 768x1280 for example) is fine for me?
90% of the wow factor for a gen is how high the res is desu
>>
>>106590289
>>106590359
also its, im pretty sure, 1216x832, not 1280x768 for some reason.
>>
>>106590359
I don't need 8k images when I'm trying to generate some decent-looking smut, especially since I'm on a 1080p monitor
>>106590371
As long as the total amount of pixels is the same and the dimensions are a power of 64
>>
>>106590405
Upscaling+refiner second pass is good for ironing out minor errors and artifacts. And just 1.25x is fine really.
>>
>>106590289
>Are upscalers needed
you can upscale if you want, adds more detail and is a easy way to fix shit. usually the way people usually inpaint is by cropping a square around their masked area -> upscaling or downscaling it to their desired resolution (either 1024x or higher) -> denoising the masked area. -> stitching the masked area to the original image (this is done automatically using the 'masked only' option in a1111-like guis or the crop and stitch node in comfyui). upscaling the original image gives you more pixels to work with and is less lossy. optional though if you know what you are doing
>hand fixers, face detailers
i recommend you skip this stuff and stick to doing it yourself manually
>>
>>106590405
>I don't need 8k images
It's not about being 8k. Doing at least a single second pass amplifies how "good" an image looks.
>especially since I'm on a 1080p monitor
That's the thing about "highresfix", it adds details beyond what we think of as being just a higher resolution. I'm on a 480p screen and I gen images that are at least double if not triple the base res.
>>
>>106590429
>I'm on a 480p screen
crt monitor? or just those TFT dell monitors from like 2006?
>>
>>106590436
It's a MAG 321UP, I just like 480P.
>>
>>106590448
Why would you pretend to be me though.
>>
>>106590448
>4k ultrawide monitor
>uses it in 480p
>>
>>106590248
>do i have to do all those things that make a gen good to make my gens look good?
lots of workflows can be trimmed due to bypassed or disconnected nodes but in general yes
>>
>>106590063
back from the gym, here at catbox, i used wan2gp
https://files.catbox.moe/5s6c7t.mp4
https://files.catbox.moe/4yulk1.png
>>
>>106590452
i like to add to the narrative
>>
>>106590289
look into tiled diffusion
>>
>>
>>106590547
I think people underestimate Wan for being able to deslop poses and composition. Like you can use wan to have the character move to a place with a more appealing background or take a position or look in a direction that makes the piece unique and different from other things you might see from SDXL outputs.
>>
File: 1735962560570417.jpg (772 KB, 1416x2120)
772 KB
772 KB JPG
>>
But can I use booru tags in Wan? Huh? Yeah, thought so.
>>
>2.2 vace model is over 32gb

How the hell am I supposed to use this?
>>
>>106590419
>>106590429
Fair enough, I'll give it a shot then. Do you use latent upscaling or a dedicated upscaler model?
>>106590428
Nice write-up, thank you
>>
>>106590588
I use latent upscale. Also if you modify the prompt in the conditioning hook going to the second sampler you can even do minor edits to the original pic with high enough denoise.
>>
>>106590577
get a job, hombre
>>
>>106590595
How low denoise do you typically set for second pass if you're just looking to upscale / fix small defects ?
>>
>>106590577
Not worth it
>>
>>106590577
did they release the 2.2 vace model or are you talking about "vace fun". i thought that just plugs into the existing t2v model like a lora
>>
>>106590595
Alright, cool, time to hopefully elevate my 1girl experience to the next level
>>
>>106590577
you mean the vace-fun version? real vace 2.2 is not out yet
>>
>>106590640
Idk if all models share this but on chroma 0.65-0.70 for plain upscale and 0.75+ it starts reiterating things and chaning stuff
>>
File: AnimateDiff_00374.mp4 (3.7 MB, 1040x720)
3.7 MB
3.7 MB MP4
>>106590464
thank the Lord in Heaven

btw 4 days late lol
>>106556008
>how do I prompt for blow kiss but without the hand movement?
>the woman stretches her arms out in front of her as if to give a hug. the camera zooms in on her face. her lips fill the entire screen and she kisses the camera with her lips
that at least keeps them from blowing a kiss with their hands
>>
>>106590723
Thanks man
>>
File: WanVideo2_2_I2V_00380.webm (1.13 MB, 1248x720)
1.13 MB
1.13 MB WEBM
>>
>>106590749
The cigarettes are better
>>
>>106590749
the surprise dick in the mouth are better
>>
>>106590803
Please Furkan, enough with that
>>
>>106590071
>larping as tech libertarians.
Keep crying pajeet, the only reason you are seething is cause you don't have the skill and the hardware, go back to your general /aids/ and /dale/, here is not for brown skin poorfags like you
>>
>>106590784
Fuck off this is the zoomer thread
>>
>>106590825
inb4 lefties claiming local ai generation is white supremacy
>>
File: WanVideo2_2_I2V_00381.webm (751 KB, 1248x720)
751 KB
751 KB WEBM
>>106590784
>>106590803
Impossible to please I swear.
>>
>>106590888
Now that's more like it
>>
>>
File: ComfyUI_01123_.png (1.78 MB, 1328x1328)
1.78 MB
1.78 MB PNG
>>
File: ComfyUI_01125_.png (1.94 MB, 1328x1328)
1.94 MB
1.94 MB PNG
>>
can chroma be prompted to do anime or cartoon illustration?
>>
>>106591089
yes of course but the style won't be consistent between seeds
>>
File: ComfyUI_01126_.png (2.56 MB, 1328x1328)
2.56 MB
2.56 MB PNG
>>
>>106591035
>>106591072
>>106591111
model? It's not cloudshit / seedream, isn't it?
>>
>>106591089
yes, pretty extensively.

if you want to prompt specific characters with their standard outfits tho you'll usually find that easier on illustrious or noobai
>>
>>106591123
worse than that it's qwen but least it's not chroma
>>
>>106591123
mindbroken
what a great troll that was
>>
File: ComfyUI_01128_.png (1.56 MB, 1024x1328)
1.56 MB
1.56 MB PNG
>>106591123
It's just qwen with an awful whinnie the pooh lora I cam across while looking for porn and thought "why not?"
>>
File: AnimateDiff_00378.mp4 (2.42 MB, 720x720)
2.42 MB
2.42 MB MP4
>>
File: AnimateDiff_00137.mp4 (1.64 MB, 720x720)
1.64 MB
1.64 MB MP4
Still suffering from horrible colorshifts.
Makes the whole video genning useless really.
>>
>>106591171
Based on everything else here, this seems to be an entirely you issue.
>>
I heard sd3 launched and came to see if it's any good
>>
>>106591171
2.2 fast loras caused this last time I tried
>>
>>106591177
Yeah and Trump became president AFTER like two assassination attempts, bitcoin passed 100k, Thailand and Burma had a brief conflict. Putin Visited the US. Biden has colon cancer. And SD3 is shit.
>>
go to bed debo
>>
>>106591171
how are you specifically struggling so much with video gen. it's the easiest shit in the world
>>
>>106591189
welp, back to cave I guess
>>
>>106590738
>kisses the camera
including this has a 50/50 chance of wan spawning a camera and having her kiss that in my experience
>>
File: AnimateDiff_00138.mp4 (1.24 MB, 720x720)
1.24 MB
1.24 MB MP4
>>106591172
>>106591188
>>106591196
The lightx2v loras introduce motion to the gen, it seems impossible to not use them.

The others aren't using first frame last frame loops.
>>
>>106591212
I really don't understand how they come out so bad on your end. Maybe switch to Kijai's workflows? They are retard proof.
>>
File: wan22_light21_00782.mp4 (830 KB, 416x416)
830 KB
830 KB MP4
>>106591171
some people can't be helped
>>
>>106591212
catbox your workflow and i will tell you how you are fucking up
>>
>>106591211
has not happened to me once
>>
>>106591229
happens to me almost every time i prompt something involving a camera. most recently i tried prompting for handheld camera movement and it just spawned a camera in
>>
File: WanVideo2_2_I2V_00304.webm (1.66 MB, 1248x720)
1.66 MB
1.66 MB WEBM
>>106591235
It only ever happened to me once on a throwaway gen and it's because I specifically prompted ronald to "bat the camera away with his hand" which it interpreted as a camera appearing to be pushed away.
>>
>>106591251
i love that gen. still cracks me up
>>
File: WAN 2.2 I2V_00001.mp4 (1.98 MB, 320x448)
1.98 MB
1.98 MB MP4
>>106591220
>>106591222
Guess I'm retarded, works fine with kijai.

>>106591227
https://files.catbox.moe/2u1ndd.mp4
>>
>>106591283
I just use kij these days desu. Something about comfy workflows that always come out wrong.
Which kind of makes sense since Kij is literally just the script wan gives you to run the model and all kij is doing is making nodes in comfy that can interact with that script.
>>
File: 1756371294586437.png (1.62 MB, 1328x1328)
1.62 MB
1.62 MB PNG
ugh bros my 90s candid amateur out of focus grainy gens are so SOVLFUL!!! chromabros we're so back!!!!!!!!
>>
>>106589872
bro just merge the attention layers manually DUH
>>
>>106591283
okay yeah nvm aside from the questionable prompting im not sure there's any glaring issues lol. comfy implementation simply might be worse
>>
File: AnimateDiff_00381.mp4 (3.47 MB, 720x1040)
3.47 MB
3.47 MB MP4
>>106591235
i don't know what you're doing wrong, using a retarded allinone model, or using a <8 cope quant but it just does not happen to me
>>
It seems that when i use more than 3 regions in regional prompter (forge UI), it breaks and only takes the first region into account. any idea what's the problem? I'm sure I used at least 4 regions in the past.
>>
>>106591328
nope im just using Q8 wan + 2.1 lightx2v.
>>
>>106591341
I can't help myself, the problem is you aren't using comfy UI.
I think my head would have exploded if I didn't at least type that. Sorry.
>>
File: AnimateDiff_00772.webm (2.87 MB, 1872x1088)
2.87 MB
2.87 MB WEBM
>>
File: AnimateDiff_00140.mp4 (689 KB, 480x480)
689 KB
689 KB MP4
>>106591295
>>106591310
I should have listened to my own notes. Now that I've swapped back and forth between two different 'Save Video' nodes, the color shift is gone. There has to be some shit going on that glitches it out.
>>
File: AnimateDiff_00141.mp4 (385 KB, 480x480)
385 KB
385 KB MP4
>>106591371
Flawless loops now, with lightx2v, REEEE
>>
https://files.catbox.moe/2mxzj8.png
https://files.catbox.moe/mfy6zj.mp4
>>
>>106591442
The consistency is impressive, some small artifacts in the bodypaint but overall it looks like in game footage
>>
>>106591442
ummm metadata?
>>
>>106591111
If only the bears weren't such bearlets.
>>
>>106591474
Whinnie the pooh is a canonical bearlet. It can't be helped.
>>
Is it possible to save each new gen into it's own new folder? I realized with png, I can get rid of even more distortion in a loop.
>>
>>106591458
this shit to me a whole 45 minutes to get it decently right. very frustrating trying to do convenient censorship just right enough to avoid triggering jannies with the ban hammer.
this is the original lora if curious.
https://civitai.com/models/1714926/tomb-raider-lara-croft-survivor
>>106591470
drag the image to png info on forge anon. video metadata is in its comments. drag it to wan2gp.
>>
>>106591496
filename_prefix can do https://blenderneko.github.io/ComfyUI-docs/Interface/SaveFileFormatting/ and also you can do folder/subfolder/[...]/filename
>>
>>106591442
*steals your workflow.*
>>
>>106591550
Thumbs up!
>>
>>106591341
>>106591352
ok it works in a1111 so it's a problem with forge. using comfy is out of the question so i guess i'll try reforge
>>
What samplers do you all use for wan? I've been using res_multistep and while good it makes gens look sorta plasticy
>>
>>106591647
>it makes gens look sorta plasticy
That would be the 4step LoRAs
>>
>>106591652
nunchaku wan never ever
>>
File: kek.png (71 KB, 1745x509)
71 KB
71 KB PNG
https://huggingface.co/bytedance-research/UMO
bytedance has released the full model of UMO (not just the lora)
>>
>>106591722
@grok what is this
>>
File: 1729677592019958.jpg (656 KB, 3005x1566)
656 KB
656 KB JPG
https://s2guidance.github.io/
Babe wake up, it's time for Alibaba to go for a new cope, "The next replacement of CFG(TM)1!1!!!1!"
>>
>>106591722
I highly doubt ByteDance is going to give local a model like Seedream.
It's either going to be Wan or Qwen that does it. Probably Wan since they don't go for slop evals.
>>
>>106591731
>THING is all you need
Whenever I see this I assume it's trash.
>>
>>106591740
idk alibaba has started to keep some things behind closed doors lately
>>
>>106591759
>lately
>>
>>106591759
>alibaba has started to keep some things behind closed doors lately
what happened?
>>
>>106591770
Qwen has started keeping certain models api-only
>>
>>106591771
>it's begining
it was bound to happen, they're starting to get models that can be serious rivals to the best API ones, no way they're gonna release SOTA models, no one is doing that, ever
>>
>>
>>106591731
interesting paper, but id like all these bold claims to be backed up by actual code.
>>
File: 1749016234009802.png (286 KB, 1840x1189)
286 KB
286 KB PNG
https://xcancel.com/bdsqlsz/status/1967431792992129065#m
His English is rough, but if I understand correctly, in a week we will have a new editing model and another video model.
>>
File: file.png (2.38 MB, 832x1488)
2.38 MB
2.38 MB PNG
>>106591795
that's why europe got rid of most of the bears
>>
File: OURS.png (182 KB, 1032x825)
182 KB
182 KB PNG
>>106591731
that's funny, all the other replacements cope of CFG were actually making it worse than CFG itself
>>
>>106591845
Did they get nostalgic and decide to mass reintroduce wild creatures to assault the local population?
>>
>>106591731
I want to say "nothingburger" but since it's from Alibaba I want to believe, so far they showed that they are a serious company.
>>
>>106591558
It's saving to a folder, but each subsequent gen goes into the same folder. For png renders that's going to be a mess, hence a new folder for each time you gen.
I'm a complete beginner so I can't make sense of that page.
>>
File: 1740345867628911.png (319 KB, 2051x911)
319 KB
319 KB PNG
>>106591731
>We use a De-distilled version of Flux Labs (2024) in our experiments.
excuse me? how did they get that? I doubt they collaborated with BFL so I guess they used this model?
https://huggingface.co/nyanko7/flux-dev-de-distill
>>
File: chroma.png (2.4 MB, 832x1488)
2.4 MB
2.4 MB PNG
>>106591855
italy and romania and probably others had a number of bear situations where people died without having done anything particularly stupid, yes
>>
>>106591873
Change your "filename_prefix" field from AnimateDiff to %date:yyMMdd-hhmmss%/AnimateDiff
>>
>>106591896
>how did they get that?
Seems they give it out to anyone so long as you're not a filthy local shitter.
>>
>>106591911
>as you're not a filthy local shitter.
that's why you don't have it either :(
>>
>>106591873
the page explains that you can use %node_name.widget_name% or %date:FORMAT% to define the foldername it goes to.

use that for a foldername with the date and seed for example
>>
>>106591910
Hell yeah, that worked. I bring you raunchy frieren and orc wip as thanks: https://files.catbox.moe/ze7qky.mp4

>>106591922
Ah so the date format decides each new folder, because it's counted in seconds. If I were to keep it just to the days, everything I gen goes into that one folder for the day?
>>
File: Chroma2k-test_00087_.jpg (581 KB, 1248x1824)
581 KB
581 KB JPG
>>
>>106591731
>https://s2guidance.github.io/
The outputs look a little deep fried to me desu.
>>
File: 1751222080386261.png (1.17 MB, 1024x1024)
1.17 MB
1.17 MB PNG
>>106591974
it does, everytime a "replacement" of CFG comes in, it's always some ultra slopped, ultra fried shit (but it follows the prompt better though !!!!)
>>
>>106591948
>Ah so the date format decides each new folder, because it's counted in seconds. If I were to keep it just to the days, everything I gen goes into that one folder for the day?
yes. and as it says with %node_name.widget_name% you could also use any other information from any other node, such as a seed or other random number from a random number generating node
>>
>>106591982
>Makes the image better by destroying it.

Can't wait to play with it for a few hours and never use it again.
>>
>>106591442
>>106591567
that's a good idea
>>
>>106591989
The widget name would be the parameter inside of that named node, for example %Ksampler:noise_seed%?
>>
File: 1755101377359070.mp4 (860 KB, 448x704)
860 KB
860 KB MP4
>>106591997
>webm
That's quite the glowup of her.
>>
>>106591997
Why is she so captivating?
>>
>>106591997
She could have a future in porn
>>
>>106592046
Wasn't her father like very high up at Goldman Sachs?
She never ever has to worry about money. Ever.
>>
>>106591969
>>
>>106592064
>Bomb expert, dual bomb expert to be exact
>>
File: 1741844197478984.jpg (1.31 MB, 1248x1824)
1.31 MB
1.31 MB JPG
Spent some time with SRPO
Terrible prompt comprehension and medium knowledge compared to Chroma or even Qwen. Half the time it just ignores a chunk of my prompt. Still slops hands regularly. Can only decently do 3d, but is completely incapable of good nsfw. Beyond faster generation, i honestly don't understand what the fuck is even the point of that finetune is or why it has been shilled around lately.
Pic related is how SRPO understands OIL PAINTING.
>>
>>106592064
>>106591997
>>106591567
you guys are getting it all wrong, i trying tooth and nails to get the camera to do a 360 degree orbit around the subject and not the subject doing a 360 body spinning in from of a static camera. This shit is pissing me off to no end.
>>
>>106592086
SRPO seems to be a good method to unslop renders, but doing it on Flux was a retarded move, you can't save Flux it's obvious at this point, can't wait to see them try on qwen image though
>>
>>106592107
Wasn't the SRPO guys from Alibaba, if so why didn't they do it on Qwen to begin with ?
>>
>>106592093
damn I forgot about her game, should I buy some lube to play it?
>>
>>106592093
what's your prompt for this? it does seem to ignore "camera orbiting around character" prompt a lot
>>
>>106592114
no, SRPO was invented by Tencent
>>
>>106592003
i believe it is %KSampler.noise_seed% but you got the concept right.

i'm using another node than the vanilla KSampler so I can't definitely check it
>>
>>106592118
Ahh, that makes more sense then
>>
Is it possible to pass an existing video into an img2vid generation with Wan2.2? I try doing it, but it fucks with the colors in a way that regular img2vid generations don't do.
>>
>>106592179
What are you trying to achieve exactly?
>>
>>
>>106592039
Because you're white
>>
File: file.png (2 MB, 832x1216)
2 MB
2 MB PNG
>>106591810
Why so much focus on video and imagen now? Not complaining of course.
>>
>>106591997
Can you post the starting pic?
>>
>>106592249
They already have a niche cut out in the llm space. Now they are trying to cut out one in the image gen space. They throw us the failed attempts along the way.
>>
>>106592249
Whatever tech is easiest to improve on for the return will be the focus. LLMs are now in an incremental codemaxxing era while there's a lot to improve on in video gen
>>
File: ComfyUI_00983_.jpg (568 KB, 2048x2048)
568 KB
568 KB JPG
>>106592253
>>
>>106592262
>LLMs are now in an incremental codemaxxing era
I hate this. I just want one where its reward training is extracting semen from balls.
>>
>>106592210
I've got a video with glitchy eyes, and I'd like to pass it through the refiner to see if it gets better. It does get better, but it fucks with the colors.
>>
>>106592267
You can't just pass it through T2V at a very low denoise?
>>
File: 00156-2515056960.jpg (148 KB, 1824x1248)
148 KB
148 KB JPG
>>106592116
prompt "The camera is orbiting 360 degrees around the woman's showing the viewer her side, back, other side of body before completing the full rotation back to her starting position. The 360 degree camera orbit around the woman's body is fast and smooth. The lighting is cinematic and dramatic, with soft shadows and realistic detail." good luck getting this work, frankly after 6 hours straight with multiple failures, i give up.
here a catbox for a spicy gen attempt.
https://files.catbox.moe/764uam.mp4
>>
After trying video gen for a couple days. My respect for blacked content has increased. It's impossible to get it working right without a shit ton of loras and different strengths etc etc. for regular sex it's so easy just plug in and go. I wish black girls were more attractive then because I just want contrasting content during sex.
>>
>>106592256
>They throw us the failed attempts along the way.
feelsbad being a local fag, because this is true, look at bytedance, they failed with UMO so they gave it to us, and once they struck gold with seedream 4.0 they kept it for themselves >>106591722
>>
>>106592272
I've tried that, but it changes the rest of the image too much, even at extremely low deneoise, ie. 0.05.
>>
>>
>>
>>106592288
Okay, and you tried i2v but with the first frame of the video as the input and vae encoded the broken video into the low noise sampler? Did you add noise to the samples?
>>
>>106592284
Genuinely curious as to what makes seed dream so good. I'm not sure Ive even seen an output from it.
>>
File: lel.png (486 KB, 555x552)
486 KB
486 KB PNG
Image models from 2021 (sovlMaxxing) vs image models from 2025 (slomMaxxing)
>>
>>106592323
>Genuinely curious as to what makes seed dream so good.
they released the paper
https://xcancel.com/bdsqlsz/status/1966034419183124527#m
https://arxiv.org/abs/2509.08826
>>
File: WanVideo2_2_I2V_00387.webm (1.47 MB, 1024x1024)
1.47 MB
1.47 MB WEBM
>>
>>106592314
Yeah, and that fucks with the colors, unfortunately.
>>
/adt/ got deleted
>>
>>106592333
Yeah, but I've never seen an output from it.
>>
>>106592323
>I'm not sure Ive even seen an output from it.
>>106577845
>>106578184
>>106576677
>>106576615
>>
>>106592368
Hmm well, they are pretty sharp.
>>
>>106592267
read this, you can use a segment model to mask a face -> crop around it and upscale before sending it to the low model for a 1 step i2i and paste it back in
>https://www.notion.so/bedovyy/WanFaceDetailer-261ce80b3952805f8aaefb1cdb90ec04
>>
>>106592390
I'll look into it. Thanks for sticking around and giving me a way forward, whether it pans out or not.
>>
File: file.png (220 KB, 1128x1195)
220 KB
220 KB PNG
Can someone help me figure out what to try next?
https://files.catbox.moe/fal8ln.mp4
Just want the dude to be black. and unfortunately neg prompts don't work for me and this setup.
>>
>>106592390
That is interesting.
>>
File: 1750525216196686.png (115 KB, 396x385)
115 KB
115 KB PNG
>>106592408
>Just want the dude to be black
>>
>>106592408
pajeet you gotta master the english language before you can master prompting, so things like negro, basketball american, breathing simulator 9000 all evoke images of 'black skinned' people in the ai's mind
>>
>>106592306
amazing consistency
>>
File: WanVideo2_2_I2V_00388.webm (2.81 MB, 1024x1024)
2.81 MB
2.81 MB WEBM
>>106592390
I wonder how this actually works for non-anime stuff.
>>
File: file.png (1.99 MB, 832x1216)
1.99 MB
1.99 MB PNG
>>
>>106592327
>Image models from 2021
there was none
>>
>>106592408
>Just want the dude to be black.
a BBC enjoyer I see
https://www.youtube.com/watch?v=oNsNjMuevXw
>>
>>106592340
That flame, else it's really good
>>
>>106592327
What's the actual difference here, beyond the tan and him getting older?
>>
>>106592466
A man oblivious to the concept of plastic surgery.
>>
>>106592472
>>106592466
he was in a motorcycle accident and needed facial reconstruction.
>>
>>106592466
You can't see his cheeks, lips and chin have been bogged to hell and back?
Is this that face blindness austists talk about?
>>
File: 1730387215144568.png (299 KB, 2963x901)
299 KB
299 KB PNG
>>106592459
2021 was the year this paper predicted that the future of image models would be diffusion models.
https://youtu.be/W-O7AZNzbzQ?t=3235
https://arxiv.org/abs/2105.05233
>>
File: 250915-183800.mp4 (1.77 MB, 992x1200)
1.77 MB
1.77 MB MP4
>>106592408
Wan BLACKED lora when?
>>
>>106592466
>>106592480
>You can't see his cheeks, lips and chin have been bogged to hell and back?
>Is this that face blindness austists talk about?
those are the same "people" who see no problem with Chroma btw
>>
>106592491
obsessed
>>
>>106592480
I can see there being lip filler in 2025, but he already looks bogged in 2021. I don't see much difference in the other features that couldn't be explained by aging, a wider smile, and weight gain.
>>
>>106592486
tried one and it made weird 4 legged, cock and pussy monsters fucking.
>>
>>106591810
Surely they aren't releasing inferior model to what we have. If they ain't on par with at least Qwen edit and Wan 2.2, nobody is gonna use them, kinda like omnigen 2, Hunyuan i2v and so on
>>
File: 1749259584572343.png (446 KB, 495x619)
446 KB
446 KB PNG
>>106592476
>he was in a motorcycle accident and needed facial reconstruction.
it was in 2013 though, he looked fine before the 2020's, he started looking like bog way after that
>>
>>106592264
Thanks
https://files.catbox.moe/a9w5zo.mp4
>>
>>106592503
they don't care that they don't compete with local SOTA, they just want some good boi points and be treated like the "nice guy company", optics are important, especially for investors
>>
>image2image
>staring images is a white man on the button of the woman.
>10 lora setup
perhaps you sittings and start image is bad to begin with.
>>
File: is this bait?.png (146 KB, 640x354)
146 KB
146 KB PNG
>>106592466
>What's the actual difference here
>>
>>106592446
can't believe basketball player worked...
almost there but the penis is all wrong
https://files.catbox.moe/mus9ye.mp4
>>
>>106592466
Left is normal, right is someone using img2img on his face
>>
>>106592466
left is training a model with real data, right is training the same model with synthetic data
>>
>>106592571
try using "bestiality" next time in the prompt to get that skin
>>
>>106592509
>in an AI thread
>can't tell when something is obviously using AI and/or photoshop
grim
>>
>>106588114
Can I get a box for this?
>>
>>106592390
I can't be assed to download all of the segmentation models for this right now, but as far as I can tell, it just takes an input video. Segments and boxes the faces, upscales them and denoises them at a higher detail the pastes them back over the video, right?

I think the most interesting takeaway is that he uses causevid for the lowpass.
>>
>>106592613
unfortunately, that's the real face of Zac nowdays, what a waste...
>>
>>106592624
Did you train the lora on Mickey Rourke's face
>>
>>106592605
K i'll try it next.
Hell if dog knots is easier than black guys I'll go that route instead
>>
>>
File: AnimateDiff_00325.mp4 (3.32 MB, 1280x720)
3.32 MB
3.32 MB MP4
>>106592282
you have to be autistic with prompting with wan
mention all four angles step by step, mention how the background moves step by step
>>
File: ComfyUI_01771_.mp4 (1.02 MB, 1280x720)
1.02 MB
1.02 MB MP4
>>106590560
That's how I keep consistency for characters in my project. I started with one good image and then do gens with her changing positions then use that as a starting image, it's not perfect but better than the gacha of trying to gen the exact same outfit/appearance again. Sometimes I'll run the final image through img2img in an image model or to refine it. The main drawback is the resolution is a bit low so if the character isn't close up close you can lose some details.
>>
what lora can i use in wan2.2 to pull their top down and expose breasts?
>>
>>106592886
>The woman pulls down her shirt, exposing her breasts

No lora needed.
>>
>>106592886
FLF
>>
>>106592889
i had
>she exposes her large breasts at the start
and nothing. I'll try this
>>
File: ComfyUI_temp_vusjp_00079_.png (3.38 MB, 1824x1248)
3.38 MB
3.38 MB PNG
https://files.catbox.moe/2tyce3.png
>>
File: 00028-3933361757.jpg (269 KB, 1728x1344)
269 KB
269 KB JPG
>>106592811
Just turned off my pc due too hearing weird noises with my 5090 and i gotta sleep. Please Post the entire prompt positive and negative prompts. I suffered seven hours straight with multiple failured gens and feel very burnt out. Help an anon out please :'). Please do a 360 orbit of pic related.
>>
>>106592971
prompt
>The camera is orbiting 360 degrees around the girl showing the viewer her left side, the background moves showing the left side of the room, wall. Then the camera continues to rotate showing the viewer her back, the background moves showing the back side of the room, audiences. Then the camera continues to rotate showing the viewer her right side, the background moves showing the right side of the room, wall. Then the camera continues to rotate showing the viewer her front, the background moves showing the front side of the room.
>>
https://files.catbox.moe/gzd7ba.png
>>
>>106592971
>>106592985
also use first and last frame and set the two frames to the same image
>>
File: ComfyUI_temp_dazrk_00031_.jpg (499 KB, 1248x1824)
499 KB
499 KB JPG
https://files.catbox.moe/v5c1s5.png
>>
File: ComfyUI_temp_lcqvx_00004_.png (3.65 MB, 1824x1248)
3.65 MB
3.65 MB PNG
lmk if such type of gore is allowed here, idk how to spoiler images https://files.catbox.moe/n9fkqr.png
>>
>>106592985
>>106592997
Will test this out later, thank you very much anon.
>>
File: ComfyUI_temp_vusjp_00053_.png (3.43 MB, 1248x1824)
3.43 MB
3.43 MB PNG
https://files.catbox.moe/kpdgky.png

Catpcha:YGANG
>>
File: 6132621603a42.jpg (20 KB, 359x325)
20 KB
20 KB JPG
>Chroma-DC-2K-T2-SL4
These niggas will train anything except the qwen text encoder for it. Someone stop them
>>
>>106590405
>As long as the total amount of pixels is the same and the dimensions are a power of 64
I'm pretty sure SDXL isn't trained on 1280x768. It means a lot what specific resolutions it's trained on. SDXL is trained on multiple different resolutions. within the 2048 pixel plane meaning 1024x1024 and 1216x832 and some others i believe; you can look it up.
You're going to get the best results if you stick to the exact dimensions it was trained on.
It's the same way where you get very bad results with SD1.5 if you make it any other dimension than 512x512.
The training dimensions are kind of hardcoded into the model, and when you don't follow them, you kind of warp the vector space and it's associations with the pixel space.
You can generate at the native resolutions, and then upscale to a high resolution and then you can downsize and crop the images later. That's the way you'll get the best qualitative outcome with the current models.
>>
guys which joycaption nodes to use?
>>
>>106592889
>>106592605
Thanks! it works way better than what I had originally. still need to work on the legs, either the guy has legs coming out of his hips to the side, or the girl is missing the bottom half of her legs
https://files.catbox.moe/ke3kv9.mp4
>>
File: 1743304472428653.jpg (1.2 MB, 1536x2304)
1.2 MB
1.2 MB JPG
>heh I haven't updated comfy in weeks
>update
>now all my gens come out a blurry mess
I'm fed up
>>
>>106593201
turn off fast optimizations retard
>>
>>106593211
There are no fast optimizations, retard.
>>
File: ComfyUI_temp_vusjp_00184_.png (1.38 MB, 1824x1248)
1.38 MB
1.38 MB PNG
>>106593226
Turn off your eyeballs

https://files.catbox.moe/ct8jyn.png
>>
If i want to continue a video, do i use a different prompt? I used the same prompt as the initial and it just. slowed down and barely moved.
>>
I enjoy Chroma.
>>
i enjoi haveng extra chromasomee..
>>
>>106592825
do you think ani is on the right track for making a game engine with diffusion mechanics built in? you are the only anon I know that's making a game
>>
>>106593279
ymmv, generally the same prompt shouldn't do LESS on average

maybe you have too many speedup things eabled that interfere with motion, or not enough steps, or maybe you want to try the hps/mps reward or movement lora, or other things
>>
>>106593149
>joycaption
>nodes
lol ur gay
>>
>>106589978
>Surely all those server farms aren't built off GPUs right?
Yes they are, but it's not the same GPU's you can buy to put in your PC. If you watch some of the nVidia presentations, you'll see that they're super huge, and they're making them bigger and bigger each time.
nVidias market is no longer really focused on consumer GPU's. They're more so in the business of designing custom systems for big businesses that need datacenters and software solutions for training and analyzing all kinds of stuff with AI.
>>
>>106593149
Just get taggui
>>
File: Landscape_2.jpg (278 KB, 719x723)
278 KB
278 KB JPG
Specs: 32GB RAM, 12GB GPU, i12900k Running ComfyUI with SDXL realisticslop

Switched from Forge to Comfy recently.
Two questions:

Is Comfy actually faster for gen/checkpoint loading than Forge or just me?

Anyone else notice Comfy outputs seem slightly softer/less sharp/quality?
>>
>>106593343
this anon just mentioned it >>106593201
>>
>>106593349
I have to move back to Forge?
>>
>>106593149
probably decide between some of the most recently updated
>>
>>106593201
>he pulled
>>
>>106590090
You know you can use quantized versions right?
>>
>>106593334
I just love noodles, what can I say
>>106593338
id rather not have YET another conda enviro PLEASE
>>106593362
I wanted to go to a 'mostly' generic route using a generic LLAVA wrapper which uses llama-cpp-python, but the generic vision nodes were not updated recently, the other nodes have descriptions all written and chink and id rather xi not see what im captioning
>>
so different seeds can give you a completely fucked unusable gen? is there any way to know how bad the gens will be? I did 2 gens and one had a random mystery guy added behind the girl and the other was perfect.
>>
>>106593426
xi has a "few" chinese that can write english - but the chinese sometimes write in their language.

just ignore it as long as you can use it.
>>
>>106593510
depending on model and prompt sure

you can't know in advance unless you use a very special model type but you can generate preiews with like TAESD on each step as it's crunching the tensors... on most models at least
>>
>>106593529
yeah i have previews on but by the time i can tell if it's fucked or not it's too late to abort,
>>
File: Qwan_00014_.jpg (825 KB, 1984x2976)
825 KB
825 KB JPG
Qwen SRPO when.
>>
File: wan22_light21_00786.mp4 (610 KB, 640x464)
610 KB
610 KB MP4
>>106590560
>my boy discovering that wan is the best edit model
>>
new
>>106593668
>>106593668
>>106593668
>>106593668
>>
File: ComfyUI_02291_.mp4 (657 KB, 1280x896)
657 KB
657 KB MP4
>>106593319
I don't know the specifics of what he's doing, is it just allowing the devs to create a prompt along with parameters to generate images in game? I think using image/video gen at runtime will be pretty common eventually so starting something like that for games to come out a few years from now is probably a good idea. I don't think it's super viable right now because it's too slow or just not possible on the average consumer's hardware. Players won't like waiting a minute for a scene to generate. There's also the issues with discontinuity but maybe it will be solved in future models or players just won't care.
>>
>>106593537
i don't see how there could be a way that lets you see it even earlier
>>
>>106593319
i think trani (read, you) should kill himself immediately



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.