[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


Pruned Edition

Discussion of Free and Open Source Text-to-Image/Video Models and UI

Prev: >>106642301

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts/tree/sd3
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/tdrussell/diffusion-pipe

>WanX
https://comfyanonymous.github.io/ComfyUI_examples/wan22/
https://github.com/Wan-Video

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
Training: https://rentry.org/mvu52t46

>Neta Lumina
https://huggingface.co/neta-art/Neta-Lumina
https://civitai.com/models/1790792?modelVersionId=2122326
https://neta-lumina-style.tz03.xyz/

>Illustrious
1girl and Beyond: https://rentry.org/comfyui_guide_1girl
Tag Explorer: https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbours
>>>/aco/csdg
>>>/b/degen
>>>/b/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
>>
cope thread of vramletniggers and mental cases
>>
all the rage against the KJ machine aside, i am getting utterly blown the fuck out by sharing this benchmarking page given they directly recommend his nodes. so. lol.
>>
>>106647244
has anyone done a good comparison using official vs kijai? how much movement loss are we talking about?
>>
What the fuck is this new movement copying tech, what did I miss, where are the workflow jsons?
You guys need a News section in the OP like /lmg/
>>
>>106647244
are they retarded? so far kijai has always been the far heavier workflow.
>>
Blessed thread of frenship
>>
>>106647244
>Native version (the so-called official workflow)
Kek, Comfyui is such a shitshow
>>
>>106647273
I wonder what model they use to parse previous threads for news, migus, etc
>>
>>106647244
> the kijai version is better because it uses kijai nodes

holy fucking esl. whoever wrote this is a room temp mango.
>>
will comfyorg pat the h1b premium to keep comfy, the chinks and the jeets in the US?
>>
>normies waking up to the idea that AI can generate children
Oh no...
>>
>>106647305
lmao, trump killed the org
>>
>>106647300
the original is in Japanese, that's an automatic translation
>>
shitposting aside, is flux kontext actually better than qwen edit at anything? i'm cleaning out old models and i genuinely can't think of a reason to keep it around. loras for kontext are slim compared to qwen too.
>>
File: elongate her hair.png (2.3 MB, 1536x1280)
2.3 MB
2.3 MB PNG
>>106647325
Better non-destructive smaller changes, when it works
>>
>>106647325
I keep it for very niche use cases
>flux has better controlnets than Qwen right now
>processing hundreds of images quickly with nunchaku + lightning lora
>generating chibi manlet versions of characters without having to prompt it
>>
>>106647305
They are Canadians anyway.
>>
>>106647347
>>106647344
i see. wasn't aware kontext had controlnets (nor do i know what i would use them for i suppose)
>>
>>106647356
Asians that live in Canada are rich, no reason to live in slum like San Francisco
>>
>https://github.com/FizzleDorf/AniStudio/releases
New releases, ya'll.
>>
>>106647436
The only release anyone cares about is the police release statement of your ACKing, sis
>>
>>106647436
>4 days ago
Not new, tourist.
>>
>>106647436
I thought Comfy was insufferable but that dev takes the cake
>>
File: file.mp4 (1.95 MB, 480x720)
1.95 MB
1.95 MB MP4
>>106647436
congrats
>>
>>106647305
they will send comfy to bangladesh by mistake. it will be so funny
>>
>>106647533
holy shit this is so hot
>>
>>106647305
comfy is an immigrant?
>>
is there a good sdxl checkpoint to go for if you're looking to gen stuff that's NOT 1girl nsfw?
>>
>>106647570
he's from quebecistan
>>
>>106647294
>model
he does it manually
>>
>>106647347

how do you generate chibi manlet version of character?
>>
>>106647593
Explains why he is so insufferable
>>
>>106647628
lurk more
>>
>>106647582
"NOT 1girl" is a very broad category. Care to elaborate?
>>
File: 1523319044083.png (207 KB, 422x362)
207 KB
207 KB PNG
Anyone done grid comparisons between identical loras trained on differrent resolutions?
>>
>>106647627
By using kontext. Will do that naturally.
>>
>motion is significantly better in wan 480p than 720p
wut duh, so the loras DEFINITELY need to be trained on the specific checkpoint to actually use them? dammit.
>480p did my prompt perfectly but gave her two nipples on each tit
>720p has near nonexistant physics and kinda stiff motion
>>
>>106647436
Pedo
>>
>>106647689

is the prompt just "generate chibi manlet verion of character"?
>>
>>106647680
I'm just looking for something to play around with for a wide range of stuff and all these 1girl checkpoints feel a bit limited

I just need something different
>>
>>106647759
SDXL is a shit model if you want to stray beyond 1girl.
>>
>>106647724
No, just have outpaint or even change poses. Chances are they will end up a manlet. It's a fault of kontext not really a feature lol
>>
Just use Seedream. Most versatile model, no censorship, and insane base res
>>
>>106647582
unless a checkpoint is literal ass it should be able to do non 1girl fine
>>
>>106647796
>no censorship
post futa oneesans pegging a shota
>>
>>106647796
Diffusing... locally... ComfyUI... API... nodes...
>>
>>106647796
Post vagina or gtfo
>>
>last new thing didn't pan out
>anon back to replying to bottom of the barrell b8
Next new thing when
>>
File: x.mp4 (2.24 MB, 480x720)
2.24 MB
2.24 MB MP4
>>106647567
yes. either go and save endangered regions of asia from depopulation or make more with wan
>>
>>106647796
>no censorship
come on bruh, no need to lie like that, the model is a great SFW model, that's it
>>
>coomberboomers seething over seedream
China won, the west lost.
>>
>>106647852
what are you using for img gen?
>>
File: 1741747423404945.mp4 (787 KB, 796x480)
787 KB
787 KB MP4
kek it masked both after the cut
>>
>>106647863
Those shoulders nigga lmao
>>
apparently seedream can't actually do 4k and it just uses esrgan upscaling. they just hide that behind the api
>>
>>106647875
duh
>>
>>106647875
Anyone could have told you that with how blurry the damn model is
>>
>>106647863
Wtf I was literally just in the tdkr /wsg/ thread looking for a vid because I was about to do something like this
>>
apparently seedream can do native 4k in under 40 seconds, openkeks can't figure out how because they cleverly hide it within comfyui's API
>>
File: 1729290283441060.png (244 KB, 395x437)
244 KB
244 KB PNG
>>106647863
>troonku
What would happen if I took the rope around your neck off?
>>
apparently my cock is in your ass right now
>>
>>106647928
it would be extremely painful

also i'm getting the hang of this but this shit is gonna be absolute gold for memes, like VACE on steroids.
>>
>>106647935
Oh that's what that was...barely felt it
>>
>>106647947
also, at the same time you could do a qwen edit swap, or shoop, then i2v with wan. but this has openpose and so on so it can be a direct swap.
>>
>>106647947
You're a big guy.
>>
>>106647952
yeah because your hole is so loose
>>
>>106647201
I so want a Mayli LoRa, I would generate her for years if I got one
>>
File: 1749549704780568.mp4 (1 MB, 832x480)
1 MB
1 MB MP4
the anime girl is talking and holding a silver briefcase.

gonna try with a full body miku. but, it works.
>>
>>106647962
FOR YOU
>>
>>106647967
just take any photo and use qwen edit + clothes remover lora. viable alternative if no lora.
>>
>>106647615
Would be cool if it were true
>>
File: 1751248114853771.mp4 (1006 KB, 832x480)
1006 KB
1006 KB MP4
>>106647973
yep, full body source is the way.
>>
>>106647211
>vramletniggers
>>106637352
>mental cases
>>106643240
>>
>>106647852
please don't stop
>>
File: 1737167160575550.mp4 (1.02 MB, 832x480)
1.02 MB
1.02 MB MP4
>>106648000
now we're talking
>>
>>106647989
>just take any photo and use qwen edit + clothes remover lora. viable alternative if no lora.
I'll have to try that. She's so gorgeous. I'm glad she did porn once I just wish it wasn't facialabuse. Then again, maybe abuse and humilation is something she deserves.
>>
File: 1730369007488444.png (29 KB, 904x195)
29 KB
29 KB PNG
Anyone else was able to use qwen image nunchaku?
I wanted to test it but it just crashes comfy silently for me.
It's just the standard qwen image wf with nunchaku model loader.
>>
>>106648063
but what if my penis can't handle it?
>>
File: file.png (397 KB, 1962x1385)
397 KB
397 KB PNG
hi new learner here, i want to add some loras to a premade image to video workflow that i found following a guide.

the tldr is i want to use the breast bounce lora. how do i fit another lora into a premade workflow? is it as simple as adding another lora and chaining it or is there something else I have to do?
>>
>>106648094
Yes chain them in series or just use a multiple lora loader node which does the same thing.
>>
>>106648081
only one way to find out
>>
>>106648105
thank you. does it matter which order the loras are in, or is that more of a trial and error thing?
>>
File: 1743559105288766.png (813 KB, 1288x808)
813 KB
813 KB PNG
the anime girl is wearing a black mask like bane from batman the dark knight rises, the background is white. remove the text and other elements aside from the anime girl.

qwen edit made a neat persona miku
>>
>>106648114
Order doesn't matter, only weight does.
>>
>>106648128
No, only height matters.
#ItsOver2025 #ItNeverBegan2025
>>
File: x.mp4 (2.11 MB, 480x720)
2.11 MB
2.11 MB MP4
>>106647861
wan 2.2 t2v, no image reference. adult women with glossy clothes in a disco. the "fuck me" text also is prompted

recommending trying stuff that wan can do physics with 'cause it's fucking cool
>>
>>106648155
post workflow? how are you going above 5 seconds without issues?
>>
the problem with animate wan is it's a lot of work and you could just use wan 2.2 after an edit with qwen edit/kontext then prompt.
>>
>>106648201
interpolate and then run it a little slower
>>
File: 1730130659708132.png (746 KB, 1024x320)
746 KB
746 KB PNG
change the position of the anime girl to a side profile.
>>
>>106648210
aw well thats not interesting thats just wasting my time with no new information. thanks anyways
>>
>>106647244
so native is for professionals and kijai is for vramlets, simple as
>>
>>106648255
what a weird response, would you rather just get no answer?
>>
>>
>>106648261
are you mentally ill? why did you interpret my response as hostile? i was just upset it wasn't any actual new frames since i don't need longer gens, i need more information because only 5 seconds of mommy worship is usually not enough
>>
>>106648155
>lightsticks not in the training data, yet
which group would you want to see first?
>>
>>106648298
>why did you interpret my response as hostile?
>thats just wasting my time with no new information
gee I wonder
>>
File: 1741797072671104.jpg (138 KB, 960x1706)
138 KB
138 KB JPG
https://files.catbox.moe/fhdlss.mp4
https://files.catbox.moe/k1hor8.mp4
https://files.catbox.moe/d2dbuy.mp4

https://files.catbox.moe/6bc02e.mp4
https://files.catbox.moe/3a1ktg.mp4

https://files.catbox.moe/aq276p.mp4
https://files.catbox.moe/m3gn41.mp4
https://files.catbox.moe/u9tffg.mp4

https://civitai.com/models/1818841/wan-22-workflow-t2v-i2v-t2i-kijai-wrapper
https://civitai.com/models/1952945/wan-22-massage-tits-by-mq-lab
https://civitai.com/models/1874153/oral-insertion-wan-22
https://civitai.com/models/1923528/sex-fov-slider-wan-22
>>
>>106648078
OK I found out why, the wheel downloader node was downloading the wrong torch version for some reason (2.9 instead of 2.10).
>>
>>106648310
ok now it's time to be actually rude, because you're a fucking retard with no reading comprehension

did i say YOU'RE wasting my time? no fucking retarded monkey, I said that interpolating slower wastes my time because it takes longer to watch the video and there's no actual new information. holy shit kill yourself you fucking mongoloid
>>
>>106648317
>https://civitai.com/models/1952945/wan-22-massage-tits-by-mq-lab
crazy this wasn't blasted by some retard yet
>>
>>106648336
anon, absolutely no one would read it as that
you're austistic
>>
>>106648351
you have proven that you're retarded so why would anyone listen to your opinion on "absolutely no one"? kill yourself subhuman. even brown people know to apologize and shut the fuck up when they're wrong, so you must be a mutt with 1-3% nigger DNA floating around inside of you
>>
File: 1742946959392761.png (1.24 MB, 1288x808)
1.24 MB
1.24 MB PNG
>>106648240
change the position of the anime girl to a rear profile.
>>
File: 1758393611321771.png (1.02 MB, 800x1280)
1.02 MB
1.02 MB PNG
/lmg/ anon here. Haven't used imagegen since flux came out and was wondering what the meta is today. Looking to mimic the AI figure stuff /v/ has got going on using gemini. Anyone have recommendations for frontends and models?
>>
>>106648375
you're clearly a very smart and level headed person, anon, I bow down to you
>>
File: Q8 vs Int4 nunchaku.jpg (3.76 MB, 9600x2250)
3.76 MB
3.76 MB JPG
>>106646564
So adding on to discussion from the previous thread.
I downloaded Q8 and Int4 nunchaku versions of Krea and experimented a bit.
Some images turn out fine. Noticeable changes but coherent without major defects. Some though indeed get noticeably degraded.
However, to be fair to it, it's running 4 to 6 times faster than the Q8. Can you not make the argument that it is better to gen more images in the same time frame and pick the best seed? The quality of this is much higher than an LCM distill while providing comparable speed up.
Or maybe I am indeed a coping VRAMlet I dunno.
They seem to have only released 128 rank versions for Qwen, which I also want to test but it would to take a while to experiment on my system.
>>
>>106648383
noob/illustrious for anime (wai v15 is good)

wan 2.2 for video

flux/qwen for realism

you could use a lora for anime figures to get something like that.
>>
>>106648387
I'm testing right now with the prompt from the example comfy wf.
bf16 vs q8 vs svdqfp4.
>>
>>106648386
dude actually kill yourself, i literally said "that's not interesting" do you think I was referring to YOU being not interesting you egotistical fucking retard? of course not, i was referring to the method. and then the second "that's" obviously was referring to the same subject as the first "that's" because that's how it works in the english language you retarded fucking ESL. holy shit kill yourself I refuse to let you accept that you are in the right here
>>
>>106648383
Flux Kontext and Qwen Image Edit if you want to give a reference picture of a character and turn it into a figure. (No idea how well they work for this though)
If you don't intend to use a reference image and just prompt the character of your choice with a figure/toy lora. Someone out there is bound to have trained one for a model.
>>
>(wai v15 is good)
rope
>>
>>106648417
For Qwen (128)? Oh yeah post the results later then.
>>
>>106648317
Impressive.
>>
>>106648419
of course anon, you're right as always and you're generous in your interactions with others
again I bow down to you, this time a little further down
>>
>>106648433
Yeah qwen image, with the following prompt :

"A vibrant, warm neon-lit street scene in Hong Kong at the afternoon, with a mix of colorful Chinese and English signs glowing brightly. The atmosphere is lively, cinematic, and rain-washed with reflections on the pavement. The colors are vivid, full of pink, blue, red, and green hues. Crowded buildings with overlapping neon signs. 1980s Hong Kong style. Signs include:
"龍鳳冰室" "金華燒臘" "HAPPY HAIR" "鴻運茶餐廳" "EASY BAR" "永發魚蛋粉" "添記粥麵" "SUNSHINE MOTEL" "美都餐室" "富記糖水" "太平館" "雅芳髮型屋" "STAR KTV" "銀河娛樂城" "百樂門舞廳" "BUBBLE CAFE" "萬豪麻雀館" "CITY LIGHTS BAR" "瑞祥香燭莊" "文記文具" "GOLDEN JADE HOTEL" "LOVELY BEAUTY" "合興百貨" "興旺電器" And the background is warm yellow street and with all stores' lights on.
>>
File: 1755717817232087.png (2.11 MB, 1536x1536)
2.11 MB
2.11 MB PNG
>>106648404
for example I used this:

https://civitai.com/models/656994

prompt: masterpiece, best quality, amazing quality, hatsune miku, waving, <lora:Figure:1> Figure, Figma, pedestal

if the eyes aren't perfect just do an adetailer pass.
>>
>>106648317
that is a really long penis
>>
>>
>>106648387
it doesn't mean a thing if there's not a fp16 comparison too, we don't know which one is closer to the real deal
>>
>>106648454
whats with the incredibly cursed starting image
>>
wtf is wrong with miku normalniggers?
>>
>>106648459
Aren't Q8s very close to fp16 overwhelming majority of the time? I don't recall seeing any counter examples to this ever.
But I can make a quick fp16 vs int4 test with the SDXL version they have recently released if you insist.
>>
File: x.mp4 (1.9 MB, 480x720)
1.9 MB
1.9 MB MP4
>>106648305
lightsticks just seemed appropriate to try

>group
no preference. ideally checkpoints would have a range of clothing like freckledvixon/ruanyi/[...] loras.
>>
>>106648452
that's the average yellow fever enjoyer's penis
>>
>>106648404
Thanks! Will try out wai first.
>>106648422
I assumed it would be as simple as image to image using forge or something. I used to have an older lora for SDXL that did figures pretty well but its been a long time so I'm trying to get back into it.
>>
>>106648094
>bounceV
the fuck is this?
>>
>>106648305
>>106648490
hey ESLs, the term is "glowstick" not lightstick lol. have you never played Terraria?
>>
>>106648497
>image to image
That won't work, it would also destroy the character, you can't simply transfer style with just de-noising. The model needs to also know the character and the concept of figures. (Either by itself or by using loras)
>>
File: ComfyUI_00308_.mp4 (724 KB, 640x640)
724 KB
724 KB MP4
>>
>>106648502
Seed Bouncer Vendradium or SBV. it bounces your seed through a Vendradium Entropic Variation algorithm. Essentially, it prevents predictable image duplication.
>>
>>106648551
Thanks. I assume something like Flux Kontext and Qwen Image Edit work just fine in forge, right?
>>
File: 1753675123870102.png (1.55 MB, 1024x1024)
1.55 MB
1.55 MB PNG
make the image in an avant garde style. keep the same pose.

any art anons? I just tried a random style I vaguely remember (qwen edit)
>>
>>106648586
https://en.wikipedia.org/wiki/Black-figure_pottery
>>
>>106648572
I don't know.
I abandoned Forge months ago due to lack of support for major models (video diffusion).
Flux Kontext possibly does because base Flux works on it. But no idea about Qwen.
The auto111/forge ecosystem kinda died while you were away.
>>
>>106648636
Well shit. Are we all using comfy now?
>>
>>106648636
>>106648640
neo forge is looking promising because it actually has an active maintainer but the memory management that illya added years ago is outdated for newer models and is just causing worse memory issues than comfy has.
>>
>>
>>106648640
Well I moved on to the spaghetti yes.
There has been some drama around it recently, maybe eventually the community moves on to something else but for now it is the most prominent all rounder.
>>
File: 1745387859700216.png (1.71 MB, 1024x1024)
1.71 MB
1.71 MB PNG
>>106648586
same prompt
>>
File: moy3_00079.webm (3.4 MB, 720x800)
3.4 MB
3.4 MB WEBM
>>
>>106648692
i genuinely lmao'd. might be a decent way to get them to stop stealing shit from here
>>
>>
File: 1750908381933371.png (978 KB, 1024x1024)
978 KB
978 KB PNG
can you solve the puzzle?
>>
>>106648450
My tests show something weird : it seems like svdquant fp4 r128 (the one for blackwell), is non deterministic.
I get different outputs when I retry using the same parameters.
wtf.
I'll retry with int4 instead, too bad because fp4 is twice as fast as q4 on a 5090.
>>
File: 66854432.mp4 (1.83 MB, 912x1296)
1.83 MB
1.83 MB MP4
>>
https://www.liblib.art/modelinfo/99d2d7a0bf0e41bd9275bdbc9a84995d?from=feed&versionUuid=5a5b4e055ed4485db884d26a440eb018&rankExpId=RVIyX0wyI0VHMTEjRTM3X0wzI0VHMjUjRTM4

china wins again
>>
API nodes bros, another source to use to use See Dream 4 for free? I'm not feeling very powerful...
>>
>>106648786
>completely impossible to download without an account
yay
>>
File: 1734040956718259.png (19 KB, 646x431)
19 KB
19 KB PNG
>>106648757
Come on, this is ridiculous.
Thankfully I have a 3090, but it's so dumb.
>>
File: file.png (1.81 MB, 832x1216)
1.81 MB
1.81 MB PNG
What do you actually want to run for generating images with 24GB of VRAM? Is there a workflow that makes use of that or is a faster GPU only better for speed? The rentry post doesn't seem very thorough, so I've stuck to asspull loras and checkpoints and random workflows from civitai.
>>
>>106648786
does it do actual nudity?
>>
>>106648807
Any workflow is fine. More vram just allows you to offload less and run unmolested model quants/full sizes. Also faster lora training
>>
>remove the yellow censorship bar and restore the nipples on the anime characters breasts.
like magic. some minor touching up needed but it was a big yellow bar, before. great example of what the edit model can do (plus a lora).
https://files.catbox.moe/lcdu3y.png
>>106648829
sure can!
>>
>>106648486
Well I can't do it today because they haven't released nodes for SDXL lol.
>>106648757
I doubt this. Are you certain you are using a converging sampler?
>>
>>106648155
oh dang I haven't tried t2v but maybe I should

>>106648530
I think they were going for kpop lightsticks which are these things the fans in the audience sometimes have
>>
File: 1733089853123320.png (45 KB, 468x60)
45 KB
45 KB PNG
>>106648851
and the original image (ad from here)
>>
>>106648851
no, do it with real people, full body
>>
>>106648786
can you share the lora?
>>
>>106648869
yes, that works. sample from a gravure shot or w/e:
https://files.catbox.moe/78jqz9.png
basically removes the sfw filter from the model, more or less.
>>106648871
sec, upload is slow
>>
File: 1731695793473781.png (29 KB, 544x555)
29 KB
29 KB PNG
>>106648861
>I doubt this. Are you certain you are using a converging sampler?
No, I didn't use an ancestral sampler or anything of the sort.
Try it if you have a blackwell card, I catboxed the result :
https://files.catbox.moe/pbg4t8.png

Gen one time, then close comfy, open then gen again.
>>
>>106648885
what about genitals?
>>
>>106648900
yeah that works too, havent tested it a lot but it can do that apparently.
>>
File: 1748354436755231.png (790 KB, 1136x912)
790 KB
790 KB PNG
the cartoon frog is sitting at a computer wearing a blue shirt, and red shorts, and sandals. keep their expression the same. a white CRT monitor is on the computer desk.
>>
>>
huggingface/civitai aren't allowed to host it (where I originally got it), because...it can lewd.

so here's a mirror of the same lora: https://limewire.com/d/sBUPu#GclImNhwoG
>>
>>106648925
same reason you have to change a line of code on reactor/face swap stuff to remove nsfw restrictions, rules are stupid
>you can download this photo software to lewd
>but this extension is bad!
>>
>>106648925
>limewire
blast from the past
>>
>>106648925
Thanks anon.
>>
>>106648966
I googled fast file upload and got https://www.file.io/ as the first link

oddly enough, it's a part of limewire.
>>
anybody manage to go past 5 seconds with animate at 1280x720? if i try to go longer than 5s shit just ooms.
https://github.com/comfyanonymous/ComfyUI/issues/9937
about to say fuck it and try this.
>>
>>
god DAMN wan is good.
once we get a model that does this shit in 10 seconds instead of 40+... i don't know what will happen to me
>>
>>
>>106648692
Catbox PLEASE
>>
File: 1753221957813677.png (3.15 MB, 2263x1281)
3.15 MB
3.15 MB PNG
>>106648897
OK found out why I think :
https://nunchaku.tech/docs/nunchaku/faq/usage.html#why-do-the-same-seeds-produce-slightly-different-images-with-nunchaku

>This behavior is due to minor precision noise introduced by the GPU’s accumulation order. Because modern GPUs execute operations out of order for better performance, small variations in output can occur, even with the same seed. Enforcing strict accumulation order would reduce this variability but significantly hurt performance, so we do not plan to change this behavior.

The difference in picrel is less than what I see, but I think it's because I generate bigger images (2048x2048) with a more complex prompt, so the variations are bigger.
>>
>>106648998
nah fuck that
we need a model that can actually do 10 second videos instead of 5.
>>
>>
>>106648897
>>106649046
I see.
Would keep this in mind.
>>
>>106648990
damn that workflows a mess now that i look at it
>>
>>106649143
yeah the comfy wiki guy barely understands the program so im not surprised
>>
>>106648433
>>106648450
Here it is :
https://imgsli.com/NDE2NzEz

I used :
euler/simple/40steps 2048x2048

Gen times :
BF16: 260s
Q8: 230s
Q4KM: 216s
SVDQUANT FP4: 102s!

Keep in mind the size and complex prompt is probably a worst case scenario.

Quality wise, Q8 is fine, nunchaku is slightly worse but honestly too different to compare, and Q4 is really bad.
Speed wise everything was more or less the same with the 5090, but man fp4 is fast, insane speed compared to the rest.

My conclusion: I don't really need svdquant for imagegen for qwen (or flux) on a 5090, but I'd trade the weird non deterministic behaviour of svdquant to have faster speed, and gen way more things using wan. (Image gen using nunchaku is probably more worth it for people on 40xx and 30xx cards.)
The best use case being wan is very annoying, because they keep making everything except it.
>>
File: Body Made For Sin.jpg (316 KB, 1250x1506)
316 KB
316 KB JPG
>>106649034
>MP4
Reminded me of this.
>>
>>106649159
Thanks anon.
It seems to just double speed for Qwen.
I will probably keep not bothering with it on my budget setup, but it's good to know.
I would probably use bf16 if I were you with a 5090.
>>
>>106649153
yeah kijais is fucked for me at 720p can't go past 5s even with everything set to offload.
https://github.com/kijai/ComfyUI-WanVideoWrapper/issues/1267
glad i'm not the only one having issues tho.
>>
>>
>>106649277
>24fps
feels like 8
>>
File: 1730794196706335.png (2.05 MB, 1328x1328)
2.05 MB
2.05 MB PNG
various ice cream containers in an ice cream shop. (qwen image)

looks pretty good desu
>>
>>106649306
Can you use an add grain node so it doesn't look so plasticky?
>>
>>106649210
For fun I did a speed test on my other 3090, and the speed :
svdq int4 : 434s
Q4: 635s
(I had to disable sage attention on the 3090 because it gave me black screens, so it didn't help.)

>I would probably use bf16 if I were you with a 5090.
Yeah, I only care for svdquant for their future (hopefully) wan model. And only if they also release support for loras.
>>
File: 1740522833937596.mp4 (1.18 MB, 640x640)
1.18 MB
1.18 MB MP4
>>106649306
a man puts the vanilla ice cream into a bowl with an ice cream scoop. (wan 2.2)

should have said cone, but it works
>>
>>106648990
>seedrean
How did you do it?
It's free?
>>
>>106649367
the bloody bastard is mixing the flavors
>>
>>106649367
yeah that scooping animation made me salivate a bit
>>
File: 1744202958759152.mp4 (745 KB, 640x640)
745 KB
745 KB MP4
>>106649367
the ice cream in the ice cream shop melts into a liquid.

wan is such a neat model.
>>
File: 1753976029829233.jpg (60 KB, 953x531)
60 KB
60 KB JPG
>>106647593
>>
>>
>>106649393
they're almost a week old back when it was on lmarena. don't really have anything to post since /g/ doesn't allow vids with audio and the file size limit is ass anyways.
>>
File: IMG_1475.png (2.49 MB, 1024x1024)
2.49 MB
2.49 MB PNG
>>106647201

I wanna do extracurricular studies with Hatsune Miku and her baloonbies…
>>
File: 1733231526802618.mp4 (607 KB, 640x640)
607 KB
607 KB MP4
anime girls made out of ice cream, come out of the ice cream bowls on the table.

again, wan is pretty cool.
>>
>>106648807
this i would actually believe is a real picture if someone used it on a dating app or something
>>
>>106648807
model?
>>
https://github.com/kijai/ComfyUI-WanVideoWrapper/issues/1262#issuecomment-3314926799
curious to see what this is all going to look like once he's finished. it's not that bad already if you don't use the speed lora.
>>
File: 1750354258879825.mp4 (1.12 MB, 640x640)
1.12 MB
1.12 MB MP4
>>106649455
Miku Hatsune grabs the vanilla ice cream bowl and walks out the door of the ice cream shop.

need more time but grab success
>>
File: 1750567010692711.jpg (1.09 MB, 1416x2120)
1.09 MB
1.09 MB JPG
>>
File: 1729796245947617.jpg (1.09 MB, 1416x2120)
1.09 MB
1.09 MB JPG
>>106649548
>>
>>106648240
Which model is this?
>>
>>
File: 1730492012366368.mp4 (825 KB, 640x640)
825 KB
825 KB MP4
used qwen edit to make the sign edit, wan to move it
>>
File: output.png (2 KB, 120x183)
2 KB
2 KB PNG
Why do my chroma loras always end up being the same size no matter the dataset size or training settings?
>>
>>106649567
qwen edit (q8)
>>
>>106649592
It depends on your --network_dim and --network_alpha settings
>>
File: 1738127570946971.mp4 (821 KB, 640x640)
821 KB
821 KB MP4
>>106649587
>>
>>106649592
shut up ran
>>
File: 1756261203593003.mp4 (1.01 MB, 640x640)
1.01 MB
1.01 MB MP4
>>106649630
the possibilities are endless for the burger king.
>>
File: 1754414689287972.jpg (865 KB, 1416x2120)
865 KB
865 KB JPG
>>
>>106649628
I don't see it anywhere in Onetrainer. Would giving it more space produce a higher quality result or is that not how it works?
>>
>>106648452
4 her
>>
>>106648383
>sabamen
Extremely based
>>
File: 1745762958159370.mp4 (758 KB, 640x640)
758 KB
758 KB MP4
the anime figure of miku hatsune rotates 360 degrees.
>>
File: 1750147496515263.jpg (1.01 MB, 1416x2120)
1.01 MB
1.01 MB JPG
>>
File: bunnybutum.mp4 (997 KB, 848x954)
997 KB
997 KB MP4
>>106649630
>>106649633
what was his BK order though? tendies? :3
>>
>>106649703
GOT DAYUM
>>
File: 1754273488831422.png (1.39 MB, 1024x1024)
1.39 MB
1.39 MB PNG
you know qwen edit is a pretty capable model not just for slight changes, it can take an image and do an entire scene.
>>
>>106649662
It's in the lora settings tab

The higher the rank, more of the underlying model will be affected by the lora, but contrary to what you may think this does not automatically mean better quality, the best results are had by scaling rank according to the number of images and their resolution.

For example, lets say you have 30 images of a person and train a lora on those, going over rank 16 won't get you a better result, most likely a worse one, you could probably even go down to rank 8 and have as good a result.

If you have 500 images or you are training several people / concepts at once, you should look to increase rank to 32-64.

Alpha is this rather stupid extra parameter which affects the strenght of the LR, it just adds pointless complexity, my suggestion is keeping it the same as your rank, so rank 16, alpha 16, which essentially nullifies its effect.

Also when training a Chroma lora, I would suggest using the 'blocks' preset in the lora tab, for me it gives the best results.
>>
File: 1737298950199767.jpg (1003 KB, 1416x2120)
1003 KB
1003 KB JPG
>>
>>106649570
Catbox pls.
>>106649662
Higher dim can also turn it into overfitted turd.
Optimal Dim number depends on what the lora is about and training dataset quality.
>>106649704
Generally yeah but you can experiment with lower values for some loras.
>>
>>106649724
>my suggestion is keeping it the same as your rank, so rank 16, alpha 16, which essentially nullifies its effect
In OT the default value is 1.0 which I guess is a ratio to the rank?
>>
>>106649767
NTA and I think you should set it as half or lower or experiment different values yourself but no 1 is the minimum, the exact opposite of setting it as the same...
>>
>>106649499
https://civitai.com/models/1950841/intorealism-ultra
+
https://civitai.com/models/573152?modelVersionId=2155386 setup as a refiner

I have no idea what I'm doing tho, just working off of someone else's existing node setup switching things around
>>
File: 1746805585751336.jpg (864 KB, 1416x2120)
864 KB
864 KB JPG
>>
>>106649710
>>106649703
>"she wiggles her bunny tail at the camera, she remains facing away, she is smirking, playfully wiggling tail back and forth"
>>
>>106649767
Yes, you can leave it at that or use the same as your rank to nullify its effect, as in rank 16 / alpha 16, rank 32 / alpha 32.

Overall it doesn't really matter as long as you continue using whatever you decide, since if you keep switching this option you will have a harder time figuring out which LR to use, since it affects LR.

Kohya and Diffusion-Pipe uses the same rank alpha by default, OneTrainer uses alpha 1.0 by default, I don't know the defaults of other trainers.

I use OneTrainer, but I use the same rank / alpha because I want to remove that extra variable from my training.
>>
File: 1745492187024829.png (1.15 MB, 1024x1024)
1.15 MB
1.15 MB PNG
>>106649721
the anime girl is on an album cover playing a teal colored rock guitar.
>>
>>106649455
they look so confused lol
>>
Queuing up a batch of low-CFG gens and letting the sampler just go completely hog wild is a real trip.
Mostly you end up with a bunch of insane tentacle hands and Megaman 1 box art tier proportions and faces but every now and again it drops a surprisingly decent and creative gen.
>>
File: thecomfyuiexperience.webm (2.83 MB, 460x754)
2.83 MB
2.83 MB WEBM
>>
File: file.png (730 KB, 2443x1838)
730 KB
730 KB PNG
i've come to the realization that my prompt doesn't do anything
it keeps generating the same thing
can you tell me what's going wrong from this screenshot, or do you need more info?
>>
>>106649858
authentic indian video
>>
>>106649859
fixed seed on second sampler?
>>
File: 1751228170036479.mp4 (435 KB, 640x640)
435 KB
435 KB MP4
>>
>>106647201
y does she laugh like tht
>>
>>106649886
Only the first stage generates the starting noise, the second stage denoises the output of the first.
>>
File: ComfyUI_01787_.png (1.19 MB, 816x1232)
1.19 MB
1.19 MB PNG
>>
Can any of these new Chinese image models generate seamless repeating/tiling/texture images? It's crazy to me that last I checked, A1111 + SD1.5 was still the gold standard.
>>
>>106649891
im not sure why a pepe appeared on the screen, I just said "the text LDG appears on the CRT monitor"
>>
>>106649859
share your workflow
>>
>>106649893
baker is spiteful\insane
>>
So how do I get rid of the retarded horizontal scanlines in every Chroma gen I make?
>>
>>106649914
teh baker made tht? woaw
>>
File: ComfyUI_01790_.png (1.33 MB, 816x1232)
1.33 MB
1.33 MB PNG
>>
I need someone with the specs to make lewd videos of Anna P on the beach.
>>
File: 1728649005740526.mp4 (603 KB, 640x640)
603 KB
603 KB MP4
the green cartoon frog walks to a nearby water cooler and fills his cup with water.

uhh, not quite...
>>
File: 1743535918375437.mp4 (1.06 MB, 640x640)
1.06 MB
1.06 MB MP4
>>106649954
better:
>>
File: ComfyUI_01799_.png (1.37 MB, 816x1232)
1.37 MB
1.37 MB PNG
Not what I prompted at all but I like it.
>>
>>106649973
when will you stop pulling the trigger early on the shit gens and just post the better one?
>>
>>106649920
These are typically the result of:

Training on bad images, as in noisy with artifacts, such as lines
Using low quant models (same as with Flux which caused a grid pattern)
Too long prompts
Using a lot of different loras at the same time (3 or more)

Also depends on the base model you are using, Chroma1-HD seem to be largely immune
>>
File: 1749989447185813.jpg (1.02 MB, 1552x1944)
1.02 MB
1.02 MB JPG
>>
>>106649954
The funny kind of AI slop nonsense.
>>
File: 1733185280634430.mp4 (475 KB, 640x640)
475 KB
475 KB MP4
cute!
>>
>>106650035
stop yapping
>>
>>106649978
Catbox?
>>
File: asianTikTOKgiiirl.mp4 (838 KB, 576x956)
838 KB
838 KB MP4
>>106650038
even with talking\singing in the neg-field she still does it often ;_;
>>
>>106649991
Looking at it again it might actually be that my monitor has something burnt into lmao. Thanks for the help though.
>>
File: 1731303870874773.mp4 (1.04 MB, 640x640)
1.04 MB
1.04 MB MP4
>>106650035
>silent
>waifu appears smiling
>happy conversation
Nah wan got it right that time
>>
>>106648419
Anon with respect, this is classic autism. You had no insight into how your message would read to someone other than you (lack of theory of mind) and are now sperging out because it was interpreted other than how you intended it. They are entirely correct that it is pure autism on your part.
Getting mad about it after being annoyed that THEY got a bit mad about it is just double down 'tism.
You can be as mad about THIS as you like, it's still true.
>>
>>106649982
I liked the first one better, I had never seen a monitor do that, the second was boring, prompt adherence is overrated.
>>
>>106648419
least hostile response
>>
>>106650047
I am not Brazilian or Portuguese btw, just copied the prompt from Sora.
https://litter.catbox.moe/t5cdeilmgwt4jeb4.png
>>
>>106649913
it's the "Wan2.2 14B I2V Image-to-Video Workflow Example" from here https://docs.comfy.org/tutorials/video/wan/wan2_2#wan2-2-14b-i2v-image-to-video-workflow-example
>>
File: hornet and the knight.png (1.44 MB, 816x1232)
1.44 MB
1.44 MB PNG
>>106650119
And no translating didn't help, how proficient t5 is at Portuguese doesn't seem to be the problem.
Still amusing image though.
>>
>>106650068
SEXO!
>>
File: 1738331183279469.jpg (1.14 MB, 1552x1944)
1.14 MB
1.14 MB JPG
>>
I wonder how many languages qwen and wan understand outside of English, Chinese.
>>
>>106648671
kek
>>
>>106648671
>>106650182
:c
>>
File: ComfyUI_01803_.png (1.49 MB, 816x1232)
1.49 MB
1.49 MB PNG
>>106650175
>wan
>UMT5 is pretrained on the an updated version of mC4 corpus, covering 107 languages: Afrikaans, Albanian, Amharic, Arabic, Armenian, Azerbaijani, Basque, Belarusian, Bengali, Bulgarian, Burmese, Catalan, Cebuano, Chichewa, Chinese, Corsican, Czech, Danish, Dutch, English, Esperanto, Estonian, Filipino, Finnish, French, Galician, Georgian, German, Greek, Gujarati, Haitian Creole, Hausa, Hawaiian, Hebrew, Hindi, Hmong, Hungarian, Icelandic, Igbo, Indonesian, Irish, Italian, Japanese, Javanese, Kannada, Kazakh, Khmer, Korean, Kurdish, Kyrgyz, Lao, Latin, Latvian, Lithuanian, Luxembourgish, Macedonian, Malagasy, Malay, Malayalam, Maltese, Maori, Marathi, Mongolian, Nepali, Norwegian, Pashto, Persian, Polish, Portuguese, Punjabi, Romanian, Russian, Samoan, Scottish Gaelic, Serbian, Shona, Sindhi, Sinhala, Slovak, Slovenian, Somali, Sotho, Spanish, Sundanese, Swahili, Swedish, Tajik, Tamil, Telugu, Thai, Turkish, Ukrainian, Urdu, Uzbek, Vietnamese, Welsh, West Frisian, Xhosa, Yiddish, Yoruba, Zulu.
The quality might vary but it should at least somewhat follow along what you are prompting in these languages.
>qwen
>The dataset is primarily composed of Chinese and English data, with supplementary multilingual entries to support broader linguistic diversity.
I am guessing it only excels at English and Chinese but curious how it would fare against a German or French or any other major language prompt.
I am too much of a VRAMlet but maybe someone else can test.
>>
File: 1730589189494388.jpg (660 KB, 1552x1944)
660 KB
660 KB JPG
>>
>>106650239
Interesting, thanks anon.
>>
>>106649982
There should be a rule that you can't post gens that are less than six hours old. Well, not a rule since it would be unenforceable, more of a gentlemen's understanding. It would cut down on the number of people who post a dozen minor variations on the same prompt.
Delayed gratification, people. It's what separates men from animals.
>>
>generated wan video (nsfw) with fp8 scaled 5/7 steps 97 frames 800x1200
->618s
>generated wan video (nsfw) with q8 5/7 steps 97 frames 800x1200
->852s
I don't know for generations prior to 5000, but for the 5000 fp8 (e4m3fn scaled) is way faster than q8. I also didn't see any visible difference in quality, so it's not worth it for me, especially with such a speed penalty.
>>
File: 1758372023132249.mp4 (395 KB, 640x640)
395 KB
395 KB MP4
>>106650119
Danke
>>
>>106650333
There is definitely a quality degradation with respect to Q8 and fp16 if you test it enough.
But yes, on Blackwell fp8 is noticeably faster due to hardware acceleration (which previous generations lack).
One can argue whether it is worth it or not.
>>
>>106650359
I like the interface effects
>>
>>106650333
q8 technically is more quality but its slower than fp8 for newer generation cards.
>>
>>106650333
>I also didn't see any visible difference in quality
gen it with the same settings and seed
>>
>>106650371
>>106650392
Yeah 30% faster is worth it for me if I don't see any horrible degradation.
Looking at it again the only thing I noticed was slightly different lighting, but nothing even bad, just a tad bit different.

>>106650403
That's what I did of course.
>>
>>106650433
>That's what I did of course.
i guess you didnt gen anything that needed anatomic consistency or movement clarity
>>
>>106650448
It was a nude woman walking forward, and it was fine, she didn't get another limb, her face wasn't weird, and her movements were ok.
I would expect bad anatomy from lower quants, but not from what I tested.
>>
>>106650460
try anything with hand movement
>>
Where are the AI NSFW videos?
>>
>>106650464
OK will try later, once my big queue has cleared.
>>
>>106649858
nice cope anon, but that's the GNU/Linux car! We don't use macbook cars when we have that!
>>
File: woman 1.png (1.25 MB, 816x1232)
1.25 MB
1.25 MB PNG
>>106650486
Have you looked at /gif/?
>>
File: 1737568201597981.mp4 (777 KB, 640x640)
777 KB
777 KB MP4
>>
>>106649548
nice
>>
File: garden of roses.png (1.66 MB, 1024x1024)
1.66 MB
1.66 MB PNG
>>
>>106647325
kontext is better at detecting and editing text
>>
>>106648201
>girl dancing
you can easily just increase the length for that
>>
man, can we get real time video gen mixture of experts models already?
>>
>>106650633
Accurate
>>
File: 1757024945511436.mp4 (1.66 MB, 640x640)
1.66 MB
1.66 MB MP4
the anime girl throws the green leek vegetables in her hands towards the camera.

qwen edit to make the leeks, wan to animate
>>
Being a face guy is suffering. It is so hard to get facial variety and even harder to actually control facial features. I just don't want everybody to look the same. Qwen can't even handle different body types, nevermind facial features.
>>
File: 1753138391235730.mp4 (1.89 MB, 640x640)
1.89 MB
1.89 MB MP4
neat, actually worked

the anime girl transforms the green leek vegetables in her hands into a green rock guitar.
>>
>>106650840
imo qwen edit is better for transforming people despite being an edit model primarily, take a base figure or image then prompt them with diff traits.
>>
>>106650840
Chroma has quite a few different faces. I suppose you could want all facial features tagged and that isn't the case, but the diversity is there.
>>
>>106650840
thats why you dont use the model with the least seed variety that qwen is or some giga slopped model like base flux
use chroma hd/2k for realism and noob/illustrious for tranime
>>
File: shotgun girl.png (1.15 MB, 1024x1024)
1.15 MB
1.15 MB PNG
>>
>>106649703
cute
>>
that was easy bait for the chroma shills
>>
>>106650916
>>
>>106650840
That's the price you pay for almost always perfect anatomy, they get this by massively overtraining on a small pool of faces and human poses

There's really no workaround at the moment, you quickly notice when you train loras on models like qwen where you use other people and poses, then the almost always perfect anatomy starts degrading
>>
>>106650914
>npc gets triggered as he hears his activation phrase amongst multiple recommendations
textbook rent free
>>
too easy
>>
>>106650869
>>106650871
Chroma is way better in this regard, but it is hard to go back when it can't do fine details and Qwen can. These patterns aren't perfect, but they are way better than what Chroma can do. Chroma is a very viable choice and I won't shit-talk it, but there's a lot I like about Qwen.
>>106650856
I'll have to give this a shot.
>>
File: pathetic.jpg (465 KB, 896x768)
465 KB
465 KB JPG
>>106650840
Another face guy here. Use chroma. It randomly creates unique features with very little to no prompting. I was in the exact same position a few months back where anon suggested chroma because I wanted to make wonky unique features. Haven't used any other models since.
>>
The token bleed in chroma is kind of fucking insane, it's pretty fucking bad and it can even force you to have to make loras stronger
>>
>>106651004
beautiful
>>
>>106650885
Is it seedream girl?
>>
>>106651024
Train a fuckhuge 'face types' Lora to serve as basis vectors for face guys like him: >>106650840
>>
>>106648990
You are meant to chain the wananimatetovideo nodes using the frame offset connection. tie a ksampler to each one, decode the latents and feed into the continue motion connection of the next ksampler. and also batch the images of each before sending to combine video.

The idea here is that you can use a different prompt for each chunk and probably chnage character mid way through the video and use different shift value or lora's.
>>106649241
KJ nodes oom? no surprise there...
>>
>>106649241
That's a real photo
>>
>>106651083
Flux Krea girl actually.
>>106651024
Examples?
>>
File: IMG_4896.png (2.31 MB, 1284x2778)
2.31 MB
2.31 MB PNG
By the way if you want to detect AI you can use tellif.ai

Its free and it works really well
>>
>>106652478
wow it's not plastic, it's AI!
>>
File: IMG_3631.jpg (864 KB, 1290x1868)
864 KB
864 KB JPG
>>106652478
Seems to work, but why only 78%?
>>
File: 1731007986549670.png (993 KB, 868x851)
993 KB
993 KB PNG
>>106652478
kek
>>
File: 1757880369384343.png (1.68 MB, 1024x1024)
1.68 MB
1.68 MB PNG
>>106652478
>80%. Likely created by human. Natural imperfections detected.
>>
File: IMG_4911.jpg (1009 KB, 1284x1441)
1009 KB
1009 KB JPG
>>106652679 idk, for me generated
>>
File: 1757364511599811.png (85 KB, 806x213)
85 KB
85 KB PNG
>>106652738
Guess it's just retarded then
>>
>>106652679
Dalle-2 had soul
>>
>>106651485
>Flux Krea girl actually.
Total local victory
>>
>>106651350
I wrote this and read it and i see that i am very shite at explaining things because autism and it really does irritate me that people want everyone on a silver plate. They complain there is no basic native workflow, well there is but its shite because it does not fully use the wananimate model to its fullest.

I don't like writing walls of text and do usually share workflows but I can answer specific questions about how to use the node properly.

for starters (and important) you need to resize and pad the reference image to the exact size of the control video frames (the original video) and it can be any person in any background. if you do get this right the control net poses won't work properly and you will end up with body horrors and other glitches as the model tries but fails to align properly. You DON'T resize the frames of the control video, you just pass them directly into the dwpose node and then the controlnet images go into the control video connection on the wanAnimateVideo node, then connect the resized reference image to the reference image connection on wanAnimateVideo and it must be the same resolution as the original video or it won't work! I use ResizeAndPadImage node to do this and link from video info node the source video width and height and I use interpolation lanczos for best quality.

pic related to this post >>106651350
shows roughly the layout of the nodes, i will probably share the workflow later once I tidy everything up. but the image should be good enough for reference to how to set it up in native, also the speed lora's i'm using.

BTW this model is fucking insane when you get it working right, you only need to set it up like i've tried to explain and it just works, it takes care of everything. So fuck all those whiny ass bitches wanting everything on a silver plate, they as stupid as they are fucking lazy. FUCK EM!

I wouldn't change much of a thing with the native nodes or the model it works very well as is.
>>
>>106653158
So i'll continue to help explain. I pad the reference image with white borders, though any image so long as its resized works, its background will heavily change the video so be aware of that, be creative etc. Control the frame rate on the video combine node using the source FPS so that the audio is in sync.Oh and the shift value I'm using is 1.00 and seems to give nice results.

So i use math to calculate the total frames from the video load node. chunks of 77 * 3 samplers = 233 frames but i use math nodes to do all that. You could indeed extend enough ksamplers to remix an entire video. I use 5 continue motion frames and i drop those frames because they are always garbage which makes sense because when you drop them you don't get jump cuts and only a smooth video. I use math to drop the frames 77 - 5 = 72 frames and select them from batch at index 4 for the length of 72



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.