[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


File: AniStudio-00029.png (1.58 MB, 1024x1344)
1.58 MB
1.58 MB PNG
Discussion of Free and Open Source Text-to-Image/Video Models

Prev: >>106929651

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP
AniStudio: https://github.com/FizzleDorf/AniStudio

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts/tree/sd3
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
https://github.com/tdrussell/diffusion-pipe

>WanX
https://comfyanonymous.github.io/ComfyUI_examples/wan22/
https://github.com/Wan-Video

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
Training: https://rentry.org/mvu52t46

>Neta Lumina
https://huggingface.co/neta-art/Neta-Lumina
https://civitai.com/models/1790792?modelVersionId=2203741
https://neta-lumina-style.tz03.xyz/

>Illustrious
1girl and Beyond: https://rentry.org/comfyui_guide_1girl
Tag Explorer: https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/b/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
>>
File: 1730755191583949.mp4 (858 KB, 480x704)
858 KB
858 KB MP4
once again I am vouching for the new lora + 2.1 combo

high: 2.2 kijai MoE lora 1 str + 2.1 lightx2v 3 str

low: wan 2.2 lightning low 1 str, 2.1 lightx2v 0.25 str
>>
ani sex
>>
Blessed thread of friendship
>>
>>106934841
is that for i2v or t2v or both?
>>
/ldg/ is healing
>>
File: 1760835110219827.png (16 KB, 1273x576)
16 KB
16 KB PNG
>>106934583
As I said last thread I use TensorArt because it is way faster. Pic shows my usual settings.
The only thing not visible is Restarts set to 3. I never use repeats, just set it to 1 and bump epochs instead for more consistent results.
Batch size shows as nothing but TA sets it to 1 for Qwen behind the scenes. They use Gradient Accumulation Steps instead of actual batch size to avoid OOM errors on their servers.
So if you train Qwen elsewhere with normal batch size, you would set batch size to 4 and Gradient Accumulation to 1. TA does the opposite. Hope that is clear.
>>
Guys, this is pathetic.
>>
File: ComfyUI_13014.png (2.21 MB, 1080x1440)
2.21 MB
2.21 MB PNG
>>106934841
Four LoRAs!? What's it look like without?
>>
In my old comfy install I had a tagger that would autocomplete artists but the new install and add on can't do that.
Any recommendations?
>>
>>106934868
i2v, t2v might work ok but these are i2v loras, with the new i2v one (kijai 2.2)
>>
>>106934878
how long does it usually take for it to finish for you?
>>
>>106934888
it's just the old 2.2 fix (2.1 lora at 3 strength) but with the new 2.2 lora high/low. 2.1 3 strength was a way to fix the slow motion.
>>
>>106934882
Fast moving threads. Keep discussions focused on local diffusion, no drama it's the obly winning movement.
>>
File: 1760825739717824.jpg (690 KB, 2048x2048)
690 KB
690 KB JPG
I got ComfyUI and Wan 2.2 going but my container crashes on this last step using the ti2v workflow. Is it not enough to have 16GB of RAM available? I have a 4080 Super, this shouldn't crash yeah? I have CUDA and everything installed in the container.

>got prompt
>Using pytorch attention in VAE
>Using pytorch attention in VAE
>VAE load device: cuda:0, offload device: cpu, dtype: torch.bfloat16
>Requested to load Wan
>VAE loaded completely 8961.125 >1344.0869674682617 True
>Requested to load WanTE
>Model loaded completely 9.5367431640625e+25 10835.4765625 True
>CLIP/text encoder model load device: cuda:0, offload device: cpu, current: cuda:0, dtype: torch.float16
>model_type FLOW
>Requested to load WAN22
crash
>>
>>106934914
huh, why would you repost my post I got answered last thread?
>>
>>106934893
ah ok. hard to keep track of all the lora releases/ ty.
>>
File: 1760751573402015.mp4 (759 KB, 640x832)
759 KB
759 KB MP4
haven't had time to find a way to set up Wan on my laptop yet but I found out I have been getting a lot of Tensor credits due to sharing LoRAs lmao.
So this is my first try with Wan2.2 img2vid.
btw, 21 credits for 5 seconds of video is a bit expensive.
>>
Does the order of descriptions in the prompt matter? (Example: Background, then Lora for character, then character description, then Lora for style, then style description, etc) Or can I just keep adding random shit I think about between generations?
>>
>>106934946
i have no idea about SaaS / IaaS pricing but generating wan videos is like doing many dozens of SDXL images even at 1/2 lower resolution than SDXL native.

you can probably set it up on your laptop if it has a nvidia GPU and some system RAM, but it might be very slow. IDK.
>>
File: 1760584407407851.png (1.71 MB, 1382x1075)
1.71 MB
1.71 MB PNG
>>106934965
it depends. using the t5 encoder, position is very important. using clip, position importance is ui-specific (forge will apply relative token weighting while comfy will use absolute weighting)
>>
File: 1742831963234277.mp4 (962 KB, 480x704)
962 KB
962 KB MP4
the japanese girl does a backflip high into the air.

was worried for a sec, but she made it!
>>
>>106934965
it depends on which ui and prompt encoder configuration you have. some comfyui prompts are not very sensitive to position at all, many other prompts (including some on comfyui with certain models) are. for some prompt fields it also starts to matter more as you get near or above the recommended token amount

basically just assume it does matter at least a bit unless you know it doesn't
>>
File: 1760631265235378.png (241 KB, 501x502)
241 KB
241 KB PNG
I haven't touched image generation in a while. I would like to modify textures for a game, those are flat 2D atlas (like picrel) which are applied on a 3D model.
Can image generation helps me with that? For example, adding some details to a texture, changing some clothing, upscaling them or making (mostly color) variants?
>>
>>106934999
I've never seen someone try this but it seems like it would be possible. qwen-image-edit is the current premiere edit model so I'd recommend trying that and seeing how it works for this usecase
>>
>>106934999
>>106935010
here is the quantized model and there's some usage instructions in there (although a bit obtuse)
https://huggingface.co/nunchaku-tech/nunchaku-qwen-image-edit-2509
>>
File: 1760564971924874.mp4 (863 KB, 480x704)
863 KB
863 KB MP4
>>106934841
another example worked well:

the japanese girl holds up a sign saying "hi LDG" and points at the sign.
>>
>>106934981
very nice,

>>106934999
I'm guessing you could do some stuff with qwen image edit, but most tooling / models probably aren't specialized enough to do most of what you want quickly and well.
>>
did that avatarfag really rush a bake with their avatar and link their vapourware wrapper in the OP?
>>
>>106935023
>>106934981
What changes did you make between the first and second video besides the prompt?
>>
>>106935029
Ran... please stop you can't control everything.
>>
>>106935016
Thanks, I also saw that there was some GGUF quantization: https://huggingface.co/QuantStack/Qwen-Image-Edit-2509-GGUF
Which quantization should I use?
>>
>>106935023
what do you use to make these kind anon?
>>
>rush a bake
last was on page 9 nigga
>>
File: 00077-3627413677.png (1.01 MB, 896x1152)
1.01 MB
1.01 MB PNG
>>
>>106935029
Meds. We're here to discuss local tech, not drama. Take it to another thread if you want to argue. Staying on topic is the only way this works.
>>
>>106935035
none, same lora combo + weights. shift is 8 for both.
>>
ran lost...
>>
>>106935029
yes
>>
>>106935044
wan 2.2

use the comfy template workflow, works well: then just grab the new loras

https://huggingface.co/Kijai/WanVideo_comfy/tree/main/LoRAs/Wan22_Lightx2v
>>
>>106935040
>Which quantization should I use?

With your 12GB vram you'll want to use either of the Q3 versions to keep image generations times fast. Using anything bigger will overflow into system ram (or just straight up crash from OOM errors) and push gen times into minutes instead of staying below that.
>>
File: 1747153743093937.mp4 (880 KB, 480x704)
880 KB
880 KB MP4
>>106935023
>>
>>106935061
I thought you needed these with high and low noise:
https://huggingface.co/lightx2v/Wan2.2-Distill-Models/tree/main
>>
>>106935040
I use Q8 which is like 20/21gb with a 4080 (16GB) and my gens are like 30 seconds, depends on how much RAM you have. So you don't necessarily need a model below your VRAM count
>>
>>106935069
I think regular 2.2 wan + separate loras works better, with these loras/weights. also you have more options cause you can adjust weight to preference (ie: 2.1 at 3 strength helps fix slow motion).
>>
see anistudio baked the thread
"oh great, probably gonna be a troll bake"
click in expecting chaos
actually well organized
good quality posts
people giving helpful advice
no drama whatsoever
pleasantly surprised
>>
Hello ldg, I have a couple of questions for you.
I've been using SuperGrok for porn over the last week or so and it's been fucking amazing. Now that they've kicked up the content moderation, it's basically impossible to make anything.
I want to keep making videos for my own use and I am willing to pay money to do so.

I have a 12700K, 64GB of system RAM, and an RTX3080 10GB. I assume this is not sufficient for video generation.
I can afford to buy a second-hand 3090 24GB. Would this be sufficient for video generation when paired with my 12700K and 64GB of system RAM? Are there better options available?

Thank you very much for your time.
>>
File: whitemane success.webm (2.47 MB, 1024x800)
2.47 MB
2.47 MB WEBM
>>106935027
I can finally make AI
>proud of myself for struggling with this docker setup for 2 hours
>>
>>106935029
It will be removed
>>
>>106935097
Yes, it would be sufficient, but you'll have to use quants and expect longer wait times. 24GB is the minimum. Read the WAN 2.2 guide in the OP, it covers this topic in depth.
>>
>>106934820
who is this creature made for plapping?
>>
>>106934946
umm...why are you lying? thats the same gen i posted yesterday
>>
File: 1760644430047634.png (28 KB, 495x439)
28 KB
28 KB PNG
trying this combo for wan with the new lora, seems to work decent:
>>
>>106934999
ahh! real monsters!
>>
>>106935097
It is "possible" but don't expect amazing results since you will have to use heavily quantized models plus Wan is not on the level of the SaaS models out of the box yet
You will have to use the MultiGPU node on Comfy and offload the text encoder and the model themselves to different devices (you will need as many gpus as you can afford), in your case you won't be able to offload the low and high noise Wan 2.2 weights to different gpus since your other one is 10gb, so you will have to wait for the backend to swap the models on the same GPU
>>
File: 1755782191135282.mp4 (1.01 MB, 704x480)
1.01 MB
1.01 MB MP4
>>
>>106935172
now I want migu to transform him into a cockroach lool
>>
>>106935140
if you have sufficient RAM you can use Q8 qwen edit/wan even with 16gb, I do that with a 4080 despite q8 edit being like 21gb. and gens are like 20-30 seconds with the 4 step lora.
>>
>>106935188
>21.8 GB
>>
File: 1751660097428126.mp4 (980 KB, 704x480)
980 KB
980 KB MP4
>>106935201
in any case if the majority loads into vram and a bit on system ram, it still works fast. no issue with wan or qwen edit, with models above 16gb.

there is also the option to use the multigpu node, either or.
>>
https://www.reddit.com/r/StableDiffusion/comments/1o8v95b/wan_22_sampler_scheduler_comparison/
what do you bros usually use? i've never tried linear quadratic with wan before
>>
>>
>>106935172
>>106935206
Wan is incredibly shit for "violent" scenes. Every time I tried to gen a subject punching or pushing someone, the scenes were mild most of the time. Someone must train a lora specifically for these things
>>
>>106935107
Why don't you let Comfy fuck you? You are begging so hard for that big spagheti cock.
>>
>>106935220
true also with grok and veo...
>>
File: 1732921804362029.mp4 (968 KB, 704x480)
968 KB
968 KB MP4
lmao

not the initial hit I wanted but still funny
>>
File: icantbackflip.mp4 (1.73 MB, 640x1096)
1.73 MB
1.73 MB MP4
>>106935097
the 24gb gpu is of course better, but i'd recommend trying it as-is first for now

>>106935103
yea, a few more adjustments and I think this will work ok
>>
>>106935123
He is the actual representative of local models. Comfy has already crossed the line from being a local UI to becoming a corpo. It makes no sense to even talk about him, InvokeAI is more local tham Comfy at this point.
>>
How do I replace the road in a scene with the detailed texture like picrel?
>>
>>106935261
I tried kontext and qwen edit and it doesn't work
>>
>>106935261
maybe you can directly tell qwen image edit to replace the road's texture in image1 with image2

i can't telll you it'll work for sure tho
>>
File: 1748841644754655.png (305 KB, 500x375)
305 KB
305 KB PNG
>>106935250
based, hit that cockroach with a bat!
>>
>>106935261
1 - Paint the road in the original scene in a very different/contrastive color like pink
2 - Use a Qwen-Edit 2509 flow, feed both images (the original with the pink mask and the new road as image 2)
3 - Prompt "replace the pink mask in image 1 with the road from image 2"
>>
>>106935261
with qwen edit, take a road in the first image node (image1) and say "replace the road texture with image2" or something, with that in image2.
>>
File: ComfyUI_01459_.png (2.13 MB, 1024x1024)
2.13 MB
2.13 MB PNG
>>
>>106935308
*also use qwen edit 2509 (second version) it's a big step up over v1. and the new 2509 lightx2v lora is good too at 4-8 steps.
>>
He will get better results if he manually paints the target element with a single color (makes it a mask). Source: I changed several images with Qwen-Edit that way and it always worked best
>>
https://strawpoll.com/3RnYXaLOzye
>>
>>106935289
Thanks for the tip for pink mask. Before editing models I was doing it with inpaint + redux but the results were meh
>>
File: 1748485958930200.mp4 (1.27 MB, 704x480)
1.27 MB
1.27 MB MP4
Hatsune Miku wearing her regular outfit walks in from the right and hits the man with glasses in the face very hard with a baseball bat, causing him to fly off his chair onto the floor. the dog on the left is walking around.

getting closer to the ideal result!
>>
super dumb noob question but.. are these models censored? Like if I type in big tits, do I actually get real big tits or have these models like WAN been fucked with to prevent AI porn generation?
>>
File: 1741878956886742.mp4 (1.04 MB, 704x480)
1.04 MB
1.04 MB MP4
>>106935337
Hatsune Miku wearing a business suit walks in from the right and hits the man with glasses in the face very hard with a baseball bat, causing him to fly off his chair onto the floor. the dog on the left is walking around.

now we're talking.
>>
>>106935347
large breasts or very large breasts will give you big boobs. there are nsfw loras if you want very lewd stuff also.

noobai/illustrious does anime lewds without loras needed. qwen/edit can do lewds, clothes remover lora makes the outputs even better.
>>
>>106935347
depends on which. most we like aren't censored much, but unfortunately that doesn't always mean they know every hentai/porn nsfw concept either

some were neither censored much nor trained much on nsfw
>>
>>106935337
>>106935352
weird how the cia is just as clueless about genning as everyone else is, but doesn't stop you from trying
>>
>>106935359
>there are nsfw loras if you want very lewd stuff also.
I'm all ears, anon. What's the best for image2video or video2video that lets you lewd it up?
>>
>>106935371
the easiest way to do video lewds is to use i2v (wan) with a starting image made with qwen/flux (realism) or noobai/illustrious (anime), to get your starting frame. then tell wan what to do with it.

wai v15 is good for anime in general, qwen/qwen edit for realism. wan can use the images from both well.
>>
>>106935392
>i2v (wan)
before I start downloading gigabytes of diffusion model files, is there a specific one you recommend that's not censored for lewds?
>>
>>106935219
nice
>>
>>106935347
in terms of breasts, wan will let you do like D-cups or something like that. for much larger breasts you may need to do i2v (generate with illustrious or chroma or such first). or a lora. or perhaps another wan finetune (smoothmix is one)
>>
>>106935412
Thanks
>>
File: 1743235033743467.mp4 (1.2 MB, 704x480)
1.2 MB
1.2 MB MP4
shocked dog gets his revenge

lmao

prompt: the dog on the left puts a black shock collar on the neck of the man with glasses, causing a huge lightning bolt to hit the man, who falls on the floor while emitting lightning.
>>
Why is ran letting his general fail? Is he admitting defart?
>>
>>106935410
wan 2.2 (use comfy template, it has files/links) is fine, with i2v you can do essentially anything. if wan vanilla won't do it, just add a wan NSFW lora after the model from civitai.
>>
>>106935441
hopefully killed himself already. fuck the drama posting and narcissistic niggardry
>>
>>106935439
lmaoo, that one is really good
>>
>>106935441
ran didnt make this general i did and im tired of him getting the credit for my work
>>
>>106935461
the new lora has improved motion this is better than my old meme gens. (the kijai 2.2 MoE one for high noise)
>>
>>106935439
Hey, you'd probably know. Is there any magic WAN phrase to make electricity flow around something and actually keep flowing around it instead of immediately disappearing? I can feel the literacy leaving my body every time I try to rephrase what I want in some other way.
>>
>>106935469
same with the anon that invented victorian cat girls prompt. stolen valor
>>
new
>>106935484
>>106935484
>>106935484
>>106935484
>>
>>106935441
What he created formed a mind of it's own and rejected him. He thought that if he birthed a child, it would be fated to love him. Imagine your own child rejecting you?
>>
>>106935493
Uhh
>>
File: never.jpg (98 KB, 588x815)
98 KB
98 KB JPG
You ever stolen a prompt /ldg/?
>>
File: file.jpg (175 KB, 800x1120)
175 KB
175 KB JPG
>>106935479
how does one us the Lo(r)a?
>>
>>106935513
>I invented the day to night cycle
we are in the presence of god himself
>>
>>106935523
I was expecting Jesus to come back, and we got god instead, meh I'll take it
>>
>>106935519
you just add a load lora node after the high noise one and select it

low noise is using the wan 2.2 lightning low noise lora at 1 strength.
>>
File: 224737.png (654 KB, 915x1182)
654 KB
654 KB PNG
any suggestions on forcing extreme perspective/foreshortening? controlnet openpose/depth can't seem to disambiguate foreshortening from proportion.
>>
>>106935439
KEK
I especially like how after the dog puts it on, he just sits there for a second with a smile before getting shocked kek
>>
>>106935513
prompt autists:
>ur stealing my "work"
>>
>>106935533
pictures would be extremely helpful when I drag the load lora model node it's undefined even though I see I have loras in my library
>>
>>106935493
Nah, we're discussing technical stuff. Not interested in the drama or which jpg displays in the thread, I'm not a zoomer.
>>
>>106935571
use the comfy wan 2.2 template, it uses lightx2v

change the default lora to the new 2.2 one for high, low still uses 2.1.
>>
this is what AI was made for

https://files.catbox.moe/iin8ki.mp4
>>
Fact: 99% of sloppers quit right before they press go on their most godly gen
>>
>>106935705
truth
>>
https://files.catbox.moe/penfpw.mp3
>>
>>106935705
what if my gens are on a 24/7 cycle?
>>
how do I make something like neurosama with my gens?
>>
do people still use flux? what for?
>>
>>106935740
what
>>
>mfw no highlights
>>
>>106935752
chroma
>>
Who's the best genner here?
>>
File: radiance.png (3.39 MB, 864x1488)
3.39 MB
3.39 MB PNG
>>106935740
stuff like this?
https://github.com/kimjammer/Neuro
https://github.com/moeru-ai/airi
https://github.com/Open-LLM-VTuber/Open-LLM-VTuber
https://github.com/fagenorn/handcrafted-persona-engine

it's more on the TTS/LLM side tho since no one has processing power and VRAM to run realtime wan/sora/whatever with longer term consistency yet
>>
>>106935780
me, check it
>>106935727
ace step is BACK BABY

(nothing has changed yet)
>>
sheesh ace step can do some things that sunno and udio can't lol
>>
>>106935801
>ace step can do some things that sunno and udio can't
being good?
>>
>>106935864
lol
>>
lads and gooners give it to me straight. How do I make nsfw videos with this new technology? I'm retarded, assume the worst.
>>
Reminder do not trust the Ace Step shill. It's a irredeemably shit model.
>>
you don't make videos. you make ace step hatespeech.
>>
Ani, advice from a boomer who doesn't care much about these dramas: Don't do this anymore. Good products get known through word of mouth from satisfied users.

Yes, I know Comfy used this general as a platform to promote itself and continues to do so with weekly self-promotion, and that its main competitor Invoke is mysteriously not in the OP.

But doing this kind of thing doesn't make you look good either.
>>
>main competitor
>>
who's the best AI tuber?
>>
>>106935942
idk but listen to miku:
>>106935727
>>
>>106934841
you could bring some examples
not this 1 sample of stupid 1girl standing
something with high dynamics and many details
like that retard with 1girl missing with a sword and get punched
>>
>>106935942
Neuro-sama
>>
Look at this:
>>106934981
>>106935035
>>106935051

And this
>>106935891

Funny how an anon uses Asian 1girls to show his WAN gens, conveniently an Asian model.

We should set a limit of 3 gens per thread that do not contribute anything new and are just posting for the sake of posting while constantly shilling a specific model.

I am not just seeing this with WAN, I am seeing it with other models too, and this is when I start to suspect that we do not just have innocent users here but actual employees AKA shills.
>>
File: ComfyUI_20213.png (3.21 MB, 1200x1800)
3.21 MB
3.21 MB PNG
>>106935752
I still prefer Flux to the others. Qwen isn't very imaginative and strays very little from your initial image, Chroma has a billion version of itself and they all look like SD 1.5 (not a bad thing!), SDXL/SD3 was always too sloppy looking to play with and SaaS is for gaybos. Flux on the other hand works well with my gacha gen style, it's low on body horror (it's definitely still there though). There's also lot more snake oil to play and experiment with, so it's also more mature than the rest.

If I genned porn, it would probably be different (Chroma).

>>106935780
Well, I gen the cutest girl, so I'm easily top five!
>>
Fuck you trannni, fuck you troonnffy
>>
>>106934971
> but generating wan videos is like doing many dozens of SDXL images even at 1/2 lower resolution than SDXL native.
you have to denoise a latent all at once
>>
>>106935978
looks like a who from whoville
>>
Is Wan not able to do a dolly zoom?
>>
>>106934820
I need to see this cat spreading her holes for me
>>
>>106934820
Where can I find women like this IRL
>>
>>106935554
image to latent
>>
>>106936058
engrish too hard for chinaman
>>
>>106936231
you mean a basic i2i? that's tough if you don't want any characteristics from the input image except the pose/perspective.
>>
>>106936300
i mean i2i or i2v
gen with no foreshortening, edit, then pass it as an input
or try loras
>>
Why is Img2Img slow?
>>
>>106936339
because it has to encode the image first
and maybe you forgot to resize
>>
in Ace Step, how does lyrics strength work?
>>
ace step sometimes produces stuff so busted it's hilarious.
>>
>>106936363
By using Suno instead.
>>
>>106936375
Can't. sunno doesn't do nsfw
>>
File: 1749558892175006.mp4 (1.09 MB, 704x480)
1.09 MB
1.09 MB MP4
the dog on the left hits the man with glasses with a large steel hammer very hard, causing a lightning strike to hit him in the face, and he falls to the ground while emitting lightning.
>>
>>106936452
We call her Kaya Thor!
>>
>hyperfixation retard is back
>>
>>106935882
start with the info in OP.
Go from there.
You're going to need to set up comfyui to begin with and then get the model & loras you want to use and maybe a workflow template if you're new and want to start generating quickly to learn as you go.
>>
>>106936600
this is where I'm stuck anon, what loras do I even use? I think it would be best to train my own based on my own research, so I think ai toolkit, yes? then feed it into like wan 2.2 5b
>>
>>106936606
dunno, whatever you want to generate I guess?
I would play around with wan first and then if you aren't getting the result you want for something specific look for a lora or look into training your own
I think the /b/ general has the celebrity loras floating around
>>
I'm ooming with the fp16 models, does comfy not flush the high noise model out, even from ram, when loading the low noise?
How am I ooming with 32gb vram and 64gb ram, on a 480p gen?
>>
>>106936670
Comfy keeps model loaded even after you exit the program you have to fully shut down the computer just to flush it if you're using Comfy.
>>
>>106936675
They're in my walls too?
>>
File: 1729656591236610.mp4 (1.02 MB, 704x480)
1.02 MB
1.02 MB MP4
the dog on the left jumps and punches the man with glasses very hard, causing a lightning strike to hit him in the face, and he flies in the air to the right through a wall, as the camera tracks him.

kek camera tracking worked
>>
>>106936675
Not sure if this is a shitpost but this actually does happen to some extent on my machine, latest update
>>
>>106936692
thats fucking great
>>
>>106936708
the new lora is definitely an improvement for high noise.

https://huggingface.co/Kijai/WanVideo_comfy/tree/main/LoRAs/Wan22_Lightx2v

was updated recently.
>>
Why is Chroma so much more better than Flux?
>>
>>106936692
>>106936721
it's still slow mo, sad
>>
>give the new double light lora setup a try for nsfw
>turns it into a cheerful winter wonderful song
https://litter.catbox.moe/osgzmqf0l865tt2g.mp4
>>
File: QwenEdit_00152_.png (1.15 MB, 1024x1024)
1.15 MB
1.15 MB PNG
tendies
>>
File: Jinn.jpg (115 KB, 900x1191)
115 KB
115 KB JPG
Have any of you heard of the tale where the prophet of Allah had a riding encounter with twelve black Jinns, otherwise known as genies? Fascinating story that. I wonder if AI could recreate it.
>>
File: 1734267499818920.mp4 (1.22 MB, 704x480)
1.22 MB
1.22 MB MP4
the snyder cut:
>>
File: lol.mp4 (2.38 MB, 752x416)
2.38 MB
2.38 MB MP4
(different anon)
the angry dog stands up and transforms into an impossibly muscular hulking bipedal monster. it then angrily runs at the man very swiftly and forcefully lifts him up and throws him directly into and through a wall to the right side of the frame. The camera consistently tracks both the man and the dog.
>>
How can Sora2 be surpassed?
>>
>>106937015
when china learns to not slop the dataset
>>
>>106937015
>How can Sora2 be surpassed?
the day China has enough balls to put pop culture in it (and stop adding synthetic slop in their dataset)
>>
File: cat.jpg (203 KB, 1200x630)
203 KB
203 KB JPG
Any helpful prompt to get a late 1900s kind of look like this with Flux?
>>
Why is there two generals?
>>
>>106937259
jews ultimately
>>
File: 1744279994197755.mp4 (1.43 MB, 704x480)
1.43 MB
1.43 MB MP4
>>
>>106937420
post in the other thread dumbass
>>
>>106936670
> does comfy not flush the high noise model out, even from ram, when loading the low noise
It does occasionally. Most of the time you have to unload manually or restart comfy.
>>
>>106936670
>>106936675
He fucked something up in the last one or two releases. I'm getting a shit-ton of issues as of late (VAE problems, models not unloading, etc) on WFs that worked fine last week... and often work until they don't now.

I might have to start fresh again for the zillionth fucking time.

>>106937163
Try adding a digital camera model name and a filename they might have used.
>>
How do I use kohya lora training script (sdxl_train_network.py) without dreambooth style naming for the image folder (repetition_activationword)? The "--train_data_dir" param that shows up in the help page seems to demand it.
I am asking this because Easy Training Scripts wrapper is six months out of date at this point so I am just trying to use newest kohya for better results.
And I know that there has to be a way since ETS doesn't require that during lora training despite being kohya under the hood.
>>
>>106937873
INFO     Using DreamBooth method. 

This shows up in terminal when running sd-scripts directly.
It doesn't with ETS.
What is causing it?
I more or less converted my ETS toml into kohya equivalent as I understand.
>>
should I use qwen or wan for single photoreal image?
>>
>>106937917
chroma
>>
>>106935063
yeah but how much lower quality is Q3 over even Q4 or Q5?
>>
>>106937924
nice joke
>>
File: 1696742705068164.jpg (81 KB, 640x881)
81 KB
81 KB JPG
Can someone be a nice fren and just give me a workflow that just werks for generating lewd videos from a sample image? My videos end up severely deformed in the face.
>>
>>106938100
For Wan 2.2? And how much VRAM do you have? If it's less than 24GB, don't @ me
>>
File: file.jpg (20 KB, 350x323)
20 KB
20 KB JPG
>>106938131
It's exactly 24GB, fren.
>>
File: NYVZdxP.png (18 KB, 128x119)
18 KB
18 KB PNG
>>106938136
>https://files.catbox.moe/0qell0.json
I'm the anon who made the old wan 2.1 rentry. This is basically a modified version of the lightning workflow from there, made to work with 2.2. Super simple and works.
I use the redditor's lightx2v combo. I tested a bunch of other combinations, including NVIDIA's rCM LoRA. The redditor's gives me the best motion for the simple slop stuff I make, even though it's still slo-mo half the time. I just speed up videos in post.
It's set to auto-scale the image to 720p, so feed it images at or above that res. Easy enough to set it lower if you want 480p or whatever. I also set steps to 10. I wouldn't go lower than 4 high/4 low personally, otherwise there's a noticeable quality drop.
>>106923658 example outputs, compressed for 4ch.
>>
>>106934820
cftf?
>>
>>106938347
Are the clips for the high and low noise lora loaders supposed to be connected? They're not in this wf
>>
>>106938373
No.
>>
>>106937873
>>106937901
Wasted 1.5 hours of my life.
You need to pass a valid --in_json argument to prevent it from using dreamboot. (class, captions combined into a single file. I can't do that because I don't understand class shit.)
I tried to bypass that by editing scripts setting use_dreambooth_method to False.
I get INFO Training with captions. but then it errors out
voluptuous.error.MultipleInvalid: expected str for dictionary value @ data['datasets'][0]['subsets'][0]['metadata_file']

I am done.
I will just use repeat+trigger word folders.
Still no idea how ETS bypasses this shit though.
>>
Not enough coom bait itt desu
>>
>>106938347
If I want better generation of a model I have, should I train my own LoRa to use? Would that benefit accuracy in terms of face etc.?
>>
>>106938466
If you mean a specific person, then yes. Character/people LoRA's for Wan don't need to be trained on videos either. Images are fine. Last time I made one, I used 150 images. I'd get 1/3 from different angles, the rest can be front facing. Use Joycaption to caption them.
>>
>>106938486
Good stuff, I'm gonna try this with AI toolkit and see what the results are with your workflow posted above.
>>
File: 1733179522733672.png (6 KB, 200x104)
6 KB
6 KB PNG
>>106936675
>>106936701
you can do it manually until it's unbroken
>>
does anon use gpt to refine your prompt? how much improvement is there to gain?
>>
>>106938928
Sometimes.
t5 can like those AI slop novels. Grok 4 fast seems to perform better than ChatGPT in my experience for this task.
Use a similar image that you like and caption it with joy caption > change details as needed is another option.
>>
>>106934820
OMFG, can you give me more info on this picture? How was it generated? Who generated it and do they have a twitter or something i can browse for more?
>>
Anyone know how to feed a text prompt to 'Clip Text Encode (Prompt)' in Comfy ?

Apparently you used to be able to enable a Text input in the node, but that's gone, also you could use a 'Text to Conditioning' node but that's also gone...
>>
File: .png (35 KB, 1480x505)
35 KB
35 KB PNG
>>106938977
>>
>>106938625
nta, but that doesn't always work. Most of the time, it'll keep some residual shit in your VRAM, even if you flush it. It only fully clears if you exit/restart Comfy.
>>
>>106939033
My man thanks, that circle was practically obscured by the prompt box, perhaps a Firefox rendering bug
>>
File: 1.png (34 KB, 527x568)
34 KB
34 KB PNG
>>106939065
Comfy hides node variable inputs until you try to drag an output into them from another node. You can also see the hidden input if you hover your mouse just left of the variable box, pic related.
>>
>>106939082
Thumbs up!
>>
Can wan 2 animate, v2, be used to upacale videos and/or inpaint areas?
>>
File: SS012653.png (54 KB, 617x834)
54 KB
54 KB PNG
>>106938347
To add to this, I came across a method to fix those annoying brightness/exposure increases when using lightx2v with more than 4 steps. You usually see it as a gradual increase in brightness as the video progresses. Added it to my workflow, if anyone's interested in copying the method.
>https://files.catbox.moe/jpyej5.json
>>
File: juli.jpg (157 KB, 800x800)
157 KB
157 KB JPG
DL my failed UI
>>
why are there two threads
>>
what's a good workflow to extend a video? basically add another scene to it after it's done playing
>>
>avatar of a literal pedophile as OP
/ldg/ has fallen
>>
>>106939397
its the retard himself in a poor attempt at hijacking the thread and promoting his shitty GUI
>>
>>106939397
anon, 90% of /g/ are into loli
you are the outsider here.
>>
>>106939408
when he wrote literal, I assume he didn't mean "into loli", or this place is even more retarded than I thought
>>
blessed thread of ignoring the troll and focussing on tech



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.