[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: 1756811738957524.jpg (775 KB, 2220x1529)
775 KB
775 KB JPG
Discussion of Free and Open Source Diffusion Models

Prev: >>107791088

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/kohya-ss/sd-scripts
https://github.com/tdrussell/diffusion-pipe

>Z Image Turbo
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>WanX
https://github.com/Wan-Video/Wan2.2
https://comfyanonymous.github.io/ComfyUI_examples/wan22/

>NetaYume
https://civitai.com/models/1790792?modelVersionId=2485296
https://nieta-art.feishu.cn/wiki/RY3GwpT59icIQlkWXEfcCqIMnQd

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe|https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg
>>
Now that we know for sure ltx is shit how about get back to
HunyuanVideo-1.5
>>
fish'n'chips
>>
>>107792316
kek
>>
Blessed thread of frenship
>>
>>107792316
so you were just trolling then, got it
>>
is the Kandinsky pron a joke? or is it with LoRA, asking for a fren
>>
File: file.png (1.94 MB, 1824x1674)
1.94 MB
1.94 MB PNG
Should I use Stability Matrix??
>>
>>107792347
>>107792087
>>
>>107792347
that's real. when a slavic nigger says it's uncensored, it actually is
>>
>>107792351
nobody asked schizo
>>
>>107792359
>you can make 10 secs of porn if you let your 6000 bake for an hour
might as well hire a hooker, cheaper too
>>
>>107792352
if you think you are dumb and need handholding at the cost of disk space then sure
>>
>ask anon for workflow
>he actually gives me
>get home and pop it on comfy
>its the default template and obviously not what he used
>>
File: 1743662667268594.png (2.21 MB, 1344x1152)
2.21 MB
2.21 MB PNG
This was genned in comfyui!
>>
>>107792395
>posts real photo
>says it is ai
boring
>>
>>107792384
ranfaggot doesn't code and he didn't make anistudio
>>
Kijai fp8 Kandinsky pro works perfect with comfyui wf and CFG1/8steps... anon need to read

https://huggingface.co/Kijai/Kandinsky5_comfy/tree/main/fp8_scaled/Pro/T2V
>>
>>107792393
And the person who defends its creator 24/7 (totally not its creator!) spams this thread and routinely gets his posts nuked, often 100 posts in individual threads.
>>
File: 1767765755990533.jpg (68 KB, 493x612)
68 KB
68 KB JPG
>>107792305
can someone give the full resolution pic?
>>
>>107792384
>>107792393
>software made by a doxer
>ANISTUDIO is literal malware
Proofs?
>>
>>107792399
its clearly the default zit girl thougheverbeitdoe
>>
>>107792406
oh? they released distilled versions? Maybe speed wont be nearly as bad then. How are they?
>>
>>107792416
proof is the author is a retarded frenchoid sneething for years about comfy and derailing threads to shit on comfy/prop up his ui
>>
>>107792367
>>107792235
>>
>>107792423
do you have the death threat post lmao, that was fucking GUCCI, wwhat a loser
>>
>>107792423
doxxing implies he did drop info. this doesn't prove anything. do you have proof of anything you are claiming?
>>
anon can post more NSFW from Kandinsky pro with Kijai FP8?
>>
oh dear. ranschizo should probably leave forever for her safety. she will be missed
>>
>>107792467
>You don't need proof
lmfao. way to come out as a clown
>>
File: o_00044_.jpg (556 KB, 1264x1552)
556 KB
556 KB JPG
>>
>>107792467
seems to me there's some sort of agenda. are you perhaps falseflagging trying to make ani seem unhinged, ran?
>>
>>107792470
nta but even I hope you get hit by a bus. You are insufferable
>>
>>107792482
model anon?
>>
File: file.png (9 KB, 353x83)
9 KB
9 KB PNG
Getting this when I try the ltx template, what do I do?
>>
>>107792492
Your mental gymnastics is malware for the brain
>>
File: ComfyUI_05704.png (3.2 MB, 2048x1536)
3.2 MB
3.2 MB PNG
How do you turn off Comfy offloading to memory? It's leaving 7GB of my VRAM unused for some reason and completely tanking the performance.
>>
>>107792497
check if the height is connected? reconnect the height input?
>>
File: o_00029_.jpg (713 KB, 1264x1552)
713 KB
713 KB JPG
>>107792495
chroma dc 2k t2 sl4
>>
works rather nice with qwen edit 2511 gens too:

41 seconds for this gen, try this workflow:

https://github.com/Lightricks/ComfyUI-LTXVideo/blob/master/example_workflows/LTX-2_I2V_Distilled_wLora.json

vid: https://files.catbox.moe/kqtusp.mp4
>>
>>107792532
what the actual fuck is wrong with you?
>>
people just let this pussy drama faggot put whatever he wants in the op. disappointed in you anons
>>
File: o_00048_.jpg (520 KB, 1264x1552)
520 KB
520 KB JPG
>>
>>107792543
Mods only care if you upload or even talk about pedo stuff. They don’t care about threats of violence because it doesn’t hurt their advertiser dollars.
>>
the black man says "man, I cant wait to do some fent and eat some mcdonalds." After he finishes speaking, he holds up a bag of white powder and says "shiiiiiiiiet!"

kek, 58 seconds for 9 seconds with audio

https://files.catbox.moe/e0thma.mp4
>>
File: file.png (68 KB, 1103x674)
68 KB
68 KB PNG
>>107792513
I did and that fixed it, but now I see this one here, why is my shit all broken, this weird node is passing null values
>>
File: doxthreats.png (83 KB, 737x433)
83 KB
83 KB PNG
It is of vital importance to the local diffusion community that newcomers know to stay safe from this security threat. When someone spends their entire time defending a no-name dev and also makes dox threats against other posters, you must NOT run their code on your computer.
>>
awesome news for a ltxv lora training
https://www.reddit.com/r/StableDiffusion/comments/1q6asqd/ltx2_lora_training/
apparently works on 5090, its fast and audio training works as well for voices
>>
>>107792570
try this workflow, it's fast and it fixes the stupid enhancer nodes that were fucking up

https://files.catbox.moe/y6y37b.json
>>
>kandinsky 20B took 10 mins for this https://files.catbox.moe/uljslp.mp4

oof
>>
>>107792580
thanks king
>>
lmao only thing I need is a better prompt than canned laughter: using the workflow from >>107792580 btw, 60 seconds, 9s/240 frames.


https://files.catbox.moe/5r47u2.mp4
>>
File: 1654012964004.gif (1.98 MB, 400x300)
1.98 MB
1.98 MB GIF
Something is fucked up with ltx2 for me, I get one gen off, then anything after that is taking forever to generate.
>>
>>107792604
not enough ram I guess. Use vram debug node to offload before whereever it hangs up for you. Though then you gota reload models which take forever 64GB is barely enough with all fp8
>>
>>107792604
try the workflow above, I have 16GB and 64GB ram and it works fine, no OOM

also try --reserve-vram 4 in startup settings/flags
>>
>>107792612
*this is with fp8 distilled btw, which works fine
>>
>>107792611
>>107792612
I have a 5090 and 192gb of ram. There has to be a bug.
>>
yea kandinsky 20B sucks, its prompt adherence is horrible and its ungodly slow
>>
im impressed how emotive the gens are desu:

https://files.catbox.moe/yomurf.mp4
>>
>>107792622
try --reserve-vram 4 to see if comfy is not trying to steal the last of your vram or something while something else is using it
>>
>>107792622
try updating comfy in the update folder and in the manager, has to be a bug of some type
>>
>>107792512
--highvram or --gpu-only startup arguments, but it'll use your gpu as the sole allocating device. You MAY get away with manually setting up offloading with distorch multigpu nodes but I'm not sure
>>
>>107792636
this is only advisable on rtx 6000 or H100
>>
>>107792612
>>107792617
nta but i tried this and it doesnt work, get unet unexpected errors and "no CLIP/text encoder weights in checkpoint, the text encoder model will not be loaded", crashing comfy instantly, 50gigs in the trash lmao
>>
File: 1767763971592431.png (1.66 MB, 1120x1400)
1.66 MB
1.66 MB PNG
which anime was this?
>>
yea I see why people didn't both with kandinsky
https://files.catbox.moe/cd0lbx.mp4
>>
File: 1751123333037761.png (52 KB, 1216x603)
52 KB
52 KB PNG
>>107792649
get the gemma 3 it safetensors file

then, go here and get all the small files NOT the safetensors and put them in text encoder folder:

https://huggingface.co/google/gemma-3-12b-it-qat-q4_0-unquantized/tree/main

I had an error till I did that, another anon suggested it, these are my files:
>>
>>107792649
your comfy is not up to date, git pull, activate the venv and pip install -r requirements in the comfy folder and then in the ltxv custom node folder
>>
>>107792512
jenny my beloved...
>>
it's so fast once the kinks are worked out, 55 seconds, even on a 4080:

https://files.catbox.moe/1joxow.mp4
>>
seems like you can trick gemma to do sort of nsfw. We will need to make abiliterated gemma to work with it it seems.
https://files.catbox.moe/1sql2m.mp4

https://files.catbox.moe/8ydf0z.mp4
https://files.catbox.moe/r6kz3e.mp4
https://files.catbox.moe/lju7xd.mp4
https://files.catbox.moe/jd1biy.mp4
>>
>>107792675
im on latest master and all requirements satisfied
>>
File: LTX-2_00003-audio-1.mp4 (3.75 MB, 1024x1024)
3.75 MB
3.75 MB MP4
>hitler speech audio with trump

Such a cucked model, holy shit..

>>107792633
Seems that was it, I only updated through the folder earlier.
>>
holy shit it actually worked. this model is crazy fast and actually works. wan 2.5 has a lot of work to do.

https://files.catbox.moe/imb2yw.mp4
>>
>>107792695
and you have the correct text encoder, the full gemma folder of files with the safetensors inside of it? If so then something must be fucked. Remove all your other custom nodes, maybe one is conflicting. That has happened before
>>
>>107792556
>people just let this pussy drama faggot put whatever he wants in the op
You wanna know what I think? I don't give a fuck about the OP so long as it has the right name in the name field.
>>
>>107792706
sometimes it does a panning effect, I got that with the floyd stuff too now and then. it works.

try prompting "the man is singing" for the audio.
>>
>>107792717
>he fell for it
>>
>>107792712
i didnt get the gemma jsons yet since i dont have a huggingface account at this moment
>>
>>107792728
ah, yea you need all that, then put the safetensors in it.
>>
Don't get what I'm doing wrong, no matter what I do she doesn't want to lipsync
https://files.catbox.moe/6gdphw.mp4
>>
>>107792583
It's t2v or I2V?
>>
File: 1762734357182929.png (2.45 MB, 1344x1152)
2.45 MB
2.45 MB PNG
>>
>>107792708
anon help me pretty please
https://files.catbox.moe/lvjye3.mp4
>>
>>107792744
try prompting "the girl is singing with passion"
>>
>>107792766
I2V
>>
>>107792774
for the audio input workflow, what worked for my floyd gens was "the man is singing (with passion/loudly/etc)
>>
>>107792779
I tried, but I think my toy is broken fuck https://files.catbox.moe/ohjpw5.mp4
>>
>>107792792
for audio workflow use the kijai one he posted:

https://files.catbox.moe/f9fvjr.json
>>
https://files.catbox.moe/7zs4se.mp4
>>
>>107792809
its funny that wan 2.5 was beaten to the punch by a faster model, I love wan 2.2 but sound is the missing element.
>>
kek

https://files.catbox.moe/7ifa8v.mp4
>>
>>107792831
is this real
>>
These custom nodes are a bitch and comfyui manager won’t install them no matter what i do. Is there a place I can get them so I can manually install them.
>>
>>107792831
that is actually great lol
>>
>>107792849
use stability matrix
>>
File: ComfyUI_05780.png (3.51 MB, 2048x1536)
3.51 MB
3.51 MB PNG
>>107792636
>but it'll use your gpu as the sole allocating device
So you can't turn it off... goddammit! It's doing this for the VAE because it's roughly 400MB, but it's wasting so much fucking time setting that up when it could just dump that in VRAM (again, there's 7GB free!) and leave it there, you know, like it used to.

Guess I'll just have to live with the extra 15-20s gen times because I don't think it's something he'll fix. FUCK!
>>
>>107792855
That is what I’m using.
>>
>>107792831
Look how dangerous this shit is, your average facebook boomer would eat it without thinking twice, that's why we can not allow something like this to work for porn
>>
>>107792859
comfyui manager should tell you the packages github if you click install missing nodes
>>
>>107792862
AI porn could make onlyfans obsolete, it's a net good for society.
>>
>>107792884
what do you think happens if you make ethos lose their jobs?
>>
>>107792714
>You wanna know what I think?
no. anyone complacent with trannies shitting on their lap is not worth listening to
>>
>>107792891
sitting*
>>
>>107792888
they are forced to get jobs that aren't degrading for income.
>>
>>107792891
>>107792894
*sharting
>>
>>107792903
yeah like there are plenty of these that pay as much
>>
>>107792636
>>107792856
Highvram still loads into RAM, just using different logic. Give it a try.
>>
lmao

a group of jewish rabbis wearing a yarmulke walk in from the right and grab the man in the middle wearing glasses. The rabbis say "THIS IS A SHOAH!" before dragging him off camera to the right.

https://files.catbox.moe/gjbkzh.mp4
>>
training a lora on a video with audio legit voice clones them, this is crazy.
>>
>>107792931
like better than eleven labs level I mean. the ltxv trainer uses deepspeed so it can work with low vram as well
>>
>>107792927
did you add the music?
>>
>>107792574
>29gb vram with int8 quanto
bro was training in italian
>>
>>107792954
no lol thats what made it even more funny
>>
>>107792934
deepfakes and scams are gonna fucking take on a whole new dimension. This will make new laws for sure
>>
>troonlien thread
shan't be posting
shan't be using trannystudio
>>
>>107792517
How many Chroma versions are there? Which ones are actually good?
>>
File: LTX-2_00013-audio-1.mp4 (3.75 MB, 672x704)
3.75 MB
3.75 MB MP4
We have come so far.

https://files.catbox.moe/rk3ul0.mp4 (loud)
https://files.catbox.moe/svuige.webm (loud)

Now we just need SVI tech to work with ltx2.
>>
>>107792962
you could easily get people fired with stuff like this as most normies dont know shit about AI.

https://files.catbox.moe/g8hn6n.mp4
>>
>>107792979
all, new ones are just constantly being put out as its trained
>>
>>107792974
There's nothing to post anyway, it's just mindless ltx spam. I wonder if people who praise it are just some sort of shills or are just as blind and retarded as the migufloyd spammer.
>>
>>107792996
>fast
>video and image
>can do audio + image to video, or i2v
best model since qwen edit, of course it will be used/discussed.
>>
>>107793001
*correction, qwen edit, and zimage.
>>
>>107792988
dog?
>>
File: o_00066_.jpg (591 KB, 1672x1264)
591 KB
591 KB JPG
>>107792979
>Which ones are actually good?
I use the ones I've trained loras for.
>>
>>107793001
Drop the sales pitch. Guess it's shills.
>>
Saars, please stop dumping everything to catbox, use /wsg/ or /gif/ threads for dumping and post links
>>
File: 1756811738957525.mp4 (390 KB, 1024x1024)
390 KB
390 KB MP4
can ltx do this?
>>
>>107793017
no shilling, wan 2.2 is good but those kikes are holding wan 2.5 hostage for API only, and this is open source.
>>
>>107793026
>he thinks israelis won't sell out
>>
>>107793026
We have WAN 2.2.
>>
Is it true that the developer of AniStudio runs a tranny grooming discord focused on children?
>>
>>107793035
proof?
>>
apparently the cause of the smudginess when there are fast movements is cause the temporal upscaler is not being used but comfy does not have support for it yet
>>
holy shit, it does a perfect trump with no lora.

https://files.catbox.moe/q7bd6t.mp4
>>
>>107793060
that's literally perfect wow. ltx is just really good
>>
>>107792921
Nope. It OOM'd without doing anything on first launch and didn't get past the ksampler when I close the error window and hit run again. It did use all my VRAM that time though!
>>
with wan 2.2 we needed 6-8 steps instead of 4 what's about ltx? 8 isn't enough that's where your shitty results come from
>>
File: image (29).jpg (79 KB, 1370x508)
79 KB
79 KB JPG
git pull comfy then set it up like this, temporal upscaler is needed along with the spatial upscaler, latents were not lining up correctly, that is causing the issues
>>
can you tell if this is AI?

https://files.catbox.moe/uxdz1z.mp4
>>
temporal upscaler is here
https://huggingface.co/Lightricks/LTX-2/blob/main/ltx-2-temporal-upscaler-x2-1.0.safetensors
>>
>>107793133
nah no fucking way. it's so real
>>
>>107793022
crickets
ltxfags in shambles
>>
>>107793124
where node from
>>
>>107793133
this one has even better expressions:

https://files.catbox.moe/sb1aqd.mp4
>>
>>107793150
Wowww okay I think we're onto something guys... this is revolutionary
>>
File: file.png (196 KB, 496x516)
196 KB
196 KB PNG
>>107793150
too much quality I kneel
>>
>>107793133
How is the audio so bad? I used some random website a year ago to generate AI audio clips with just 30 second voice samples and they sounded extremely realistic.
>>
>>107793172
turns out trying to make a single model do multiple things isn't smart
>>
wise words, Frieren...

https://files.catbox.moe/z3vech.mp4
>>
>>107793180
deep voice did the trick, imo

even though subs > dubs

https://files.catbox.moe/iqmy7j.mp4
>>
>>107793180
this is south park levels of animation
>>
>>107793189
something a bit more wholesome:

https://files.catbox.moe/ahud98.mp4
>>
you might have to manually pull ltxv nodes then reinstall requirements.txt for it for the spatial and temporal upscalers
https://github.com/Lightricks/ComfyUI-LTXVideo/commit/63c8a9285c5c17bcd19c7088da8a6597719d336e
>>
Has anyone been able to do regular 2D cartoon animation with LTX-2 yet? I haven't seen any examples and I don't have time to test it myself right now.
>>
>>107793227
hold on things were broke >>107793225
>>
>Vram shit the bed and have to send card in for RMA
Reeeeeeeeee
>>
>>107792856
Post that Jenny lora. We're family here.
>>
well, now you can shit on other fanbases with i2v and audio.

https://files.catbox.moe/93fci7.mp4
>>
>>107793251
so close, just need a good toss now.

https://files.catbox.moe/inhzb2.mp4
>>
>>>/wsg/6067201

Finally got this shit working
>>
>>107793185
How do i make i2v with this
>>
>>107793268
why does it make it so cringe
>>
>>107793270
shutup benchod i deleted it
>>
this shit is so fucking bad
>>
fuck, didn't change sampler
https://files.catbox.moe/2hohuv.json
>>
>>107793292
mf make it i2v
>>
>>107793276
anon you can literally do anything. like so:

https://files.catbox.moe/dlcydq.mp4
>>
>>107793292
that should look MUCH better now. No more smudging from fast movement. ComfyuiWF did not use the temporal upscaler which has to be used alongside the spatial upscaler otherwise the latents dont line up
>>
>>107793288
say that again >>>/wsg/6067209
>>
File: file.png (131 KB, 339x291)
131 KB
131 KB PNG
>>107793314
>mfw ltx
>>
>>107793323
i fucking kneel, this is a masterpiece
>>
>>107793331
did you know...

https://files.catbox.moe/8oojbb.mp4
>>
File: file.png (104 KB, 285x256)
104 KB
104 KB PNG
>>107793342
who these 13%
>>
>>107793311
you just plug in a image instead of a empty image
>>
>>107793355
benchod
>>
kneel to uncensored AI, sora 2 can't do this:

https://files.catbox.moe/g7xf14.mp4
>>
>>107793372
sora 2 can say it as long as you don't type it
>>
Sorry to be that guy, but are there any examples of LTX2 image to video/audio or text to video/audio NSFW (realistic, not cartoon bullshit). I don't really use this website much. Can't be bothered setting up comfy etc but keen to see what the latest open source looks like these days. Any other boards good for this specifically?

Kind regards
>>
>>107793292
why this shit not connected to anything
>>
>>107793380
you have to trick sora into doing edgy stuff, this is up to you (plus loras)

also, same prompt but diff image: neat sky imo

https://files.catbox.moe/5zgepu.mp4
>>
here fixed I2V https://files.catbox.moe/jwmw2j.json
>>
>>107793403
who are you
>>
>>107793398
god damn it, I should stop rushing
https://files.catbox.moe/20yi3n.json
>>
>>107793394
It just came out, dumbo. It's a censored, baseline model. You need to wait for LoRA's to be trained.
>>
lmao I forgot to adjust the image setting so it cropped jotaros head off and didnt prompt anything about jotaro.

https://files.catbox.moe/45vulb.mp4
>>
brehs this is it, we are inside the singularity now
>>>/wsg/6067213
>>
>>107793429
benchod moment
>>
son of a bitch the I2V still had a disconnected node https://files.catbox.moe/b3mwgz.json
>>
>>107793452
Stop being retarded.
FUCKING STOP IT.
>>
cozy bread
>>
File: 8.webm (371 KB, 640x752)
371 KB
371 KB WEBM
>>107793467
>>
>>107793407
nta but at a quick glance this is a wf with the temporal upscaler included too.
>>
https://files.catbox.moe/l3wumz.mp4
>>
>>107790806
blessed thighs
>>
Do you think Julien treats the proxy abuse as a business expense?
>>
The jews... won?
>>
File: 1746774088208660.png (11 KB, 85x507)
11 KB
11 KB PNG
i h8 this inconsistency so much, just randomized seeds everything else all teh same, even down to resolution and aspect ratio on the pics
>>
Am I just wasting my time trying to get Wan 2.2 to do the Nazi salute? Is it the guardrails?
>>
>>107793452
kinda far from fastest model now
>>
>>107793524
Ask yourself how many videos of nazi salutes are out there. Then ask yourself how many do you think made it into Wan's dataset.
This is something you need to specifically train into it via a LoRA.
>>
>>107793452
>Allocation on device This error means you ran out of memory on your GPU.
do i need to pass some args to comfyui? sorry just woke up so i'm out of the loop. i have a 5090
>>
>>107793526
the latent was half as long as it was supposed to be, still very fast
>>
it works very well, look at the mouth/lips/teeth:

https://files.catbox.moe/8g1vgb.mp4
>>
>>107793524
"reaches arm out in front of him with fingers straight forward and palm down"
>>
>>107793551
--reserve-vram 4
>>
>>107793551
ah, the I2V WF might be too high res / length for your gpu, I had it really high
>>
>>107793520
Watch RAM and SSD usage when genning, I bet sometimes it correctly offloads everything, sometimes it spills to a pagefile
>>
>>107793556
that should do it will offloading as long as you have enough ram though yea, windows does not manage memory well
>>
>>107793452
why am i getting 16 s video with length 200
>>
>>107793556
i can try this, ty

>>107793560
yeah it was set to full HD. does LTX2 not support offloading like Wan? or will it offload when i pass that reserve vram 4
>>
>>107793556
6 or 8 is better
>>
>>107793574
Depends on the res though, right? 4 is enough for 1 megapixel up to like 250 frames/10 seconds @ 24gb vram
>>
finally. need an applause lora though.

https://files.catbox.moe/552o0c.mp4
>>
I swear this is the final fix for the I2V one, I accidently only set one sampler to res_2 https://files.catbox.moe/h6v3f9.json
>>
>>107793566
200/12=16
>>
he did it, he said the thing.

https://files.catbox.moe/6qrc0x.mp4
>>
File: file.png (21 KB, 485x207)
21 KB
21 KB PNG
>>107793584
what's going on here mate? i think you still fucked up
>>
>>107793584
play with image strength though, it might need to be a little higher to stick to the given image but that also might hurt motion a bit
>>
>>107793607
>>107793584
benchod fix your script or are you trolling?
>>
File: adasfas.png (40 KB, 615x873)
40 KB
40 KB PNG
>>107793607
? do you not have the checkpoints in the checkpoint folder?
>>
>>107793619
>>107793622
nvm i refreshed comfyui and it's fine now. ignore me
>>
and roll credits:

the blonde man driving the car says "I am LITERALLY me.". he then exits the car to the right and closes the door. 1980s synth pop music plays in the background.

https://files.catbox.moe/38mcc8.mp4
>>
>>107793624
benchod fucker
>>
>>107793624
you could also just get rid of the combo and put it directly in the loader, I forgot I dont need that anymore
>>
File: o_00084_.jpg (597 KB, 1344x1728)
597 KB
597 KB JPG
>>
question for the elders, can ltx quality be fixed to be at least wan 2.2 level?
>>
File: file.png (19 KB, 413x214)
19 KB
19 KB PNG
did comfy break auto increment? it's set to increment before generation but it just doesn't.
>>
kneel to Todd:

https://files.catbox.moe/y297tw.mp4
>>
so what are the advantages of ltx2 over wan2.2 beside the sound part?
>>
>>107793670
fidelity, knowledge, speed, and audio is huge And training it will be MUCH easier since only 1 model
>>
>>107793670
Faster
I2V and T2V in the same model
No high noise/low noise split
>>
>>107793670
It can gen beyond 5 seconds out of the box, without SVI? Everything else looks worse, especially the video quality.
>>
>>107793683
>>107793681
>faster
It's not if you want same quality as wan
>>
>schizobake
>>
>>107793686
oh that too, it does best at 15 secs but can do like 50 ive seen without losing too much coherence
>>
>>107793627
I don't care about audio, just wish we had this degree of control with wan 2.2
>>
>>107793704
this as well, its prompt following is crazy good
https://ltx.io/model/model-blog/prompting-guide-for-ltx-2
>>
>>107793342
can it do jp audio too or only english/chinese?
>>
>>107793694
50 steps with this is still faster than 50 with wan and looks better + audio
>>
>>107793725
ive seen people do Italian, Spanish, Russian, Japanese and Chinese
>>
>>107793721
havent tried yet
>>
>>107793712
>prompt following is crazy good
Btw, for best prompt following set CFG for FIRST STAGE to 4 ish instead of 1, 2nd stage is fine at 1
>>
>>107793725
but 50 steps with this is like 15 steps with wan
>>
>>107793746
speed? yes. quality, your lying
>>
anyways goodbye
>>
>>107793584
OOM's on a 24GB, while anon's WF from one of the last threads can do the same rest, but twice as many frames on my setup and still not OOM.
>>
>>107793686
>Everything else looks worse, especially the video quality
The quality is so much worse that it makes it not worth it. Are people just blind to how bad it is?
The sound is awful as well, it's unnerving. Is prompt-adherence really worth it when shit looks fucking atrocious?
>>
>>107793452
Pro tip for this one, if the upscale is ooming you, switch off the meme res2 sampler. (you can just do more steps if you absolutely must)
>>
you're a big guy

https://files.catbox.moe/is9px7.mp4
>>
>>107793786
The sound, you can get around by genning the voice with a dedicated model like VibeVoice and using that as the reference, but the output quality... yeah, it's pretty fucking bad.
>>
>>107793802
>>107793786
I don't think the quality is nearly as bad as what you are saying. A lot of portrait stuff looks downright real.
>>
>>107793802
you can get perfect cloned voices if you use the workflow with the audio input + image input.

not related, this is just i2v

https://files.catbox.moe/hxq5mv.mp4
>>
>>107793584
oom's, fuck nigga
>>
>>107793826
>perfect cloned voices
So you're actually trolling, right?
>>
Show me something cool
>>
Are we pretending wan 2.2's slow motion 16fps videos look good now?
>>
>>107793840
You didn't see the Jensen Huang one?
>>
>>107793840
another anon did it too, or you can just clone a voice fast and use that as the audio input and the video will match the audio.
>>
>>107793847
No, but this one sucks all of the detail out of the input image. Skin goes ultra smooth, for example. It looks like liquid shit.
>>
>>107793854
heres a sample just with swapped audio:

https://files.catbox.moe/1a4t0s.mp4
>>
>>107793855
sour grapes
>>
>>107792305
Why is AniStudio not in OP? It'sa local UI and should be in OP
>>
Don't bite, anons. You're better than that.
>>
>>107793876
thissss, ltx is so good, vramlets are fucking seething. keep using your paycuck saarsshit loool
>>
>>107793884
it isnt even that demanding. I am using fp8 distilled with 16gb (4080) and 64 physical ram. faster than wan, also.
>>
File: 1755986211256060.png (2.07 MB, 1024x1472)
2.07 MB
2.07 MB PNG
Questionable quality
with static camera lora
https://litter.catbox.moe/0b8nl1lzt51asy01.mp4
>>
cia guy does persona:

https://files.catbox.moe/872zwe.mp4
>>
>>107793904
is this ai?
>>
>>107793925
no its real
>>
>>107793904
and one more with miku: didn't prompt a female singer but there you go.

https://files.catbox.moe/z7dj2x.mp4
>>
>>107793931
you saying it recognised the sex of the singer, determined that the image contained no subjects of the appropriate sex and so conjured one to fill the role?
>>
>>107793584
There's definitely some fuckery going on with that temporal upscaler. Obviously it's way more demanding on vram. Not a big deal. But no matter what length you put in it stretches the video to 10 seconds. A lot of weirdness going on there.

Are they even supposed to be used at the exact same time together? You can't run one then the next?
>>
>>107793939
it works for anyone I just used a basic prompt so it probably inferred female singer.

also, ltx2 knows Trump natively:

https://files.catbox.moe/3j4wda.mp4
>>
>>107793946
use spatial, I havent had issues with vids of various frame lengths
>>
>>>/wsg/6067234
Is there no way to hook the upscalers to the KJ's audio reference workflow?
>>
>>107793949
this would trick 99% of boomers btw:

https://files.catbox.moe/rz0nck.mp4
>>
Yep he's trolling.
>>
if I'm being honest. The best results come from just not fucking with the upscalers and genning natively. The time rounds out the same due to model loading and offloading anyway.
>>
>>107793991
true but that is a 4x is processing time, there may be a middle ground
>>
>>107793946
sorry, you need to increase the fps from 24 to 48, I forgot to change that in the WF
>>
>>107794018
Yeah, go fuck yourself.
>>
>>107793980
thank u mr president
>>
how come sometimes it ooms, and sometimes it shuts down completely with no word
>>
we need to save the price of memory guys:

https://files.catbox.moe/r0drv0.mp4
>>
File: LTX-2_00010_-1.mp4 (2.63 MB, 704x960)
2.63 MB
2.63 MB MP4
ltx is fucking weird.
>>
>>107794018
Okay so it's basically interpolating the frames to avoid smearing? Does it really need to run on the same pass as the normal upscale then?
>>
>>107794028
that is legit the issue there
>>
>nesting subgraphs inside other subgraphs
Kill yourself if you do this
>>
>>107794018
are you a troll or do you just not test your own workflow before posting 7 different variations of it on here?
>>
File: 1741296243616293.png (2.17 MB, 1024x1440)
2.17 MB
2.17 MB PNG
yeah back to the 1girl plastic factory for me
>>
this model is incredible.

the ugandan man says "why are you no generatin one girls? are you gay?". he has a heavy ugandan accent.

https://files.catbox.moe/4y5xau.mp4
>>
>>107794057
subgraphs are completely retarded
>hurr duur let's just hide all the settings you want to change and constantly tinker with in a random maze of arbitrarily nested bolognese
nice humiliation ritual
>>
What's the best workflow around? I've tried two from here and they were both shit.
>>
>>107794075
I've noticed people using subgraphs as an excuse not to clean their shit up.
>>
>>107794074
Kekd. Low res image makes the shit audio quality not seem so out of place
>>
American Jammer:

https://files.catbox.moe/5ltpq5.mp4
>>
Meh, audio on an easier to setup model is nothing burger, this is the same level of hype for SD 1.4 or something, and I'm bout 156,022 gens in, wake me up when it can do 1080p technicolor with sound without face glitches
>>
>>107794112
not even sora 2 can do that so I guess wait 100 years for nvidia to give us 1TB GPUs
>>
chinkoid vram monster when
>>
we are getting there, most of us really should be checking our time and use it wiser, none of this is considered fomo, I predicted the ram and gpu hysteria 18 months before it hit and now have 2 4090s and 384 gb ram, and im also just chilling waiting for something good, some of these are funny but I still won't whip out my setup to run them, though Ill hit up ZIT in a month or so, seems good,

TLDR dont waste time, nothing here is fomo, unless you anticipate loras and models being deleted, then you are trapped forever
>>
we will soon have the best of both worlds for some people, kijai is working on a video to audio WF so people could feed it wan2.2 videos
>>
>>107794146
>nothing here is fomo
I think you're using that term wrong.
Maybe you mean "nothing here is worth fomoing over."?
>>
Chat if I don't generate an epic meme right the heck now I fear the humanity might go to waste
>>
>>107794160
true but I don't think those are at all corrolary
>>
My main issue with LTX's lipsyncing is they really exaggerate the mouth and jaw like it's a cartoon, so it looks a bit uncanny for real life photos. Every time something is spoken it's done with the full extent of the mouth, no subtle movements. Some times it looks like the mouth+jaw is just too large to be real.
>>
so can ltx do goon stuff yet?
>>
>>107794216
for some reason its far better when given audio to work with like https://www.reddit.com/r/StableDiffusion/comments/1q627xi/kijai_made_a_ltxv2_audio_image_to_video_workflow/
>>
>>107794224
Depends entirely on what you goon to. But for what I assume the majority goons to? No.
>>
>>107794224
Yes and no. Depends on what you're into. It'll run with whatever image you feed it, but it knows nothing about sexual actions yet
>>
>>107794237
k guess i'll wait for a few days until everything optimized
>>
>>107794249
You're looking at weeks to months until there's decent LoRA's.
>>
>>107794254
nah, theres a official trainer and people are already training AND its a 1 model setup so its 10x easier than wan2.2 to train https://www.reddit.com/r/StableDiffusion/comments/1q6asqd/ltx2_lora_training/
>>
>>107793513
What confuses me whoever is doing this does it for so long that he crashes for a full day only to start up again. I think this is a combo of both pastebin schizoa
>>
>>107794314
no shit?
>>
File: 1759619741145791.png (1.98 MB, 992x1504)
1.98 MB
1.98 MB PNG
last 1 girl mayhaps?
>>
File: ComfyUI_temp_cnzpx_00023_.png (2.72 MB, 1600x1000)
2.72 MB
2.72 MB PNG
>>
>>107794333
chroma + lora?
>>
>>107794327
Many anons forget, they think it's only the dev. The other one is even more schizo and is happy to see the second one become like him. You know who used to be sort of normal and he's becoming a clone of the first one.
>>
So if the provided training code has deepsneed It can use multi gpu out of the box to train right?
>>
>no bake
AI is kill
>>
File: 1757894739925344.png (1.66 MB, 1472x1024)
1.66 MB
1.66 MB PNG
>>107794333
im sorry I said LAST 1 girl
>>
kinda underwhelmed by ltx2 for now, but imma wait a few weeks and reserve my judgement until people figure out workflows and train some loras, etc.
i remember early Wan, people were giving up, before we had the speedup loras, cause you had to wait 40 minutes to generate a 5 second video
>>
>>107794362
we're still page4
>>
File: ComfyUI_temp_cnzpx_00024_.png (2.1 MB, 1600x1000)
2.1 MB
2.1 MB PNG
>>107794333
Yes, chroma, ever since comfy optimized the fp8 performance of comfy + chroma fp8, it has been running really good for me.
I don't know why people don't like it, its basically flux nsfw uncensored, wish the furry author kept training it or someone made a proper finetune for it
>>
File: breh.mp4 (937 KB, 384x704)
937 KB
937 KB MP4
no cap this is too good, ltx is amazing
>>
>>107794362
mfw we find out that it was just the two schizos baking in tandem all along and no one else has the willpower
>>
Anyone else never even read the OP and just get annoyed when two schizos get into arguments over some formatting or some shit.
>>
File: 1761239394666896.png (6 KB, 93x290)
6 KB
6 KB PNG
>>107793563
afterburner graphs showed ram & vram being 99% maxed at one point so it definitely must have spilled into the pagefile. weird but it just went away after using the reboot function from within comfyui. same launch parameters and everything and its back to normal speeds
>>
>>107794395
yep, i've never clicked a single link from the OP, the only valuable thing is the collage image
>>
File: 1752905501519626.png (1.68 MB, 1472x1024)
1.68 MB
1.68 MB PNG
my wife
>>
>>107794379
I have
>Chroma1-HD-fp8mixed-final
downloaded, but I forgot to try it. Have to give it a go later
>>
>>107794411
mutated feet yummy
>>
>>107794379
he moved on to z image it seems
>>
File: 1764976400381705.png (2.05 MB, 1472x1024)
2.05 MB
2.05 MB PNG
>>107794420
mutated feet? here you go!
>>
File: 1755030985955151.gif (10 KB, 127x128)
10 KB
10 KB GIF
>>107794432
>>
With temporal upscale
https://files.catbox.moe/4xmhy8.mp4

Spatial upscale only
https://files.catbox.moe/sapzgh.mp4

No upscale at all.
https://files.catbox.moe/40s59z.mp4

Who wins?
>>
>>107794444
no upscale of course is gonna be better but its also 4x as expensive. The point of it is to make it faster
>>
File: 1739067694169691.png (2.12 MB, 1024x1472)
2.12 MB
2.12 MB PNG
more poses for this thot
>>
>>107794453
I don't think the time savings are there. There were all pretty comparable. But the temporal upscale resulting in fucky movement, huge uptick in vram use and weird pacing that wasn't fixable even by doubling the framerate.
>>
>>107794476
hmm, maybe its still broken in how comfy is handling it then cause that is the entire point of downscaling 0.5 and using the upscaler for the 2nd stage
>>
>>107794495
Probably. I don't think there is anything wrong with the upscaler itself. But I don't think it's being used right.
>>
File: file.png (24 KB, 533x174)
24 KB
24 KB PNG
Had to reinstall comfy and now I don't have the free vram button anymore, help pls, top is how it used to be, bottom is now
>>
File: WanVideo2_2_I2V_00477.png (1.28 MB, 1248x704)
1.28 MB
1.28 MB PNG
https://files.catbox.moe/5v9yju.mp4
>>
File: ComfyUI_06016.png (2.51 MB, 1280x2048)
2.51 MB
2.51 MB PNG
>>107793249
I don't know... would you keep it classy?
>>
>>107794516
Reinstall comfyui manager
>>
when ready migrate
>>107794552
>>107794552
>>107794552
>>
>>107792395
what class is she
>>
>>107796010
healer
>>
>>107796026
how does the healing procedure look?
>>
>>107796036
she points her pointy staff at the recipient, chants a spell and light/energy begins to flow
>>
>>107796053
thank (you) and (her)



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.