[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


Breaking Their Hardware Edition

Discussion of Free and Open Source Diffusion Models

Prev: >>107789820

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/kohya-ss/sd-scripts
https://github.com/tdrussell/diffusion-pipe

>Z Image Turbo
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>WanX
https://github.com/Wan-Video/Wan2.2
https://comfyanonymous.github.io/ComfyUI_examples/wan22/

>NetaYume
https://civitai.com/models/1790792?modelVersionId=2485296
https://nieta-art.feishu.cn/wiki/RY3GwpT59icIQlkWXEfcCqIMnQd

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe|https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
>>
hahaha, this is pretty good. used a meme image.

the man says "you are fired you dumb nigger, get out of my office" in a Donald Trump voice.

https://files.catbox.moe/e2a82t.mp4

96 seconds, on a 4080/16gb with 64gb ram.
>>
>>107791088
what a shitty collage lmao
>>
https://files.catbox.moe/ly773b.mp4
>>
Total 1girl Victory
>>
>>107791088
>>Maintain Thread Quality
>https://rentry.org/debo
>https://rentry.org/animanon
this needs to get updated with the test anon and ranfaggot
>https://rentry.org/ranfaggot
>>
>>107791098
lmaooo, the kino is off the charts
>>
File: ComfyUI_temp_tpsca_00036_.png (2.71 MB, 1824x1248)
2.71 MB
2.71 MB PNG
>>
Coombrains lost, LTX chads won
>>
>>107791104
i thought this was going to be a jumpscare, can it do that? https://www.youtube.com/watch?v=GMgsFZ4rkEI
>>
>>107791111
quads of truth. no idea why anons that don't post here are even mentioned. just makes the author look like a jaded and spiteful schizo
>>
>>107791088
this is 100% the neta yume bunny/spiderman spammer
>>
>>107791117
too late this thread was already claimed by the coomers >>107791106
>>
>>107791115
why are you spamming the same image over and over?
>>
>>107791129
no im not
>>
File: ComfyUI_temp_ipaor_00077_.png (2.08 MB, 1120x1400)
2.08 MB
2.08 MB PNG
>>
>>107791138
same obsession and anti-fun behavior with models you dont like. you started this even before the model dropped too. kys
>>
sheeeeeeeeeeeeeit!

https://files.catbox.moe/8osuj1.mp4
>>
>>107791158
fuck off benchod
>>
>>107791164
funny post
>>107791152
coomer braindead incel coded post
>>
File: ComfyUI_temp_cnzpx_00055_.png (1.94 MB, 1120x1400)
1.94 MB
1.94 MB PNG
>>
>>107791164
kek, that's good
>>
>>107791164
sheeeeeeeeeeit x2

https://files.catbox.moe/ye1f6v.mp4
>>
>>107791152
>>107791176
wheres the ass tho
>>
>>107791181
we wuz chemists and sheet
>>
>>107791181
gonna try this with more frames.
>>
>unfunny floyd spammer
>unfunny anti-ltx2 most generic ever 1girl spammer
we're only missing the other schizo randomly saying that comfy is shit and we won bingo.
>>
File: ComfyUI_temp_ipaor_00081_.png (1.82 MB, 1120x1400)
1.82 MB
1.82 MB PNG
>>107791171
try breathing some fresh air anon, you're watching too much youtube and reading too much 4chan, don't turn your whole personality into that
>>
>>107791197
>comfy is shit
you mean the ani is shit schizo right? anyone shitting on comfyui isn't schizo
>>
>>107791199
>don't turn your whole personality into that
says the guy making his hate towards edgy memes his whole personality btw
>>
220 frames, I can tell I was close to max ram/vram cause system was slow-ish

but it works!

https://files.catbox.moe/xlp0fb.mp4
>>
>>107791205
comfy nodes 2.0 will never be good, ani did this
>>
>>107791197
you forgot about the nigger troon baker that ban evades and won't fuck off after being told to many times just like the floyd spammer
>>
>>107791197
its funny how the most uncreative are often the biggest faggots
everytime
>>
>>107791197
at least the floyd spammer isn't screaming "STOP HAVING FUN" towards others lol, the 1e-thot spammer is more obnoxious
>>
https://files.catbox.moe/84ucai.mp4
>>
File: ComfyUI_temp_cnzpx_00059_.png (2.13 MB, 1120x1400)
2.13 MB
2.13 MB PNG
1girls always make chuds seethe because they can't generate them
>>
>>107791223
sometimes the world doesn't need to know you are having a good time
>>
https://files.catbox.moe/wbcavs.mp4
lol it beeped the swear
>>
File: 1767161505052182.png (207 KB, 1877x842)
207 KB
207 KB PNG
>>107790378
The video works but the audio seems to just use the real audio then silence. How do I make a latent audio from 49 video frames + 200 frames empty latent?
>>
>>107791236
sometimes the world doesn't need to know you are having a bad time
>>
>>107791249
proof?
>>
>>107791011
Thank you for reminding me of this album. Worlds and Adventure. Good times.
>>
>someone actually mad at floyd
cope more nigga:

https://files.catbox.moe/lcm17t.mp4
>>
>>107791268
lmaoo
>>
>>107791205
like clockwork
>>
>Latent previews when?
>Full suit of kijai workflows when?
>LoRA training when?
>>
>>107791099
theres been much worse
>>
>>107791268
try to go for a real song kek
https://www.reddit.com/r/StableDiffusion/comments/1q627xi/kijai_made_a_ltxv2_audio_image_to_video_workflow/
>>
>>107791284
2 more weeks
>>
the LTXV2 WFs are MUCH better but my god does its clip loader take forever. It takes like 20 minutes for it to load. Better hope you have the ram to not have to offload it
>>
https://files.catbox.moe/pt91gq.mp4

Oh uh... okay.
>>
>>107791294
yeah im gonna try that stuff next. got a funny output from this though:

the black man lying on the ground takes out a bag of white powder with his hand and says "sheeeeit gotta get high", in the voice of a black rapper. rap music plays in the background.

https://files.catbox.moe/zpncbt.mp4
>>
>>107791299
I wonder if I could force it into my second 3090.
>>
>>107791299
>my god does its clip loader take forever. It takes like 20 minutes for it to load.
yeah for some reason it's slow as fuck, but I'm even more unlucky since I have this error right after (when trying to do prompt rewriting)
>RuntimeError: Expected all tensors to be on the same device, but got index is on cpu, different from other tensors on cuda:0 (when checking argument in method wrapper_CUDA__index_select)
>>
>>107791300
yeah sometimes I got that as well, like it creates a new character when ultimately there's already one waiting to do the shit, that's weird
>>
I cracked the code!!! lol
https://litter.catbox.moe/vd41ik7sqpxhw4pj.mp4
>>
>>107791319
it works for me but I think I saw a fix for that somewhere. It might have been turning off previews in comfy with --preview-method?
I might be wrong.
>>
oh btw, supposed to be using this detail lora as well apparently:
https://huggingface.co/Lightricks/LTX-2-19b-IC-LoRA-Detailer
>>
>>107791346
why do they want us to use loras, if they wanted their model to be more detailled why couldn't they just finetune it
>>
My fucking god, the quality of this general fell off a cliff after ltx-2 release
>>
>>107791357
harder to put a virus in the model itself
>>
Can someone help me understand the phenomenon of people being upset when a new model gets released and people inevitably play with it?
>>
>>107791357
same reason why z image lacks all diversity of outputs. Training in "everything is detailed" makes everything detailed and does not let you create gens where they are not. With loras its just more controllable.
>>
>>107791376
you benchod be using model that don't make goon instead of shitting on it
>>
>>107791376
it happens with every heavy model. Same with wan, then flux 2.
>>
>>107791381
>Training in "everything is detailed" makes everything detailed
if they can't make the model non biased if they decided to add one additional style they fucking suck lol
>>
>>107791376
severe mental illness. Don't pay attention.
>>
>>107791376
it's just vramlets jealous they can't play with that new big model, so they cope by pretending that it's gonna be bad anyway
>>
>>107791391
You can't. These models are literally "bias" predictors. Either you make a model that looks great every generation but can only look great like z image, or you made a model that is absurdly creative but also gives garbage half the time like chroma. You sharpen the distribution
>>
kijai workflow with audio + image input actually works, lmao. gonna try to clip the audio better then post results.
>>
File: Lan Party.png (590 KB, 650x432)
590 KB
590 KB PNG
>>107791088
>>burns local(host)
>>doesn't mention pinokio

why?

pinokio.co
>>
>>107791403
>you will never experience this
breh
>>
>>107791402
if you gotta clip then its not working maroon
>>
>>107791399
i haven't seen a single ltx-2 gen that is good though. the sound is fucking atrocious as well. I know you need to justify spending all that money but this shit is worse than wan
>>
https://files.catbox.moe/dlhac7.mp4
>>
>>107791412
>shit is worse than wan
Either you are trolling or you have not been looking
>>107790899
>>107790820
>>107790043
>>
>>107791406
I did.
My feet bloated from sitting down for 3 days with like 1 hour of attempted sleep, I couldn't fit my shoes on when leaving.
I slept for 47hours, waking up once to eat dinner and fall asleep again instantly.
>>
>>107791425
those were the ones I referred to
>>
>>107791425
>Either you are trolling
obviously he's trolling
>>
to fix the long clip load times in the LTXV WF just use the LTXV audio text encoder loader instead
>>
>>107791411
im using a song. got an output where it was synced to the song
>>
>>107791428
benchod...
>>
Lately I feel the schizos have upped there game. I spent a good deal of time last night trying to verify that ggufs of gemma wouldnt work in the native loaders and a schizo spent all of their time faking screenshots of Json files to prove it did.
>>
>>107791425
Do localfags really pay 7k dollars to generate this trash?
>>
>>107791445
>7k dollars
a 3090 costs 7k dollars in your third world shithole? damn
>>
>>107791444
they werent fake you are the schizo
>>
>>107791445
>7k dollars
works on my 4090 and its way faster than 4090 and its far cheaper than paying for sora 2 / veo 3 would be. and wan2.2 does not even compare
>>
why all of you so exited about sd 1.5 level video model
>>
so I think I'm absolutely sure the only ones left here are the worst /ldg/ posters. where did all the good ones go?
>>
>>107791455
No you don't understand. One of the outputs was just a slideshow panning image.
>>
>>107791445
at least we can generate por-*checks LTX-2* uhmm yeah
>>
>>107791455
>its way faster than 4090
woops, way faster than wan I mean
>>
>>107791455
>my 4090 is way faster than a 4090
oh you are one of those chinese hackers right?
>>
>>107791461
I'm right here benchod
>>
cope and seethe floyd lover: example with audio file + image, kijai workflow.

prompt: the man is singing with passion.

workflow: https://www.reddit.com/r/StableDiffusion/comments/1q627xi/kijai_made_a_ltxv2_audio_image_to_video_workflow/

https://files.catbox.moe/jnqa1w.mp4
>>
>>107791461
da 'cord
>>
>>107791444
that was (also) me lmao, city still hasnt implemented goofs for gemma XD
>>
on top of that veo / sora two are censored to shit. This can do copyrighted stuff and eventually nsfw when loras come out. The non distilled dev ones actually does decent nudity out of the box
>>
>>107791477
>having to rely on reddit
its over
>>
>>107791461
the actual devs don't come here anymore. ani left forever ago and comfy's last words were I think you guys are retarded
>>
>>107791481
comfydev already implemented it himself, just plug the gguf into the loader
>>
>>107791477
do more you arsehole, i need more
>>
>>107791492
cumfart never does ggufs right
>>
I put the starting image strength too low but here, for the 1girl gooner
https://files.catbox.moe/jb0em0.mp4
>>
File: file.jpg (744 KB, 3520x3520)
744 KB
744 KB JPG
>>107791477
how is this better than wan?
>>
>>107791480
link? I'm done with this shit especially the catjak daggotry that's been going on for over a year
>>
https://files.catbox.moe/kk5zo2.mp4
Not bad for the default workflow and 840x480, is there a way to run 720p without ooming? I have a 4090 and 64gb of ram
>>
>>107791497
in due time anon, I need to clip some miku songs and other stuff in audacity
>>
File: file.png (273 KB, 637x463)
273 KB
273 KB PNG
>>107791503
amazing quality
>>
>>107791503
One thing I've noticed about ltx is how well it does believable wobbly flesh. All the wan LoRAs that address that turn the women into elastic putty.
>>
>>107791510
omg so much better than wan! insane quality!!! nobody can compete!!!!
>>
File: file.png (9 KB, 522x74)
9 KB
9 KB PNG
>>
>>107791507
There are people not genning at 720? Did you use --reserve-vram?
>>
>>107791520
>rips and tears your nvme automatically
>>
>>107791516
That is the big advantage of 25 fps, wan's 16 fps sucks for realistic stuff
>>
>>107791516
oh it makes more than flesh wobbly alright
>>
>>107791507
i2v seems incredibly hungrier than t2v for me, it craps out at barely 5 seconds of 480p and that's with the upscaling lora
>>
>>107791461
im training rn srry
>>
Is wan 2.2 just still the best coom video generator? Just feels like half the thread is outdated
>>
>>107791536
2.5 is way better
>>
I am tempted to try this because I have wanted to make ai generated fart porn for the longest time, and now we have a model that can add sound clips. I have tons of fart sound clips from real women. What does it sound like when you try to gen a person farting with a fart sample? Can someone test it out with one of those generic fart sound effects?

Plz, just once.
>>
for 1girl anon
https://files.catbox.moe/gxk6f2.mp4
>>
>>107791523
I did not. I opened the workflow, hit run, got an oom, dropped the res, posted and hoped I'd get spoonfed while I was searching for the solution

>>107791523
>>107791535
Weirdly enough I tried setting the res back to 720 and now it's running fine, ~4s/it
https://files.catbox.moe/mhw7xg.mp4
>>
>>107791535
nice try. no good poster would still be posting in this trainwreck thread
>>
File: 1757601531829906.mp4 (637 KB, 640x640)
637 KB
637 KB MP4
https://litter.catbox.moe/5qrgd03xes9sfkkb.mp4

it's fast as fuck, cant wait for more optimizations
>>
Can you guess which model is which?https://files.catbox.moe/1fe2ho.mp4
https://files.catbox.moe/dcffn3.mp4
>>
>>107791571
ltx because it looks and sounds like shit
>>
https://files.catbox.moe/mfyi2m.mp4

Hmm
>>
>>107791571
gee anon i wonder if the one where the girl becomes a blur while moving is ltx
>>
>>107791587
>slowly becomes uglier
>>
this shit is fucking amazing
https://files.catbox.moe/blvkhg.mp4
https://files.catbox.moe/pkwnql.mp4
https://files.catbox.moe/alxtxw.mp4
https://files.catbox.moe/n8uj9n.mp4
https://files.catbox.moe/sn0234.mp4
https://files.catbox.moe/x12icb.mp4
https://files.catbox.moe/v4yfo2.mp4
>>
>>107791583
forgot to say top is the shit one (ltx)

>>107791605
stop eating shit it isn't good for you
>>
I really hope the slopspam ends by tomorrow morning
>>
LMAO, the singing prompt still worked. but, different audio. you'll see.

the man is singing with passion.

https://files.catbox.moe/cdvjd3.mp4
>>
File: file.jpg (494 KB, 5000x2435)
494 KB
494 KB JPG
>>107791605
watched all, its garbage lmao
>>
>>107791605
wow no h*ck*n w*y this is ai!!!!!
>>
>>107791611
>stop having fun without me
reeeee!!!
https://files.catbox.moe/cfqh9j.mp4
https://files.catbox.moe/tmjtzr.mp4
https://files.catbox.moe/kqwagj.mp4
https://files.catbox.moe/rwrjnq.mp4
https://files.catbox.moe/i7fkiu.mp4
>>
>>107791620
got the full quote. 200 frames, 54 seconds.

for you.

https://files.catbox.moe/7ntqiw.mp4
>>
The highs of Z Turbo release have finally been reconciled by the lows of LTX2
The universe is balance once again
>>
File: b.webm (3.92 MB, 632x456)
3.92 MB
3.92 MB WEBM
>>107791510
>>
https://www.nvidia.com/en-us/geforce/news/rtx-ai-video-generation-guide/
Comfy won...
>>
>>107791652
this made me realize
>>
Have 2 4090s not touching this shit until hacking isn't needed
>>
>>107791671
>one argument when launching
>hacking
>>
>>107791677
you forgot the file editing
>>
>>107791679
That's already been fixed.
>>
>>107791652
always believe the opposite of what njewdia says
>>
>>107791687
i dont see any updates for comfy
>>
I'm gonna spend far too long remaking memes with this
https://files.catbox.moe/yl77ty.mp4

>>107791690
you have to git pull
>>
>>107791702
>just get unstable updates and break your shit
yeah nah
>>
Waiting to see a video good enough to convince me I need to try it
>>
File: file.png (656 KB, 1194x1092)
656 KB
656 KB PNG
do diffusion models talk too?
>>
kek, it works with ANY audio.

the black man wearing a black sweater is singing with passion.

https://files.catbox.moe/namgmn.mp4
>>
>>107791712
its broken how you have it. He has several fixes merged
>>
>>107791702
i'd just save it for a better model. the other thing I am noticing is the fried look. this one is just looks uncanny after a few seconds too. the subtle morphing on gene's face is just disturbing and the audio is still terrible. memes are supposed to be funny, not endured
>>
>>107791461
mr Number One Ani Defender spammed the thread until everybody left so he can gaslight some newfags
>>
>>107791716
Look for Sora 2 videos on youtube.
>>
>>107791727
prob would look better if I used more steps but I think its good enough for less than a min to gen each. And I could also just do a 2nd pass later
>>
>>107791605
pretty cool
>>
>lora super slopped
>even the 1k steps version is giga fried slop
fuckk
>>
>>107791734
>hey I want some demo of how good your shitbox is
>ah just look for ferrari videos saar
>>
>>107791742
Well you wanted something "good enough". Obviously local vidgen is not on that level, it's a shit-tier toy.
>>
>>107791719
lmaoo this is good
>>
>>107791746
wan is better
>>
>>107791088
The bottom girls face looks very familiar, like some tiktok thot
>>
>>107791746
>you
That's a different anon. I didn't reply.
>>
>>107791760
Prove you aren't him
>>
>>107791754
show me gens even close to them please. I have yet to see anyone back that up and dont show me something that requires some lora
>>
*yawn*
>>
>>107791765
Would if I could
>>
>>107791731
it's believable people would stick up for ani. nobody defends niggerjak kek
>>
>>107791767
see >>107791647 vs >>107791510
or >>107791571
>>
>>107791782
except you admitted you made a thousand posts defending ani >>107536391
>>
>>107791716
signs point to not worth it
>>
>>107791789
i dont even know who ani is
>>
>>107791783
Ive seen better made by both.
>>
>>107791795
moving the goalpost saar?
>>
>>107791789
it's funny you still just attack ani but can't even excuse your own shit behavior. nobody wants to defend you, not even yourself
>>
https://files.catbox.moe/t1pnqg.mp4
do you think it can happen?
>>
>>107791790
it is faster than wan AND has audio. it is absolutely worth trying.
>>
>>107791798
you showed one bad gen as proof compared to the probably 100 shown that did not have those blurry issues
>>
>>107791803
you made them barely move so they don't blue didn't you? kek
>>
>>107791792
>>107791800
*yawn*
>>
>>107791808
The audio is awful and the videogens are inferior to WAN. Don't waste your time.
>>
>>107791811
prove me wrong you can't
>>
I will prove to you right now ltx2 is worth it. with the kijai audio + i2v workflow he posted:

prompt: the man is singing with passion.

https://files.catbox.moe/uyhrgg.mp4
>>
>>107791808
great, you can make shittier quality videos with the worst audio of any model that can do so. congrats
>>
>>107791808
>>107791821
>Sure wan can make good videos, but lpootx can generate shitty video AND shitty audio at the same time
>>
>>107791824
wake me up when its the full song
>>
>>107791822
>n-no you!
You can't even run any of these models. Got it.
>>
>>107791837
well it'd have to be in 10-15 second segments then stitched...
>>
>>107791842
i made all the videos i showed you, in fact most ltx gens in this thread are mine
>>
>>107791824
he's the best poster on /ldg/ Idc what anyone says lmao
>>
>>107791847
did I stutter?
>>
prob wont happen but:
https://files.catbox.moe/jeyo2b.mp4
>>
File: 1763718017999330.mp4 (1.11 MB, 640x640)
1.11 MB
1.11 MB MP4
https://litter.catbox.moe/xd2j7kjau9222dyu.mp4
lol
>>
>>107791824
>it's shit but it can sing
Who the fuck cares. Try better faggot
>>
File: 1747422240381463.png (1.49 MB, 1280x720)
1.49 MB
1.49 MB PNG
>>107791857
-> >>107791803
>>
File: troll.png (6 KB, 498x179)
6 KB
6 KB PNG
>>107791848
the fuck is this retard level trolling?
>>
>>107791803
>>107791857
nice repost
>>
>>107791869
he is succesful at that since you can't stop taking the bait lol
>>
>>107791869
1. that's not me
2. i didn't link those posts
3. you are a retard
>>
>>107791769
*farts in ur mouth while you suck in the gas during your yawn*
>>
File: 2025-10-01_18-35.png (1.12 MB, 1108x864)
1.12 MB
1.12 MB PNG
>>107791115
everyone is looking at her tits and or hatsune miku and I am wondering how that text outline was made
>>
>>107791869
gens are actually mine, so fuck off
>>
this kijai workflow might be even more optimized, could do 280 frames without vram issue or ram, 78s.

my pc has 16 vram (4080) and 64 RAM.

lmao it did a transition, not bad:

https://files.catbox.moe/01bfvp.mp4
>>
>>107791861
jesus. can 4070 ti super 16gb handle this?
>>
>>107791890
George shakes his head as if he was hinting he wouldn't be the one singing it kek
>>
>>107791893
im on a 4080 so definitely yes
>>
>>107791890
where is that workflow?
>>
>>107791902
I forgot to set the audio duration above 10, thats why it ends abruptly. still, it's amazing how we have a local audio + video model now that works.
>>
Can someone please just do a regular photo 1girl walking toward the camera in a sexy manner so I can see whether this new model is good or not?

I don't have enough VRAM to try it out myself, but the absence of any kind of guidance-like parameter is a bad sign
>>
>>107791890
>shakes head declining and forces a transition
Floyd is too powerful
>>
>>107791915
https://files.catbox.moe/f9fvjr.json
>>
>>107791929
thanks benchod
>>
>>107791923
>in a sexy manner
impossible mode kek
>>
321 frames, 140 seconds. but I got a full loop of the song!

https://files.catbox.moe/q8k4fm.mp4
>>
>>107791923
>girl
>sexy
>ltx
next you will want us to mix oil and water
>>
File: file.png (285 KB, 486x604)
285 KB
285 KB PNG
>>107791946
freaky
>>
>>107791957
>when the fent hits
>>
https://files.catbox.moe/zny460.mp4
>>
I have a theory. the Jews want you to have a taste of vid+audio gen so you subscribe to oai for sora because any idiot can see this model sucks ass in every regard except speed
>>
>>107791978
is that what you're gonna do?
>>
>>107791985
not going to bother with either of those. they are both cucked and nobody is going to train this dogshit for nsfw
>>
also if you use the audio + image workflow, if you use a second or two of the actual person talking, it will basically clone their voice for the rest of the prompt.
>>
>>107791994
yea, this shit is insanely good at voice cloning. I said it before but it will 100% get people arrested
>>
>>107791989
that puts the kibosh on your theory
>>
>>107792004
people are for sure gonna do porn deepfakes with it to a level never before seen. New legislation is gonna be forced out due to it
>>
>>107791989
Even if it can’t do nsfw, if it can simply add audio to existing ai videos it is still useful.
>>
>>107792004
is there a way to use the source audio to clone them without using 1-2 seconds of the audio in the original gen? like, if you want a new statement without any other words.
>>
KINO! even works for miku/anime.

the girl with teal hair is singing with passion.

https://files.catbox.moe/vfwmdy.mp4
>>
https://files.catbox.moe/nmlvyq.mp4
>>
>>107792033
kek, good one jerry
>>
>>107792025
clip it out? You can prompt entire scene changes and such https://ltx.io/model/model-blog/prompting-guide-for-ltx-2
>>
>>107792020
then why package it together in a fuckhueg model file? seriously, multimodal is a dogshit practice. people make hacks to use different models together all the time, why the fuck does everything need to do everything if it can only do one thing well?
>>
>>107792037
you sure it wasn't kramer?
>>
>>107792043
multimodal is how it actually understands how expressions / faces / body lauange works. If it wasn't trained alongside video it would be useless at the task
>>
can someone just post something good?
>>
>>107792004
But what is the workflow to continue a voice clone? That workflow needs full audio.
>>
>>107792054
that is literally just the video diffusion model and it's encoder. that has nothing to do with the audio. the audio probably uses the same tokenization but you can ship it separately like the diffusion model and encoder already are
>>
>>>/wsg/6067150
>>>/wsg/6067150
>>>/wsg/6067150
Migrate.
>>
I wanted to be more than just a coomer, so I started learning about the topic.
Do others do that too?
If so, how's it going anon?
>>
>>107792032
Is this what counts as "works" here nowadays?
>>
Anon, tested Kandinsky 5 Pro T2V and NSFW is fucking real

https://litter.catbox.moe/3nfp26m9b68gxx9g.mp4

{
"prompt": "A POV shot from the perspective of someone lying on a bed; a beautiful woman creates a silhouette against a bright window covered by sheer white curtains; she is straddling the viewer's phantom lap, bouncing gently, her head thrown back in ecstasy, heavy breasts swaying with the motion; natural rim lighting outlines her curves, highlighting the sweat on her neck and the explicit motion of her hips; captured with a wide-angle lens to enhance the feeling of immersion, high dynamic range preserving the details of her body against the backlight.",
"negative_prompt": "Static, 2D cartoon, cartoon, 2d animation, paintings, images, worst quality, low quality, ugly, deformed, walking backwards",
"width": 1280,
"height": 768,
"video_duration": 10,
"seed": 3071
}
>>
Why is this dude so mad at ltx?
>>
>>107792083
anything with migu in it werks for him
>>
>>107792071
No.
>>
literally me:

https://files.catbox.moe/n1qirw.mp4
>>
>>107792068
they were clearly all trained together or else it would have no idea how to line up the audio so well

https://files.catbox.moe/m3tt74.mp4
https://files.catbox.moe/wunip1.mp4
https://files.catbox.moe/eea5wn.mp4
https://files.catbox.moe/k29y60.mp4
https://files.catbox.moe/b1l6jc.mp4
>>
>>107792088
I can't be mad at a diffusion model. if I don't like it I don't use it. I am mad at the apparently deaf, dumb and blind anons making literal garbage and praising it like it's the second coming of christ.
>>
>>107792094
oof it slopped the image hard on that one
>>
>>107792091
But... it has audio...
>>
>>107792087
you stole my post. Kandinsky is just too slow to use to be worth. That is why im excited about ltxv2. It has the fidelity of it but is fast as fuck. In a few months we will have nsfw loras
>>
>>107792110
This. Not really mad, just disappointed.
>>
>>107792113
I'LL BITE YOUR FACE, CUNT
>>
>>107792110
I agree that the guy spamming floyd/erika whatever is obnoxious as fuck, but we're having fun with the new model, so dont be such a faggot.
>>
>>107792110
You're survivorship bias is blocking out the sheer number of absolutely ass Wan 2.2 gens.
>>
https://files.catbox.moe/vgne7f.mp4
>>
>>107792099
https://arxiv.org/abs/2601.03233
literally the second page of the paper. you don't need them bundled together you stupid faggot. they just share the same encoder
>>
>>107792110
>>107792118
>>
>>107792136
you have no clue what I'm even talking about on how its all trained as one together
>>
>>107792087
Is there a good comfyui workflow?
>>
>>107792114
I knew it was you anon !!
>>
>>107792126
WAN is still not great compared to actual sota vidgen, but its passable and there are good gens out there. Every single ltx gen looks fucking awful and the audio doesn't sell it at all. It's not even close to veo or sora.
>>
actually kino: got cinematics even

https://files.catbox.moe/iuxmam.mp4
>>
>>107792142
can someone qwen edit this but the people are just shitting all over the room and the guy shouting at them is playing the video game?
>>
>>107792153
lmao, nice
>>
>>107792153
no way that's ai
>>
do we really have a chinese shill trying to fud ltxv or something? its for sure better than wan. And it being so hopefully means they also are forced to drop 2.6
>>
What are the odds someone will make a workflow/node to let us use a proper audio model with ltx2 in post?
>>
File: 1758341632131438.jpg (191 KB, 1920x1080)
191 KB
191 KB JPG
>>107792166
source image is this, plus the song.
>>
>>107792153
>open embed
>cia meme the test spammer already ruined months ago because he couldn't help himself beating the same memes to death
>close embed
>>
File: file.png (338 KB, 564x536)
338 KB
338 KB PNG
>>107792032
>just photoshop some random mouth
>>
>>107792167
Literally one person whose replying and agreeing with himself, so yeah. There's no way you can try LTX and be "meh". The potential is huge.
>>
>>107792146
T2V kandinsky WF from comfyui BUT replace the model with the PRO one

https://huggingface.co/collections/kandinskylab/kandinsky-50-video-pro
>>
>>107792186
Thanks. I don't know if I can run it but want to try.
>>
Can LTX do this? Yeah it can't, you lost. >>107792087
>>
>>107792167
Why would this force them to open source anything when comfy collaborates with then to sell the api instead.
>>
>>107792189
H100 took like 40 mins for >>107792087
>>
>>107792151
>Every single ltx gen looks fucking awful and the audio doesn't sell it at all.
Don't be a retard. There are some phenomenal ones out there.
>>
>>107792177
wow
>>
>>107792195
no it didn't
>>
>>107792198
post one? and no floyd singing isn't it
>>
>>107792198
Can you post 1-3 examples?
>>
>>107792204
>inb4 a list of dogshit
>>
>jump over a bar that I'll keep raising
nah, go be a faggot on your own
>>
>>107792202
pro at full res / steps at full fp16? yes it did
>>
>>107792204
>>107792211
>and no floyd singing isn't it
Alright I concede
>>
>>107792212
more like we kept lowering, fuck off
>>
>>107792204
>>107792210
No because no matter what I post you will just say it's not good enough. Why would I even go out of my way to prove it?
>>
>>107792195
>H100
Fucking kek. Oh well
>>
>>107792221
Just admit defeat
>>
>we
>>
I'm worried models are regressing after this. what a shitty model
>>
>>107792221
So you're just trolling, got it.
>>
Kandinsky pro using Comfyui template is about 10 min on a 5070ti for 241 frames.

Don't use github Kandinsky nodes !!! that's fucking slow and quality is the exact same
>>
>>107792223
could work with less with fp8 and offloading, I was just giving you a idea of how slow it is
>>
here is ltxv2 nsfw, will for sure need loras https://files.catbox.moe/n1ttzt.mp4
>>
this is why RAM is $2000 btw:

https://files.catbox.moe/c2apb3.mp4
>>
>>107792235
we only use ltx around here
>>
>>107792192
am downloading SD 2.1, anon all lost
>>
>>107791088
All built 4 bbctech
>>
>>107792254
do it again with a better starting image wtf are those nipples
>>
>>107792263
benchod stop the mockery
>>
>>107792268
that is T2V, like I said will need a lora
>>
>>107792114
>It has the fidelity of it but is fast as fuck
what the fuck? are you actually implying that ltx is anywhere close to kandinsky?
>>
>>107792277
oh that's crazy it actually genned nudes
>>
>>107792282
Stop being a bigot, he is blind can't you see
>>
>>107792282
at the same settings? fp16 with full 50 steps? Yes. And kandinsky needs 40+ mins for that, ltx will be like 10 mins. I posted one at that fidelity before

https://files.catbox.moe/55tof3.mp4
>>
>>107792294
you're right
>>107792295
i'm sorry anon. this looks great
>>
>>107792295
this is just like real life wow
>>
>>107792305
>>107792305
>>107792305
>>
and then of course the whole audio thing
https://files.catbox.moe/vrzlt0.mp4
>>
new bread
>>107792310
>>107792310
>>107792310
>>
>>107792312
based, thanks anon
>>
>>107792309
thx for baking anon
>>
>>107792312
duplicate thread, remove please
>>
>>107792309
Thank you.
>>
>>107792305
>>107792305
>>107792305
earlier bake



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.