[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


Turbo Whores Edition

Discussion of Free and Open Source Diffusion Models

Prev: >>107787932

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/kohya-ss/sd-scripts
https://github.com/tdrussell/diffusion-pipe

>Z Image Turbo
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>WanX
https://github.com/Wan-Video/Wan2.2
https://comfyanonymous.github.io/ComfyUI_examples/wan22/

>NetaYume
https://civitai.com/models/1790792?modelVersionId=2485296
https://nieta-art.feishu.cn/wiki/RY3GwpT59icIQlkWXEfcCqIMnQd

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe|https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
>>
>>107789820
>>Maintain Thread Quality
>https://rentry.org/debo
>https://rentry.org/animanon
who let the monkey out of the cage? she baked the off topic garbage in the OP again
>>
LTX2 question : can you use a bigger gemma 3 model (27B for example) or it really doesn't matter or change anything?
>>
Blessed thread of frenship
>>
>>107788437
very nice, what did you prompt for the retro style? so tired of instagram face
>>
>>107789833
Why would you want an even bigger model?
>>
>>107789820
thanks for the bread anon
>>107789518
>RuntimeError: Expected all tensors to be on the same device, but got index is on cpu, different from other tensors on cuda:0 (when checking argument in method wrapper_CUDA__index_select)
sigh... I guess it won't work on LTX Studio's workflow if you use --reserve_vram right?
>>
>>
>>107789833
no you can't change the text encoder, it's like asking a human to understand a completly new language in first try
>>
>>107789839
Bigger = better accuracy, I am very dumb

>>107789857
Makes sense, thanks anon
>>
>>107789842
its working for me with reserve vram set at 4 and using a 4090 + 128GB ram. Make sure you have latest everything and disable previews in settings
>>
>>107789842
even without the --reserve_vram flag I have this error, how do you even run this workflow
>>
>>107789879
>128GB ram
how much of that does it use?
>>
use anons workflow it's 10x better than the shit comfy template: https://files.catbox.moe/lclc9t.json

the anime girl runs towards the camera and says "miku miku miku" in a cute anime style voice.

sticky anon's workflow cause the comfy template one sucks dick and this works, also this model is pretty fast, 16GB 4080 64GB ram and it works just fine.

proof it works: https://files.catbox.moe/u8r61w.mp4
>>
>>107789837
why do you say this shit when you troll and dox people constantly? when are mods going to permanently get rid of you?
>>
>>107789879
what files did you download anon? I went for this for the config files
https://huggingface.co/unsloth/gemma-3-12b-it/tree/main
and this for the model
https://huggingface.co/Comfy-Org/ltx-2/blob/main/split_files/text_encoders/gemma_3_12B_it.safetensors
>>
>>107789890
>>107787923
>>
>>107789885
like 70GB but I also watch youtube at the same time
>>
The schizos in this thread would be rejected as batman villains for being too whacky and over the top.
>>
>>107789900
that's ok, for the fp8 I guess?
>>
>>107789888
>it doesn't have her voice
sad :(
>>
>>107789899
so you're off your meds? you should probably log out then and get a lot of rest. get well soon
>>
https://files.catbox.moe/v97wl9.mp4
>>
>>107789911
If this model has any success the second thing people will do is to create an easy way to add the correct voice to any workflow.
First being bouncing boobs.
>>
how are you supposed to chain the distil lora, and what exactly is it doing

I wish these retards could write basic documentation
>>
>>107789916
holy kino...
>>
can we get a rentry for comfyanon up? I think he's the most schizo of all
>>
>>107789908
yes, including using ltx2's nodes to use the FP8 encoder properly
>>
>>107789921
>the distil lora
if you want to use the model on distilled mode, don't go for the lora, download directly the distilled model
>what exactly is it doing
makes shit faster, obviously the quality and prompt adherence is a bit worse but it's totally worth the time save
>>
>>107789937
so its prob closer to 80-85GB with full size gemma. For FP16 model as well its gonna be like 150GB+
>>
>>107789937
OK
I'll add 64GB of ram this weekend, might as well try it then
>>
anyone compared the ltx2 repo fp8 vs nvfp4 quality?
>>
>>107789925
this
>>
>C drive goes from like 100gb to 30 during this process
keep in mind it eats a lot of temporary memory

in any case. use this workflow: https://files.catbox.moe/lclc9t.json

KNEEL to ltx2: 129 secs on a 4080 (16gb) and 64gb ram.

https://files.catbox.moe/xstmpb.mp4
>>
>>107789951
if nvfp4 was good I think it would've been known at that point
>>
>>107789958
>https://files.catbox.moe/xstmpb.mp4
based
>>
>>107789958
also this is fp8 distilled model, works great.
>>
>>107789958
>>107789971
are you using comfy's workflow or ltx's workflow?
>>
I don't know why people are breaking their hardware just to use this garbage. none of the videos are particularly good
>>
its insane that we can now do 20 sec gens in less time I used to do wan 2.2 5 secs
>>
the police officer in the blue shirt stands up and starts dancing to jazz music.

96 seconds, lmao. need to tweak things but, it does in fact work.

https://files.catbox.moe/iknez6.mp4
>>
>>107789980
it's better than wan 2.2, faster, and can do sound, it's currently the best local video model, obviously people are gonna try it
>>
>>107789980
its really fucking good actually. Its just half broken in comfyui currently and people need to use Res_2s instead of euler AND they need to write proper prompts >>107789518

https://ltx.io/model/model-blog/prompting-guide-for-ltx-2
>>
>>107789979
the comfy workflow is a piece of shit, the anon workflow is working and has no errors, ill use this for now.

https://files.catbox.moe/lclc9t.json
>>
>>107789958
can this work with 32gb ram? how big should my page file be?
>>
>>107789999
>AND they need to write proper prompts
hopefully someone can get the prompt enhancement node working with the uncensored alliterated gemma, its night and day better with big prompts like:
EXT. SMALL TOWN STREET – MORNING – LIVE NEWS BROADCAST
The shot opens on a news reporter standing in front of a row of cordoned-off cars, yellow caution tape fluttering behind him. The light is warm, early sun reflecting off the camera lens. The faint hum of chatter and distant drilling fills the air.
The reporter, composed but visibly excited, looks directly into the camera, microphone in hand.
Reporter (live):
“Thank you, Sylvia. And yes — this is a sentence I never thought I’d say on live television — but this morning, here in the quiet town of New Castle, Vermont… black gold has been found!”
He gestures slightly toward the field behind him.
Reporter (grinning):
“If my cameraman can pan over, you’ll see what all the excitement’s about.”
The camera pans right, slowly revealing a construction site surrounded by workers in hard hats. A beat of silence — then, with a sudden roar, a geyser of oil erupts from the ground, blasting upward in a violent plume.
Workers cheer and scramble, the black stream glistening in the morning light. The camera shakes slightly, trying to stay focused through the chaos.
Reporter (off-screen, shouting over the noise):
“There it is, folks — the moment New Castle will never forget!”
The camera catches the sunlight gleaming off the oil mist before pulling back, revealing the entire scene — the small-town skyline silhouetted against the wild fountain of oil.
>>
File: 1744970358406613.png (7 KB, 366x188)
7 KB
7 KB PNG
>>107790002
>how big should my page file be?
well on windows it's growing up automatically, be sure you have a lot of disk space
>>
>>107790002
that is going to be PAINFULLY slow constantly swapping from disk, like going from sub 2 mins to 30+ mins or something. Even 64GB is not enough
>>
>>107789997
>it's better than wan 2.2
the video quality is worse

>it's faster
I agree

>can do sound
the audio quality is absolute garbage
>>
>>107790019
>video quality is worse
nta but that's debatable and often up to the subject matter.
>>
the police officer in the blue shirt stands up and starts dancing to jazz music.

KINO, actual jazz music this time.

https://files.catbox.moe/8hgfs5.mp4
>>
>>107790019
>the video quality is worse
delusional
https://files.catbox.moe/mpd5u9.mp4
https://files.catbox.moe/lvnhqk.mp4
https://files.catbox.moe/htkh8y.mp4
https://files.catbox.moe/55tof3.mp4
https://files.catbox.moe/jgj4md.mp4
>>
>>107790017
but anon with 64gb said it took him 129 seconds..
>>
>>107790019
>the video quality is worse
no its not.
https://files.catbox.moe/mpd5u9.mp4
https://files.catbox.moe/lvnhqk.mp4
https://files.catbox.moe/htkh8y.mp4
https://files.catbox.moe/55tof3.mp4
https://files.catbox.moe/jgj4md.mp4

for audio supply your own or prompt it, non prompted audio is terrible
https://files.catbox.moe/eea5wn.mp4
https://files.catbox.moe/wunip1.mp4
https://files.catbox.moe/m3tt74.mp4
https://files.catbox.moe/k29y60.mp4
>>
>>107790037
stealer of posts
>>
>>
also they said they would have a fix for generated audio out soon, that it didn't make release
>>
Why are people comparing this to sora when it's much much closer visually and behaviorally to veo3?
>>
>>107790065
oh so it's broken?
>>
>>107790037
>>107790043
all of these are shit
>>
68 seconds for 120 frames, this shit is fast AND has sound.
>>
>>107790070
I haven't seen anyone compare it to sora on this thread, and yeah it's more of a veo 3 copycat, a pretty good copycat if you ask me
>>107790065
source?
>>
>>107790079
compared to what? sora 2? It blows away everything else including some cloud models, its far better than wan, seeddance and hulio
>>
File: wutthismeans.jpg (63 KB, 2426x657)
63 KB
63 KB JPG
>>107789958
I'm trying your workflow but get this error, what this shit means?
>>
>>107790017
use this workflow: >>107790000

add --reserve-vram 4 to launch options
>>
>>107790094
I have the same thing :( >>107789842
>>
>>107790088
I'd have to find the screen cap, it was posted several threads ago, from their discord
>>
>>107790034
it's the mouth. the blurring around the mouth is really noticable. other than that the motion quality is also kind of bad. it can't do anime and it can't do nsfw.
>>
is it as cool as wan was with single image t2i
>>
>>107790091
>compared to what?
kandinsky 3
>>
>>107790108
it can do animie but not with euler, nsfw will be trainable same as wan's nightmare dicks were, the blurring is a issue half fixed by more steps and using Res_2s and I think can be fully fixed by lowering the max shift values but I need time to play with it
>>
>>107790122
>kandinsky 3
I have seen a lot of posts saying that it can do nsfw, but curiously no one provided a single video output from this model
>>
>inb4 having to refine using wan 2.2
>>
>>107790133
it can and I did before, it can do full on porn, the issue is that kandinsky takes fucking a HOUR on a H100 per gen, its simply too slow to use. LTXV's speed is simply too good to pass up
>>
>>107790094
did you update comfy?
>>
>>107790145
>it can do full on porn
do you have a single of those videos to showcase?
>>
>>107790138
the other way actually, I was giving some wan gens audio with higher res
>>
can you train sound loras for ltx?
>>
>>107790131
>it can do animie
where is the example?
>>
68 seconds, the speed is crazy good desu, depends on your resolution but still.

https://files.catbox.moe/in8t0d.mp4
>>
>>107790145
>it can do full on porn,
I've seen examples, and unless I missed something, they look like shit.
>>
>>107790156
go to the nsfw section on the bandicoo discord
{
"prompt": "A POV shot from the perspective of someone lying on a bed; a beautiful woman creates a silhouette against a bright window covered by sheer white curtains; she is straddling the viewer's phantom lap, bouncing gently, her head thrown back in ecstasy, heavy breasts swaying with the motion; natural rim lighting outlines her curves, highlighting the sweat on her neck and the explicit motion of her hips; captured with a wide-angle lens to enhance the feeling of immersion, high dynamic range preserving the details of her body against the backlight.",
"negative_prompt": "Static, 2D cartoon, cartoon, 2d animation, paintings, images, worst quality, low quality, ugly, deformed, walking backwards",
"width": 1280,
"height": 768,
"video_duration": 10,
"seed": 3071
}

This is T2V:
https://files.catbox.moe/3icuv9.mp4
again, its just too fucking slow to be useful
>>
Should I bother testing LTX2? I didn't see anything very motivating so far.
I could probably run it in full precision, maybe that would be better.
>>
>>107790186
holy shit... that's actually impressive, looks like only Russia has the balls to train their models with porn
>>
more kandinsky 20B nsfw
https://files.catbox.moe/6pdai4.webp
https://files.catbox.moe/ixbint.mp4
https://files.catbox.moe/s6rqoi.webp>>107790186
>>
Anybody here raw dogging with sdcpp ?
>>
lmao

bumped length to 200 frames. it fit the additional dialogue.

135 seconds for this one.

https://files.catbox.moe/1brdht.mp4
>>
>>107790213
using a 4080 (16gb), 64gb ram, --reserve-vram 4 in launch options, and anon workflow.

it's fast and works fine. no 5090 necessary.
>>
File: 00127-1089922999.png (1.23 MB, 1112x728)
1.23 MB
1.23 MB PNG
>>
>>107790197
too bad they are probably too broke to try again with a better architecture. Though also Kandinsky main weakness is that its prompt following was shit as well and being ungodly slow. They clearly captioned very badly
>>
>>107789518
>The LTXV WFs are WAY better btw
are you using the prompt enhancer that's on that WF? it gives me an error when I'm on that part
>>
lmao, need to tweak some settings but this is amazing, 79 seconds WITH AUDIO

https://files.catbox.moe/dgtftd.mp4
>>
why do people use gemma3? 30% of its size is because it has built-in diffusion
>>
>>107790303
glad to know that the new shock collar doesn't hurt kaya that much kek
>>
>>107790303
You can provide a video and continue it so it mimics his voice and stuff properly too.
>>
>>107790303
>"why would israel do this?"
>an israel model has made this kino edit
for once he's not wrong
>>
HAHAHA

the dog on the left jumps towards the man and knocks him off his chair, and the man then says "WHY WOULD THE STATE OF ISRAEL DO THIS?" in an angry tone.

didnt get the knockdown but the audio.

https://files.catbox.moe/1154xx.mp4

>>107790314
cool. finally we have a video + audio model, beat wan 2.5 to the punch. AND IT IS FAST (66s, on a 4080 even).
>>
>Guys use res_2s
>literally doubles the generation time

This is just like heun, isn't it?
>>
https://github.com/Comfy-Org/ComfyUI/pull/11681
>Disable comfy kitchen cuda if pytorch cuda less than 13
what is comfy kitchen, and is it worth the hassle of upgrading to cuda 13 to use it?
>>
>>107790303
It can copy voices crazy good if you feed it the audio + video for a few seconds
The first 5 secs are real, after than is AI:
https://files.catbox.moe/46y2ar.mp4

This will get people arrested
>>
>>107790330
use res3m
>>
>>107790340
>The meta changes again mid gen

damn it.
>>
omg le meme video model is awesome!!!!!1 now I can post and grift my slop video with audio crap on social media too yay
>>
>>107790330
use res4m
>>
>>107790335
how are you using audio input as a source? is there a video to video option?
>>
you need the RES4LYF sampler pack btw to use what I'm saying
>>
>>107790335
>feed it the audio + video for a few seconds
workflow us up
>>
>>107790349
res3m is just slightly worse than res2s but so much faster
>>
>>107790369
replace the image input with a video and a encoder then feed that as the input instead. I didn't save it before I switched off of it. https://files.catbox.moe/f9fvjr.json
>>
>>107790369
>workflow us up
fake
>>
>>107790358
the concatAVlatent might have something to do with it
the latent video doesn't have to be generated, it can be anything
>>
File: 2k37jJ6[1].jpg (80 KB, 720x960)
80 KB
80 KB JPG
https://files.catbox.moe/do2dsn.mp4
>>
>>107790378
and and you have to continue from it as if it was a latent from the first sampler node, just replace the first section, not just encode and sample the whole thing
>>
the desperation to use this terrible model is weird as hell
>>
Make sure you turn off preview in comfy if you get errors about mat multiply dimensions being wrong. Search settings for "Live preview method"
>>
>>107790396
>terrible
its better than wan 2.2 and better than some cloud models, no one is comparing to sora 2
>>
>>107790396
probably LTX staff that is posting here and on reddit, BFL staff also posted here once defending their shitty model
>>
>>107790147
Crossing this to the real thread
>>
https://www.reddit.com/r/StableDiffusion/comments/1q627xi/kijai_made_a_ltxv2_audio_image_to_video_workflow/
>>
>>107790411
>its better than wan 2.2 and better than some cloud models
I still haven't seen any examples that prove this
>>
>>107790396
>>107790419
what? it's better than wan 2.2 and wan 2.2 is really popular, so why wouldn't that superior model be as popular as well?
>>
>>107790396
cope, hasan you are going to get memed on even worse in 2026:

https://files.catbox.moe/mlwizk.mp4
>>
You'll never make me update.
>>
File: ComfyUI_temp_cnzpx_00020_.png (1.49 MB, 1600x1000)
1.49 MB
1.49 MB PNG
>>
>>
>>107790335
Do you think it's possible to upload a video with the audio and concat the last frame to be your intended img2vid to get a proper img2vid with the correct voices?
>>
>>107790439
>>107790043
show me wan gens without loras that even come close
>>
File: 1743886313317067.png (1.65 MB, 1280x720)
1.65 MB
1.65 MB PNG
>>107790441
did he sub zero kaya or what? that poor dog is not moving at all :(
>>
File: ComfyUI_temp_cnzpx_00021_.png (2.27 MB, 1600x1000)
2.27 MB
2.27 MB PNG
>>107790440
wan 2.2 wins by default because you can generate videos of women
>>
>>107790419
this post was written by wan staff
>>
112 seconds, 200 frames

haha, still havent tweaked stuff yet but it's a meme generator for sure.

https://files.catbox.moe/wf1uzz.mp4
>>
>>107790462
the movement is so good you can't even notice it anon
>>
>>107790419
>BFL staff also posted here once defending their shitty model
I remember the dead baby.
>>
>>107790440
it's not, it's still plagued with body horror and impossible movement, but it still has merit
>>
>>107790468
this was written by kandinsky staff
>>
Need I remind everyone about Chinese culture when engaging with the people who are seemingly salty about people enjoying a model that isn't wan?
>>
I fucking hate github with passion, they're moving namespace of comfyui and fetch/pull isnt working at all. even issues are fucked (showing right now only up to dec 2025)
fuck GH
>>
>>107790460
this model can't even use loras, its a dead end
>>
>>107790475
wan 2.2 can only do slo mo shit, and if you're lucky enough to get a fast movement it shits the bed quickly, way more than ltx 2
>>
scraping this broads social media in awe of the fact that shes posted over 1k images of herself online in todays imggen climate
>>
>>107790465
most people don't care about that coombrain
>>
>>107790465
huh?
https://files.catbox.moe/imqnxt.mp4
https://files.catbox.moe/v0bprl.mp4
https://files.catbox.moe/mbbl3x.mp4
https://files.catbox.moe/yxd3yo.mp4
>>
>>107790470
>irrelevant eceleb drama
>pick me pick me! attentionslop
>constantly praising the shitty outputs
yep. this is the migu test anon making forgettable trash again
>>
>>107790486
you know it came out with official loras right? give it a month or two like wan
>>
>>107790497
It took you this long to realize it?
>>
>>107790497
>forgettable
you won't forget it though since it rents free in your head lol
>>
>>107790488
>>
>>107790494
that's via the api, you didn't use the local
>>
>>107790497
>forgettable
half a year later people are mocking the dog zapper, cope
>>
>>107790508
go ghen more floyd garbage retard
>>
>>107790520
they didn't pick your video because it's dogshit :^)
>>
>>107790522
oh you didn't forget that one as well kek
>>
File: 1747689978047692.png (30 KB, 860x115)
30 KB
30 KB PNG
fp8, 16gb vram, 64 physical ram, 200 frames, working fine and fast:
>>
>>107790516
why the fuck would I pay $200 for like 200 gens instead of generate in like 2 mins on a 4090?
>>
>>107790520
>half a year
has it been that long?
>>
>>107790526
>they didn't pick your video because it's dogshit :^)
stop noticing!
>>
File: ComfyUI_temp_cnzpx_00022_.png (2.21 MB, 1600x1000)
2.21 MB
2.21 MB PNG
>>107790492
nor people care about e-celebs faggots, 3d cartoons and shitty racist memes but here we are, are we?
>>
>>107790535
>i gen 100 videos per minute on my 4090
I accept your concession
>>
>>107790488
lmk if you find a decent insta scraping method
>>
the American man with glasses stands up and fires a lightning bolt with his eyes at the dog on the left, which explodes into smoke and fire.

youtube poop tier result

https://files.catbox.moe/cwt8jf.mp4
>>
>>107790536
>has it been that long?
yeah, and he's still getting that meme spam on his twitter lmao
>>
>>107790532
can you please post your workflow?
>>
>>107790555
it's this one:

https://files.catbox.moe/lclc9t.json
>>
>>107790547
https://github.com/mikf/gallery-dl
return the favor by suggesting another profile to scrape. this girl is too big for my preference desu
>>
so now that comfyorg is completely in control of the project. who do we support now considering the project is doomed to get worse?
>>
make sure to prompt it right
https://ltx.io/model/model-blog/prompting-guide-for-ltx-2

EXT. SMALL TOWN STREET – MORNING – LIVE NEWS BROADCAST
The shot opens on a news reporter standing in front of a row of cordoned-off cars, yellow caution tape fluttering behind him. The light is warm, early sun reflecting off the camera lens. The faint hum of chatter and distant drilling fills the air.
The reporter, composed but visibly excited, looks directly into the camera, microphone in hand.
Reporter (live):
“Thank you, Sylvia. And yes — this is a sentence I never thought I’d say on live television — but this morning, here in the quiet town of New Castle, Vermont… black gold has been found!”
He gestures slightly toward the field behind him.
Reporter (grinning):
“If my cameraman can pan over, you’ll see what all the excitement’s about.”
The camera pans right, slowly revealing a construction site surrounded by workers in hard hats. A beat of silence — then, with a sudden roar, a geyser of oil erupts from the ground, blasting upward in a violent plume.
Workers cheer and scramble, the black stream glistening in the morning light. The camera shakes slightly, trying to stay focused through the chaos.
Reporter (off-screen, shouting over the noise):
“There it is, folks — the moment New Castle will never forget!”
The camera catches the sunlight gleaming off the oil mist before pulling back, revealing the entire scene — the small-town skyline silhouetted against the wild fountain of oil.

using this style of prompts + the lightx2 WFs instead of comfys is a lot better. Change sampler to res_2 though
>>
>>107790587
>res_2
res2s or res2m?
>>
File: ComfyUI_temp_cnzpx_00025_.png (1.92 MB, 1600x1000)
1.92 MB
1.92 MB PNG
someone use this image with ltx and try to make her a 1girl youtuber
>>
>>107790587
So basically describe the opening, use regular text as guidance on what to do and give a character's name and (optionally emotion): "Before their speech like this"
>>
>>107790540
>nor people care about e-celebs faggots
>>107790601
>someone use this image with ltx and try to make her a 1girl youtuber
lmao
>>
Should I bother with this stuff I’d I only have 64gb of ram. What if I only want to add sound to existing ai videos..
>>
>>107790595
res_2s
you may need the RES4LYF custom node for it to show up for lightx2's WF options
>>
File: ComfyUI_temp_cnzpx_00027_.png (2.12 MB, 1600x1000)
2.12 MB
2.12 MB PNG
>>107790610
your e-celeb worship isn't the same as 1girl slop anon
>>
>>107790571
im pretty sure insta will flag that as botting
theres no good way to bypass their shit

theres soooooo many girls that u can scrape from bro dont you use the app at all
do you have a type
>>
>>107790635
1girl is different from 1FemaleYoutuber (e-celeb slop)
>>
>>107789980
TELL ME ABOUT THE GARBAGE! HOW DOES IT BREAK THE HARDWARE?
>>
File: ComfyUI_temp_cnzpx_00028_.png (2.19 MB, 1600x1000)
2.19 MB
2.19 MB PNG
>>107790640
Ok I get you got offended by the e-celeb comment, sorry to burst your lil zoomer bubble
>>
>>107790638
>im pretty sure insta will flag that as botting
ive scraped tens of dozens of accounts no problems here
>do you have a type
yes but the other factor to consider is how many anon will also know her and give me sweet sweet (you)s
>>
>>107790649
>I get you got offended
says the libtard losing his shit when he sees his hero George Floyd getting meme'd here byw
>>
File: ComfyUI_temp_cnzpx_00029_.png (2.01 MB, 1600x1000)
2.01 MB
2.01 MB PNG
>>
>>107790626
>>107790628
It's fucking twice as long now you stupid nigger
>>
>>107790670
I suspect res2 is very similar or almost exactly the same as Heun and has the practical outcome of just doubling the steps.
>>
71 seconds for a default 126 frame gen with audio

the speed is crazy, this is actually amazing desu. 5090s are probably generating in 30s. this is on a 4080.

this is a comedy goldmine, because now you have the audio.

https://files.catbox.moe/h0nany.mp4
>>
>>107790668
howd you make the hips so wide
>>
File: ComfyUI_temp_cnzpx_00030_.png (2.39 MB, 1600x1000)
2.39 MB
2.39 MB PNG
>>107790663
wut? again sorry you live in a bubble, but the world doesn't revolve around your stupid nation politics lmao
>>
>>107790690
american website
>>
>>107790690
what country is she from that the outlets are so small? is she from MICRONESIA?
>>
>>107790683
67 seconds:

https://files.catbox.moe/s7x92a.mp4
>>
>>107790650
ok cool. that saves me so much time.

>youmaxxing
sophia diamond ig. ive seen her posted around
itsnekochan for the paag
>>
>>107790670
and its more than twice as good. Much more accurate sampler, its worth it
>>
is gemma smart enough to remember characters? that's useful in flux to just lay out the characters first then do what you wanna do with them, but I kinda doubt ltx will keep it coherent
>>
File: file.png (226 KB, 1280x720)
226 KB
226 KB PNG
Spent a couple hours downloading ltx 2 and messing around with the ltx 2 lip sync work flow trying to figure out a good prompt. I liked this the most (out of like 30)t:

https://files.catbox.moe/wjdesf.mp4

It has the same problem as Wan 2.2 where it wants to make the artwork way too clean for my use case. The built in audio is awful. Each 353 frame gen took about 100 seconds on a RTX 6000 pro.
>>
>>107790644
people are swapping using their nvme which is raping it
>>
>>107790670
then reduce the steps you mouthbreathing faggot
>>
Some anons issue is that FP8 does not work with a lower cuda version than cuda 13

https://github.com/Comfy-Org/ComfyUI/pull/11681
>>
>>107790730
14s? how much vram do you have? 5090?
>>
>>107790739
>FP8 does not work with a lower cuda version than cuda 13
what, I'm on cuda 12.8 and I can run ltx2 fp8 distilled just fine
>>
https://files.catbox.moe/swvw9c.mp4
>>
File: ComfyUI_temp_cnzpx_00031_.png (2.53 MB, 1600x1000)
2.53 MB
2.53 MB PNG
>>107790698
1girl country
>>
my old google-era boomer intuition regarding prompting is only to include relevant information such as
>1 girl, tribal, wide hips, body paint, tanlines, large natural breasts, micro bikini

but better prompters seem to include more natural language, bordering on narratives. can you explain why including natural language is better for prompting and what kind of checklists you go through in your head when doing it?
>>
>>107790742
96 gigs (RTX 6000 pro)
>>
>>107790683
>this is a comedy goldmine
>click video
>didn't laugh or even crack a smile
i see
>>
JUSTICE!

https://files.catbox.moe/bvikpi.mp4
>>
>>107790748
must depend on what GPU you are using
>>
>>107790748
it might be casting to fp16?
>>
res2 is cope and has the same outcome as increasing the steps.
>>
I think the best part of the model is we can choose not to look at a single frame of test anon's uninspired slop anymore
>>
File: ComfyUI_temp_cnzpx_00032_.png (2.39 MB, 1600x1000)
2.39 MB
2.39 MB PNG
>>
>>107790768
>casting
Ayo you sayin we got wizards in our 'putas and shit?
>>
>>107790777
that would be a really cool streamer setup
>>
>>107790739
>cuda 13
oh damn do I really have to upgrade to that one? :(
>>
>>107790773
cope, dog zapper
>>
>>
>>107790790
nobody cares about hassan anymore. only you
>>
>>107790798
>nobody cares about hassan anymore.
because he killed his career by shocking his dog lmao
>>
File: ComfyUI_temp_cnzpx_00033_.png (2.52 MB, 1600x1000)
2.52 MB
2.52 MB PNG
>>107790780
If only there was a video model that allowed us to generate video with audio...
>>
>>107790773
Hmm, will I do an Hasan or a Miku today? Perhaps I will be outgoing a gen a Ryan Gosling with a Deus Ex just to spice things up a little. Hmm yes hmm.
>>
>>107790790
the cultural zeitgeist left you and Hassan behind. you are forgotten
>>
z-video when? this Jew model is fast but it sucks
>>
>>107790798
kek he is memed on by every streamer daily. no one likes him.

https://files.catbox.moe/f52did.mp4
>>
>>107790807
Better generate a george floyd racist video to spice things a little bit anon
>>
>>107790807
hmm will I complain, or complain, and gen nothing? hmm.
>>
>>107790805
that was two months ago anon. new memes only last a week now before you move onto new meme
>>
this is super impressive though
https://files.catbox.moe/k29y60.mp4
>>
>>107790820
every single time you say this and post a link it's to utter garbage. you aren't fooling me this time
>>
File: ComfyUI_temp_cnzpx_00034_.png (2.65 MB, 1600x1000)
2.65 MB
2.65 MB PNG
only millennials who watch cartoons like LTX-2
>>
>>107790818
>new memes only last a week now before you move onto new meme
tell me you are a zoomer without telling me you are a zoomer
>>
File: z-image-fp_00071_.png (2.83 MB, 1024x2048)
2.83 MB
2.83 MB PNG
>>107790810
soon
>>
>>107790829
smells like sour grapes in here
>>
>>107790820
damn that's pretty good
>>
>>107790831
t rex arms ass thot

PROPORTIONS! THEY MATTER!
>>
>>107790831
why are you spamming the same youtube e-celeb slop over and over, be more creative the fuck?
>>
>>107790820
how did you mix it with real music? that's orangecaramel right?
>>
>>107790835
dude, you are the one bothering to try and bring zoomie eceleb memes in here. if it's something attached to irrelevant edrama I immediately disassociate
>>
>>107790829
nta, but your scale of whats impressive seems to have been thrown out of whack. We basically got veo 3 at home and you're salty.
>>
>>107790859
https://www.reddit.com/r/StableDiffusion/comments/1q627xi/kijai_made_a_ltxv2_audio_image_to_video_workflow/
>>
>>
>>107790862
>if it's something attached to irrelevant edrama I immediately disassociate
you don't since you can't stop taking about seething about it, dissociation means you ignore the post
>>
Kinda crazy just how many frames you can generate with ltx.
>>
>>107790878
I am seething someone cares to post zoomer garbage also can't let go when it isn't in vogue anymore
>>
>>107790864
kijai boss does it again
>>
it gets extremely detailed with more steps
https://files.catbox.moe/55tof3.mp4
>>
File: ComfyUI_temp_cnzpx_00036_.png (2.3 MB, 1600x1000)
2.3 MB
2.3 MB PNG
>>
File: 1740909719167933.png (516 KB, 1832x1019)
516 KB
516 KB PNG
we cookin
>>
if it cant do goon stuff idc
>>
>>107790892
>I am seething
no one cares you're upset about it, the world doesn't revolve around you
>>
>>107790899
Maybe I'm crazy but I can't see anywhere in any of the subgraphs to edit steps, it tells me 8 is enough and to suck a fat cock.
>>
>>107790899
this is non distilled btw, for sure looks a lot better
>>
88 seconds. fast + audio, 160 frames (can do more)

https://files.catbox.moe/klu1g7.mp4
>>
>>107790899
>>107790909
I'm noticing the motion is still shit. almost all of these clips are boring to look at
>>
File: gadfasfdsad.png (127 KB, 1144x801)
127 KB
127 KB PNG
>>107790907
I use the light2x WF with the full model https://github.com/Lightricks/ComfyUI-LTXVideo/blob/master/example_workflows/LTX-2_T2V_Full_wLora.json
>>
>>107790920
the prompt was literally "The camera zooms in on the spinning blue fireball held by the frozen fox woman in slow motion." I and other have shown plenty of ones with tons of motion
>>
>>107790920
is just two anons posting the same boring clips over and over
>>
>>107790937
says the guy posting the same "1 female e-celeb" over and over, you're totally not boring by doing that lol
>>
>>107790937
imagine only being able to fit z image on your shitty 4GB gpu and crying over people having fun with video
>>
>>107790937
>the most uncreative fucker is calling others boring
ohh the irony
>>
File: ComfyUI_temp_cnzpx_00041_.png (1.94 MB, 1120x1400)
1.94 MB
1.94 MB PNG
>>107790945
>>107790946
>>107790950
ouch, it seems my post hurt your feelings, sorry anon, you keep posting your little fun videos
>>
>>107790957
need her
>>
>>107790957
keep posting your almost exactly the same looking jailbait teens then with plastic skin and fake looking faces
>>
Hey ltxfags, can you animate this? >>107790635
No you can't and never will
>>
>>107790423
>>107790147
Remove the VAE connection to the positive prompt and instead use VAE encode + Reference Latent
>>
>>107790957
those videos are way more interesting and fun than your e-celeb 1girl slop that's for sure
>>
File: ComfyUI_temp_cnzpx_00042_.png (2.11 MB, 1120x1400)
2.11 MB
2.11 MB PNG
>>107790963
kek, I just know they downloaded the gen and failed miserably trying to generate a video
>>
rip crunchyroll dub actors:

https://files.catbox.moe/fgostf.mp4
>>
https://files.catbox.moe/ce70aa.mp4
>>
>>107790971
why would anyone animate your boring slop, there's plenty of cool images to work with in the first place
>>
why are you retards so combative? i dont learn anything from your faggy bickering. cease
>>
>>107790957
>>107790971
need her
>>
so I guess there is no audio censorship:

https://files.catbox.moe/dop814.mp4
>>
File: ComfyUI_temp_cnzpx_00043_.png (2.22 MB, 1120x1400)
2.22 MB
2.22 MB PNG
>>107790976
its ok anon you're right, giving incel coomers more power is just not right, look what happened in X with the edit model
>>
>>107790986
why would there be
or better yet, how could there be? just remove the phonemes ni and ger?
>>
File: 1763146010134031.png (191 KB, 400x400)
191 KB
191 KB PNG
>>107790979
>noo I want to fuck around and not found out
lol
>>
>>107790992
idk what if it was a chinese model and it banned tiananmen square from prompts and created gibberish
>>
>>107790994
wrong. i exude pure love energy, even when i'm angry

i'm care bears
>>
>>107790989
>giving incel coomers more power is just not right
this is true because an incel like you is now spamming "e-thot, sitting" over and over like the braindead coomer you are
>>
reminder that you need to pip install comfy-kitchen to get FP8 speedups
>>
>>107791002
it'll only work if you're on cuda 13 though >>107790332
>>
the anime girl with white hair wearing a black dress and white panties stands up and points to the camera and says "hey anon, how's it going?" in a soft voice.

oh man it's almost too good at dub voices being the same

https://files.catbox.moe/t3qhpb.mp4
>>
File: ComfyUI_temp_cnzpx_00046_.png (2.47 MB, 1120x1400)
2.47 MB
2.47 MB PNG
>>107791001
good thing you learned another word anon, now keep using it along with e-celeb slop
>>
me be like
https://files.catbox.moe/kfusxy.mp4
>>
>>107791010
Concession accepted, incel.
>>
>>107791011
THEY BROKE THE WALLS WE GUARDED
WOOOOAAHOH

BUT THEY DONT CARE ABOUT IT
WOOOOOAAHOH

WE'LL FINISH WHAT WE STARTED
WOOOAAAAHOH

SO PROMISE ME THAT
>>
File: ComfyUI_temp_cnzpx_00048_.png (1.83 MB, 1120x1400)
1.83 MB
1.83 MB PNG
>>
the jews pay some shills for the thread, because why someone would posting a DOA and anti cunny model? Also the gens are totally shit
>>
man I'm not liking this shitty model, instant OOM when I try to use the camera control loras (even reducing from 10 to 6 seconds lmao)
>>
>>107790979
>why are you retards so combative?
you expected different from a 4chanz thread?
>>
>>107790964
Not really sure in what order I should connect that?
>>
oh and 2nd reminder, to use the official sampler with the lightx2 WF you need this:
https://github.com/ClownsharkBatwing/RES4LYF
>>
>>107791047
do you have a video comparison or something? how can you be so sure it's better than euler?
>>
>>107791040
it wasn't always like this, people are overly argumentative for the sake of it now
>i accept your concession
type posting. it's not fun or interesting
>>
>>107791054
1. Its the default in the LTX2's teams node
2. Testing
>>
>>107791055
What would you expect from some zoomer kid that watches asmongold daily and made his whole personality about us politics
>>
>>107791066
>t. triggered leftist
>>
>>107791055
this specific thread i agree
its the end of a shiny new model day thodesu
>>
>>107790332
>is it worth the hassle of upgrading to cuda 13 to use it?
for those who want to build new sageattention wheels with cuda 13 you can go for this (it worked for me)
https://github.com/thu-ml/SageAttention/issues/262#issuecomment-3484630336
>>
fresh when ready
>>107791088
>>107791088
>>107791088
>>107791088
>>
File: ComfyUI_temp_cnzpx_00052_.png (1.97 MB, 1120x1400)
1.97 MB
1.97 MB PNG
>>107791091
Nice OP :)
>>
>>107791047
they actually only use that for the FULL NON DISTILLED flow, for the distilled one it's euler.
>>
ltx-2-19b-dev can do nudity unlike distilled, huh



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.