[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/wsg/ - Worksafe GIF

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • Supported file types are: GIF, WEBM, MP4

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


File: 1739548272015919.mp4 (150 KB, 720x720)
150 KB
150 KB MP4
Miku Edition

Discussion of Free and Open Source Diffusion Models

Prev: >>>/g/107791088

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/kohya-ss/sd-scripts
https://github.com/tdrussell/diffusion-pipe

>Z Image Turbo
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>WanX
https://github.com/Wan-Video/Wan2.2
https://comfyanonymous.github.io/ComfyUI_examples/wan22/

>NetaYume
https://civitai.com/models/1790792?modelVersionId=2485296
https://nieta-art.feishu.cn/wiki/RY3GwpT59icIQlkWXEfcCqIMnQd

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe|https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
>>
wrong board
>>
BASED
>>
File: 1742250089513206.webm (2.02 MB, 1080x1920)
2.02 MB
2.02 MB WEBM
>>
File: 1758852517981157.mp4 (280 KB, 576x1024)
280 KB
280 KB MP4
>>
>>6067153
this is nicer and nobody will ask me to install gentoo
>>
File: LTX-2_00001-audio.mp4 (1.08 MB, 576x832)
1.08 MB
1.08 MB MP4
>>
File: LTX_2.0_i2v_00020_.mp4 (850 KB, 448x832)
850 KB
850 KB MP4
>>
File: LTX_2.0_i2v_00021_.mp4 (832 KB, 448x832)
832 KB
832 KB MP4
>>
>>6067724
lul
>>
File: 1760276029449595.mp4 (1.73 MB, 1184x800)
1.73 MB
1.73 MB MP4
I still don't get why /g/ doesn't let you upload video with sound, it's retarded
>>
File: LTX_2.0_i2v_00073_.mp4 (2.15 MB, 1280x736)
2.15 MB
2.15 MB MP4
>>
File: LTX_2.0_i2v_00027_.mp4 (1.84 MB, 448x832)
1.84 MB
1.84 MB MP4
>>6067725
>>
>>6067729
can you try and see if ltx 2 can do some ytp kino like sora 2
>>
>>6067731
>ytp kino
qrd
>>
>>6067732
>>6067731
I mean in the context of sora
>>
File: LTX_2.0_i2v_00032_.mp4 (1.42 MB, 448x832)
1.42 MB
1.42 MB MP4
>>
>>6067734
this model sure loves to do some powerpoint shit, I wonder if going for an abliterated version of gemma 3 could fix it
>>
File: LTX_2.0_i2v_00038_.mp4 (654 KB, 448x832)
654 KB
654 KB MP4
>>6067740
Oh for sure, I'm getting a fuck ton of powerpoints and posting the least bad ones
>>
File: LTX_2.0_i2v_00043_.mp4 (1.06 MB, 448x832)
1.06 MB
1.06 MB MP4
>>
File: 1763374024836227.mp4 (3.08 MB, 960x960)
3.08 MB
3.08 MB MP4
When base
>>
>>6067746
they're saying boo-urns
>>
File: sad.mp4 (1.22 MB, 960x960)
1.22 MB
1.22 MB MP4
>>6067746
>When base
if they don't release it before Chinese's new year (Feb 17, 2026) it's definitely over
>>
File: LTX_2.0_i2v_00049_.mp4 (813 KB, 448x832)
813 KB
813 KB MP4
fucking powerpoints
>>
File: LTX_2.0_i2v_00050_.mp4 (1.5 MB, 960x960)
1.5 MB
1.5 MB MP4
>>6067746
>>
>>6067750
I give you the original image input so that you can get a better result kek (we can't upload images on this place? this sucks wtf)
https://files.catbox.moe/1jwczb.jpg
>>
File: LTX_2.0_i2v_00053_.mp4 (2.41 MB, 960x960)
2.41 MB
2.41 MB MP4
>>6067756
thanks, already had this one
>>
>>6067750
>>6067757
absolute kino, love those ending transitions
>>
File: 1741358928983775.mp4 (4.1 MB, 1920x1080)
4.1 MB
4.1 MB MP4
https://www.reddit.com/r/StableDiffusion/comments/1q6zb57/comment/nycrhpl/
seems like it's working better on Wan2GP
>>
File: output.webm (3.88 MB, 960x960)
3.88 MB
3.88 MB WEBM
migu left :(
>>
>>6067773
lmaooo, I guess you tried to stitch the videos together by going for the last frame but it's getting more and more horrific for each iteration kek
>>
>>6067774
got so bad miku left the video and made me end it, svi when
>>
>>6067773
>migu left
catch her back! without the sacrifice we won't get Z-image turbo!
>>
>>6067768
i'm having a tinker with it, I'm upping the res and frames each time but the resource usage never moves, could it be infinite?
>>
https://github.com/modelscope/DiffSynth-Studio/commit/0efab85674f2a65a8064acfb7a4b7950503a5668
Oh, looks like we'll finally get it!
https://files.catbox.moe/lney3m.JPG
>>
File: WERE BACK.gif (2.73 MB, 498x498)
2.73 MB
2.73 MB GIF
>>6067795
>https://github.com/modelscope/DiffSynth-Studio/commit/0efab85674f2a65a8064acfb7a4b7950503a5668
oh shit it's from Modelscope, finally something is happening
>>
File: LTX_2.0_i2v_00060_.mp4 (720 KB, 512x640)
720 KB
720 KB MP4
waow
>>
>>6067795
I thought they would've released it right before Chinese's new year, but if it's sooner than that I'll definitely take it, gimme gimme gimme
>>
File: LTX_2.0_i2v_00076_.mp4 (3.21 MB, 1280x704)
3.21 MB
3.21 MB MP4
>>
did I do something wrong?
>>
>>6067807
sounds correct to me
>>
File: kek.mp4 (2.17 MB, 1312x704)
2.17 MB
2.17 MB MP4
>>6067795
>>
>>6067803
>>6067811
those powerpoints zoom/de-zoom is killing this model, without that it would be way more fun to play with
>>
File: 1753819783513759.mp4 (1.46 MB, 960x1280)
1.46 MB
1.46 MB MP4
>>6067803
>>
File: LTX_2.0_i2v_00086_.mp4 (3.86 MB, 1280x736)
3.86 MB
3.86 MB MP4
>>
>>6067789
>could it be infinite?
there has to be a resource usage increase, but maybe they found some tricks to make it minimal, this is a huge deal desu
>>
File: LTX_2.0_i2v_00084_.mp4 (3.28 MB, 1280x736)
3.28 MB
3.28 MB MP4
>>
>>6067823
seed lotto or is this a good prompt
>>
>>6067845
starting to notice this music in a lot of videos
>>
File: LTX_2.0_i2v_00089_.mp4 (4.17 MB, 1280x736)
4.17 MB
4.17 MB MP4
>>6067863
Probably the most generic suspense sounds all mashed into one homogeneous suspense slop.
>>
>>6067859
My first try. I am using qwen 8b to enhance the prompt other than that it's the standard comfy flow for the distill model.
>>
File: LTX_2.0_i2v_00090_.mp4 (3.43 MB, 1280x736)
3.43 MB
3.43 MB MP4
>>
>>6067789
it happened
looks like the limit is 960x960_240, or more frames for fewer pixels and vice versa
pretty good, especially considering comfy won't even try at 832x480_121
>>
>ltxv2
>input picture of woman
>prompt her to say something and do a simple action
>every single gen it hangs on the static input image for several seconds while audio plays then the last second it cuts to show an unrelated woman doing the action I prompted (while also being garbled slop)
what the fuck gives?
>>
is chroma better than lumina?
>>
>>6067970
Side grade
>>
>>6067970
lateral step
>>
File: LTX_2.0_i2v_00061_.mp4 (1.7 MB, 704x384)
1.7 MB
1.7 MB MP4
>>
File: LTX_2.0_i2v_00062_.mp4 (1.51 MB, 1408x768)
1.51 MB
1.51 MB MP4
>>
File: LTX_2.0_i2v_00063_.mp4 (2.54 MB, 1408x768)
2.54 MB
2.54 MB MP4
>>
File: LTX_2.0_i2v_00064_.mp4 (2.58 MB, 1408x768)
2.58 MB
2.58 MB MP4
>>
>>6067964
>what the fuck gives?
they censored the model, so we're getting the API cuck treatement, but in local!
>>
File: LTX_2.0_i2v_00065_.mp4 (3.01 MB, 768x1216)
3.01 MB
3.01 MB MP4
>>
File: LTX_2.0_i2v_00066_.mp4 (2.45 MB, 768x1216)
2.45 MB
2.45 MB MP4
bruh
>>
File: LTX_2.0_i2v_00067_.mp4 (2.58 MB, 768x1216)
2.58 MB
2.58 MB MP4
good enough I guess
>>
>>6068165
>>6068167
it's terrible when the movement is fast, not a big fan of the blurry shit lol
>>
File: 1756895147045515.mp4 (1.59 MB, 832x1088)
1.59 MB
1.59 MB MP4
gens really shouldn't look this fake in year of our Lord 2026. Even on good rolls everything always goes a bit blurry. Colours change. Weird motions.
Is it comfy's fault?
>>
>>6068170
its ai slop but the sound makes it funny
>>
>>6068171
>gens really shouldn't look this fake in year of our Lord 2026.
I agree, Z-image turbo showed that you can make good and small models, the others need to learn a thing or two from Tongyi
>>
>>6068171
>dried cum moving when she moves tummy
>>
File: LTX_2.0_i2v_00068_.mp4 (2.56 MB, 1088x832)
2.56 MB
2.56 MB MP4
>>6068171
I mixed height with width, grim
>>
File: LTX_2.0_i2v_00069_.mp4 (2.56 MB, 832x1088)
2.56 MB
2.56 MB MP4
>>6068179
eh
>>
>>6068171
>>6068181
are you using the upscaler? if yes, remove that shit and go for a vanilla render with more pixels (like 0.9 megapixels)
>>
>>6067150
d*bo status?
>>
>>6068189
you can't post images there so you can't be an avatarfag, we're safe from those fuckers lol
>>
>>
File: LTX-2_00011_.mp4 (328 KB, 832x448)
328 KB
328 KB MP4
>>
I want to try this out even if I'm a 16+32 ramlet

What's the best UI to pick up?
>>
>use LTX-2 to create the audio
>then use Wan 2.2 S2V with the audio for better video quality
I'm too lazy to set it up but someone should try this.
>>
>>6068462
wangp
>>
>>6068462
Pinokio + wan2gp if you are lazy and/or have no idea what you are doing. ComfyUI for more speed, but you need to learn a few things first.
>>
File: LTX_2.0_i2v_00070_.mp4 (1.13 MB, 768x1344)
1.13 MB
1.13 MB MP4
>>
>>6068165
lmao
>>
File: LTX-2_00002-audio.mp4 (1.2 MB, 512x736)
1.2 MB
1.2 MB MP4
breh why
>>
File: LTX-2_00004-audio.mp4 (1.8 MB, 512x736)
1.8 MB
1.8 MB MP4
i give up
>>
File: LTX-2_00005-audio.mp4 (958 KB, 704x704)
958 KB
958 KB MP4
ltx hates migu
>>
>>6068647
>>6068641
>>6068639
>>6068625
wtf 4chan supports audio now?
>>
>>6068650
Only chad boards like /wsg/ do
>>
>>6068651
i thought i was on /g/ lol
>>
File: LTX-2_00007-audio.mp4 (1.52 MB, 704x704)
1.52 MB
1.52 MB MP4
Not what I asked, but kinda cute ngl
>>
File: LTX_2.0_i2v_00134_.mp4 (1.64 MB, 1664x960)
1.64 MB
1.64 MB MP4
>>
File: 4.mp4 (2.55 MB, 768x1344)
2.55 MB
2.55 MB MP4
cozy bread
>>
>>6068656
is this real?
>>
>>6068639
SONGIK
>>
>>6068655
moar
>>
>>6068655
Would watch.

Hiroshima is a greedy gook.
>>
File: LTX-2_00008-audio.mp4 (2.93 MB, 1056x608)
2.93 MB
2.93 MB MP4
Attempt 1
>>
File: LTX-2_00009-audio.mp4 (2.27 MB, 1056x608)
2.27 MB
2.27 MB MP4
>>6068655
Attempt 2
>>
File: LTX-2_00010-audio.mp4 (2.32 MB, 1056x608)
2.32 MB
2.32 MB MP4
>>6068655
>>
File: ComfyUI_00005-audio.mp4 (1.98 MB, 640x1024)
1.98 MB
1.98 MB MP4
>>
>>6068655
Ok this is awesome.

>Glad you could bake it, Uther.
>>
File: LTX_2.0_i2v_00071_.mp4 (1.11 MB, 1152x768)
1.11 MB
1.11 MB MP4
>>
File: LTX_2.0_i2v_00006_.mp4 (1.29 MB, 704x384)
1.29 MB
1.29 MB MP4
>>
File: LTX_2.0_i2v_00072_.mp4 (1.6 MB, 640x1024)
1.6 MB
1.6 MB MP4
>>6068675
>>
File: LTX_2.0_i2v_00073_.mp4 (4.02 MB, 768x1344)
4.02 MB
4.02 MB MP4
>>
File: LTX_2.0_i2v_00074_.mp4 (1.03 MB, 832x1216)
1.03 MB
1.03 MB MP4
thanks for the powerpoint
>>
Just heads up that if you aren't using the Q8 ggufs yet, you might want to consider it.
>>
>>6068696
link?
>>
File: LTX_2.0_i2v_00147_.mp4 (1.54 MB, 960x512)
1.54 MB
1.54 MB MP4
>>
>>6068698
https://huggingface.co/Kijai/LTXV2_comfy/tree/main/diffusion_models
>>
>>6068698
https://huggingface.co/Kijai/LTXV2_comfy/tree/main/diffusion_models
>>
>>6068702
>>6068700
thank
>>
>3 difsferent thread s
bruh
>>
>>6068704
this is the shelter from schizos plus we got audio
>>
File: LTX_2.0_i2v_00075_.mp4 (983 KB, 832x1152)
983 KB
983 KB MP4
>>
desu, I'm finding having the audio ready and genning the i2v over it gives some pretty awesome results.
>>
File: LTX_2.0_i2v_00148_.mp4 (3.14 MB, 1280x704)
3.14 MB
3.14 MB MP4
>>
>>6068702
>>6068696
Getting " LTXVEmptyLatentAudio 'VAE' object has no attribute 'latent_frequency_bins'" when I run it, any idea what's going on?
>>
>>6068721
Looks like you are trying to use GGUF model?
Its not yet fully supported, the GGUF model loader node needs an update.
Either wait a little bit with patience until its out in the public version, or use git to pull the pr

Although you have a VAE error, maybe you connected something wrong with the VAE loaders (if you have already the PR update from the GGUF model loader)
But thats beyond my "pay grade", not sure why, will leave that for the experts ;-)

https://github.com/city96/ComfyUI-GGUF/pull/399
>>
i just got my 5090 last week and am paranoid about those 12vhpwr issues so i run it at 70% power the whole time. it keeps it dead silent at least. need to look into undervolting
>>
>>6068709
ear torture
>>
>>6068733
you did buy the asus with pin monitor right? if not you better get the debaur cable that does it, else good bye gpu
>>
>>6068736
no, that one was out of stock and almost a $1000 more expensive
i am literally just YOLOing it
>>
File: LTX_2.0_i2v_00076_.mp4 (913 KB, 832x1152)
913 KB
913 KB MP4
powerpoint again gee
>>
File: LTX_2.0_i2v_00077_.mp4 (563 KB, 256x192)
563 KB
563 KB MP4
>>6068737
>>
>>6068739
RIP finety-ninety
>>
>>6068739
that would be just my luck and wouldn't surprise me desu.
>>
>>6068736
>debaur cable that does it
It's a pretty cool product to a self imposed problem by nvidia, but it doesn't support FE card, sadly.
>>
>>6068744
grim
>>
>>6068738
3 things that help :
- the compression node LTXVPreprocess + node that changes resolution Resize Images by Longer Edge
- the node LTXVImgToVideoInplace
- long and detailed prompts
>>
>>6068745
It's fine, mine is running almost every day and there was no problem last I checked a month ago when adding ram.
It's capped at 460W though, which gives roughly the same performance while consuming 20% less.
>>
File: LTX_2.0_i2v_00078_.mp4 (2.79 MB, 640x1088)
2.79 MB
2.79 MB MP4
>>
File: LTX_2.0_i2v_00080_.mp4 (3.14 MB, 832x1088)
3.14 MB
3.14 MB MP4
>>6068171
>>
File: 1747771178489730.mp4 (5.07 MB, 1280x704)
5.07 MB
5.07 MB MP4
What causes powerpoints?
>>
>>6068774
doesn't happen on their api
>>
File: LTX_2.0_i2v_00082_.mp4 (1.89 MB, 576x576)
1.89 MB
1.89 MB MP4
>>
>>6068784
i undressed this bitch yesterday
>>
File: LTX_2.0_i2v_00083_.mp4 (1.28 MB, 640x1024)
1.28 MB
1.28 MB MP4
dunno why the dramatic pause at the start
>>
>>6068790
isn't it obvious?
>>
File: LTX_2.0_i2v_00084_.mp4 (1.44 MB, 576x576)
1.44 MB
1.44 MB MP4
sloppa
>>
File: LTX_2.0_i2v_00088_.mp4 (905 KB, 576x576)
905 KB
905 KB MP4
lmao
>>
quick test
that's one feature down I was really hoping for, now just video continuation with audio encoded
>>
File: LTX_2.0_i2v_00098_.mp4 (1.73 MB, 640x896)
1.73 MB
1.73 MB MP4
>>
File: LTX_2.0_i2v_00102_.mp4 (1.17 MB, 704x1088)
1.17 MB
1.17 MB MP4
>>
you're really loving this inspector gadget shit despite ltx have piss poor special effects
>>
File: LTX_2.0_i2v_00104_.mp4 (1.67 MB, 704x1088)
1.67 MB
1.67 MB MP4
>>6068821
True story
>>
File: LTX_2.0_i2v_00108_.mp4 (1.1 MB, 704x1088)
1.1 MB
1.1 MB MP4
>>6068821
prompt adherence too, unless the person is facing the camera they won't speak most of the time, at least that's my experience as you can see in the previous and this one
>>
>>6068828
yes, getting the correct person to speak the correct lines has been the bane of my gens
I wish I knew the trick because sometimes all it takes is 'the person on the right' and others it needs their social security number and mothers maiden name to specify them
>>
>>6068829
ltx regional prompting
>>
>>6068699
this is fucking sick
>>
File: LTX_2.0_i2v_00111_.mp4 (888 KB, 448x896)
888 KB
888 KB MP4
wish it could do good 2d, maybe some lora will save us
>>
File: LTX_2.0_i2v_00115_.mp4 (3 MB, 768x832)
3 MB
3 MB MP4
>>
File: ComfyUI_00004-audio.mp4 (2.97 MB, 544x960)
2.97 MB
2.97 MB MP4
>>6068675
Just in case someone misses on /g/, this quant of gemma3 is stupid fast on 50xx series https://huggingface.co/GitMylo/LTX-2-comfy_gemma_fp8_e4m3fn/blob/main/gemma_3_12B_it_nvfp4_uncalibrated.safetensors
>>
>>6068851
will try thanks king
>>
File: ComfyUI_00003-audio.mp4 (3.01 MB, 544x960)
3.01 MB
3.01 MB MP4
>>6068851
shid
wanted to post a better one
>>
see what I mean about the sitcom vibe
>>
>>6068774
>What causes powerpoints?
they censored their model, desu it ruins the fun for me, I feel like an API cuck who's told what's acceptable to generate or not
>>
>>6068851
>fp4 on a text encoder
bruh... using fp8 on the text encoder is already a bad idea but fp4 is on another level lmao
>>
>>6068702
>https://huggingface.co/Kijai/LTXV2_comfy/tree/main/diffusion_models
you have to activate this PR to make it work right?
https://github.com/city96/ComfyUI-GGUF/pull/399
>>
>>6068851
my shit is not working with that, can you share your workflow?
>>
>>6068859
is this real?
>>
>>6068870
ye
>>
File: output.mp4 (5.39 MB, 1536x832)
5.39 MB
5.39 MB MP4
>>6068851
Pretty decent considering its 1/3 the size.
Left is og, right nvfp4
>>
https://civitai.com/models/2292336/ltx-2-nsfw-text-encoder-gemma-3-12b-abliterated?modelVersionId=2579572
looks like the censorship/powerpoint shit can be removed if we go for the uncucked version of gemma 3?
>>
>>6068895
but it powerpoints even normal shit
>>
>>6068897
>it powerpoints even normal shit
that's false positives, the model is so censored it sees anything as NSFW
>>
>>6068898
lmao if true
>>
File: LTX_2.0_i2v_00127_.mp4 (1.07 MB, 640x960)
1.07 MB
1.07 MB MP4
>>
>>6068903
is that ace-step?
>>
>>6068906
nope, its ltx2 disobeying my prompt and making music lol
>>
File: LTX_2.0_i2v_00132_.mp4 (2.57 MB, 768x1152)
2.57 MB
2.57 MB MP4
>>
>>
https://github.com/city96/ComfyUI-GGUF/pull/402
there's a way to go for gguf's of the text encoder now
>>
>>6068733
same, but watercooled. I run it at 80% and have a max of 500w. Thinking about going lower.
>>
>>6068928
why
>>
File: LTX_2.0_i2v_00133_.mp4 (3.33 MB, 768x1152)
3.33 MB
3.33 MB MP4
>>
>>6068929
I get 95% performance for - 100w.
>>
>>6068870
>https://github.com/city96/ComfyUI-GGUF/pull/399
>>6068925
>https://github.com/city96/ComfyUI-GGUF/pull/402
to get both PR on your repository you can do this

git checkout -b temp-test-branch main
git fetch origin pull/399/head:pr-399
git fetch origin pull/402/head:pr-402
git merge pr-399 -m "Merge PR 399"
git merge pr-402 -m "Merge PR 402"
>>
>>6068877
u might need to upgrade to cu130 mby? ive not tried ltx ever but needed it for ZIT nvfp4 to work properly otherwise it would output a bunch of geometric shapes
>>
File: grid_output_with_audio.webm (3.43 MB, 2048x768)
3.43 MB
3.43 MB WEBM
>>
>>6068936
what each video is supposed to represent though? what are you comparing to?
>>
>>6068937
just seeds
>>
>>6068936
does i2v not care about portait dimensions?
>>
File: ltx2_i2v_00070_.mp4 (1.24 MB, 512x704)
1.24 MB
1.24 MB MP4
fun stuff

this has ruined WAN for me, I need audio now
>>
>>6068939
no
>>
File: grid_output_with_audio.mp4 (5.92 MB, 2048x716)
5.92 MB
5.92 MB MP4
>>
>>6068932
you also have to go to this ComfyUI\custom_nodes\ComfyUI-GGUF\loader.py file and change this code
>sd, arch = gguf_sd_loader(path, return_arch=True, is_text_model=True)
to
>sd, arch, metadata = gguf_sd_loader(path, return_arch=True, is_text_model=True)
and you're good to go, you'll be able to use GGUFs on both ltx2 and its text encoder

and don't forget to download the tokenize.model
https://huggingface.co/unsloth/gemma-3-4b-it/blob/main/tokenizer.model
and put it on ComfyUI\models\text_encoders
>>
>>6068952
thank you
>>
File: LTX_2.0_i2v_00140_.mp4 (1.21 MB, 768x448)
1.21 MB
1.21 MB MP4
>>
>>6068932
>>6068870
>>6068952
it's amazing how much better at memory efficiency GGUF is compared to .safetensors, before I was peaking at 130 gb of ram usage, now it doesn't go over 80gb
>>
So we need kijai and gguf repos to update?
>>
>>6068967
what do you mean exactly?
>>
>>6068940
I just can't go back to 10 gens for slo-mo tiity bounces. LoRas for LTX can't come fast enough
>>
>>6068970
*10 min gens
>>
>>6068968
my mom will be mad at me if I use unmerged commits, so do i just wait till those two I mentioned merge?
>>
File: 1755929021370559.mp4 (2.69 MB, 960x960)
2.69 MB
2.69 MB MP4
https://github.com/modelscope/DiffSynth-Studio/commit/0efab85674f2a65a8064acfb7a4b7950503a5668
https://xcancel.com/bdsqlsz/status/2009566444632334479#m
>>
>>6068973
if you don't want to use unmerged commits you'll have to wait yeah
>>
>>6068974
zimage
>>
File: LTX_2.0_i2v_00142_.mp4 (569 KB, 384x576)
569 KB
569 KB MP4
>>6068974
>>
File: LTX_2.0_i2v_00143_.mp4 (1.7 MB, 832x832)
1.7 MB
1.7 MB MP4
>>6068974
>>
>>6068989
kek, the music was pretty decent on that one
>>
>>6068989
was THAT ace-step?
>>
>>6068993
Just ltx, I don't even have ace installed
>>
File: 1753831695494216.mp4 (4.66 MB, 1280x736)
4.66 MB
4.66 MB MP4
>The woman throws her phone on the ground and then runs to jump into a swimming pool.
its prompt adherence is kinda garbage, I guess it's because we use the distilled model, but going for the regular one + 20 steps is gonna take too long just for some shitpost desu
>>
Where ace step 2 btw?
>>
>>6068998
i forgot
>>
>>6068974
>https://github.com/modelscope/DiffSynth-Studio/commit/0efab85674f2a65a8064acfb7a4b7950503a5668
looks like only Omni Base will be released first, they also promised "Z-image" which has the SFT finetune to it
https://github.com/Tongyi-MAI/Z-Image?tab=readme-ov-file#-model-zoo
>>
File: LTX_2.0_i2v_00146_.mp4 (2.33 MB, 768x768)
2.33 MB
2.33 MB MP4
bro what
>>
>Error running sage attention: Input tensors must be in dtype of torch.float16 or torch.bfloat16, using pytorch attention instead.
Do you also have that when making ltx2 videos? can this be fixed at all?
>>
>>6069007
i do
>>
>>6069006
qrd? is this image from some news or something?
>>
>>6069012
i think its the woman who got killed by ice the other day
>>
File: LTX_2.0_i2v_00148_.mp4 (1.22 MB, 768x576)
1.22 MB
1.22 MB MP4
>>
>>6069007
>>6069011
https://github.com/kijai/ComfyUI-WanVideoWrapper/issues/1192#issuecomment-3259588742
>That error is not coming from these nodes as it mentions falling back to pytorch attention. It's probably from some other model in the workflow, not the model itself that reports it.
>It's probably from the audio encoder and it should not matter.
>>
>>6069015
thanks king
>>
File: 1747135557633964.mp4 (1.23 MB, 768x768)
1.23 MB
1.23 MB MP4
>died for nothing award
>>
File: LTX_2.0_i2v_00149_.mp4 (1.32 MB, 768x512)
1.32 MB
1.32 MB MP4
>>
File: 1738738969054071.mp4 (2.21 MB, 1280x736)
2.21 MB
2.21 MB MP4
>The woman is playing with her guitar
I'm tired of those fucking powerpoint shit, fuck this model
>>
>>6069018
just saw the video, goddam she's so fucking retarded
https://xcancel.com/EricLDaugh/status/2009682363077509453#m
>>
>>6069022
welcome to the club
>>
File: LTX_2.0_i2v_00150_.mp4 (1.54 MB, 1024x576)
1.54 MB
1.54 MB MP4
>>6069022
>>
>>6069027
this is so fucking bad lmao, they said they were going for a 2.1 version, but I hope they realize how much improvement this model actually needs, it's a really rough model at the moment
>>
File: LTX_2.0_i2v_00151_.mp4 (1.45 MB, 1024x576)
1.45 MB
1.45 MB MP4
>>6069022
>>
>>6069030
loras will fix this in two weeks
>>
From what I’ve seen of this model it absolutely loves Loras and gobbles them up. My hopes are high.
>>
>>6069033
do you have that lora anon posted for nsfw? the download got deleted
>>
>>6069034
My dude I just woke up like ten minutes ago.
>>
File: LTX_2.0_i2v_00154_.mp4 (774 KB, 640x832)
774 KB
774 KB MP4
>>
https://github.com/huggingface/transformers/pull/43100/files
>glm_image.md
>This model was released on 2026-01-10 and added to Hugging Face Transformers on 2026-01-10.
interesting, maybe that's why the Tongyi fucks started to wake up, maybe they see GLM image as a threat
>>
>>6069044
is it?
>>
Mom, the chinks are making fun of us on discord :(
https://files.catbox.moe/s6h9l2.mp4
>>
>>6069049
it do be funny
>>
File: LTX_2.0_i2v_00155_.mp4 (2.02 MB, 576x896)
2.02 MB
2.02 MB MP4
>>
>>6069022
>I'm tired of those fucking powerpoint shit,
maybe going for the abliterated version helps
https://huggingface.co/mlabonne/gemma-3-12b-it-abliterated-v2-GGUF/tree/main
https://civitai.com/models/2292336/ltx-2-nsfw-text-encoder-gemma-3-12b-abliterated?modelVersionId=2579572
>>
>>6069052
im feeling abliterated
>>
File: LTX_2.0_i2v_00159_.mp4 (886 KB, 640x512)
886 KB
886 KB MP4
>>
>>6069051
>Me when someone brings up z-image base
>>
File: LTX_2.0_i2v_00163_.mp4 (1.32 MB, 512x960)
1.32 MB
1.32 MB MP4
>>
>>6069062
You become a hot devil woman? please be in london
>>
File: 1746575631860416.mp4 (4.28 MB, 1896x720)
4.28 MB
4.28 MB MP4
>>6069022
>>6069052
it barely changed anything, sad
>>
>>6069071
she moved a finger
>>
File: 1747135557633965.webm (942 KB, 1106x720)
942 KB
942 KB WEBM
>>6069071
Try a different seed it works for me
>>
>>6069071
What model are you using?
>>
>>6069078
ltx2?
>>6069077
the point was to try to remove the powerpoint seeds, to not have this shit anymore
>>
>>6069080
>he thinks he can fully undo mossad censoring by changing the encoder
lol
>>
>>6069082
:(
>>
>>6069080
>ltx2?

No like, quant, gguf, fp8 or fp8 distilled
>>
File: 1736695881162023.mp4 (2.6 MB, 1280x736)
2.6 MB
2.6 MB MP4
>>
>>6069083
wait for loras
>>
>>6069085
Q8 + distilled
>>
>>6069088
use the q8 distilled instead
>>
>>6069088
Must be the prompt then. The only times I've gotten powerpoint slides was when I didn't prompt enough actions to fit the time.
>>
>>6069088
are you using negatives?
>>
>>6069090
with ablit?
>>
>>6069088
>Q8 + distilled
>>6069089
>use the q8 distilled instead
?

>>6069091
how can I use negatives it's distilled it's at cfg 1
>>
>>6069092
>with ablit?
Ima be real. I have no idea if ablit does anything or not.
>>
>>6069095
gazebo
>>
File: LTX_2.0_i2v_00153_.mp4 (1.83 MB, 1344x768)
1.83 MB
1.83 MB MP4
>>
File: LTX_2.0_i2v_00166_.mp4 (210 KB, 128x64)
210 KB
210 KB MP4
>>6069086
uhmm my prompt was to make the planet gain a face
>>
>>6069099
>128x64
lul
>>
>>6069101
each pixel assembled with care
>>
File: LTX_2.0_i2v_00168_.mp4 (1.52 MB, 1344x768)
1.52 MB
1.52 MB MP4
>>6069098
>>
>>6069098
sounds different to how I remember it
>>
>>6068169
Wow. ltx 2 is really bad. Is that a super quantized distilled fp1 version?
>>
>>6069109
you wish lol
>>
File: LTX_2.0_i2v_00171_.mp4 (1.26 MB, 704x832)
1.26 MB
1.26 MB MP4
>>
>>6069111
ahah, it do be like that mr stancil
https://www.youtube.com/watch?v=l_oaNrAl83Q
>>
>>6069112
kino
>>
File: 1758967761478263.mp4 (1.43 MB, 704x704)
1.43 MB
1.43 MB MP4
I tried adding voices from VibeVoice but it seems to struggle with multiple people speaking in an audio clip. It kept making different people speak from the same voice.
>>
>>6069119
regional prompt when
>>
File: LTX_2.0_i2v_00172_.mp4 (1.9 MB, 896x576)
1.9 MB
1.9 MB MP4
>>
File: 1764587840376236.mp4 (1.61 MB, 704x704)
1.61 MB
1.61 MB MP4
This was the worst one kek
>>
>>6069124
lol
>>
>>6069124
i watched this 4 times, too funny
>>
which gemma 3 12b am I meant to download?
>>
>>6069131
all
>>
>>6069131
the normal one, I tried the "uncucked" versions (aliberated and heretic) and it didn't do shit >>6069071
>>
>>6068925
>>6068932
problem is now it won't load the ltx2 gguf ugh... I need both clip and checkpoint gguf to load, the unet loader is giving.
ValueError: too many values to unpack (expected 2)

can anyone get both the model and clip in gguf format to actually work?
>>
>>6069138
maybe
>>
>>6069136
the 24gb one? Is there a smaller version that will work with the LTXV Audio Text Encoder Loader?
>>
>>6069142
use https://huggingface.co/GitMylo/LTX-2-comfy_gemma_fp8_e4m3fn/blob/main/gemma_3_12B_it_nvfp4_uncalibrated.safetensors
>>
>>6069138
>the unet loader is giving.
>ValueError: too many values to unpack (expected 2)
it shouldn't, did you do that? >>6068952
>>
>>6069142
>>6069145
performance was the same in my tests
>>
>>6069146
i didn't do this to merge both >>6068932
I'm a bit hesitant to run those git commands because I don't know what directory I need to be inside of, i don't want to break my custom nodes directory.
>>
>>6069150
you aren't using claude code?
>>
>>6069150
>I don't know what directory I need to be inside of
those git commands have to be done inside the gguf custom node folder, and if you want to go back to the "normal" branch again you do this
>git checkout master
>>
>>6069156
error: Your local changes to the following files would be overwritten by merge:
loader.py
nodes.py
Please commit your changes or stash them before you merge.
Aborting

how do?
>>
>>6068828
kek
>>
>>6069160
git stash (and LLMs exist, use them if you have a coding problem)
>>
File: 1764670478453452.mp4 (1.83 MB, 704x704)
1.83 MB
1.83 MB MP4
I guess LTX just sucks at interpreting new audio because it can understand this stupid prompt better when using the model's own audio.
>>
>>6069167
what happens if you include the dialogue from the audio in the prompt?
>>
>>6069169
nta, but I've found it to be helpful when it just doesn't get it.
>>
>>6069169
That's what I did earlier but it still got the actors confused every time or they weren't speaking to the right person
>>
>>6069145
Missing weight for layer gemma3_12b.transformer.model.layers.0.self_attn.q_proj
>>
>>6069160
>>6069164
you have to do the merge shit BEFORE modifying the script by yourself
>>
>>6069174
fixed, I had to update comfy...again
>>
File: LTX_2.0_i2v_00164_.mp4 (1.8 MB, 832x480)
1.8 MB
1.8 MB MP4
This key framing with audio shit is finicky but I'm getting close.
>>
>>6069175
DualCLIPLoaderGGUF

Unexpected text model architecture type in GGUF file: 'gemma3'

Back to the original problem, fuck it I'll wait.

I guess I can't have both...
>>
>>6069167
Try raising the denoise strength on the encoded audio latent so that the model thinks that it's generating it.
You will get better results that way.

The "real" way to do it is to reverse sample on the latent like with RF-Edit, but it's too complicated to do right now.
>>
File: LTX_2.0_i2v_00180_.mp4 (1.74 MB, 832x480)
1.74 MB
1.74 MB MP4
>>
>>6069206
not bad, but can you make a 3d version? i hate live action
>>
File: 1738563506105046.mp4 (709 KB, 896x768)
709 KB
709 KB MP4
ltx is pretty fun, trying the kijai Q8 distil model, works on a 4080 (16gb) and 64gb ram just fine:

Donald Trump wearing a black suit walks in from the right and says "she has great genes, you know. the best genes! the best."
>>
File: 1755951650517168.mp4 (2.13 MB, 896x768)
2.13 MB
2.13 MB MP4
>>6069211
also, this gen didnt work as planned but it's still hilarious, cause it captures his emotes so well and appearance:
>>
>>6069206
Make a Tiktok account and just post these with "Warcraft if real"
>>
>>6069214
I used to have a youtube back when harry potter balenciaga was a thing, even got monotized within three days then the dickheads at youtube banned me from earning on my acccount for no reason. "Reuploading content" they said. IT WAS MY CONTENT.
>>
File: 1737909175910826.mp4 (1.21 MB, 896x768)
1.21 MB
1.21 MB MP4
there we go, prompt edit fixed it.

the camera pans to the right is what I needed to add.
>>
File: 1763010791057296.mp4 (1.21 MB, 640x896)
1.21 MB
1.21 MB MP4
lmao

the camera pan is 10/10
>>
>>6069218
Maybe a lot of jealous people report everything. There's money involved after all.



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.