[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: collage.jpg (481 KB, 1200x900)
481 KB
481 KB JPG
Discussion of Free and Open Source Diffusion Models

Prev: >>107794552

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/kohya-ss/sd-scripts
https://github.com/tdrussell/diffusion-pipe

>Z Image Turbo
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>WanX
https://github.com/Wan-Video/Wan2.2
https://comfyanonymous.github.io/ComfyUI_examples/wan22/

>NetaYume
https://civitai.com/models/1790792?modelVersionId=2485296
https://nieta-art.feishu.cn/wiki/RY3GwpT59icIQlkWXEfcCqIMnQd

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>Illustrious
https://rentry.org/comfyui_guide_1girl
https://tagexplorer.github.io/

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe|https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
>>
try the audio + i2v workflow kijai made, that and the regular i2v is all you need:

https://files.catbox.moe/yv1b19.mp4
>>
Blessed thread of frenship
>>
can ltx at least do nudity from i2v?
>>
>>107798362
even better:

https://files.catbox.moe/uafbqj.mp4
>>
I'm not saying the model is terrible. But a lot of the praise for LTX2 does not feel organic. I have no proof other than vibes. Something just isn't right.
>>
>>107798348
Nobody can, it doesn't work. That's why you use the official comfyui workflow released by comfyui, not LTX.
>>
>>107798373
lol
>>
We are healing, bros, one step at a time
>>
>>107798376
one more time but with a better audio sample and more frames (260). all you have to prompt is "the man in the blue shirt is singing with passion."

Bateman but Miku (audio input workflow + image):

https://files.catbox.moe/sktqzs.mp4
>>
>>107798332
>>Maintain Thread Quality
>https://rentry.org/debo
>https://rentry.org/animanon
why are these still in the OP? I thought we were healing?
>>
What's the max length you guys have been able to get an LTX 2 video to go? I got 20 seconds on a 4090. I tried 30 seconds multiple times but comfyui just fully crashes when the video completes.
>>
>>107798424 >>107798362
the workflow for what model where?
>>
File: fuckingtextencoder.png (221 KB, 2082x1147)
221 KB
221 KB PNG
Any fix to this yet?
>>
>>107798450
https://www.reddit.com/r/StableDiffusion/comments/1q627xi/kijai_made_a_ltxv2_audio_image_to_video_workflow/
>>
>>107798428
We always heal because anons need this thread, so they'll always return, not like your fake discord center.
>>
>>107798362
>audio + i2v workflow kijai made, that and the regular i2v
What's the difference between "audio + i2v" and "i2v"?
i2v should already have audio as output
>>
a big post, for you:

https://files.catbox.moe/b198c9.mp4
>>
File: 4291417907.png (924 KB, 896x1152)
924 KB
924 KB PNG
>>
>>107798464
can we kick out the schizo that is terminally online putting off topic drama in the OP and constantly doxxing ani then complaining ani wants to dox him when it's clearly a false flag?
>>
File: 1747650983670953.png (112 KB, 1749x425)
112 KB
112 KB PNG
yippie no more vram spillings on tall images. grok tokens well spent
>>
>>107798495
can you seethe in silence?
>>
the truth about wan 2.5:

https://files.catbox.moe/k5pqz7.mp4
>>
>>107798566
better audio:

https://files.catbox.moe/xdp83r.mp4
>>
>>107798495
Shhh, stay quiet, we are healing.
Please keep yourselves outside /g/.
Thanks.
>>
>>107798495
How about stop talking about it like a gossiping school girl and talk about all the cool ai shit like this thread is intended to. Goddamn.

>>107798566
>yuden
kek
>>
Man, LTXV2 is so good. The only thing that would be better is if Alibaba dropped Wan 2.5....
>>
>>107798493
Nice
>>
>>107798612
it would be good if it had nsfw
>>
>>107798566
>>107798576
lmao
>>
>>107798468
You use audio to drive the video instead of it being generated together.
I don't know how this community isn't putting two and two together (not you >>107798468)
Maybe your guys brains really are fried

>tts models
>songs, sounds effects (stable audio open, tango)
>image gen on any model
>qwen, multiple angles
>20 second gens in under 2 minutes
>vid2vid with wan, controlnets, or use vace for better quality
>etc....
>>
>>107798674
Oh this is pretty cool indeed
>>
>>107798458
i'm still trying to find out why, i'm checking obvious places but no one seems to be talking about it. it leads me to think something is wrong with the download or we have something missing from our environment.
>>
>>107798458
boom found something
https://www.reddit.com/r/StableDiffusion/comments/1q6om6c/ltx2_gemma_error_no_cliptext_encoder_weights_in/

google is a fucking cunt to use these days.
>>
>>107798072
now post them fucking
yeah you cant
>>
>>107798373
no
>>
File: 1742335141696994.png (32 KB, 794x315)
32 KB
32 KB PNG
So what do you use there for the best motion, anon? So fat disabling it completely kills any motion from LTX2, so you have to use it, but I have no idea what would be a good compromise.
>>
>>107798740
so what happens if your starting image is a woman with her tits out?
>>
>>107798332
why is only animeslop on the collage fuck off you got your own general
>>
>blurry, out of focus, overexposed, underexposed, low contrast, washed out colors, excessive noise, grainy texture, poor lighting, flickering, motion blur, distorted proportions, unnatural skin tones, deformed facial features, asymmetrical face, missing facial features, extra limbs, disfigured hands, wrong hand count, artifacts around text, unreadable text on shirt or hat, incorrect lettering on cap (“PNTR”), incorrect t-shirt slogan (“JUST DO IT”), missing microphone, misplaced microphone, inconsistent perspective, camera shake, incorrect depth of field, background too sharp, background clutter, distracting reflections, harsh shadows, inconsistent lighting direction, color banding, cartoonish rendering, 3D CGI look, unrealistic materials, uncanny valley effect, incorrect ethnicity, wrong gender, exaggerated expressions, smiling, laughing, exaggerated sadness, wrong gaze direction, eyes looking at camera, mismatched lip sync, silent or muted audio, distorted voice, robotic voice, echo, background noise, off-sync audio, missing sniff sounds, incorrect dialogue, added dialogue, repetitive speech, jittery movement, awkward pauses, incorrect timing, unnatural transitions, inconsistent framing, tilted camera, missing door or shelves, missing shallow depth of field, flat lighting, inconsistent tone, cinematic oversaturation, stylized filters, or AI artifacts.

"Just do it"?
that's some very weird negative in the ltx2 default wf lol
>>
>>107798373
maybe it'll work with some elaborate workarounds, like nipple pasties
>>
So I've done some tests and LTX 2 can comfortably do 601 frames at 24 fps. 25 second videos locally and so quickly is wild, definitely a model worth finetuning
>>
>>107798738
>noooo, I must see my 1girl getting fucked hard! It's not enough that there are millions of tons of porn material on the internet. IT MUST BE MY 1GIRL GETTING FUCKED BY TYRONE OR THIS IS 1984!
>>
File: maxresdefault.jpg (68 KB, 1280x720)
68 KB
68 KB JPG
>>107798749
it gens a still powerpoint in most cases but some anons have somehow managed to jailbreak it and aren't exactly sharing how. So it would appear we just have to figure out the magic prompting style, I know from experience gemma the text encoder does not exactly help with NSFW stuff. Plus the ltx team trained in some censorship for one month to foil our efforts, so maybe its as simple are tricking it to flip those censorship weights.

Where there is will there is a way, no censorship ever stopped this place :))
>>
>>107798820
gemma is notoriously censored and the abliterated version is retarded.
>>
>>107798791
5 cents have been deposited
>>
>>107798833
There is a less retarded version :
https://huggingface.co/p-e-w/gemma-3-12b-it-heretic/tree/main
>>
>update comfy
>workflow that worked on an 8gb card now OOMs on a 16gb card
bruh
>>
File: 1758461025169264.png (152 KB, 290x250)
152 KB
152 KB PNG
>>107798738
Here you go. You need to be heterosexual to see the full picture, unfortunately.
>>
>make lora using AI toolkit
>want to try it in forge neo
>it vomits out horrid burned images
>try to lower weights
>doesn't help
>remove lora, verify it generates fine with same prompt
>it does
>decide it's just a neo issue, try cumfart
>cumfart suddenly isn't generating ANY z-image content
>just gets this
Can a retard get some help? I tried reinstalling from the top menu, but nothing changed. The same z-image gguf works fine in neo. I don't know what happened. It worked a few weeks ago when I last used cumfart.
>>
>>107798857
chinese culture'd
>>
File: 1746241227771625.jpg (1.64 MB, 1248x1824)
1.64 MB
1.64 MB JPG
>>
Is there any comfy z-image implementation of those image enhancing stuff like token perturbation?
>>
File: o_00155_.jpg (666 KB, 1728x1264)
666 KB
666 KB JPG
chroma fp8 is actually pretty decent
>>
File: 1757129956940954.jpg (1.51 MB, 1248x1824)
1.51 MB
1.51 MB JPG
>>
File: 1740293792995864.jpg (1.4 MB, 1248x1824)
1.4 MB
1.4 MB JPG
>>
File: 1756938158077519.jpg (1.44 MB, 1248x1824)
1.44 MB
1.44 MB JPG
>>
OK I found why I was constantly OOMing with my 5090 on LTX2 (and wan), I had to use --reserve-vram 3 and all works now.
Apparently block swapping is deprecated.
I can see my ram being used properly now anyway.
https://files.catbox.moe/argx29.mp4
>>
>>107798887
>>107798903
>>107798913
This is very cool
>>
>>107798899
Rule34 of this please
>>
>>107798866
It's certainly not a neo issue when a Lora fucks up the model. There are loras that mess up, even on low settings.
As for comfy it looks like it stopped after step 1 check error log.
>>
>>
File: 1757699214843200.jpg (1.07 MB, 2016x1152)
1.07 MB
1.07 MB JPG
>>107798931
Thanks
First time doing my own lora
>>107798947
Sorry, i only have culinary brutality.
>>
>>107798833
>>107798842
abliterated version is what I thought but I don't think its going to be enough to break something which is hardcoded into the models weights through training it with captions designed to give a powerpoint slide show who's only motion is zoom in. So would need to figure out how to trick it, I thought that perhaps using the negative prompt might be able to flip them, not fully tested that theory since I'm still battling with the model actually loading properly.
>>107798857
I got you.
>>107796905
my somewhat retarded short write up on how i stopped all my issues with memory. btw comfyui according to gemini aggressively tries to allocate more memory than it needs causing your kernel to have a fit...

gemini he is a cool guy.
>>
>>107798833
>my somewhat retarded short write up
cheers, will test tomorrow
>>
>>107798977
>with captions designed to give a powerpoint slide show who's only motion is zoom in
I think it just means "I don't know what to do with it" from the model, see : >>107798744
If you disable that node, most prompts regardless of them being sfw or risqué, it will be just powerpoint slide zooms.

So an abliterated/derestricted text encoder + nsfw loras would probably work fine.
>>
>>107798917
>Apparently block swapping is deprecated.
has been for a while I think, and you used the correct launch flag, imagine it like this.

GPU
[---stuff---][---inference---][---other proc---]
So you need enough for the initial load, then some for the inference to actually start + some left over for everything else running on your system to not OOM. Plus it worth using --lowvram so that comfyui offloads models to either the system RAM or swap, if that still gives you problems then make a large pagefile or swapfile.

Then consider my previous post for other options to stop the madness.

TL;DR
comfyui is a piece of trash.
>>
>>107798963
None of the LoRAs I tried worked on neo, I downloaded a few to test with my own. I'm doing another reinstall of cumfart, because it didn't stop after step 1, it went all the way through. I saw the noise generated move around in the previews and didn't see any errors in the log when I checked before reinstalling. I'll check again if the full clean reinstall fails.
>>
What's the best UI for a beginner vramlet (16gb) who just wants to do meme/lewd images and videos?

I see most people on 4chan use Comfy. Should I just install that?
>>
>>107799029
Yes, it works perfectly fine for me the way i'm using it right now, but I'm using a local headless linux.
>>
>>107799039
I never had an issue in neo that it doesn't work at all. Loras fuck up, yes. Especially when using more than one. zimage is quite bitchy with loras.
Lower the weight to 0.5 or even 0.3 then it should work, then increase until you found a balance between desired effect and when it stars to mess.
>>
Which LLM has the best knowledge of booru tags?
>>
what's the deal with the framerate in LTX2? The workflows I check have it on 25. Is it recommended to stick to 25?
>>
>>107799050
forge-neo gives quicker results with less hassle. Comfy only makes sense when you need complex workflows or the latest algorithm or model. However forge-neo does zimage and qwen as well as flux besides old stable diffusion 1.5 and XL.
>>
>>107799027
wtf does that node even do? I disabled it while debugging OOM errors last night, i disabled the up-scaling crap also. KJ's workflow actually worked best, i just modified it as I go, so are you saying I would need that compression node for the model to actually work?

Well cheers i will actually try it and see if it works, because I was thinking it was a problem with that clip error I and the other anon are seeing.
>>
>>107799073
Illustrious.
>>
>>107799073
Captioners.
>>
>>107799088
yeah, humans can only see up to 24fps anyway
>>
>>107799050
Either Comfy or Swarmui
Swarmui also uses Comfy, it just gives it a nice GUI with a set of base beginner functions and the second best inpainting interface after krita + comfy.
>>
File: xcjvxclkjv.png (2.36 MB, 1023x1024)
2.36 MB
2.36 MB PNG
>>107799072
Lowering down to .02 and working my way up was the first thing I tried in Neo. Zimage works fine, the loras do not, even when going through them one at a time. Maybe it's because I'm using the Q8 gguf, I don't know. They all generated picrel levels of bad in neo, regardless of weight (unless it was past 1.0, I didn't go above 1...) The AI toolkit sample images came out fine, and I'm sure the LoRA would work in cumfart, if fucking z-image didn't shit itself there for whatever reason...
>>
>>107799091
>wtf does that node even do?
From what i understand it compresses the input image to give something looking like movement artefacts so the model can use it to help make motion.
In my tests, disabling it gave me almost always the powerpoint slide zoom effect.

I also disabled the upscaling stuff, I don't know why the wf by default is overcomplicated this much.
>>
>>107799100
>>107799096
Uh thanks guys but I have my reason for asking specifically for an LLM. I would like to be able to take a rough or vague description of an image I want, e.g. "A sci-fi woman wearing something eclectic", and have it spit out a structured list of booru tags. Or say "I want her to be leaning against a wall, supporting herself with her elbows" and have it convert that into the relevant tags.

I'm not familiar enough with captioners to know if they can interact with text prompts this way. If I've got it wrong, my bad.
>>
>>107799070
and besides if its LTX 2 then I think its not the same as wan, wan needs to do all frames at once, there are reports that LTX 2 works with almost no vram, because its doing frame by frame so it will still be able to do it. Wan shills have been seething ever since LTX 2 was released because LTX2 is clearly better it just needs some testing to find the best settings.
>>
>>107799118
>I'm using the Q8 gguf
Why tho.
>>
>>107799088
No use 24, 25 would work fine too, but the number of frames has to be a multiple of 8 (+1 if i2v), so might as well do 24.
>>
>>107799130
Because I'm on a 3060 12gb w/ 32gb sysram because I'm poor.
>>
>>107798428
>>107798367
>>107798495
Lol suffer
>>
>>107799125
I like wan a lot but it's clearly annoying with how much time it needs to gen vs ltx2.
LTX2 current issue is the lack of loras for me, and the fact it seems more censored by default, hopefully it will catch up.
>>
>>107799124
I'm not familiar enough either but as far as I know they are just multimodal llms. Also another option: TIPO https://github.com/KohakuBlueleaf/z-tipo-extension
>>
File: 1749435765168363.png (40 KB, 399x399)
40 KB
40 KB PNG
ltx2 is like the lottery. It works on random PCs, lol. Two months for this result...
>>
>>107799143
Did you try BF16 anyways?
>>
>>107799121
how many frames are we talking? I upped the frames once i was comfortable with the stability to at least 121 frames and i get motion.

plus i think it filters certain words.
>>
>>107799166
I'm playing around with 241 frames in i2v and it works fine.
I probably can go even more but for now I'll keep it that way.
>>
>>107799186
i think it would go forever now that i figured out the memory problems, it actually took less time on my 3060 upping the frames lol. Its starting to grow on me just when I thought local was doomed forever.
>>
The more I use LTX 2 the more I hate it. It's only good for slop memes.
>>
Just need a speed up for wan that generates in seconds. Like cache dit speeds but it retains the quality
>>
>>107799203
I'm not sure how it works but it's for sure not like wan architecture where every frame was computed at the same time meaning that more frames = more memory used.
Just this means that we can finally have long gens, and fast too.
>>
>>107799212
it will never be this good anon, not on my hardware at least and nvidia already hints on producing more 3060 RTX cards and that is probably all we are gonna get access to. the probably with wan is it needs to do all frames at once and it uses 2 separate models to do it and its only trained on 5 seconds at 16 fps 81 frames. This thing will gen >>107799186
probably on an 8GB card ffs using one model with audio, it is the game change that most here still can't see yet. Btw LTX bodies look way nicer, the curves, the gluteal crease, they got it right.
>>
>>107799207
for now, but the technical base is good
>>
>>107799147
At the end of the day, anitroons always lose. That hatred of 3D always masks a great weakness, like any mental illness.
>>
>>
>>107798895
as a test I will animate yours with LTX2 anon because I can't be arsed genning an image, i hope you do not mind.
>>
>>107799294
enjoy the single frame powerpoint presentation kek
>>
>>107798895
have you tried radience?
>>
>>107799300
it won't trust me i jailbreaked the fucker.
>>
File: 00003-3051912297.jpg (1.19 MB, 2888x4000)
1.19 MB
1.19 MB JPG
>>
File: o_00169_.jpg (430 KB, 1376x1072)
430 KB
430 KB JPG
>>107799294
physics test please

>>107799303
nope, are they done training?
>>
>>107799310
top of the feet > soles
>>
res2s looks really good, but man is it SLOW
>>
>>107798895
>>107799294
I will give it a shot too, why not.
>>
>>107799090
everyone is genning with Wan2.2. is forge neo very limited?

>>107799112
Interesting. Which one are you using?
>>
File: 00019-3460350771.jpg (433 KB, 1344x1728)
433 KB
433 KB JPG
>>
>>107799255
>Btw LTX bodies look way nicer, the curves, the gluteal crease, they got it right.

Dunno, havnt seen a single sexy ltx2 gen yet. As a sloppy fat goth enjoyer, wan 2.2 creases are pretty top notch.
>>
File: file.png (1.38 MB, 1069x1057)
1.38 MB
1.38 MB PNG
LTX-2 being 19B, how hard would it be to finetune it for nsfw?
>>
>>107799344
I don't do video at all. As for images, it can do enough. Currently I am only using zimage and I can't imagine a (txt2img) workflow that would force me into using Comfy.
>>
>>
>>107799344
Wan2.2 video is supported
https://github.com/Haoming02/sd-webui-forge-classic/tree/neo
>>
>>
idea: make a simple website with an api that lets people submit great prompts for different models , with secondary tags describing what it contains. you can then use some comfy node to fetch random prompts based on your preferred tags.
>>
>>107799405
Sounds easy to abuse by submitting garbage prompts. Then you'll have people who believe that their trash ChatGPT generated prompt is the gold standard. So you would need some kind of ranking or rating system as well.
>>
File: 00165-1816457581.png (992 KB, 1112x728)
992 KB
992 KB PNG
>>
>>107799405
>gets flooded with cunny hc
>>
>>107799422
>So you would need some kind of ranking or rating system as well.
yeah there definitely has to be a ranking system to sort out the bad stuff.
>>
File: LTX.mp4 (839 KB, 704x704)
839 KB
839 KB MP4
>>107798463
ty, wouldn't have found that one

sometimes he won't sing and I seem to have very poor control over what else the character is doing, but this is certainly quite fun:
https://litter.catbox.moe/hvjj5534tls0gpqu.mp4
https://litter.catbox.moe/348nfcye0i56qrt8.mp4
https://litter.catbox.moe/30vurf8dfzr9zo7e.mp4
>>
can ltx do big titty women in lingerie or bikinis?
>>
lmao

I have to regenerate to fix the audio but holy shit, what a model.

https://files.catbox.moe/w1m2xj.mp4
>>
kek, the end

https://files.catbox.moe/1sbhtk.mp4
>>
File: ComfyUI_00001_.png (3.95 MB, 1216x1664)
3.95 MB
3.95 MB PNG
>>107799161
I'm going to try it right now. I did a full reinstall and it's still generating pure noise like picrel.
>>
all of these itx videos suck ass.
>>
lmao, now we are talking.

https://files.catbox.moe/6q5746.mp4
>>
>>107799522
you really should have figured that out after 2 or 3, no need to watch the rest
>>
File: 00094-1886089240.jpg (188 KB, 832x1216)
188 KB
188 KB JPG
>>
>>107798731
>boom found something
>https://www.reddit.com/r/StableDiffusion/comments/1q6om6c/ltx2_gemma_error_no_cliptext_encoder_weights_in/
Not really any additional info in that
>>
>>107798895
Obviously not the best due to me using only 512 x 512 for prototyping, no silly compression node needed. physics test not really but at least it animates.

https://files.catbox.moe/4s2w2x.mp4
audio is crap but i think its due to low res, she was not prompted to talk, it made some music though.
>>
https://files.catbox.moe/0icomt.mp4
>>
>open my basic t2i workflow
>the default seed node is broken
How does that even happen? It's just a random number generator
>>
>>107799573
up the frames? the clip warning is normal according to sources i've read, use that anons workflow from previous thread, not that it suppresses the message... but its working for me now once i went 121 frames, no idea why or how. Just do as per instructions, like put the models where they tell you and it will work.
>>
>>107799573
just so you know, I only have that file used from gemma and it all works for me :
.\ComfyUI\models\text_encoders\Gemma 3 12B\gemma_3_12B_it.safetensors
>>
>>107799614
horror teeth.
>>
>>107799627
that is where i also have it, the message is nothing it can be ignored.
>>107799573
just like those ledditors said you can ignore the error its still working. change your prompt, up the frames to at least 121.
>>
>>107799614
yes
>>
>>107799614
Indeed
>>
File: 1549902781136.jpg (153 KB, 1920x1080)
153 KB
153 KB JPG
>can't generate a video longer than 60 seconds with only a rtx 3070
Man.
FUCK
I figured it would take forever instead but comfyui won't even let me generate long ones because it's saying 8gb of vram isn't enough
>>
>>107799631
Yeah I think I'd need a bigger resolution to sort that.
>>
https://files.catbox.moe/d2d6fv.mp4
posting this for my own sanity, it did work at least once today, and you can all attest to that
>>
File: 1745814833060676.webm (3.91 MB, 2048x1166)
3.91 MB
3.91 MB WEBM
>>107798947
>>107798976
>>
>>107798866
This may sound retarded, but did you try a longer more descriptive prompt?
>>
>>107799704
If this video was posted 6 months ago we'd be losing our shit and now everyone is acting like tlx sucks.
>>
>>107799695
what is the error, i swear you people are just shilling now.

--lowvram --reserve-vram 4.0

try that, if not trolling or shilling for wan or what ever let me know how it goes i have more aggressive ways to make it work for you anon. But do not expect speed, speed does not really matter though because unlike wan they thing can follow your prompts to the exact.
>>
>>107799452
https://litter.catbox.moe/4oils3e14dh3v3am.mp4
>>
Posting some boosts, enhancements and looking for the best possible mix to balance quality and speed (both video and image welcome). If there's some you know of, let me know and I'll bookmark them too.

https://github.com/xlite-dev/comfyui-cache-dit

>Super fast, works with almost everything, quality hit. I only tested with wan 2.1 and 2.2. On 2.2 the quality seems better than 2.1 still despite the quality hit, suppose thats all of the slopped loras I have (lightx2v i2v 480p both high and low, kj MoE lightx2v distill high, Seko lighting lightx2v low, SVI v2Pro both high and low)

https://github.com/philipy1219/ComfyUI-TaylorSeer

>Tried it with flux and it just froze. seems to have many updates since its release so going to try it again

https://github.com/woct0rdho/ComfyUI-RadialAttn

>Pretty good for going beyond 81 frames. You'll need woct0rdho's triton, sageattention and sparge attn to run

https://github.com/Zehong-Ma/ComfyUI-MagCache

>Kinda works but doesnt play nice with light loras. Havnt had enough time to test this out properly.
>>
me waiting for base
>>
given that I only gen lewds, I have made my decision: I'm going to wait and see if any loras come out of it before I bother running any ltx workflows. I'm sticking with Wan. Also, I can't say I'm entirely trusting of using anything from an israeli company on my machine
>>
>>107799737
nice
>>
>>107799615
>seed node is broken
Your seed node is broken? That must hurt.
>>
for video reference lcm works best. Everything else I tried turns to pure slop.
https://files.catbox.moe/hhcgj8.mp4
>>
>>107799728
I've tried using a bunch of prompts, mostly just loading images I already generated without changing literally anything. None of them work. They all spit out the static garbage. And I know it's not the gguf, because like I said, it works fine in neo (minus using loras).
>>
>>107799522
loras will save it
>>
>>107799798
yeah i'm sure people we're are all gonna start renting gpu's
>>
the outputs are pretty funny sometimes

https://files.catbox.moe/pg3d1b.mp4
>>
Can we optimize systems more? I think we can ^^
>>
File: 1765831405285749.webm (3.9 MB, 2048x1166)
3.9 MB
3.9 MB WEBM
>>107798947
>>107798976
>>
https://github.com/Comfy-Org/ComfyUI/pull/11713
>>
>>107799846
>how to kill your ssd
>>
>>107799695
Swapped my 3070 to a 5060ti on Black Friday and I already forgot how sad life was with 8gb lol. Big thanks to Tongyi chinks for prompting me to finally do it.
>>
>>107799733
it's because people just post memes, add some hot girls and the interest will go way up
>>
File: 00092-3252439222.jpg (335 KB, 1344x1728)
335 KB
335 KB JPG
>1girl
>>
>>107799846
my system offloads at least 30GB to swap
>>107799860
its fine, i checked both of them this morning when debugging, i even flashed my bios with a newer bois

The drives are fine even with max thrashing for like 2 years, go back to toms hardware and be a freak about SSD's wearing out. If you look at the prices of SSD rising you now know why, next it will be the CPU's and more probably so grab a few second hand ones like I did.
>>
>>107799860
the point is a hdd moves date at a pitiful rate of 80 - 120 MB/s its not fast enough at all...
>>
>>107799898
>>107799860
and also small pieces of data, yeah your hdd will mechanically die first pal
>>
File: File.png (82 KB, 1711x645)
82 KB
82 KB PNG
>>107799859
comfyguy fixed everything :')
>>
>>107799846
>masterclass
man the glazing lol, first thing I ask is to stop that shit
>>
>>107799924
no, it lowers vram usage for the text encoder
>>
File: fuckingcumfart.png (1.18 MB, 1902x1236)
1.18 MB
1.18 MB PNG
>>107799794
>>107798866
This is so retarded. The fp16 works fine. It just refuses my Q8 gguf now for whatever reason. Fuck this stupid app.
>>
>>107799614
Of course!
>>
>>107799943
yeah its fucking shit but what the hell i just use it because it generally gives solid advice, it searches reddit and everything for you and finds the answers. But always take it with a pinch of salt.
>>
File: 00016-3797307013.jpg (360 KB, 1432x1840)
360 KB
360 KB JPG
>>
>>107799950
but we need that, but if he fucks it up more i'm gonna be pissed. his pinned memory is garbage as it causes kernel to shit it self and force system off at the plug fuck that guy serious...
>>
>>107799977
oh it's very useful, I learned many thing with it, but man the way it always congratulates me, it pissed me off so much I made that rule
>>
>>107799991
good call chief
>>
>>107799991
yeah like every little thing you reveal about your life it will use in every reply its so fucking gay and cringe. I tend to ignore it and forgive it though otherwise i end up arguing with it more than being productive.
>>
>>107799989
lovely freckles
>>
>>107799991
>I learned many thing with it,
as did I, thinks i would have overlooked when considering bare metal AI, such it giving me videos up on youtube which explained pci devices from port io to mmio and how to set that shit up. I like it for that but then i drunk and it started being fucking really weird with me... But its alright for a quick assistant model for going in deep into technical topics really.
>>
https://www.reddit.com/r/StableDiffusion/comments/1q6vuo1/comfy_ram_leaks_again/

>Its the subgraph, copy paste the nodes out of the subgraph and its much better at managing memory

Is this true? Cant find any information on it
>>
File: freshening_up.mp4 (1.91 MB, 1280x704)
1.91 MB
1.91 MB MP4
>>107799614
our neurons are trained

>>107799751
yep. that said wan/hyvideo do seem to still have some advantages
>>
>>107800049
I've been noticing this with the SVI workflow that has all of the samplers in subgraphs. Keeps OOMing on me when my old Wan workflow worked fine for numerous samplers.
>>
>>107799991
the best is the subscription model, you get to make the prompt for it, you can just ask the base to make a system prompt for your own AI. But you have to pay for that, but it can say analyze an entire video and pull all kinds of information from it, or just about anything. I have very little use for it right now unless i go back to the bare metal ai project.

>>107800068
>yep. that said wan/hyvideo do seem to still have some advantages.

but it does not, i'm sorry you have not figured this out yet. The detail i'm seeing in human autonomy despite if allegedly censored says otherwise. Wan asses look flat, they always have that dulled out look, and no amount of stuff ass wobble lora will ever change it, the loras make it look so bad. LTX2 got it right perfect, so wtf bitch?
>>
ai board when
https://files.catbox.moe/6vagng.mp4

>>107799989
very good
z-image?
>>
KINO, daily news edition:

https://files.catbox.moe/2983t7.mp4
>>
>>107800068
you need more steps for it btw, try 30.
>>
>>107800114
How the hell don't you get ppt slide with such a terse prompt?
>>
>>107800092
Hmm. I've been using the native kijai workflow for SVI and its been great, havnt had a single oom. Only issue is the fucking random model offloading. Wonder if taking them out of the subgraphs would help
>>
>>107800114
>ai board when
this because no audio streams in video allowed. and also constant retarded shitting this thread up.
>>
holy shit, need to tweak the prompt but you can get air fighters to do shit even:

https://files.catbox.moe/p7o4vi.mp4
>>
>>107800117
t2v? try it with i2v if not because i2v is definitely better.
>>
https://files.catbox.moe/uda6o1.mp4
call me crazy but it kinda looks like a human wearing a costume
>>
Chroma is only nice for furry and weird anatomy gooner... Having a normal full body output is too much based on luck.

I can of prompt from wan 2.2 T2i or ZiT

Anime > many models does better
Realism > many models does better
>>
>>107800199
>>107800215
like i2v need less prompt to describe things since it goes of the image you feed it. basically like wan 2.2 only its better imo.
>>
>>107800232
well uh
it is?
>>
why is everything in the ai-space filled with jeets? FFS these fuckers are everywhere
>>
>>107799318
a young woman is sitting on a gamer chair and she bounces up and down on it while giggling like a maniac, she then stop's and runs both her hands through her hair while still laughing. Audio of her laughter and her bouncing up and down in her gamer chair can be heard, she is clearly amused and finds it hard to catch her breath.

i think that one might work anon. :-)
>>
>>107800290
>ai-space
Have you been outside recently? It's not just AI
>>
>>107800265
well this made it more androidy, but I wish I didn't
https://files.catbox.moe/8qobpt.mp4
>>
>>107800306
The outside is dangerous, anon
>>
>>107800290
>>107800306
in your head rent free
>>
File: 1755217425364381.jpg (45 KB, 680x626)
45 KB
45 KB JPG
>>107800326
>>
File: tmpekoc68y9.png (1.57 MB, 1224x1632)
1.57 MB
1.57 MB PNG
How do I make my gens look better!?

https://files.catbox.moe/dahrbm.png
>>
>>107800319
yes it is but it its not indians make it dangerous where i live at least, its whites that are pissed or smaked, coked up...
>>
File: z-image_00873.jpg (89 KB, 1063x1016)
89 KB
89 KB JPG
>>107800326
>>
>>107800347
UK is lost soil
>>
comfy fixed ton of issues when I was away it seems
https://github.com/Comfy-Org/ComfyUI/commits/master/
>>
>>107800337
me? try ex forces white fuck head.

btw didn't click, i already know what it is, its some schizo retarded shit you post in every thread you fucking retarded junkie, you have no idea how brainwashed you are.
>>
z-image turbo face id cn when?
>>
also they are making it a ton lighter by separating everything into its own file now
https://huggingface.co/Lightricks/LTX-2/tree/main
>>
>>107800376
holy fuck this npc brownie melted under no pressure
>>
no browns every gave me problems ever, imagine that. I guess then it is a you problem.
>>
>>107800395
>brown cant understand statistics
pottery
>>
>>107800367
yeah ltx2 gens are more stable for me
>>
>>107800367
when updating comfy do you always use the folder, or update within comfyui?
>>
T2V works pretty well, audio bugs out when the dialogue is too long and continuous but works well if you break it up into separate quotes

https://files.catbox.moe/celjfs.mp4
https://files.catbox.moe/z3i6y4.mp4
https://files.catbox.moe/537tjc.mp4
>>
>18 blue buzz has been added to your account
>482 yellow buzz has been added to you account
I actually feel bad shmucks are giving this site money

>>107800395
you are brown
>>
>>107800440
>conda activate env
>git pull
>pip install -r requirements.txt

do people seriously just not do this? I've never broken a comfy install since I started doing it this way.
>>
im so close to getting a proper bombing.

the man says "stop the car, or else!" toward the black SUV on the right, and a f-35 jet fighter drops a large bomb onto the black SUV from above which creates a huge explosion.

https://files.catbox.moe/xz9lg3.mp4
>>
Is there an LDG approved LTX2 guide?
>>
https://github.com/deepbeepmeep/Wan2GP
wan2gp now has ltxv support for people who don't want to wait for comfy to fix all his issues
>>
>>107800468
>deepbeepmeep
Something about this guy gives me the ick.
>>
>>107800340
Define "better"
>>
why the fuck i have instant "reconnecting" for ltx? not even the classic oom...
>>
>>107800483
it's still an oom but comfy didn't catch it, it just craps out instantly
>>
>>107800468
have (You) tried it?
>>
File: tmptac3zfz6.png (1.32 MB, 1224x1632)
1.32 MB
1.32 MB PNG
>>107800479
Better hands, irises and just overall AI gunk removed. I use face and adetailer, but they're very hit-or-miss.
>>
>>107800491
but why lol. i use same settings that reddid vramlets. and my test is 1 sec with 200x200 resolution lol. will try --lowvram trick.
>>
>>
>qt from my youth accepts my follow request
>enough images on her acct to make a lora
AI is amazing
>>
>>107800558
:|
>>
File: 1577112207885.png (27 KB, 128x128)
27 KB
27 KB PNG
>>107799736
>--lowvram --reserve-vram 4.0
Where am I supposed to put this?
I'm using WAN 2.1 on comfyui
>>
>>107800582
>I'm using WAN 2.1 on comfyui
Why would you need this for wan?
>>
oo that was a rough one, screen went black and took out the browser
>>107800468
okay I'll give this a shot
>>
lmao im dying, so many possibilities

https://files.catbox.moe/51c6xe.mp4
>>
>>107800465
>>107800465
Bump
>>
>>107800443
ive been trying to get lipsyncing for my videos. i tried infinite talk video to video but its not working well, and it seem to alter the video too much sometimes since i used loras in the original video generation. are you doing pure text to video or do you have an audio file? any tips for taking a prepared video + a prepared audio file then lip syncing it?
>>
>>107800595
I don't know you should be asking the anon who replied to me while I'm complaining about video generation time
>>
>>107800558
Kek, why? Everyone you know finding out you do shit like this is 100x worse than going to prison.
>>
>>107799318
>>107800299

https://files.catbox.moe/2a5hc8.mp4
https://files.catbox.moe/oux7y4.mp4
https://files.catbox.moe/1jzi5b.mp4
>>
>>107800582
in teh komfyui bat of course :D
>>
>>107800629
who gives a shit? people are retards and we are well past the point in society where shame means anything
>>
ltx is suno ai locally?
https://files.catbox.moe/sk48rk.mp4
>>
>>107800629
how would anon ever get caught?
>>
>>107800629
you sound like a pussy
>>
>>107800445
>you are brown
you will be destroyed this year actually. we know you have been doing to the minds of kids.

take fucking heed we are coming to smash your bot networks and all involved.
>>
>>107800541
>>107800340
Looks good to me, you're probably overthinking it. Take a break.
>>
>>107800541
Use a more modern model i.e. not SDXL. But as the other anon said, it looks fine for WAI gens, like all the others so.
>>
the minnesota crash/shooting was a government psyop!

https://files.catbox.moe/doheqb.mp4
>>
File: file.png (6 KB, 520x164)
6 KB
6 KB PNG
>>107800455
I'm lazy, I just use the manager, it does the same thing anyway.
>>
>>107800631
>https://files.catbox.moe/2a5hc8.mp4
haha yeah yours is way better than mine anon, i won't bother uploading my low res slop, that delayed action though.
>>
>>107800558
I'd rather do random girls I find hot in insta.
>>
>>107800631
it actually bounces, pretty cool, what's the initial resolution anon?
>>
File: file.png (57 KB, 719x512)
57 KB
57 KB PNG
>>107800631
>>107800721
nevermind I saw the wf thanks for sharing !
what is picrel vae decode? I'm using the comfy core one, is it supposed to be better?
>>
>>107800738
No idea I Just got that workflow from in here, the anon spamming the ICE shooting stuff. That might be from the LTX official comfyui nodes.
>>
File: file.png (65 KB, 969x632)
65 KB
65 KB PNG
>>107800753
yes it's from there, it seems targeted towards ltx but I'm not sure it's better than normal tiled vae like picrel
I think it's because this node didn't exist when ltx1 was released
>>
>>107800714
this. but i've spoken to people on the /b/ sdg general that have literally made ai porn of their moms cousins and aunts. they showed me on discord. (yes, several dudes showed me ai porn of their family)
they were (you guessed it) brownoids lol!
they even committed incest with their cousins (let them cum inside and everything).
they like to pin incest on white people for some reason. projection, as usual.

>>107800694
will it be soon? before or after base
>>
>>107799155
Thanks will look into it all then
>>
>>107800738
>t supposed to be better?
don't do it anon i swear just use the normal vae decode and then the ltx audio decode. lurk moar for settings to avoid OOM and all is good. I don't know why KJ insisted on using his fucking tile decoder in his shared workflow its not needed unless you are doing insane number of frames i guess.
>>
>>107800582
open nvidia_gpu.bat with notepad, and add it after ComfyUI\main.py
>>
>>
>>107800714
the thrill of a random thot pales in comparison to someone you know or knew irl
>>107800765
ive never jerked off to family members anyway
>>
this is so good, so emotive:

https://files.catbox.moe/qnvm2t.mp4
>>
>>107800631
the best imo
https://files.catbox.moe/2a5hc8.mp4
got a bit of dick titch going on seeing that
>>
can you use ltxv2 to gen audio for an existing video?
>>
>>107800777
>its not needed unless you are doing insane number of frames i guess.
I actually need it sadly, it does the job
>>
File: 9.png (1.52 MB, 848x1184)
1.52 MB
1.52 MB PNG
>>
why does tran concern troll about someone wanting to dox her but doxxes ani all the time including his family?
>>
launched comfyui-zluda today and now it's broken/gives out an error when trying to start. anyone could point me to how i can see which commit i was using yesterday (when it worked fine) and revert to it?

i'm not super familiar with how git works, but i can look up whatever i need to
>>
>>107800826
probably yeah because it can take video frames just encode them and force the frame rate to be 24 fps. Create the empty audio latent's by connecting the audio vae to the LTXV Empty Latent Audio node and gen the same number of latents as frames of the original frames, run it through sampler and then split them and decode the audio and video.

I've never tried but i can't imagine it being hard.
>>
>>
>>107800804
>got a bit of dick titch going on seeing that
the best judge of quality
>>
anyone here wants to train z image on the wikiart dataset? I'd be so grateful; kek
>>
More jiggle testin, seems like it can mostly just be prompted?

These were the best 5 out of like 15:

https://files.catbox.moe/9b65ww.mp4
https://files.catbox.moe/d4ezli.mp4
https://files.catbox.moe/fd7429.mp4
https://files.catbox.moe/ss3l19.mp4
https://files.catbox.moe/dyfj49.mp4

A couple bizarre/gross ones:

https://files.catbox.moe/50iofl.mp4
https://files.catbox.moe/qmqz3p.mp4

It frequently adds on some dance music when she gets close to the camera and starts moving.
>>
So for the lads making z-image LoRAs on poorfag hardware like the 30xx cards, how do you get it to gen in ~1 hour? I left everything at default and it took like 6 hours with 20 images.
>>
>>107800826
Yes. I think Kijai had a workflow somewhere for it.
>>
>>107800876
Link me a repo with the images because I'm lazy or point me to a scaper that works with the site
>>
>>107800859
https://github.com/patientx/ComfyUI-Zluda/commits/master/
Check any hash from yesterday
>>
>>107800890
https://huggingface.co/datasets/huggan/wikiart
>>
i am artist now
>>
>>107800879
how much time per gen anon? and on what hardware? some of these are quite hot
>>
>>107800911
60 seconds on a RTX 6000 pro
>>
>>107800865
a young woman is sitting on a gamer chair she quickly unstraps her bra and changes it for a pink one due to skin irritation on her shoulders.

Gemma would block this i know but maybe?
>>
Hello kings, i am looking for a one piece artyle model and/or artstyle prompt if you have it.
>>
>>107800880
just let it train low capability for future thinking and delayed gratification nigger
>>
>>107800916
that's freaking fast, I need to check how, on my 5090 I need 8min and I'm not sure why
>>
File: characterlora_ilus.png (557 KB, 574x834)
557 KB
557 KB PNG
Does this look okay for a character lora?
>>
fresh when ready
>>107800934
>>107800934
>>107800934
>>107800934
>>
>>107800879
>https://files.catbox.moe/9b65ww.mp4
holy fucking shit, fuck wan this shit is amazing haha
>>
>>107800879
oh wow and no lora, impressive no?
>>
>>107800961
>>107800946
>thread breaks down the moment a video that isn't from the george floyd miku spammer is posted.
>>
>>107800897
Now time to figure out how to mass download and turn this into a usable dataset with captions heh
>>
>>107800894
strangely enough the ones from yesterday didn't work, even though my comfyui.bat does git pull and i did use it yesterday.

anyway, it's working now with a commit from last week. thanks for the help!
>>
>>107800946
it still noticable degrades. the skin quality tanks
>>
>>107800932

Post original sauce. Look slop already on the smaller details.
>>
>>107801004
>>
>>107801025
Ok. Try to change the pose. Or strip an item from the character. If you can't it is overfitted.
>>
I'm trying to make a lora for my OC using the netayume model, but onetrainer doesn't support lumina models anymore, what can i do bros?
>>
>>107801042
She has a very low amount of images and Qwen just zoomed it out when I tried to do that last night
>>
>>107801025
doesnt being able to recreate the training data 1:1 mean its overfit
what do other prompts look like?
>>
>>107800929
that doesn't seem right. I just tested a 100 steps video in about 4 minutes on my 5090 (powerlimited).
>>
>>107800996
>https://files.catbox.moe/50iofl.mp4
ahahah, breaths ahahahahahaha
>>
>>107801025
Genuine question are you mentally deficient to waste this energy on a cartoon made for kids?
>>
>>107801061
I only did 1 to save on time when training
>>107801070
Where do you think we are?
>>
File: ZiT tests.jpg (2.47 MB, 9402x5379)
2.47 MB
2.47 MB JPG
>>107801060

Try Nanobanana or whatever to generate synthetic data for front, back, side pose to break the overfit. Also try removing clothes or include more costumes in your dataset. This is my personal test of what is a successful LoRa should be able to do.
>>
>>107801070
its what they do mate all day, let them have their toys.
>>
>>107801140
Any advice on prompting for Nanobanana?
I downloaded your Syuen Zit when you uploaded it on mega a few weeks back
>>
>>107801189

Register this image(or images) as the reference. Do not do anything yet.

Wait for banana to confirm.

Using the reference images: Create front view standing straight.

Using the reference images: Create side view standing straight.

etc, and so forth, including clothes removal and costume change.
>>
>>>/wsg/6067150
>>>/wsg/6067150
>>>/wsg/6067150
Migrate when ready.
>>
>>107799356
where is the result of your pic you piece of shit, i wanna see the shitstorm
>>
File: tmplkadblnv.png (1.76 MB, 1224x1632)
1.76 MB
1.76 MB PNG
>>107800701
>Use a more modern model i.e. not SDXL.
Besides SDXL for things like Illustrious/NAI, what else is there? I don't see that many effective anime loras/checkpoints for ZIT.
>>
>>107801459
Open civitai, filter Style Loras for Illustrious and scroll. Eventually you'll see a style that you like and that doesn't look like default 2.5D AI slop you currently have.
>>
>>107801477
proof?
>>
>>107801459
>I don't see that many effective anime loras/checkpoints for ZIT.
train your own (or hope someone else trains) a lora on whatever anime style you enjoy. thats all we'll get for now but imo more enjoyable than illust
maybe netayume but probably not



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.