[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


No apology, Not sorry Edition

Discussion of Free and Open Source Diffusion Models

Prev: >>107826985

https://rentry.org/ldg-lazy-getting-started-guide

>UI
ComfyUI: https://github.com/comfyanonymous/ComfyUI
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
re/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneo
SD.Next: https://github.com/vladmandic/sdnext
Wan2GP: https://github.com/deepbeepmeep/Wan2GP

>Checkpoints, LoRAs, Upscalers, & Workflows
https://civitai.com
https://civitaiarchive.com/
https://openmodeldb.info
https://openart.ai/workflows

>Tuning
https://github.com/spacepxl/demystifying-sd-finetuning
https://github.com/ostris/ai-toolkit
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/musubi-tuner
https://github.com/kohya-ss/sd-scripts
https://github.com/tdrussell/diffusion-pipe

>Z Image Turbo
https://huggingface.co/Tongyi-MAI/Z-Image-Turbo

>WanX
https://github.com/Wan-Video/Wan2.2
https://comfyanonymous.github.io/ComfyUI_examples/wan22/

>LTX-2
https://huggingface.co/Lightricks/LTX-2

>Chroma
https://huggingface.co/lodestones/Chroma1-Base
https://rentry.org/mvu52t46

>NetaYume
https://huggingface.co/duongve/NetaYume-Lumina-Image-2.0
https://nieta-art.feishu.cn/wiki/RZAawlH2ci74qckRLRPc9tOynrb

>Misc
Local Model Meta: https://rentry.org/localmodelsmeta
Share Metadata: https://catbox.moe|https://litterbox.catbox.moe/
GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/
Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-one
Txt2Img Plugin: https://github.com/Acly/krita-ai-diffusion
Archive: https://rentry.org/sdg-link
Bakery: https://rentry.org/ldgcollage

>Neighbors
>>>/aco/csdg
>>>/b/degen
>>>/r/realistic+parody
>>>/gif/vdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/trash/slop
>>>/vt/vtai
>>>/u/udg

>Local Text
>>>/g/lmg

>Maintain Thread Quality
https://rentry.org/debo
https://rentry.org/animanon
>>
Blessed thread of frenship
>>
What do you guys think of Netayume Lumina?
>>
Why does mini get brought up so much by the twins?
>>
New Chroma finetune (actual tune, not shitmix) released:
https://civitai.com/models/860092/kegant
The author honestly admits that it is schizo about anatomy as expected. Though it seems to have a nice style for anime so if you are interested in that and willing to play seed lottery I am sure you can get nice gens from it.
>>
>>107829310
All Chroma finetunes are by definition shitmixes
>>
>>107829269
My main model, a Holy fucking shit/10 to me.
Speaking of which, i want to train and use a lora to have it prompt my OC, how do i do that ?
>>
The random faded yellow frames from wan gguf models is really fucking annoying.
>>
>>107829389
0 issues with gguf for me
>>
Today I learned that cumfart shills are low brains.
>>
>>107829434
umm its literally unironically the reverse thoughever
>>
>>107829381
https://files.catbox.moe/rmfg5g.mp4

How can wan even compete? Seriously.
>>
>>107829447
go be busy with your retarded workflows.
>>
>>107829455
KEK
>>
>>107829396
That's why I said they're random. Two different pictures with the exact same work flow. One gen has them and one doesn't. Must be the background or something.
>>
>>107829381
https://files.catbox.moe/31xfwr.mp4

Another one. A little NSFW maybe?
>>
>change FPS to 48
>get instant higher quality
LTX2 is a dishonest model
>>
>says they will release base model
>Always told to wait two weeks
Z-image is a dishonest model.
>>
>>107829541
Epic shit, i'm waiting on my 9070xt to come so i can get to video prompts as well.
Btw is this also netayume? Can i have a workflow ?
>>
File: 1767923827119002.jpg (41 KB, 705x629)
41 KB
41 KB JPG
has anyone here managed to use gemma 3 gguf? the guide is really boring and divided into many different topics…
>>
File: 1755493238942768.mp4 (1.58 MB, 1264x720)
1.58 MB
1.58 MB MP4
>>107829381
>>
>>107829557
https://files.catbox.moe/xlpvg2.mp4

Is it really better though?
>>
>>107829586
Ah, there was some kino a thread back about a valuable lesson in culture

https://vocaroo.com/11sA543nw26m
>>
>>107829618
It's really been over for the hentai and porn industry holy shit
>>
>>107829612
just don't use it for LTX-2 yet if you don't know how to do the patching and can't follow the existing guides

for basically everything else it's as easy as simply loading the corresponding gguf as the .safetensors after installing the gguf custom nodes and -if you need it- comfyui-multigpu custom nodes
>>
>>107829641
Holy kino. What model is this?
>>
I've been getting pretty bad results with the "Full" workflow LTX-2. Is it working well for anyone else or is it just me? One thing to note is I did install and use the res_2s sampler, maybe I messed that up somehow? Have a distilled gen for your troubles:

https://files.catbox.moe/n35gus.mp4

Guess I'll try a dev gen with euler to rule out the sampler.
>>
>>107829212
>21:9 and wider aspect on sloppas
thats a good idea
>>
>>107829310
its anime slop?
>>
>>107829649
i already use gguf in general, and it works well. but the standard gemma makes launching very slow, even the 12gb version. i'm ramlet kek
>>
>>107829634
that looks better than gens that I've done at 24fps
>>
>>107829310
the rabbit x horse picture convinced me that chroma is based but there needs to be more proof that this image generator is actually worthwhile
>>
File: compressed_1768104595843.jpg (752 KB, 2400x1792)
752 KB
752 KB JPG
>>107829310
Let me shill it properly..ehem..

BABE WAKE UP NEW CHROMA FULL ANIME FINETUNE!

K E G A N T!
E
G
A
N
T
!
>NEW CHROMA ANIME FINETUNE!
https://civitai.com/models/860092/kegant
>NEW CHROMA ANIME FINETUNE!
https://civitai.com/models/860092/kegant
>NEW CHROMA ANIME FINETUNE!
https://civitai.com/models/860092/kegant
>>
>>107829694
idk, I think higher framerates just look better to our eyes. I don't think the actual quality is better. I'm just some guy though.

I think the real takeaway is that this model will basically work with whatever you ask it.
>>
>>107829661
no, i also struggled.
maybe the workflows will have been improved and bugs get erased soon.
>>
File: Kegan_01.jpg (22 KB, 474x248)
22 KB
22 KB JPG
>>107829710
just ran MIKU TEST on this
bro
BRO
WHY IS THIS ACTUALLY CRISP
chroma finally cooking anime RIGHT
KEGANT IS BUILT DIFFERENT
>>
>>107829710
Where are the examples of the stuff I can do with this but can't with SDXL?
>>
>LTX can't render text like Wan
it's over
https://litter.catbox.moe/r58kpvvohk2r1578.mp4
>>
how the fuck does the sliding window work in wan2gp?
>>
>>107829710 >>107829742
looks nice. is it a chroma base finetune or radiance?
>>
>>107829651
I have no idea sorry. Someone asked if it was ace step but you'll have to look at previous threads, might be an answer there

>>107829710
Fuuuuck that looks good. Does it have some weird ritualistic setup like radiance x0 or can we just plug into out regular chroma workflows?
>>
>>107829710
Does it support artist tags?
>>
>>107829765
Don't you feel embarrassed admitting you use the retard UI?
>>
File: Test_Kegan-0001.png (3.72 MB, 2304x1792)
3.72 MB
3.72 MB PNG
>>107829710
>me: “just one test image”
also me 2 hours later: WHY IS EVERYTHING TURNING ANIME AND WHY IS IT SO GOOD
>>
>>107829765
it's for making videos longer than the model can output, it stitches them together with X frames from the previous chunk used as temporal guide for the next chunk
it's not made for ltx if that's what you're trying
>>
>>107829771
>but you'll have to look at previous threads
I just looked and it was never revealed.
>>
>>107829780
>Test_Kegan-0001.png
>2 hours
>>
>>107829710
Did he tune ChromaHD or what? Also GG-UUUUUUFFS when motherfucker
>>
File: Kegan_Lndscp.png (1.18 MB, 832x1216)
1.18 MB
1.18 MB PNG
>>107829710
LANDSCAPE BROS THIS IS NOT A DRILL
ANIME BROS MOVE ASIDE
THE SCENERY ENJOYERS ARE EATING GOOD TONIGHT
>>
>>107829310
>>107829710
why does it look like he trained it on niji slop
>>
>>107829810
why are you asking questions you already know the answer to
>>
>>107829771
>some weird ritualistic setup like radiance x0
it's one of the simpler models to set up in comfyui. pick the builtin pixel_space as vae instead of downloading and selecting a vae and -IDK if this is still needed- use the chroma radiance empty latent space.

everything else IIRC is very standard, one text encoder, one diffusion model, the usual ksampler
>>
which i2v gooner model could this be https://rule34hentai.net/post/list/chocolater34/1
like these are pretty good
>>
>>107829774
it's better and has more features than comfy. I use comfy to create input frames.
>>107829782
I understand the general idea, but when I'm using a video as input (let's say 121 frames) and I want to make another video (121 frames) that continues the first video, what am I supposed to set the window size to? I understand the theory of the overlap, but it doesn't seem like it shortens my new generated video at all, which I would expect to happen, because of the overlapping frames, but it doesn't. And when I set the option for newlines to be a new section in a sliding window, am I supposed to reference the previous line/frames at all? And why won't the multiple ending frames ever show up as input? Is that a bug or am I just being retarded?
>>
>>107829837
>>107829782
>it's not made for ltx if that's what you're trying
maybe I am just being retarded. If it's not made for ltx, then why would they include it as an option what running ltx-2?
>>
File: 1750518321799182.png (2.81 MB, 1216x1728)
2.81 MB
2.81 MB PNG
>>
File: 1348-jpg-3276842278.jpg (174 KB, 1023x718)
174 KB
174 KB JPG
>>107829710
new chroma finetune
anime works
landscapes work
still untested on miku
UNACCEPTABLE MIKUTESTER WHERE YOU AT ????
>>
GGUF when
>>
>>107829850
No offense but until Mikutester runs this the opinions are theoretical
>>
>>107829710
does it know artist tags? it doesn't, right? THEN IT'S FUCKING TRASH
>>
>>107829833
holy fuck the anims are so badly interpolated
and it looks like https://civitai.com/models/2053259?modelVersionId=2540892
>>
>>107829782
>it's not made for ltx if that's what you're trying
it does kind of work for ltx-2 somehow. See >>107829182
>>
>>107829867
interesting, i'll try it out. thanks
>>
>>107829710
Neat, how does hatsune miku hold up under pressure?
>>
>>107829833
Probably wan 2.2.
>>
>>107829833
Wan 2.2.

Don't expect this for ltx for at least six months.
>>
>>107829310
>b-but the dog fucker told us that t5 can't learn styles and characters
BWAHAHAHAHAH
>>
>>107829641
https://files.catbox.moe/xx40j2.mp4

Music video.
>>
>>107829641
the quality is really decent, did you do it with udio?
>>
>>107829912
>styles
where can you see styles in this model?
>>
>>107829925
>it perfectly nailed the early 00s video style
kino
>>
File: 1765100739733686.gif (13 KB, 220x241)
13 KB
13 KB GIF
>>107829925
high quality shitpost, just how I love them
>>
>>107829925
Lmfao, beautiful

>>107829930
I didnt do it, sorry, just reposted it from previous thread, whoever made it must be long gone
>>
>>107829310
You are boring, nobody presents a local model that way, you seem like a newfag, where is your shilling energy?
>>
>>107829962
>where is your shilling energy?
his 2nd try is good though >>107829710
>>
File: kegant.jpg (89 KB, 832x1488)
89 KB
89 KB JPG
>>107829850
i don't have standard miku tests but it has a miku
>>
File: kegant.jpg (85 KB, 832x1488)
85 KB
85 KB JPG
>>107829850
>>
>>107829850
>>107829981
Flux had miku, Chroma had miku, dunno why this finetune wouldn't have it lol
>>
File: kegant.jpg (99 KB, 832x1488)
99 KB
99 KB JPG
>>
File: Get Chinese culture'ed.png (872 KB, 1920x1080)
872 KB
872 KB PNG
>>107829925
Are you ready for another round of "next week"?
https://github.com/huggingface/transformers/pull/43100/files
>*This model was released on 2026-01-10 and added to Hugging Face Transformers on 2026-01-10.*
>>
>>107829986
>>107829991
does it know Lain?
https://www.youtube.com/watch?v=XtOsfHoDDdI
>>
that songs stuck in my head now
>>
>>107829993
I'm enjoying the two more weeks bants more than the models themselves.
>>
File: kegant.jpg (81 KB, 832x1488)
81 KB
81 KB JPG
>>107829998
will check. it knows megumin.
>>
>>107830000
It really is an ear worm.
>>
File: Chroma v26.png (2.32 MB, 1024x1024)
2.32 MB
2.32 MB PNG
>>107830006
Chroma already knows megumin
>>
>>107830006
motherfucker that's gregumin
>>
Anyone have the KJ LTX2 workflow?
>>
>>
I'll probably as the other AI generals this down the line, but I wanted to ask you what you think the future of AI is going to bring? No just for diffusion, but in general. This stuff is obviously not going anywhere and I think there is still a lot of cool potential in the technology. That and I personally think this is the most exciting thing in technology in a very long time, but I'm interested in what people here think since you seem to actually use the tools and models here.
>>
>>107830022
>KJ LTX2 workflow
He's been spending most his time on dick cord hypothesizing like a schizo. There isn't an official workflow yet.
>>
>>107830035
I think for coding there will eventually come a point where you can just ask it do something and you probably won't need to check it. As much as that annoys professional programmers.

For art and stuff, who knows. Video keeps getting better and better but since the end result is all you get, the usefulness of a lot of the output is still up in the air. I think a lot of AI art tools will end up specializing in things to help in the artistic process. Look at Qwen layered for example.
>>
>>107830035
> I wanted to ask you what you think the future of AI is going to bring?
in what sense? it'll basically do every form of art

>>107829998
seems like no
>>
>>107830035
AI is already good at building on existing ideas you've already created. It's really useful if you have a creative block or don't know how to progress something. As for basic image/video/audio genning, I feel like it will be something that will be taken for granted over time as it becomes less and less of a hassle to tard wrangle, but will lead to people wanting to create something bigger. Video games and other interactive media will definitely be easier to build and may lead more people to do it creatively instead of just massive corpos or broke indies.
>>
File: 1743130739556872.jpg (122 KB, 1024x1024)
122 KB
122 KB JPG
fuck ltx2
>>
>>107830081
>in what sense
I guess a broad and general one. How do you see AI being used and adopted both "regular people" as well as for business and industries moving forward. Do you see AI usage becoming basically what the cloud is now where it's just a regular part of life that nobody really thinks much about but is basically a required layer for businesses to function? Just overall scope and impact of where you think AI is heading.
>>
>>107830095
based
>>
>>107829710
Meanwhile Midjourney is starting its anime domination once again
https://xcancel.com/nijijourney/status/2009714744597643503#m
>>
try out ltx2 kijai q8 model off his repo. image source is a qwen edit miku edit as joker

the girl says "wanna know how I got these scars?", and then she fires her silver gun once. the man on the right falls over to the right and lies on the floor, with his eyes closed.

https://files.catbox.moe/qx679p.mp4
>>
>>107830121
AI is more desirable than the cloud, but to what extent normal people will be serviced in what amount of time is very unclear to me.
>>
https://files.catbox.moe/xc6ta3.mp4

Got the whole song in there.
>>
>>107830137
But isnt modjourney saas?
>>
File: img_00023_.jpg (366 KB, 1216x1376)
366 KB
366 KB JPG
>>
File: bd8.png (454 KB, 1088x1264)
454 KB
454 KB PNG
>>107830035
The most optimistic outcome (and my personal cope) is that it will destroy social media and the internet.

The most likely outcome is that it will plunge earth into Niggerhell
>>
>>107830006
Nice
>>
File: 1750331830566816.mp4 (1.62 MB, 704x1280)
1.62 MB
1.62 MB MP4
>>107830006
>>
File: 1754671739428008.png (157 KB, 498x430)
157 KB
157 KB PNG
>>107830190
absolute kino
>>
>>107830190
timeless
or is it?
>>
>>107829212
Thank you for baking this thread, anon
>>107829260
Thank you for blessing this thread, anon
>>
>>107830137
>red cube next to blue cube
>girl with four arms holding ice creams
Why are they acting like this level of prompt following is impressive?
>>
>>107830238
can local do 4arms though?
>>
File: img_00021_.jpg (386 KB, 1216x1376)
386 KB
386 KB JPG
>>
File: kek.png (466 KB, 720x720)
466 KB
466 KB PNG
>>107830273
>can local do 4arms though?
chroma can do more than 2 arms unprompted yes
>>
>>107830190
beautiful.
>>
Been trying to do Amelia™ gens for the cause ("England for the English") but I keep forgetting the mission and adding huge breasts to the prompt. Also Chroma is (predictably) not good at getting the mole in the right place.

original Amelia: https://x.com/BovrilG/status/2009719322986389748
>>
>>107829863
I could possibly make something using that.
>>
>>
https://lemonsky1995.github.io/dreamstyle/

This is looking very promising, change video styles with input images.
Hopefully it's not another chinese culture moment.
>>
>>107829642
uhh I feel like hentai video industry was bottle necked by lack of funding and effort required, there is gonna be a lotta slop but someone is def gonna make some high effort degeneracy in the coming year
>>
File: 1749927579533212.png (3.59 MB, 3709x1204)
3.59 MB
3.59 MB PNG
>>107830355
ehh it's not bad indeed
>>
>>107830190
very good
>>
>>107830359
A skilled user of AI can make some pretty good NSFW material. My pp agrees with this sentiment.
>>
https://github.com/Comfy-Org/ComfyUI/issues/11726#issuecomment-3726697711
Comfy please
>>
File: 1764170104629548.mp4 (380 KB, 832x480)
380 KB
380 KB MP4
https://congwei1230.github.io/UniVideo/
Even their cherry picked videos look like something you'd see in 2024, dunno why they would released outdated shit like that in the first place
>>
>>
Prompting for British college girls with dyed hair and blunt bangs at a protest is requiring a LOT of beautifying tags elsewhere in the prompt to balance it out...
>>
File: 1750298700488730.png (1.6 MB, 1745x1521)
1.6 MB
1.6 MB PNG
>>107830355
>>107830373
we'll be able to do this kinda of shit on Z-Image base as well with the I2L (Image to Lora) method that was used first on Qwen Image
https://huggingface.co/DiffSynth-Studio/Qwen-Image-i2L
>>
>>107829212
garbage collage, kys
>>
>>107830469
Does this not run in comfy?
>>
>>107830503
it does
https://github.com/HM-RunningHub/ComfyUI_RH_QwenImageI2L
>>
>>107830512
How come it's not in the OP? This is extremely powerful.
>>
File: ComfyUI_00006.mp4 (1.08 MB, 480x832)
1.08 MB
1.08 MB MP4
Wan was struggling so much with actually making the balloon inflate gradually and ltx just does it on the 1st try
Skin gets slopped to hell though
>>
>>107830530
>Skin gets slopped to hell though
on i2v it's all right but on t2v yeah it's ultra slopped, almost unusable
>>
>>107830530
now make her tits grow too
>>
File: 1753957816390442.png (222 KB, 2174x997)
222 KB
222 KB PNG
>>107830526
>How come it's not in the OP?
you have to load a 7.9B and then a 7.6B model to get the "best" quality, and even at its "best" it's not that good, the concept is cool but it needs more improvement
https://modelscope.cn/models/DiffSynth-Studio/Qwen-Image-i2L/summary
>>
File: ComfyUI_00590_.jpg (485 KB, 1120x1440)
485 KB
485 KB JPG
>>
>>107830190
Kek, excellent work, ldg is finally healing
>>
File: 1748519610337549.png (2.14 MB, 1024x1472)
2.14 MB
2.14 MB PNG
>>
File: it's owari.png (27 KB, 1003x102)
27 KB
27 KB PNG
>>107829710
>>
File: qwen_image_2512_00077_.png (3.42 MB, 1920x896)
3.42 MB
3.42 MB PNG
>>
File: 1758858735570937.png (2.34 MB, 1216x1248)
2.34 MB
2.34 MB PNG
>>
>>107830609
how does he know it's a lora merge?
>>
File: 1749063869376328.png (2.28 MB, 1088x1408)
2.28 MB
2.28 MB PNG
which way, anons?
>>
>>107830685
the one with the cleaner ass hole
>>
File: file.png (81 KB, 1381x363)
81 KB
81 KB PNG
Can someone please explain this part of the wan2.2 guide? How do I install this workflow?
>>
>>107830624
That's like asking how you know the guy you're talking to on the phone is black.
>>
>>107830704
>How do I install this workflow?
you download the json and you load it on comfyui?
>>
>>107830704
>click the link to the json
>Save
>drag json file into comfyui

That's your one question for the day because if that's where you're getting stuck we'll be here forever.
>>
>>107830704
dude, it's basic shit ask a LLM to help you out, don't waste our time with this shit
>>
>>107830705
i knew someone who refused to admit that it was possible to tell someone's race just by hearing their voice on mic. wild.
>>
>>107830739
seriously? lmao, black people sound so unique it's impossible to miss them, we wuzz unique and sheet
>>
File: 1756049474847376.png (1.61 MB, 1024x1472)
1.61 MB
1.61 MB PNG
>>
>>107830725
Calm your tits, a rentry from here is not the best thing for a noob to be using. its even got broken links and still mentioning teacache ffs.
>>
>black people
just call them niggers
>>
>>107830420
just prompt for chav DESU if you want British sluts, seems to be a known term in many cases
>>
>>107829704
Chroma was always bestialitymaxed TBQH
>>
File: 1739794280421953.png (863 KB, 1600x800)
863 KB
863 KB PNG
>>107829310
>actual tune, not shitmix
>>
This is in the workflow of that new chroma anime merge, but I can't find this lora. I guess it's some custom shit and not a form of lightning lora?
>>
File: 1756430566849084.png (2.39 MB, 1152x1408)
2.39 MB
2.39 MB PNG
>>107830809
the author just made a lora and merged it into chroma base, but he forgot to update the wf and left the lora node in LMAO.
Many such cases, here have JK
>>
File: COME ON.png (128 KB, 360x346)
128 KB
128 KB PNG
>>107830821
>but he forgot to update the wf and left the lora node in LMAO.
lmaoo, they can't even pretend right
>>
File: 1756804249060147.webm (779 KB, 1024x512)
779 KB
779 KB WEBM
>>107830807
>>
>>107830821
No fucking way, fucking jeets.

Here's a prompt-less grid of it anyway.
>>
>>107830834
kek
>>
File: 1743797932208407.png (2.8 MB, 1152x1312)
2.8 MB
2.8 MB PNG
me and my anime wives irl
>>
>>107830821
>>107830809
it's all right, this lora identifies as a finetune
>>
>>107830821
lmao, not the first time, definitely not the last time
I remember someone ITT trying to prove that aniwan was a trained checkpoint and not a jeetmerge
>>
whats the most i can squeeze out of a 3080ti?
>>
File: 1747166892637311.png (2.85 MB, 1152x1312)
2.85 MB
2.85 MB PNG
>>107830865
kino
>>
>>107830390
You can send it to v2v with ltx to make it look more believable. Though yeah, I think it'll just be better to get the 1st frame with Qwen and then proceed with it
>>
>>107830190
What are your specs? How did you even squeeze almost a minute in one go?
>>
I haven't used chroma before.
Is this how it's supposed to look like? Also takes about 2minutes to gen 4images at 30steps, 1280x720, normal?

https://litter.catbox.moe/bonreg8c4e1i6haa.jpg NSFW
>>
>>107830908
>Is this how it's supposed to look like?
yep, the model is slow as fuck and the anatomy is garbage, if Z-image turbo got hyped so hard but not chroma it's for a reason
>>
How do I use GGUF for LTX2 on WAN2GP UI? I have downloaded the .gguf file and placed it onto the "ckpts" which is where all the models and safetensor files are loaded when I click generate yet in the dropdown menu the GGUF doesn't appear?
>>
>>107830908
I don't think a success rate of 25% is a good thing for an image model in 2026
>>
>>107830905
RTX 3090. I offloaded 8gb of vram into my other 3090. You'll notice the resolution is fairly tiny. I think I could make it even bigger but at that point that's a lot of waiting for a meme. Took about 5 minutes to generate.
>>
>>107830914
>>107830916
How did it end up like that? Can't you tell if your checkpoint is shit early on when making it?
>>
>>107830929
>How did it end up like that?
turns out it's impossible to proprely undistill a model, Flux Schnell can't be saved it is what it is
>>
>>107830821
https://files.catbox.moe/idswav.mp4

Kinda not sfw?
>>
https://github.com/Rolandjg/LTX-2-video-extend-ComfyUI

This seems promising. It extends existing videos alongside their audio. Can someone test it? The memetic (and coom) potential are great
>>
>>107831010
That workflow has been out forever. But it looks like this one did a good job balancing the audio.
>>
For multigpu chads who want to run LTX2, a custom node is finally out:

https://github.com/dreamfast/ComfyUI-LTX2-MultiGPU
>>
>>107831071
Too bad the multi gpu nodes aren't really multi gpu, just using your other GPU as a big fat second stack of ram.
>>
File: sigmas.jpg (18 KB, 241x444)
18 KB
18 KB JPG
>>107831010
Wish I knew what all of these numbers were doing
>>
>>107831094
its the sigmas, aka the STEPS, or to be exact, when the sampling is going to happen (float of 1 to 0)
>>
>>107831071
>>107831083
Distorch nodes work fine, at least with GGUFs if you pull those two request in the comfyui-gguf repo.
>>
>>107831083
???
That's precisely the idea, if it works like the Distorch stuff from the other multigpu node
It is still faster than loading on RAM and using CPU for inference
>>
>>107831094
Don’t think too hard about them, each one separated by a comma is just a value used in each step. They should get closer to zero on the final step. How those numbers are derived I don’t really know, but it’s not as scary as it looks.
>>
>>107831101
But yeah, all the compution is still don on the main GPU, whatever you set up. I don't see comfy implementing multigpu setup any time soon so someone really needs to look up how llama-cpp-python implements it
>>
>>107831103
I know. It would just be cool if both my GPUs could be blasting away on the same job like an llm does
>>
>>107831112
>I don't see comfy implementing multigpu setup any time soon
Also known as never, that lazy grifter will never deliver anything useful outside of adding support for more API nodes or cosmetic changes
>>
>>107831112
>>107831126
he's too busy removing the stop button, plz understand
>>
>>107831112
>llamacpp-python
It's just a basic wrapper for llamacpp, the main thing is still in C/C++
>>
>>107829310
looks like SDXL, why waste my bandwidth by replying this link?
>>
>>107830712
>>107830714
>>107830725
>>107830776
I finally got it working but it looks really bad
>>
>>107830530
>>>/wsg/6070113
>>
>>107831101
Which PRs anon?
There are two PRs about the Gemma 3 text encoder
https://github.com/city96/ComfyUI-GGUF/pull/404
https://github.com/city96/ComfyUI-GGUF/pull/402
>>
>>107831228
399 and 402
404 does the same thing as 402 plus handles loading Gemma's mmproj file for the prompt enhancer node, but the author of the PR probably messed up somewhere because it runs OOM for me where 402 does't. Or maybe I'm a vramlet. Or both. Anyways, 399 + 402 work fine.
>>
>>107829212
Hey niggers, I'm pretty out of date when it comes to my knowledge of imagen stuff, so I've got a question about the current crop of i2v/t2v models.
Do any of them allow you to give a frame by frame controlnet, like the original animdiff did? Like if I say, wanted to give it 60 frames of openpose or depth maps, will any of the new shiny video models use those? I ask because I only see people mentioning using starting frames to control generations, rather than controlnets.
>>
>>107831246
>399 + 402 work fine.
and if you don't know how to implement them to the gguf repo, you do this

Go to ComfyUI\custom_nodes\ComfyUI-GGUF
>git checkout -b temp-test-branch main
>git fetch origin pull/399/head:pr-399
>git fetch origin pull/402/head:pr-402
>git merge pr-399 -m "Merge PR 399"
>git merge pr-402 -m "Merge PR 402"

and once those pr are merged, if you want to go back to the original "master" branch you do this
>git checkout master
>>
>>107831247
Probably one of the vace models. I think ltx can be coaxed into using depth maps but I’m not sure.
>>
>>107831257
Mind sharing your workflow, anon? I merged the whole thing and the Distorch multigpu node for model checkpoint doesn't support the audio vae, so I wonder what you are doing for that
>>
>>107831281
>vace models
Thanks anon, looks like the vace wan 2.1 can do what I want.
Actually, looking through related stuff it seems both wan 2.1 and 2.2 can be made to use controlnets it the way I wanted, so that's extra nice to know. Question answered.
>>
>>107831344
It won't. use KJ's vae node
>>
>>107831351
Thank you kind sir, upvotes are appreciated
>>
File: 1750802317505447.png (261 KB, 992x1484)
261 KB
261 KB PNG
>>107831344
it should look like this
>>
File: 1759881778499863.jpg (940 KB, 1248x1824)
940 KB
940 KB JPG
>>
>>107831394
Take note of the connector too. The distill one is broken or something. The one in the Kijai repo should be good
>>
Enjoying your base model you fucking retards?
>>
>>107831434
>The distill one is broken or something.
I think he fixed it, he thought the connector shit was the same for both the non distilled and the distilled, now both are separated as it should I guess
https://huggingface.co/Kijai/LTXV2_comfy/tree/main/text_encoders
>>
>>107831394
Anon, if you don't mind, may I ask: why are you offloading part of the main model, and not the text encoder, to the CPU?
>>
>>107831467
the text encoder at Q8 can fit to my 24gb vram card, not ltx 2 though, if I don't offload it'll overflow because of the additional memory usage (it has to take into account all the frames and shit)
>>
>>107831471
Yeah, but the text encoder generates the embeddings only once per prompt. You can just free up both GPUs to handle the main video model and perhaps the two VAEs, it will probably be faster as long as you don't change the prompt
>>
>>107831489
>the text encoder generates the embeddings only once per prompt.
the text encoder gets unloaded regardless, so if the prompt doesn't change, its embeddings are still on the ram I don't need to load the TE again
>>
LTXV with booba lora is a sign of things to come
https://files.catbox.moe/gi6kai.mp4
>>
I dunno anything about tensors, but this code looks like it's supposed to extend latent audio to match the specified frames, but that doesn't happen
I'm not confident enough to ask about it, plus the fact that nobody else sees a problem with it
>>
>>107831453
Base on what?
>>
got the chinese culture song stuck in my head....
>>
first real nsfw lora https://civitai.com/models/2298764/prone-face-cam?modelVersionId=2586637
>>
>>107831625
>Too bad audio training isn't possible yet.
so it's useless lul, it has to trained on those e-thot voices to be believable
>>
>>107831642
the official trainer can do audio
>>
>>107831570
catbox dead again?
>>
is there any nudify lora for flux 2?
>>
File: file.png (664 KB, 640x640)
664 KB
664 KB PNG
>>
i thought about upscaling my images to 2k or 4k before training a lora, but does that even make sense when i will be training on 1024? as far as i understand they will be downscaled again anyway
>>
File: file.png (100 KB, 774x514)
100 KB
100 KB PNG
>>107831690
They deployed the Anti-AI protection...
>>
>>107831782
>the Anti-AI protection
wtf? you won't be able to upload AI videos to catbox? that's retarded omg...
>>
>>107831570
>>107831690
no it's just this link that doesn't work for some reason
>>
File: LTX_2.0_i2v_00198_.webm (1.04 MB, 448x448)
1.04 MB
1.04 MB WEBM
>>>/wsg/6070152
>>
>>107831788
No, they're banking a lot of people with $8 are retarded enough to know all you need to do to bypass any "poisoned" image for AI training is to do like a 1-2% gaussian blur on the image before training. The ever so slight quality loss will correct itself in training thanks to non poisoned images.
>>
>>107831813
I mean he isn't wrong, your average artists know jackshit about AI, I was talking with a friend of a friend of mine who works making concept art and he thinks the AI has a database of every single drawing which it then grabs and mixes with another for the prompt response
>>
>>107831813
What kind of "poison" do they even do?
So far I've seen some images modified so much it's not even about "ai training", even for humans it looks like shit.
>>
File: 60.png (2.4 MB, 1592x1128)
2.4 MB
2.4 MB PNG
>>
>>107831842
>the AI has a database of every single drawing which it then grabs and mixes with another for the prompt response
It's actually pretty common, most of them think it's basically a giant zip of all their stuff, thus all the hysteric discourse about it "stealing art".
>>
>>107831872
brazilian
>>
File: file.png (1.6 MB, 792x1320)
1.6 MB
1.6 MB PNG
>>107831180
>>
File: 7154403523.png (2.33 MB, 1544x1160)
2.33 MB
2.33 MB PNG
>>107831888
>>
File: LTX_2.0_i2v_00203_.webm (1.07 MB, 640x1088)
1.07 MB
1.07 MB WEBM
>>107831897
fucker doesn't wanna do it
>>
>>107831843
It adds a noise pattern on top of the image that's supposed to confuse LLMs in to thinking the image itself is pure noise and not of what was prompted, thus causing it to generate actual garbage noise when in the final model. These companies always claim that the noise is undetectable to the human eye, but it never is. The image always looks low quality slightly off, like it was ran through a shitty filter, which it was. Adding the blur gets rid of the "off" look but degrades image quality ever so slightly. But since the noise got broken up, the AI sees the image basically as it was intended to be seen before the poisoning. It's really not hard to set up an automated batch process to add the blur when resizing to fit a training dataset, but they bank on people being retarded and ignorant.
>>
>>107831879
why are you even replying to him hes an ESL retard
>>
>>107831975
Found the artist.
>>
>>107832012
?
>>
>>107832013
??
>>
>>107831965
I see. At some point if you're an artist and you deliberately make your stuff look bad to fight some made up problem, just don't publish your work, keep it in your computer, never share to anyone, who knows, maybe if someone sees it, they could be inspired and copy the style.
>>
Why is the VideoVAE step of LTX2 so fucking slow? Any way to speed that up? Is the upscaler thingy mandatory?
>>
>im a nigbophile
>>
>>107832029
stop replying to ESL retards
>>
>>107832036
proof?
>>
>>107829310
>>107829710
Didn't expect this many (You)s before going to bed.
But unfortunately anons are right, it's a lora merged into the model.
I guess people are hungry for a decent chroma tune despite larping that the model is irredeemable dogshit.
>>
>>107832056
chroma won't be the future, even lodestone knows it, all we can do is to wait patiently for Z-image base... :(
>>
>>107832037
I eat McDonalds every day and call others niggers and enjoy when cops are killing people
how did you know I'm from the united states?
>>
File: 7543065.png (2.58 MB, 1544x1160)
2.58 MB
2.58 MB PNG
>>107832048
>>
>>107832081
That's just z-image silly, for a second I thought you actually saw base lol
>>
>>107832029
some artists (especially JP ones), regularly delete all their old works
they were doing that before AI was a thing, they're weird like that
>>
>>107832081
kek
>>
>>107832029
I don't think is true because even a couple years ago there were some papers that showed you could very minimally alter images to make LLMs completly misunderstand input images without it being perceptible to humans. So unless the models are so good now that you MUST alter it in a way that even a human would notice, I believe that is still the case. Particularly because they don't "see" images the way we do anyway, so what looks "different" to a model has very little to do with what we consider different anyway.
>>
>>107832081
now make a video where the spaceship flies over the houses while playing the chinese culture song from their speakers
>>
can we start using something other than catbox?
>blocks vpns
>randomly doesn't load because of overloaded servers
>20kb/s if it does manage to load, enjoy waiting 15 minutes to watch a 5 second meme vid
>submits all your uploads to glowie servers for analysis
how did this become the standard here
>>
File: 1009728877.png (1.09 MB, 896x1152)
1.09 MB
1.09 MB PNG
>>
File: 2295726753.png (900 KB, 896x1152)
900 KB
900 KB PNG
>>
>>107832125
It allows hotlinking. But I am interested if there's an alternative. pomf and desu but they are not as good in some aspect?
>>
>>107832125
I think someone is maintaining a wsg thread, should be added to OP.
>>
There are like billion fucking optimizers in OneTrainer.
Which one am I supposed to use? AdamW?
>>
>>107832166
can't go wrong with good ole' AdamW
>>
>>107832173
>>107832177
That tracks.
The boring answer sounds like the safest one.
>>
>>107832166
i think for zit the consensus is adamw8bit if you are low on vram otherwise adamw
the onetrainer default configs are solid so just start with those
>>
File: WAN_video__00001.mp4 (1.76 MB, 1024x1024)
1.76 MB
1.76 MB MP4
>>107832081
>>
File: img_00007_.jpg (465 KB, 1264x1672)
465 KB
465 KB JPG
>>
>>107832251
solid loop
>>
what happened to GLM image?
>>
z image base really isn't release yet?
>>
>>107832290
i forgot
>>
>>107832159
>>107832125
We welcome audio clips >>>/wsg/6069549
>>
Comfy really messed up the ltx2 release, all he managed was to piss off users with his shitty coding
>>
>>107832302
he is being paid bigly by nvidia, we lost
>>
>>107832304
Well they need to ask for their money back because his implementation fucking sucks, he is getting mogged by a single user and a gradio app
>>
File: img_00016_.jpg (576 KB, 1264x1672)
576 KB
576 KB JPG
>>
wan is too fucking slow oh my fucking god
>>
So how does LTX compare to Wan2.2?
>>
>>107829710
>no info about booru / character / artist tags
>all examples look like westoid idea of anime

miss me with that shit
>>
anyone else has youtube recommendations relating to their local gen prompts, or am i the only one spied on?
>>
>>107832407
meds, nyaow
>>
>>107832422
why would youtube show me a giant frog the day after i generate a giant frog? i never searched for this
>>
>>107832407
and you can confirm that you've never searched for anything on google related to your gens ever?
>>
>>107832427
i did have a pet frog 15 years ago, i did want a new one 1 year ago, but then i never searched for that again
>>
i'll try prompting the weirdest specific videos to be sure pinokio spies on me
>>
>>107832448
i mean.. frogs are not that uncommon on the internet, could just be a coincidence
>>
>>107832385
Wan 2.2 has better prompt adherence, better quality and better motion. Ltx 2 has audio, variable frame rate options and built in lip sync and is way faster. LTX is really just better for memes. Maybe ltx will be better as they refine, light tricks says they will continue to train it and will release future models.
>>
I can't get TensorRT to work with Chroma, why must my life be so hard.
>>
>>107832498
use use nbp
>>
>get workflow
>gorillions of nodes I need to install even though I have equivalents
>just replace them
fuck off
>>
>>107832290
coming next week
>>
>>107832385
meh. wait for ltx 2.5
>>
>>107832290
supposedly this week, but you know the chinese culture anon...
https://xcancel.com/bdsqlsz/status/2009911175019168215#m
>>
>>>/wsg/6069549
>>>/wsg/6069549
>>>/wsg/6069549
Migrate.
>>
>>107832533
racist
>>
>>107832385
good for meme gens at the moment. they need to fix i2v and comfyui needs to sort out his memory management and workflows
>>
>>107832562
FUCK OFF!
>>
>>107832519
what about wan2.5?
>>
>>107832585
three weeks
>>
>>107832585
>what about wan2.5?
in 2.5 weeks
>>
>>107832498
Does it even support Chroma?
Also Cumfart TensorRT node is kinda broken on latest pytorch.
>>
File: 09829666.png (2.53 MB, 1521x1160)
2.53 MB
2.53 MB PNG
kek
>>
>>107832584
nta but i think ltx posters should migrate there, it supports audio
>>
>>107832664
lmao the joke is cuckoldry
hilarious loooool
>>
>>107832664
more like disabled edition
>>
>>107832081
is this ai?
>>
File: file.png (1.98 MB, 832x1248)
1.98 MB
1.98 MB PNG
>>
File: ComfyUI_00002_.png (1.27 MB, 864x1152)
1.27 MB
1.27 MB PNG
>>
>>107832707
>the joke is cuckoldry
why can't they both be single and just engaging in casual sex?
>>
m'ready
>>107832710
>>107832710
>>107832710
>>
>>107832664
benchod
>>
>>107832498
>TensorRT
qrd
>>
>>107830315
How do you prompt for interesting faces like this one?



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.