[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: loooooooooong.jpg (1.74 MB, 3264x1441)
1.74 MB
1.74 MB JPG
Discussion of Free and Open-Source Diffusion models.

Image Edition

Previous: >>103539836

>UI
Metastable: https://metastable.studio
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
Forge: https://github.com/lllyasviel/stable-diffusion-webui-forge
reForge: https://github.com/Panchovix/stable-diffusion-webui-reForge
ComfyUI: https://github.com/comfyanonymous/ComfyUI
InvokeAI: https://github.com/invoke-ai/InvokeAI

>Models, LoRAs, & Upscalers
https://civitai.com
https://tensor.art/
https://openmodeldb.info

>Training
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts/tree/sd3
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts

>HunyuanVideo
Comfy: https://github.com/kijai/ComfyUI-HunyuanVideoWrapper/
Windows: https://rentry.org/crhcqq54
Training: https://github.com/tdrussell/diffusion-pipe

>Flux
Forge Guide: https://github.com/lllyasviel/stable-diffusion-webui-forge/discussions/1050
ComfyUI Guide: https://comfyanonymous.github.io/ComfyUI_examples/flux
DeDistilled Quants: https://huggingface.co/TheYuriLover/flux-dev-de-distill-GGUF/tree/main

>Misc
Share Metadata: https://catbox.moe | https://litterbox.catbox.moe/
Generate Prompt from Image: https://huggingface.co/spaces/fancyfeast/joy-caption-alpha-two
Archived: https://rentry.org/sdg-link
Samplers: https://stable-diffusion-art.com/samplers/
Open-Source Digital Art Software: https://krita.org/en/
Txt2Img Plugin: https://kritaaidiffusion.com/
Collagebaker: https://www.befunky.com/create/collage/
Video Collagebaker: https://kdenlive.org/en/

>Neighbo(u)rs
>>>/aco/sdg
>>>/aco/aivg
>>>/b/degen
>>>/c/kdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/tg/slop
>>>/trash/sdg
>>>/u/udg
>>>/vt/vtai

>Texting Neighbo(u)r
>>>/g/lmg
>>
It's not moving...
>>
It is moving too much...
>>
Videofags BTFO
>>
File: 1722334325807108.png (64 KB, 638x560)
64 KB
64 KB PNG
>vramlets celebrating the non moving OP picture
It's not like I care, at the end of the day I have a 24gb vram card, you don't
>>
>>103546793
based
>>
>>103546793
LOL U POOR
>>
File: HunyuanVideo_00655.webm (840 KB, 960x544)
840 KB
840 KB WEBM
vramlet lurkies acting so cute rn
>>
>>103546793
https://youtu.be/VN6OThlrA6g?t=63
>>
File: 1720924371686075.mp4 (840 KB, 672x384)
840 KB
840 KB MP4
babe wake up, a new model got released
https://huggingface.co/IamCreateAI/Ruyi-Mini-7B
>>
mass vidfag suicide imminent
>>
>>103546883
she's cute but I don't see the appeal in 3 second videos. how many seconds could I make with 16gb?
>>
File: 1710878507214885.mp4 (685 KB, 416x608)
685 KB
685 KB MP4
>>103546922
for a 7b model it really looks like ass
>>
File: HunyuanVideo_00661.webm (957 KB, 960x544)
957 KB
957 KB WEBM
this is getting out of foot, now there are two of them!

>>103546928
you could probably go up to 5 or 7 seconds, but you'd need to either sacrifice step count or resolution or quantization
>>
>>103546922
oh cool image-to-video SOTA just dropped
>The model architecture is modified from EasyAnimate V4 model, whose transformer module is inherited from HunyuanDiT
it's hunny all the way down
>>
File: low_gpu_memory_mode.png (36 KB, 1196x116)
36 KB
36 KB PNG
>>103546922
> The training process is divided into four phases:
>Phase 1: Pre-training from scratch with ~200M video clips and ~30M images at a 256-resolution, using a batch size of 4096 for 350,000 iterations to achieve full convergence.
>Phase 2: Fine-tuning with ~60M video clips for multi-scale resolutions (384–512), with a batch size of 1024 for 60,000 iterations.
>Phase 3: High-quality fine-tuning with ~20M video clips and ~8M images for 384–1024 resolutions, with dynamic batch sizes based on memory and 10,000 iterations.
>Phase 4: Final video training with ~10M curated high-quality video clips, using a batch size of 1024 for ~10,000 iterations.
Why would they train at a low resolution at the start? Video isn't like images where you can use something low resolution and still make out something. You need to have resolution on the motion to make sense of anything going on. Also why would they use non-standard video sizes? IMO, the lowest they should've gone is 352×240.
Also LMAO at the 4090 being low memory.
>>
>>103546922
you could generate any video even if a celeb isnt in the data set, then swap them in with roop unleashed video swap.

no, not for lewds, for memes. and lewds if you care for that too I guess.
>>
File: 1731968387944394.png (1.19 MB, 1024x1024)
1.19 MB
1.19 MB PNG
vpred models are the future, literal magic.

noobAI vpred 0.9, but a slightly diff model merged with other stuff to give even better anatomy (default is fine, but this is even better)

https://huggingface.co/Panchovix/noobai-XL-Vpred-0.9r-cyberfix

ZERO lora.

prompt: masterpiece, best quality, newest, absurdres, highres, safe, ayanami rei, white plugsuit, smile, peace sign gesture
>>
File: tongue.jpg (329 KB, 1024x1024)
329 KB
329 KB JPG
>>
>>103547136
*also adetailer if you want to touch up anything or if eyes arent perfect, but generally there is no issue. everything pops, no washed out colors, good lineart/shading, looks like real art. and the best part is you can use any booru tag for a character or style. very few characters dont work, then you can just use a lora.
>>
>>103547136
how does it compare to NTR MIX?
>>
>>103547155
havent used that I think, before this my most used model was autismmix confetti (anime pony based checkpoint)

it's still good, but this gives even better results/better art + colors + lineart.
>>
so I'm a bit out of the loop, what's the current state of flux? I remember people being very hyped about PonyXL.
>>
>>103547139
just imagine...
>>
File: 1706221865745175.png (777 KB, 1024x1024)
777 KB
777 KB PNG
>>103547136
same prompt with (jojo pose:1.3) added. any booru tag seems to work (it's trained on them, so it should). the booru autocomplete extension works really well as you type a character/style and you can just tab complete the specific booru tag.
>>
File: 1730805838537133.png (1.06 MB, 1024x1024)
1.06 MB
1.06 MB PNG
>>103547293
masterpiece, best quality, newest, absurdres, highres, safe, ayanami rei, white plugsuit, smile,(jojo pose:1.3), white background, hand over face

it just works
>>
File: 1706666220187690.png (757 KB, 1024x1024)
757 KB
757 KB PNG
>>103547327
one more
>>
>>103547136
>>103547293
>>103547327
The types of things you're trying "just worked" since the first nai leak.
>>
>>103547336
to an extent, noobAI checkpoints however have the best tag understanding and flexibility so far, I havent even needed to use a lora for most of this. pony, while capable, really relies on loras in general for various stuff. this model is solid without needing character/style loras as much.
>>
File: 1722003954520977.jpg (889 KB, 1024x1024)
889 KB
889 KB JPG
>>103547136
Vpred has insane contrast, forcing me to use cfg rescale. 1.1 epsilon's creativity is just as good without additional issues.
>>
File: HunyuanVideo_00043.webm (430 KB, 352x640)
430 KB
430 KB WEBM
>>103547293
>>103547327
>>103547332
i have generated so many ayanamis over the past few days
>>
>>103547409
yeah I have both, plus some of the cyberfix models which are merges that seem to work well too. All I really need is the tag extension in most cases, I have some style loras though and some characters who aren't in the dataset. but you can also just prompt artist style tags.

https://github.com/DominikDoom/a1111-sd-webui-tagcomplete.git

great addon, just type an idea and you get a list of related booru tags.
>>
File: 1712321762776235.jpg (1.5 MB, 1248x1824)
1.5 MB
1.5 MB JPG
>>103547437
My issue with all Noob merges is that they hurt creativity, which is the best thing going for Noob. Same with loras unless they are well-trained too.
>>
>>103547443
yeah, I always keep the vanilla models to use because some models will skew a certain way.
>>
File: 1705731189358601.png (1.28 MB, 1024x1024)
1.28 MB
1.28 MB PNG
controlnet canny is always fun to use:
>>
vpred is AIDS man i have no idea how you get that shit working, i've tried three different vpred models besides base noobai and it always gives me some schizo nonsense
https://files.catbox.moe/t74m7r.png
>>
>>103547342
wish the illustrious base model had the same furry understanding as PonyXL
just for variety, I'm not a furry
>>
>>103547514
I had issues using my sdxl/pony local install with the model, and then I git cloned reForge in a new folder and everything worked fine. I was getting random noise till I made a new install just for noobai stuff. must be some setting that was screwing stuff up. but now, it's fine.
>>
>>103547574
i figured swarmui was compatible just fine, just needed to change the metadata and it was supposed to be good.
forge of course isnt, i'm not even wasting my time with getting reforge with how buggy it is.
>out of three UI's i OOM on the one that's supposed to be best at memory management and speed
>>
>>103547618
it's been stable for me, but a fresh install of either gui should fix it, reforge for sure worked, just get the model, sdxl vae, and that's basically all you need, then install extensions/get controlnets/etc. or copy them over.
>>
File: 1718218492086236.png (1.17 MB, 1024x1024)
1.17 MB
1.17 MB PNG
>>103547509
>>
>>103547514
You have two loras in your workflow. Maybe they are not compatible with vpred.
>>
>>103547763
it's explicitly a vpred lora
removing the lora still created eldritch horrors.
>>
File: 1734280901895423.png (1.24 MB, 1024x1024)
1.24 MB
1.24 MB PNG
>>103547651
>>
File: 1722443391538368.png (1.28 MB, 1024x1024)
1.28 MB
1.28 MB PNG
>>103547780
>>
https://huggingface.co/city96/HunyuanVideo-gguf
Based, ty city
>>
>>103547797
now where's the loader
>>
>>103547797
Let's goooooooooooooo
>>
>>103547797
oh boy oh boy time to see how lobotomized this shit is
someone grab the lowest quants NOOOOWWW let's see some blurry horrific BOGS
>>
File: 1708416661766768.png (1.31 MB, 1024x1024)
1.31 MB
1.31 MB PNG
>>103547793
>>
File: 1716558548289371.jpg (1.37 MB, 1248x1824)
1.37 MB
1.37 MB JPG
>>103547766
I dunno anon. Forge?
>>
File: Untitled.png (2 KB, 626x85)
2 KB
2 KB PNG
>>103547806
>>
File: 1721856557942874.png (1.18 MB, 1024x1024)
1.18 MB
1.18 MB PNG
>>103547822
>>
Kijaisisters... Did we lose?
>>
>>103547136
any tldr on what vpred is?
>>
File: 1728207274879468.jpg (1.38 MB, 1248x1824)
1.38 MB
1.38 MB JPG
>>103547828
>>
>>103547171
from what I'm seeing, kind of forgotten
>>
File: 1706834248342300.png (1.37 MB, 1024x1024)
1.37 MB
1.37 MB PNG
>>103547857
there are two types of the noobai model, vpred and eps (epsilon), both work well but with either you can get amazing results with basic danbooru tags, loras aren't even necessary in most cases.

this was just nami \(one piece\), and "beach, pirate ship", with the default positive/negative tags.
>>
>>103547797
>It is intended to be used with the native, built-in ComfyUI HunyuanVideo nodes
lol
Breaking changes for Kijai's node incoming
>>
>>103547797
If you go with the BF16 version can you get the same quality as the original non gguf BF16 model but just faster?
>>
>>103547797
is Q4_1 better than Q4_K_M? some different quant mechanism?
>>
File: Untitled.png (81 KB, 885x704)
81 KB
81 KB PNG
wtf? Why is this rendering without errors?
>>
>>103547872
>>103547863
Why are these styles so incredibly unappealing? Is it the nose?
>>
>>103547941
one piece has a distinct style
>>
>>103547941
I've always found one piece to be hideous desu
>>
File: HunyuanVideo_00006.webm (640 KB, 960x544)
640 KB
640 KB WEBM
>>103547797
oh good we will finally have more than the same 3 anons posting videos each thread

>>103547911
k quants are better. 4_1, 5_1 are older quant techniques that have higher perplexity than k quants
>>
File: 1706000772030714.png (1.22 MB, 1024x1024)
1.22 MB
1.22 MB PNG
>>103547945
but nami and robin are great, in any case
>>
Can decent image gen be done on a 6900XT? I want to run a couple of anime tiddy models but I'm not spending a small fortune on a new card just to do that.
>>
>>103547966
>AMD
>>
>>103547966
You can probably do some stuff with 16gb, I don't know how with AMD, but you can do it.
>>
>>103547966
My niggy i prompted with GTX970 two years ago
>>
I WILL wait for Kijai to implement the GGUF into his nodes
I WILL not use the built-in ComfyUI nodes
>>
>>103547999
You WILL NOT PULL
and you vill be 'appy
>>
>>103547999
Kijais node better somehow?
>>
i will wait until people test the quants and we know how difficult it is to generate cute girls and their feet at q4-6
>>
>>103548008
Torch compile
Block swapping
fp8_fast
Text encoder quantization
Lots of optimizations comfyui nodes are missing
>>
Why would they release the gguf files and not provide half the nodes required to use them?
>>
>>103547975
>>103547987
Cool I'll give it a bash later and see if I can get it working one way or another. I'm totally new to this but I'm sure if I follow enough instructions on the internet I can spin something up. I've got a friend who's messing around with it but he's using a 4090 so I probably can't just 1 to 1 copy his NVidia setup.
>>
>>103548036
he just quantz's models
>>
>>103548038
I hope you like Linux
>>
>>103548038
you're lucky you're getting into this now and not 2 years ago. AMD is basically 1 click on windows and ComfyUI-ZLUDA should just work out of the box for you for all your SD1.5/SDXL needs
>>
>>103548040
Not true, he provides inference nodes that were updated to load hyvid too.
>>
>>103548041
I'm not scared of a terminal so I'll be okay. WSL on the other hand is a minor cluster fuck and I don't look forward to dealing with that.
>>
>>103548041
>>103548051
there are solutions like DirectML and ZLUDA for AMD on Windows now, it's not early 2023 anymore
>>
File: Untitled.png (57 KB, 991x506)
57 KB
57 KB PNG
Okay, so I put the hyvid in the gguf loader and empty latent into the ksampler. What now?
>>
>>103548036
Waiting for someone else to do the rest so they can copy it
>>
File: HunyuanVideo_00001.mp4 (998 KB, 960x544)
998 KB
998 KB MP4
I stand loyal to the end with my nigga Kijai
>>
>>103548107
The man who shot down any requests for gguf support in the vain hope we'd get some magical nf4 quants? Nah.
>>
>>103548118
was nigga4 even good? looked like AIDS when it was used for flux.
>>
>>103548125
Never was. It was the most baffling format to come out of Flux's release.
>>
Inadvertent NSFW, stupid horny model
https://files.catbox.moe/hjl0qg.mp4
>>103548118
Hey! He was working hard on implementing rf-inversion!
Whatever that is...
>>
Can someone share comfy workflow for q8 >>103547797 ?
>>
>>103548169
It's not possible yet.
>>
>>103548202
So how do I test this model then?
>>
>>103548249
You do not.
>>
>>103548309
So what's the point then?
>>
>>103547797
>Based
not that based, they're not in the gguf.safetensors format like the mochi gguf of kijai, that means that you can't use torch.compile with it
>>
>>103548325
puts pressure on kijai to support it
>>
>>103548341
city applied no pressure, those must be on gguf.safetensors format, and they are not
>>
>>103546752
What would you recommend for an ancient 1060 machine? inb4 get better hardware
im working on it, but i need something to work right now
>>
>>103547999
Didn't he say he wasn't interested in gguf at all?
>>
>>103548350
dalle3
>>
>>103548036
some turkish dev will develop the proper nodes
>>
>>103548350
SDXL and PonyXL lightning 4step models
>>
File: HunyuanVideo-00111.webm (1.15 MB, 1280x720)
1.15 MB
1.15 MB WEBM
>>
>>103548377
I have 4090. Could you share your workflow and tell me how long does it take to generate 720p video?
>>
>>103548377
damn
>>
File: 1626366747359.jpg (110 KB, 1018x1024)
110 KB
110 KB JPG
Is there any video model that can use controlnets?

Animatediff is good but I think is hard to control and most of the time It generates garbage.
>>
>>103548438
>Is there any video model that can use controlnets?
hunyuan can use some sort of controlnet
https://aivideo.hunyuan.tencent.com/
>>
>>103548377
That's a really stupid skirt design
>>
>>103548482
seems to be a high waist, polyester mini with rear centre split, what's the problem
>>
>>103548499
thanks for the details, patrick bateman
>>
What's currently the best way of prompting two people with specific designs? Regional prompting? Controlnet? Or just biting the bullet and training the lora on those designs?
Straightforward prompting with active usage of BREAK hallucinates a lot.
>>
>>103548504
well let's see Paul Allens skirt design
>>
File: out.webm (229 KB, 432x768)
229 KB
229 KB WEBM
>>103548513
>>
>>103548513
>>103548514
oh my god, she's even wearing a garter.
>>
>>103548465
that's useless for gamedev.
>>
>>103547797
using those means that we have to go for comfy's implementation of Hunyuan, does comfy's workflow work with sage attention 2 even?
>>
File: out.webm (133 KB, 768x432)
133 KB
133 KB WEBM
casual work attire has been a disaster for the human race
>>
>>103548561
What a weird response.
>>
File: 1679585602346140.png (311 KB, 513x513)
311 KB
311 KB PNG
>>103548583
how is weird.

I need something I can control.
Not some random clip of some literal e child prostitute being fucked.
>>
>>103548589
Nowhere in your previous post did you mention anything about game dev. So it's only natural that people are confused when you suddenly throw that condition out there when they provide you with an answer.
>>
>>103548589
>I need something I can control.
are you retarded or something? that's what hunyuan is providing
https://youtu.be/vhaFcqvSOSU?t=109
>>
>>103548602
what do you think controlnets are used for?
>>
>>103548608
it's a fucking video model, what does this has to do with game developpement?
>>
>>103548607
can't use civitai autismo nor controlnets.

It's useless for gamedev.
>>
>>103548612
he's trolling or he's legit retarded? I can't tell
>>
>>103548611
>>103548616
I need a tool that can generate sprite animations.
>>
File: 1726701385516478.png (688 KB, 1446x801)
688 KB
688 KB PNG
>>103548620
and you couldn't make sprite animations with this hunyuan controlnet?
>>
>>103548626
I need custom loras of waifus similar to civitai.
>>
>>103548629
I guess you need to make those loras by yourself
>>
>>103548636
the issue is quality.

last time I tried animatediff a year ago, the quality still generated flickering.
>>
>>103548645
the fuck is he talking about? I'm saying that you can make loras on hunyuan to get the quality and the waifu you want
https://civitai.com/models/1033325/rem-rezero-hunyuan-video-character-lora?modelVersionId=1158953
>>
>Ski-bi dibby dib yo da dub dub Yo da dub dub
https://civitai.com/images/45824869
SOUL
>>
File: 1680838975455434.jpg (60 KB, 465x517)
60 KB
60 KB JPG
>>103548653
Man, I've been burned for 2 years with newer overhyped magical shit that ended up being mediocre shit outside some random cherrypicked shit.

Need more evidence that is good shit beyond cherrypicked shit.
>>
>>103548661
you could also do your own research in your own free time to see where things currently are instead of posting a bunch of literally nothing statements in this thread you colossal lazy faggot
>>
>>103548661
>Need more evidence that is good shit beyond cherrypicked shit.
how does that work? if we show you more example you will dismiss them and say they're also cherry picked
>>
>>103548658
kek
>>
>>103548667
>>103548670
Nigger, already been burned like 4-5 times where I wasted my money paying for paperspace that ended up being a waste of money.

I'm already skeptical this shit can be used.
>>
>>103548680
>I'm already skeptical this shit can be used.
so you aren't convinced of all those hunyuan video spam that happened those last 10 threads? I guess they never existed and those were just a product of our imagination or something
>>
>>103548690
I stopped caring for AI since 8 months ago after I got burned again by animatediff.

This is my first AI thread in months.
>>
>>103548704
that's up to you to catch up and look at the previous threads, the evidence is here, you just need to look at them
>>
>>103548680
i mean it's not our fault you got FOMO'd into buying ((current year)) graphics cards, just because faggots online need to justify their overpriced njudea investments
if it makes you feel better, i can confidently say we'll see the event horizon next year at this time.
Whether or not it'll be EASY, given the state of GPU's, is the only unpredictable variable. But it will get better. We didn't have illustrious/noob/hunnynyan 4 months ago even. >>103548704
so honestly, i'd say wait until summer next year to make any decisions on buying GPU's or investing time into A.I, i honestly might do the same given i have other shit going on right now and tweaking models/getting just the right settings to make what i want is a lil too time consuming on my hardware.
>>
Hello sirs, does anyone know of an illustrious model that uses LCM/HyperSD/Lightning/something else to reduce the amount of steps? I've been using SD 1.5/SDXL models of this type since my GPU is shit and 6 steps there are enough compared to the usual 20, but I checked civitai and the top 40 illustrious models, none of them had a version like that. And I tried using the HyperSD XL Lora with a regular model but the results were always worse than a checkpoint of that type
>>
>>103548711
Just reviewed last thread wembs.

Useless shit for gamedev.

I need controlnets and custom LORA.

>>103548716
I don't have money to buy GPUS.

What I do is paying for a month of some website like tensor.art or paperspace.
>>
>>103548612
How was anyone supposed to infer that was your intent? Sprite sheets are a niche use at best.
>>
Stop responding to clearly low intelligence individuals.
>>
>>103548736
I literally mention I need gamedev shit.

3D generation is even worse than 2D one.

So I can't use AI to make 3D animated models yet.
>>
>>103548743
this, he's mentally challenged
>>
>>103548743
i have no idea why i felt the need to respond to him
>What I do is paying for a month of some website like tensor.art or paperspace.
PATOOEY
>>
File: 1689484204322994.jpg (120 KB, 912x1024)
120 KB
120 KB JPG
>Check out the magical examples
>The same shitty panning animations or a person standing still while doing some mouth slur
Wow, nice tech faggots.

I can clearly can make a game out of that.

It's simply much cheaper and better to spent 5-6 days doing what I want in blender over using this tech.
>>
>>103548783
are you that autist that has been shitting up /3/ for years?
>>
>>103548783
no one give a fuck about your retarded rants, we were fine in this place when you weren't there, go take another 2 years vacation like before and cope with blender or something, get the fuck out retarded nigger
>>
>>103548796
kind of.

>>103548800
don't care nigger.

I'm asking a question.

I need something like animatediff that isn't flickering garbage I can use with Illustrious loras.
>>
>>103548808
>I'm asking a question.
and we gave an answer, and you don't like our answer, that's not our fucking problem you retarded monkey, go complain somewhere else
>>
>>103548821
your answers can't be used for gamedev assets.
>>
>>103548827
I guess that there's nothing that can satisfy your requirements yet, it's that simple, come back in 2 years, maybe there will be something that will be what you want, so far it doesn't exist
>>
File: 1733377538173786.png (112 KB, 1035x1063)
112 KB
112 KB PNG
>>103547797
I have a question, how do you make the clip loading work on the vanilla comfyui workflow? there isn't a unique safetensors for llama-llava so it doesn't work
>>
>>103548392
the workflow is nothing special, I use a 3090 and it takes me almost 2h to generate using block swap
I don't care I just let it run since it's in a server, and I made a prerun with low steps to select the ones I liked
>>
>>103548836
is a requeriment because I can't use a random clip of some e prostitute video as game sprite.
>>
>>103548582
This is almost how all my gens look if you know what I mean.
>>
>>103548821
just stop responding to him, eventually his low IQ monkey brain will trail off and go do something else.


anyway, miss piggy time.
>>103548861
what did he mean by this?
>>
>>103548582
especially the end of office lady attire
pure sadness, you only see them in Asia nowadays, and luxury brands companies in the west
>>
>>103548848
>there isn't a unique safetensors for llama-llava so it doesn't work
is there a huggingface of this that only has one .safetensors?
>>
>>103548906
I still see them at my job in western europe. Only female engineers wear jeans+rock t-shirts. Even myself as an engineer wear formal attire.
>>
so is this cpp flux shell worth a damn or what
>>
>>103548904
hiiiiYAH
>>
File: 1727549599710775.png (150 KB, 1035x1262)
150 KB
150 KB PNG
>>103548916
I managed to make it work, you have to merge the 3 safetensors into one "merged_model.safetensors", here's the python script to do that:

from safetensors.torch import load_file, save_file
import torch

# List of input files in order
input_files = [
"model-00001-of-00004.safetensors",
"model-00002-of-00004.safetensors",
"model-00003-of-00004.safetensors",
"model-00004-of-00004.safetensors"
]

# Dictionary to store all tensors
merged_tensors = {}

# Load and merge all tensors
for file in input_files:
tensors = load_file(file)
merged_tensors.update(tensors)

# Save merged tensors to a single file
save_file(merged_tensors, "merged_model.safetensors")

and here's the workflow:
https://files.catbox.moe/wg80nj.mp4
>>
File: 1731432901813393.png (660 KB, 3339x1575)
660 KB
660 KB PNG
>>103549020
pircel how it looks like on my end
>>
File: 1694568932060561.jpg (55 KB, 617x724)
55 KB
55 KB JPG
>Try to use comfyui in tensor.art
>Failed to fetch error
Nice tech, faggots.
>>
>>103549050
this is what happens when you trust Chinese people
>>
>>103548936
It's on the rarer side for me, but it's always a welcome sight. Nothing like what I've seen in China and Japan though, it was a feast there.
>>
File: ComfyUI_T2i_21863_.png (1.06 MB, 1152x896)
1.06 MB
1.06 MB PNG
that is fucking cool
https://civitai.com/models/1046320/sandman-beats-all?modelVersionId=1173971
>>
File: 1704908259253486.png (32 KB, 2808x218)
32 KB
32 KB PNG
>>103549020
>>103549031
looks like it's more memory efficient on the comfyui workflow, I can go for 960x544x97f on Q8_0 (that one is bigger than fp8), on kijai's node I can only go for 960x544x96f fp8

This shit is really slow though, it's 20 mn on fp8 + kijai, and it looks like it's 27mn for Q8_0 + comfy, that one definitely needs a torch compile
>>
>>103549083
I've been in China but only Xiamen and Shenzhen and besides the airport and policewomen most women wore loose informal shirts. If anything western europe has more uniforms in daily life.

I think it's mostly an American thing where everyone wears a hoodie or t-shirt.
>>
>>103549107
>1 extra frame
>>
>>103546928
>I don't see the appeal in 3 second videos
I stopped visiting this place altogether when the video spam started, glad to see it's slowly coming to an end now, as it always does.
>>
File: 1729163427075169.png (8 KB, 492x155)
8 KB
8 KB PNG
>>103549118
>1
*4, and Q8_0 is better quality than fp8, and I think I can go higher than 97f, there's still room to spare
>>
>>103549118
>>103549129
oh yeah, my b, it's not 96f on kijai's node, it's 93f, because only 4n + 1 is working on hunyuan
>>
>>103549109
Weird, in the company offices in China (8 years ago) I've seen so many tight skirts and pantyhoses, so nice, but then again maybe it depends on the type of company. Only Japan had more, but everyone has a work attire there.
>>
File: 1731181637784955.png (22 KB, 2762x133)
22 KB
22 KB PNG
>>103549107
it's gonna be more than 30 mn, that is brutal, city, if you're here, you have to make torch.compile work on your gguf node, we need this
>>
>>103548975
>hiiiiYAH
https://files.catbox.moe/tbmpxt.png
>>
>>103549147
is there a way to convert a gguf into a safetensor?
https://huggingface.co/city96/HunyuanVideo-gguf/tree/main
>>
>>103549107
>This shit is really slow though, it's 20 mn on fp8 + kijai, and it looks like it's 27mn for Q8_0 + comfy, that one definitely needs a torch compile
So for now it looks a bit better, is slightly more efficient in vram usage, and 40% slower?
>>
>>103549144
>so many tight skirts and pantyhoses
The 70s/80s were like that in the US and Europe.
>>
tight skirts and pantyhoses are unattractive
>>
File: 1709796829901716.png (23 KB, 2780x147)
23 KB
23 KB PNG
>>103549190
>and 40% slower?
yeah, usually it's 20 mn on kijai + fp8, here it's 31 mn on comfy + Q8_0, I don't know if comfy's workflow is using sage attention 2 for the speed boost, I think he does because it's really memory efficient, Q8 (14gb) is bigger than fp8 (13.2), yet when I go for 97f, Q8 + comfy only goes to 22.4gb wheras I OOM if I try 97f + fp8 + kijai
>>
>>103549208
perfect for a work environment
>>
File: 00098-4073099787.png (865 KB, 960x1440)
865 KB
865 KB PNG
>>103549208
U R MR GAY
>>
>>103549187
>is there a way to convert a gguf into a safetensor?
something like kijai did on Mochi
https://huggingface.co/Kijai/Mochi_preview_comfy/tree/main
>>
File: 1729071511709253.png (254 KB, 2798x1538)
254 KB
254 KB PNG
>>103549213
you motherfucker... comfy has no idea how to make a tilted vae work
>>
File: 1705242820935808.png (965 KB, 3515x1556)
965 KB
965 KB PNG
>>103549323
fortunately for us we can use kijai's vae decoder to make it work anyway
>inb4 blurry shit
it was done on 1 step just for testing
>>
File: HunyuanVideo_00665.webm (321 KB, 960x544)
321 KB
321 KB WEBM
>takes 10 minutes longer because no torch.compile
GGUF? more like giga oof
>>
>>103549468
I wasn't using torch.compile on the fp8 either (because I have "only" a 3090), gguf quants are just slow molasses...
>>
What's the difference between clip l and t5 xxl?
>>
File: mario stretch hd.png (179 KB, 599x484)
179 KB
179 KB PNG
gguf? more like ggOOOOOOFFFF
>>
>>103548783
Everyone's dropping videos already after the initial rush of interest. 3 second animations are just boring, gifs are just garbage in general. A well-made still image is better than those short "animations" that you can't even realistically edit. It's fucking slop.
>>
>>103549491
t5_xxl is a big boi who understands natural language
clip_l is a small shit that is decent at tags
>>
>>103549510
With the exception of anything by Vince Gilligan most TV and Film is a series of 5-10 second shots.
>>
>>103549510
I can't use that for sprite animations.
>>
can I do vid2vid with hunyuan now?
how?
>>
>>103549549
>can I do vid2vid with hunyuan now?
yes
>how?
use his workflow example
https://github.com/kijai/ComfyUI-HunyuanVideoWrapper/blob/main/examples/hyvideo_v2v_example_01.json
>>
>>103549512
How are they different from sigclip_vision_patch14_384?
>>
>>103549020
no need to do that, comfy has already made such file
https://huggingface.co/Comfy-Org/HunyuanVideo_repackaged/blob/main/split_files/text_encoders/llava_llama3_fp16.safetensors
here's his workflow btw
https://comfyanonymous.github.io/ComfyUI_examples/hunyuan_video/
>>
>>103549615
there's none
>>
So did anyone test the new c++ ui from last thread? Curios how it is
>>
>>103549622
the example video looks like shit
>>
File: 1730478955380823.png (282 KB, 1706x1293)
282 KB
282 KB PNG
>>103549660
he went for hunyuan bf16 + 848x480 res, Comfy must have a beefy gpu to be able to run this
>>
>>103549650
it's shit. I'd rather use the tkinter one that came out forever ago
>>
>>103549693
Like slow or clunky or what would you say is the problem with it?
>>
>still images are slowly returning to /ldg/
is the general healing?
>>
>>103549681
>can run hunyuan bf16
>goes for fp8 text encoder anyways
is he retarded?
>>
>>103549650
No, nobody wants to be responsible for Ani's suicide.
>>
>>103549650
Does it have inpaint?
>>
>>103549748
I forgot about that
Maybe i should run it and promote it here 24/7 mhh
>>
>>103549703
disorganized code, bad use of screen space, no options for comfyui backend, inference is slower and said he can't figure out how to add llama.cpp
>>
can't say I envy code-fluent folks
>>
>>103549558
Does gif2vid work?
>>
>>103549794
yep, works perfectly
>>
fun fact, you can put gifs into img2img

unfun fact, the output doesn't move
>>
>>103549729
considering the UI is complete garbage from what it was before, yes
>>
File: teehee.gif (1.56 MB, 250x445)
1.56 MB
1.56 MB GIF
>>103549798
try this
>>
>>103549828
ACK
>>
I actually wonder if hunyuan has a lot of anime stuff in its dataset, it's really bad at 2d.
>>
>>103549864
>it's really bad at 2d
it's kinda good at 2d, but only at its true resolution (1280x720)
>>
>>103549874
I see.
>>
how do I load flux control net in gguf format?
I'm trying to do it under 6gb vram. this model to be specific:
https://huggingface.co/second-state/FLUX.1-Canny-dev-GGUF
>>
>>103549964
>under 6gb vram
yikes
>>
File: 1709036888546801.png (665 KB, 3460x1209)
665 KB
665 KB PNG
>>103549398
I'm really surprised by the results, the difference between fp8 and Q8 is really big, or else I messed up somewhere

>Sailor Moon juggling with 3 moons
>960x544x97f

>fp8 (OOM -> block swap -> 20 mn)
https://files.catbox.moe/0tjdrj.webm
>Q8_0 (22.7 gb -> 34 mn)
https://files.catbox.moe/zj1fck.mp4
>>
>>103550103
maybe it's the sampler, is euler + normal the good combo for hunyuan? that's what's on comfy's workflow
>>
>>103549998
didn't know controlnet v3 from xlabs is smol. maybe it can fit
>>
>>103550103
>slower
>garbage results
>>
>my miss piggy gen were shes totally clothed was flagged for furry pornography
the absolute state of these jannies, not my fault if you precummed to that shit.
>>
>>103550150
It's not a controlnet, it's an entire version of Flux because BFL are very special snowflakes
>>
>>103550103
the seams glitches are more apparent here than on kijai's node, weird
>>
>>103550170
TJD
>>
>>103549757
no
>>
>>103550212
it's over
>>
>>103550103
someone needs to rent an A100 to figure out if comfy's version even work at BF16, doesn't seem like it does
>>
>>103550273
he has an a6000
>>
>>103550292
He doesn't test anything
>>
hurry the fuck up ani, can't take this overrated noodle shit anymore
>>
File: 1722531678539320.png (23 KB, 2804x124)
23 KB
23 KB PNG
>>103550103
Something's wrong with Comfy's implementation, it's really slow even when using fp8, on kijai it's 20 mn, on comfy's workflow it's closer to 30 mn, I suspect that it doesn't recognize Sage attention or something?
>>
File: HunyuanVideo_00709.webm (399 KB, 960x544)
399 KB
399 KB WEBM
>can't take this overrated noodle shit anymore
two more years
>>
File: HunyuanVideo_00712.webm (1.07 MB, 960x544)
1.07 MB
1.07 MB WEBM
its ability to do stretching is borderline impressive at times but there's not much it seems hunyuan understands other than splits
>>
>>103550310
It sucks but I can't really see a good way to replace it. How else do you visually represent functions with inputs and outputs so retards can put them together without just sharing some form of code. There's a reason the node/graph style interfaces are so common.
>>
File: HunyuanVideo_00703.webm (580 KB, 960x544)
580 KB
580 KB WEBM
>>
>>103550380
>Something's wrong with Comfy's implementation
does it even support block swapping or offloading? If not then what is the fucking point? It will barely work on most peoples cards.
>>
>>103550447
entirely depends on if ani has a nodegraph included. sdcpp and ggml nodes would be pretty rad and might get more people on board to get it up to snuff with pytorch
>>
>
>>
>>103550461
>If not then what is the fucking point? It will barely work on most peoples cards.
only comfy's workflow supports gguf quants so far, so the vramlets can go for lower quants if they feel they don't have enough memory
>>
>>103550461
>what is the fucking point
To claim all the credit
>>
ggufs are literally pointless they do nothing other than make outputs worse
>but muh memory
doesnt matter it still runs like ass
>>
>>103550482
fp8 is worse and takes longer
>>
>ntr \ divorce arc
>>
File: 1709635660964972.jpg (2.04 MB, 7961x2897)
2.04 MB
2.04 MB JPG
>>103550482
>ggufs are literally pointless they do nothing other than make outputs worse
not true for Q8, its quality is closer to bf16 than fp8
>>
where sovl
>>
>>103550103
>>103550380
Yeah, something's definitely wrong here...
>Sailor Moon juggling with 3 moons
>960x544x97f

>fp8 Kijai (OOM -> block swap -> 20 mn)
https://files.catbox.moe/0tjdrj.webm (embed)

>fp8 Comfy (21.9 gb -> 33.24 mn)
https://files.catbox.moe/j0qh5b.mp4

>Q8_0 (22.7 gb -> 34 mn)
https://files.catbox.moe/zj1fck.mp4 (embed)
>>
File: HunyuanVideo_00726.webm (2.23 MB, 960x544)
2.23 MB
2.23 MB WEBM
it even generated the fade to white transition lol

we're definitely not writing our prompts good enough
https://huggingface.co/spaces/Apollo-LMMs/Apollo-3B
throw a short video into this and see how apollo captions your video
>>
Q_4 rtx 3060 chads now is our time
>>
Anyone lucky with creating ugly people, especially on XL/Pony? I tried to do people with turkey necks or receding chins but no luck, no matter what terms I tried no model seems to know these concepts
>>
>>103550578
Anon only care about it/s
>>103550635
In heart
>>103550683
Try "syndrome" and the like
>>
>>103550683
I trained loras for ugly people. Lots of mtf selfies in the dataset does the trick, not even joking.
>>
>>103550730
kek
>>
File: 1727511922286631.png (99 KB, 2786x484)
99 KB
99 KB PNG
>>103550653
>Token indices sequence length is longer than the specified maximum sequence length for this model (80 > 77). Running this sequence through the model will result in indexing errors
Hunyuan can only handle 80 tokens? Or is it talking about just clip_l unable to go further? Idk why they keep using this piece of shit
>>
>>103550653
Wtf? the output should be the same if you're using fp8 on comfy or kijai. Why isn't it the case?
>>
>>103550683
juggernaut models were always really good at generating normal looking people
>>
>>103550779
gguf conversion might be off
>>
>>103550779
any specific prompts to get there?
>>
>>103550785
meant for >>103550771
>>
Why does my skin crawl and blood boil every time anon mentions plebbernaut.
>>
>>103550793
no, I'm not talking about the gguf, but using the same fp8 model on both kijai and comfy, you get something different even though it shouldn't >>103550653
>>
>>103550802
>comfy straight up lying he implemented a model correctly again
>>
>>103548851
gj, dont give these coomer saas masquerading as anons a thing.
>>
>>103550824
>anons realizing comfy is a hack
>>
pixel art lora for illustrious rocks, but how do you get it to not be so JPEG-compressed looking?
i have no idea what this comment here means by reduce/resize by 8, and i've used comfy in the past.
>lora in question
https://civitai.com/models/466454/pixel-art-style-xl-pony-flux-illustrious
>>
>>103550725
>In heart
gimme pixels
>>
>>103550837
he's a ACK
>>
File: 1660045389086.jpg (3.58 MB, 3072x3072)
3.58 MB
3.58 MB JPG
That commit anon linked last week-ish got merged so I tried to test it. Intentionally long and retarded prompt, can't say I see any improved prompt adherence. Will test more later I guess.
3 illust models
nothing | Golden Noise | MaHiRo
When you are simply prompting without controlnets or some shit, golden noise still seems to be most useful, though it limits the samplers/schedulers. Really hoping somebody will make a dedicated Illustrious version.

>a cartoon-style boxing match between two characters.
>a muscular male with a determined expression. He has short, dark hair and is wearing green boxing gloves. His physique is well-defined, indicating strength and athleticism.
>A female character with distinct cat-like features, including pointed ears and a tail. She has blue hair styled with a golden snake ornament wrapping around her head. She is wearing yellow boxing gloves and an outfit that resembles ancient Egyptian attire, including a white dress with a blue and gold collar. Her expression is focused and slightly anxious.
>The background is vibrant with warm colors, predominantly orange and yellow, giving a sense of intensity and energy. There are sparkling stars and dynamic lines, adding to the dramatic effect of the scene.
>>
>>103550854
he's trans?
>>
>>103550846
search for like pixel art or whatever in the manager, there's a custom node that does nice pixelization along with some other post processing effects
>>
>>103550877
no, not at all, I was just kidding around
>>103550310
>hurry the fuck up ani, can't take this overrated noodle shit anymore
you're in luck anon
https://www.reddit.com/r/StableDiffusion/comments/1hgewlq/new_gradio_app_for_hunyan_video_can_generate_97/
>I have adapted the Hunyuan Gradio app so that it can run on a RTX 4090. My version seems to be much faster than the comfy version and consumes less GPU memory:
You can generate 97 frames (more than 3s) frames at 848x480 in less than 12 minutes.
>>
I gave up.
it's not possible to run flux + controlnet in 6gb vram. even q2 flux quants can't make it.
>>
>>103550893
>My version seems to be much faster than the comfy version
that doesn't mean anything, comfy's version is weirdly slow as fuck
>>
File: 1622752256923.png (110 KB, 1938x520)
110 KB
110 KB PNG
>>103550876
Forgot the link https://github.com/comfyanonymous/ComfyUI/pull/5975
I simply added it before the KSampler node.
>>
>>103550876
>>103550911
I guess that can't be used on flux and hunyuan because they use positive prompt?
>>
File: HunyuanVideo_00735.webm (422 KB, 960x544)
422 KB
422 KB WEBM
you couldn't pay me to watch the Paralympics (actually you probably could)

>6gb vram
your only mistake was trying in the first place
>>
>>103550956
Rin?
>>
File: 1709842691796774.mp4 (1.52 MB, 1280x720)
1.52 MB
1.52 MB MP4
https://xcancel.com/oliver_wang2/status/1868813458478915611#m
Google can be really good when they don't waste their time on woke shit
>>
>>103551231
awesome, open sores is about to see a huge boost in ((2 weeks)) because of this.

I better see some parachuting penguins in this thread when it does.
>>
>>103551263
>I better see some parachuting penguins in this thread when it does.
can someone try that on hunyuan to see how far we are?
>>
>>103549122
It's just two at most three anons
>>
>>103551231
>really good
That's an understatement, they made the best video model and it's not even close.
>>
dont make me tap the sign
>>
>>103549122
>glad to see it's slowly coming to an end
Weird thing to say when you are lurking on a diffusion general thread, videos are part of the diffusion ecosystem
>>
>>103551274
agreed, someone get on it

>>103551304
this is the first example ive seen, had no idea this was even released.
>>
>>103551324
>this is the first example ive seen
you can see a lot of cool examples here
https://xcancel.com/search?f=tweets&q=%23veo2&f-native_video=on&cursor=DAADDAABCgABGfBQhW5bMdQKAAIZ8EBh6FpwRwAIAAIAAAACCAADAAAAAAgABAAAAAIKAAUZ8FDOwEAnEAoABhnwUM7AP4rQAAA
>>
File: 00154-88125250.png (809 KB, 832x1216)
809 KB
809 KB PNG
>>103551342
good fucking gravy that looks realistic. not shit, even.
>>
>>103550944
>Modify the guidance to scale more on the 'direction' of the positive prompt rather than the difference between the negative prompt.
I guess?
>>
File: Damn.mp4 (1.54 MB, 1280x720)
1.54 MB
1.54 MB MP4
>>103551342
https://xcancel.com/hhm/status/1868770356234010931#m
Sora looks like a toy compared to this, not gonna lie
>>
File: maira.gif (264 KB, 512x512)
264 KB
264 KB GIF
3 years and AI photobashing still provides better restuls than most newer AI cuck papers.

lmao.
>>
>>103551406
>photobashing
what's that?
>>
>t. schizo of stagnation
>>
>>103551415
using a wacom tablet and krita to paint over and fix issues by hand.
>>
>>103551426
thats not photobashing, retard
>>
>>103551418
it's months before we get a UI that makes things fun again. noodleshit is so tiresome
>>
>>103551439
It's AI photobashing.

You need to fix issues by hand and use areas of gens to fix that shit by hand.
>>
>>103551452
still not photobashing
>>
>>103551441
Noodleshit is the best.
>>
>>103551466
cope retard.
>>
>>103551467
*+1 Chinese social credit has been added to your account*
>>
>>103551468
why do you post on a website where no one likes you
>>
>>103551487
I dont care about that.
>>
>>103551504
there are many things you dont care about like learning and improving
>>
>>103551467
if you're autistic
>>
>>103551509
dont care.
>>
>>103551513
ironic considering >>103551441 is one of if not the most autistic posters
>>
>>103551342
Doesn't quite fall into the "local" theme though anon
>>
Get a job, Chris
>>
>>103551406
the fuck is this?
>>
>>103551527
you are acting pretty autistic right now
>>
>>103551547
Kris has been unemployed for more than a decade without any signs of improving. He's not gonna get a job.
>>
https://civitai.com/images/45995790
I hate those mf who provide kino outputs but don't bother to give us the prompt or the workflow
>>
>>103551581
I don't think vaginas do that tho
>>
>>103551590
yeah i dont think vaginas shplort water out like that with muscle flexing
>>
way to out yourselves as virgins
>>
>>103551633
i own a pussy actually
>>
>>103551636
paying for an onlyfans doesn't count anon
>>
>still provides better restuls than most newer AI cuck papers.
I didn't realize you were literate. I presumed you utilized some form of text to speech.
>>
>>103551564
comfy should hire cris
>>
File: 1727218260930062.png (110 KB, 1968x484)
110 KB
110 KB PNG
>>103550653
loras don't work on comfyui's hunyuan workflow?
>>
>>103550901
anons mostly given up on flux
>>
>>103551659
there's not current model that has the same level of control.
>>
>>103551717
txt2vid never has the control I want. Google is no different
>>
>>103550901
Maybe try q1?
>>
>>103551706
sdxl forever and ever from now on
>>
>>103551705
It's broken at the moment it seems. Better open an issue.
>>
is SD3.5 useable or still dead
>>
>>103551782
ahahahahahahah

ahahahahahahahahahahah
>>
>>103551782
Anon, I...
>>
File: 00114-2733463024.png (2.62 MB, 1080x1576)
2.62 MB
2.62 MB PNG
>>103551750
>when SDXL has more staying power than fucking SD2/3 >>103551782
>>
>>103551782
useable for typography but nothing else
>>
>>103551782
Need
3.5 medium
finetune!
Or Large turbo but quantized.
>>
File: tmpsx1n3r9y.png (3.33 MB, 1432x1840)
3.33 MB
3.33 MB PNG
>>103551782
dead in the water
>>
>>103551821
it has more staying power than flux kek
>>
https://www.youtube.com/watch?v=S9L2WGf1KrM
How powerful is that really?
>>
>>103551515
if u didnt care you wouldnt have replied desu seething retardkun
>>
>>103552003
seems like a waste of money
>>
>>103551941
>vramlet copes hard
>>
>>103552083
buy the human race better gpus and maybe we will switch
>>
>>103552003
What a wholesome chungus
>>
bred
>>103552117
>>103552117
>>103552117
>>
unet>DiT
>>
>>103552121
yeah going for the ban you faggot
>>
>>103552126
try me
>>
new bread not very friendly imma stay in this one
>>
>>103552795
*Gives you a blanket*
Lets tell scary tales of pickle files around the campfire...



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.