Discussion of Free and Open-Source Diffusion models.Last bread : >>103468004>Local (Hunyuan) VideoWindows: https://rentry.org/crhcqq54>UIMetastable: https://metastable.studioSwarmUI: https://github.com/mcmonkeyprojects/SwarmUIForge: https://github.com/lllyasviel/stable-diffusion-webui-forgereForge: https://github.com/Panchovix/stable-diffusion-webui-reForgeComfyUI: https://github.com/comfyanonymous/ComfyUIInvokeAI: https://github.com/invoke-ai/InvokeAI>Models, LoRAs, & Upscalershttps://civitai.comhttps://tensor.art/https://openmodeldb.info>Cookinghttps://github.com/Nerogar/OneTrainerhttps://github.com/kohya-ss/sd-scripts/tree/sd3https://github.com/derrian-distro/LoRA_Easy_Training_Scripts>FluxForge Guide: https://github.com/lllyasviel/stable-diffusion-webui-forge/discussions/1050ComfyUI Guide: https://comfyanonymous.github.io/ComfyUI_examples/fluxDeDistilled Quants: https://huggingface.co/TheYuriLover/flux-dev-de-distill-GGUF/tree/main>Guides & ToolsShare the Sauce: https://catbox.moePerishable Sauce: https://litterbox.catbox.moe/Generate Prompt from Image: https://huggingface.co/spaces/fancyfeast/joy-caption-alpha-twoArtifact resources: https://rentry.org/sdg-linkSamplers: https://stable-diffusion-art.com/samplers/Open-Source Digital Art Software: https://krita.org/en/Txt2Img Plugin: https://kritaaidiffusion.com/Collagebaker: https://www.befunky.com/create/collage/Video Collagebaker: https://kdenlive.org/en/>Neighbo(u)rs>>>/aco/sdg>>>/aco/aivg>>>/b/degen>>>/c/kdg>>>/d/ddg>>>/e/edg>>>/h/hdg>>>/tg/slop>>>/trash/sdg>>>/u/udg>>>/vt/vtai>Texting Neighbo(u)r>>>/g/lmg
Well at least it's "moving"
It's okay to only gen images.
what is a video?A miserable pile of images
>>103475601>Rape Flash man, he rapes you in a flash you don't feel anything
>>103475577wtf
I for one am glad the Flux era is over
>>103475614one could say it never happened at all, but you absolutely shouldn't
you wouldn't have video if it weren't for images just remember that
>>103475636same, the same face issue of all the trained models got sickening.
Blessed thread of frenship
>>103475488hanbaagaa
>>103475777that's like a cartoon mask with a human mouth, like the intro to spongebob
>>103475777can Hunyuan do more than just Migu? I hope it's not Flux all over again when it comes to characters
>>103475794it does asians really well, but not the arab kind
>>103475805>it does asians really wellwell it's a chinese model, seems obvious they trained the model with a lot of local videos relative to them
>>103475816sounds intuitive but I'm not sure. Is there even 12 billion parameters worth of captioned chinese material on the web?
>>103475806Official comfy implementation soon then?
>>103475827why do we need that, kijai's node is good enough no?
>>103475843Gguf loader
>>103475806was that img to video?
https://github.com/hkchengrex/MMAudiovideo to audio kino coming soon, imagine a lora with sex sounds
>>103475852nope, pure text2video>>103475849>Gguf loaderComfy never did some gguf stuff, we'll get more lucky if city was up to the task
can any kind anon share a good kohya_ss config file for training a 1-image lora with pony xl?
>>103475855>video to audio kino coming soonI think hunyuanVideo can already do that
>>103475794IIRC they used a language model to construct the prompts for the images during training so my guess is that a lot of knowledge about specific character, places, etc. was lost during that process.I definitely had issues with genning Kasane Teto.
>>103475855finally, i fucking hate it when the guy makes a noise. instant turn off
>>103475882>i fucking hate it when the guy makes a noise. instant turn offyeah same, how hard is it to just shut the fuck up and let the women do the sounding
So how much longer until I can prompt POV of a harem of my highschool crushes nude giving me a blowie
>>103475861>nope, pure text2videolooks pretty consistent with the usual cute foxgirl
>>103475912yeah, it pretty good at anime... when it's not moving too much
>>103475882>hot scene>male groans>male dirty talkI guess people find that hot, I don't like it.
>>103475906don't forget the virtual headset, this shit is amazing when you look at pov porn
>>103475488test
>>103475930welcome back
>>103475906when there's enough 360 videos to train a model on (never)
>>103475488Image Diffusion?
>>103475948no, it's written local diffusion, which means local image diffusion and local video diffusion
>>103475948inclusively yes
>>103475906Within the year we'll probably have fully customizable porn that can star anyone you can get at least 10 quality pictures of lol.
>>103475938if you type VR on sukebei you're going to be surprised
>>103475906god i miss her, why did i not make a move... now i just jerk off to the thought of her with my fucking imagination
>>103475984and v&?
support for willow chips? when?
>>103475906LoRA training soon.
>>103476042doing porn lora on this model will be so easy, it already knows so much, can do pussy, dick, breasts... perfectly, all we need is to make it learn sex poses
>>103476042my terabytes of cached porn will finally come good, to make petabytes of cached porn
They won't let us do loras because they know how much power we'd have. They do not want the little guy to win.
>>103476069neither did flux
>>103476065based
>>103476099sad, actually
>>103475922>>103475882>hot scene>maleI guess gays find that hot, I don't like it.
>>103476141don't pretend you have standards, you're in ldg
LDG forefront of AI diffusion
I vote to change the name of the general for> /vdg/ - Video Diffusion General
>>103476042two weeks?
>>103476214submit a pr
Oh no Migu!
>>103475508nice
will there be optmizations for 3090s? 17 minutes a gen is just brutal, can't believe 4090 fags do it in 6. they don't know how good they have it.
>>103476363>will there be optmizations for 3090s?the only optimization left is to get the Q8_0 version, that one lets you do torch.compile on a 3090
I can't get huny to get good cartoons out it wants to make realism too hard. Anyone got some good keywords? "sketch" loves to go into black and white outputs but is still realistic.>digital illustration, cel shaded, cartoon, animated
>>103476374how much faster would that be?
>>103476397I think it's 30% faster, which is a big deal, I'm also waiting 19 mn to make a single gen, that's horrible
>>103475577Video gen is going to be so cool in five years when we actually have the video cards for it.Or is 5 years too optimistic a timeline for that?
>>103476401try out 960x544 40 steps. looks good in about 15 minutes. that's what i'm doing.
>>103476415I'm already at 960x544x89f 30 steps
>>103476405>Or is 5 years too optimistic a timeline for that?dude in two years time we are all going to have super grok chips that can do video gens in like no time. we are going to think it's crazy to wait over a minute for a 2 minute 1080p video.
>16GB 40 series card>Can use speed optimizations like fp8_fast and torch compile>But can't fit huge frame countsAn interesting conundrum
>>103476437Once we get the 5090, we'll be set. Thing will be a beast and optimized for AI. all for the low low price of 3,500
>>103476383i didn't have much luck either in the short amount of time i tried.https://files.catbox.moe/p4hvyf.webm got this randomly tho
>>103476480I was hoping for a 24gb 5080 but it's 16gb again.
EATING a burger not talking to it. MLLM NOW!
>Day ends with Y>OH boy, time to generate 2 girls in bubblegum pink skin tight outfits doing jack shit for 3 seconds at a time for the next 12 hours
>>103476607you need more?
>>103476569>MLLM NOW!this, give is to us now!
Is the video hype dead already?
>>103476437I'm not paying more than $1000 usd for a video card. So it doesn't matter how super fast a 2026 card might be.As long as video gen requires enormous expense the general is going to be pretty quiet. I'm sitting this one out.
>>103476405>Or is 5 years too optimistic a timeline for that?next cards give us 32GB VRAM, the ones after probably will keep 32GB, and the ones after maybe 48GB, so 3+3 = 6 years at the earliest, 2031.
>>103476778and the next model will use 64gb vramcan't win, it'll always be on the edge of consumer grade
>>103476795>can't windesu, you'd be fine sticking with hunyuan and flux, those are really good models
What's the meta for speech to textI absolutely fucking hate whisperx it actively ruins my virtualenvs
>>103476778Let me clarify. When I say "we" I literally mean us, the posters in this general. The average poster does not have a top of the line card—a few do, but that's not the norm. Prices have been trending up, so we can expect even fewer posters to have top of the line cards in future generations.I am saying that video gen will be very cool when a 480p five-second video can be genned in 1 minute on an $800 card that the average poster owns. This is why I say 5 years might be too optimistic.
>>103476830if the whole market doesn't fizzle out because it's not actually profitable outside of hype. Then we're in the doldrums, stuck with whatever models the geeks with clusters of decade old teslas can shit out
>>103476869I don't think it's much of a fad. People like us can continue to work with the models and the open source projects will probably go regardless. But all AI companies that provide API's and shit will just fold.
what's the word for "un-mannequinny"
android or salvaged spinal cord transplanted into plastic bodyalso wake the fuck up people (except you)
>>103476830Just skip your latte for two weeks and buy a 5090, it's that simple.
>>103476942fleshy
>>103477060>Just skip your latte for two weeks>Just skip your latte>Just latte two weeks>skip latte>two weeks>skip>just
>>103477060>he buys latte>he doesnt just make his coffee at homelmao faggot, anyway hop[e you 5090 suckers are prepared for getting a second PSU + upgrade to your ancient breaker system in your parents' house to afford the extra amperageand extra uptick in how much electricity youre paying for per month
>>103477134I don't buy latte, that's why I currently have a 4090, can afford a 5090 when it releases, and already have a 1600W PSU
Knife tricks
vast supports paying with metamask and i have a bunch of trash left over from my defi phase 7 years ago so i will 100% be trying to do hunyuan training
>>103477171why'd you blow up a 144p video to 1080
https://github.com/Tencent/HunyuanVideo/issues/93#issuecomment-2533257381>the text Text Encoder is t5 flan xxlhttps://github.com/Tencent/HunyuanVideo/issues/109#issuecomment-2533261573>as it needs an 80gb gpu it seems unlikely to fit on a 24gb gpu unless it is at 4bit
>>103477220I was testing differing ways of post processing resolution upscaling, forgot I left it in.
>>103477141>1600w psuthis is sick. Just dont buy a new gpu and enjoy your latte.
>>103477263I WILL have that 5090
They're still talking about how AI's going to come along and replace everyone. But the more I use ComfyUI the bigger the skill ceiling gets."AI engineer" is the new job title.People have this idea in their heads that when the next massive new model is released, that'll be it, there'll be nothing more that humans can do. But I'm really not seeing that, everything is all spread across 1000 github repositories and probably always will be. The "ultimate god AI" is just some singularity bullshit that's not even real.
>bought my 4090 for 1700€ in june this year>now that 4090 costs 2700€why?
>>103475636Agreed. Plastic shit. Same face fatties from 2023.
>>103477295>why?everyone is buying one to make loras out of Flux
>>103477295>the more you buy>the more you saveHOLY SHIT HE ACTUALLY WASN'T BULLSHITTING
>>103477295Crypto mining in full swing atm
>110 replies >19 files
yum
>>103477412nobody is mining with GPUs anymore.Etherum has been proof of stake for years now
>>103477419If you want avatarfagging and image spam go to /sdg/
>>103477419video is demanding to generate
>>103477419eat a knife
>>103477419>oh no, why isn't this place populated by retards who spam images and video all day? get the fuck out >>103477296
>>103477419everyone is gooning themselves to death. they're not going to waste valuable gen time to make sfw videos for blue cuck boards
>>1034776004090 go brrrr
>>103477600>everyone is gooning themselves to deaththis, thats how AI kills all humans. it turns everyone into a gooner that withers away while generating 1girl slop.
>>103477600>>103477631
Babe, wake up, new diffusion modelhttps://github.com/lehduong/OneDiffusionhttps://huggingface.co/lehduong/OneDiffusion/tree/main
>>103477666Satan...Comfy?Controlnets?IP Adapter?Loras?8gb vram?2MP?Comfy?
>>103477666LOL you need 80GB VRAM for it? top kek
>>103477696>80GBwait what? how can it be, it's a 2b model, where did you find this?
>>103477696>Duong H. Le and Tuan Pham and Sangho Lee and Christopher Clark and Aniruddha Kembhavi and Stephan Mandt and Ranjay Krishna and Jiasen Lu
>>103477712yeah i dont know what the fuck he's talking about heres the model on HF
>>103477717so that's 10gb for the model + its vae + the text encoder?
>>103477712>>103477717when I'm bored I spread disinfo to stimulate the thread
>>103477729>text encoder is 11gbim gonna kill myself
>>103477737that's a big boi wtf, it's bigger than t5_xxl, dunno if my 2nd gpu (rtx 3060) will be able to run that
>>103477717>11.2gbso that's a ~5b model?
>>103477666will i be able to run this on my 1650 4gb
>>103477814the text encoder and the model are both 11gb big...
>>103477717any comfy nodes?
>>103477717>>103477666comfyUI wrapper when?
time for a booster
>>103477878pepto bismol in the veins would burn like hell
>>103477666>We adopt the Next-DiT architecture[72] in our model. By leveraging a full transformer-based architecture, our model can work with different numbers of views N.Oh nice, that might...>synthetic datasetThrow it in the trash, it's going to just generate slop.
>>103477983why the fuck are they all doing that? fucking why, this is ridiculous
>>103477987Even hunyaun is poisoned with this garbage
>>103475488Luigi Mangione Model dropped today: https://civitai.com/models/1025986/luigi-mangione-or-fluxd
>>103477983There are MASSIVE amounts of photos and videos from before sloppa was a thing. Why the fuck do they keep training AI on AI slop? Fucking hell
>>103478002>There are MASSIVE amounts of photos and videos from before sloppa was a thing. Why the fuck do they keep training AI on AI slop? Fucking hellIkr, the fuck is their fucking problem...
>>103477987>>103478002>>103478019>whyWe know why, they're lazy. That's it.
>>103476942it creeps me out to see a still frame from hunvid
>nearing halfway on posts>32 filesIt hurts remembering 2022
>>103478221You know where to go
>>103478221sdg is the image general
>>103478221big surprise comin in the second half trust the plan
>>103478241We had images too once
>>103478255ldg hit image cap maybe two times in its entire history
>>103478263but it does hit post cap almost every
>>103478263it hit cap at least twice within the last month:>>103328816>>103280382It does still happen sometimes
Quality over quantity
>>103478288Image output is a shitty metric to gauge the quality of a thread. Just look at /sdg/. It produces nothing, discusses nothing of note, they just jerk each other off. They're literally stuck in 2022.
>>103478295>"quality over quanity">posts neitherYeah "quality" is when we have 350 low effort text posts about "how much VRAM do I need for X" or "I wonder how much a 5090 will cost" or "is there a new model yet" or "why is the text encoder so big".
>we
>>103478346Sorry to hear your 1girl didn't get any (you)s, anon
>>103478344Image output is a great metric to guage whether you people have any interest in genning at all. As far as I can tell 90% of this thread is about "fuuuuck I wish I could afford a 5090" and impotently hoping the next model that comes out will cure your boredom. Completely worthless discourse. Spamming slop would be better>>103478353You will be
>>103478346yeah quality is when we have posts bitching about the low quality of the average posts in the thread when news is slow im posting an old 1girl now just to spite you because i think it'd give me a gentle chuckle
The last big happening was a video model that only a small percentage of anons can run. It's not that deep.
>>103478385>bitching about qualityI was explicitly replying to a post that implied the thread's low energy was preferable because "quality over quantity". You then pretend this was me, unprompted, bitching about the thread's low quality. You don't understand the discussion you're jumping into or who you are "spiting" or why.
Limitations of hunyuan became evident. Even the video anons aren't posting
remember when 70-80% of the images were one poster?
>>103478346You sound unaccustomed to the quality of the average 4chin post
>>103478477you sound unaccustomed to clicking the link in a post to see what a poster is replying to and then understanding the post *as a reply*, that is, having its meaning shaped by the context in which it was posted, that context being the post to which it is a reply
YJK who that ^ is kek
>>103478507Haha yeah, absolutely... but why don't you say who we both think it is? So I know for sure we're on the same page.
>>103476700twas quick now anon is restless for Next Big Thing
>>103478574the whole problem with video gen is that it wasn't quick
>>103478574Waiting for the literal who in the github PR for the wrapper to merge his LoRA inference PR request and release his training script so I can get a bunch of Turkish grifter images and kick this whole thing off with a reddit post showing the grifter doing dumb shit.
Christmas Cat
>>103475488How do you go about getting a proper hourglass body shape in your output like pic rel? Trying to get the model to do this feels like pulling teeth at this point and I'm out of ideas.inb4 issues w/ pic, hey its just first pic I had on hand relax lol
>>103478627nice
>>103478656You've seen that style a thousand times
>>103478663You're talking to the author.
>>103478673I know
>>103477484nice gore
nom nom
>>103477620Kek.
>>103478817This is too smooth for most anime studios. Did you ever think about how many korean in between artists you'll be hurting?
>>103478835Why they fuck isn't the industry using framegen? Or they already are and it's the explanation for QUALITY
>>103478843Nah you can see the QUALITY are just rushed doodles.
>>103478835>korean in between artistsinterns that don't get paid? their life doesn't really change. maybe less suffering than being led by a lie that they will get a paid position eventually. They can just jump into keyframing with ai assistance. the model can be trained off the assistants.>>103478859this
go back
>>103478872Are you using STG at all?I know Hun supports it, but thus far nobody seems to mention it. But what I've seen it do on other video models looks promising.
is the plan to pray that the tech outpaces your lies?
Some dude right now in both /sdg/ and /lmg/ throwing out random combative phrases to goad a response. Don't take the bait.
>>103478651hot pic tho important to know is what model youre using
>>103478843>Why they fuck isn't the industry using framegen?they have been using it for years now btw. can't remember the software tho. starts with a c>>103478892not atm. just spooling gens while I code. I wanted to try the new comfy gui but it's so shit the video preview isn't even in the node and the queue just gave up on video previews altogether. very strange stuff happening too as well as a worse runtime. very disappointing>>103478929I know, I never really do that.
so text to image was stagnating after flux and now it's completely dead with these new video models. Have any interesting improvements been made to t2i in the last couple of months?
>>103478945hyunwunweiwun is good for still images
nice avatarfagging btw
>>103478943Would desperately like a preview of the video gens I'm getting so I know to nope out early. Is it even possible? I know we cant actually see it moving, but surely the frames can be seen right?
>>103478945https://github.com/lehduong/OneDiffusion
>>103478968doesn't this need 80gb vram? It's also not that impressive
>>103478651I'm getting results like this after only a few minutes of trying and I'm using a guidance value of 1.37 (on flux-dev nf4). I think you can get thereprompt was>Harlee Washington \(age 19\) nahhh she too damn cute... this angel popped up in my tiktok feed and her body is way too nice [eyes emoji x2] [fire emoji x 3] she gotta be photoshopped.>u can tell this flabby hoe got lipo cuz why she got huge thighs and a tiny little waist. aint nobody that skinny with curves like that. men b stupid af thinking she look good. "she slimthicc" "she got a hourglass figure" no dummy she just a fattie who got lipo and a tummy tuck and she still saggy asf. the bikini is desperate too, she belong to the streets, and she got a ugly face ngl>a curvaceous college student with a sexually mature body poses enticingly in a bikini in front of a mirror in her dorm room for her tiktok. Her thighs are enormous.
>>103478945Illustrious
>>103478993yes, people who gen cartoons are having fun.
>>103478982I think the trick to an hourglass, in short, is to emphasize big thighs and a small waist.
>>103478932> modelThe three I tried were epicrealismXL, flux1-dev, & juggernautXL + some vae
>>103478964no, with the way temporal layers work it has to be a batch. it's really annoying this is the way things are going toward since prompt travelling, loopbacks and frame interp gives much more control and you can just nope out of shitty results like you saidgetting really shitty slop style atm and I ain't posting that. the interwebs is already full of shitty slop. why the fuck did they train this in? so fucking frustrating
>>103478982this one's kind of a bad gen but it does illustrate that the prompt can produce an "hourglass" of the type anon meant, even down at 1.37 guidance
>>103478982>>103479010I've been able to get this semi-working side/3 quarters view, but imo it doesn't look quite right. Plus a straight on view seems nigh impossible, probably for limited dataset reasons
>>103479019actually, if there was some sort of taesd model but for video decoding that could work
>>103479019It's really annoying. I'll get some really good gens then I'll get a straight up 1.5 default pose slow motion pseudo animation. I'm praying being able to train LoRAs and eventually img2vid let's just just circumvent the slop altogether.
>>103479040>>103478982>>103479040sorry for the lack of example output, but I already shut my pc down for the night and I'm too lazy to turn it back on at this point
>>103479040You can't easily force Flux to do perspectives except by prompting a kind of image that has a conventional perspective. E.g. a runway image will tend to have the model walking in a very exact way toward the camera, which is why runway prompts produce good results reliably.If I wanted to eliminate side angles I'd change the "poses in front of a mirror" part of the prompt and try to think of a different sort of conventional image where people face the camera more directly. But remember that every piece of every prompt carries its own baggage—there's no free lunch.
img2vid aint coming, is it?
>Manifests cigaretteNothing personnel
>>103479246vid2vid ain't great, I'd temper your expectations
>>103479324That looks good to me. What's the issue?
>>103479324you will finally be a woman
>>103479324You made girl into different girl. Good enough
>>103479372>girlanon..
>>103479324eh, with loras you can probably do pretty easy deepfakes
>>103479386>with lorasWith WHAT training script?
>>103479324Looks like it mostly works. It's fine.
>Look at the issues page on the github>sage attention>sage attention>sage attention>sage attention>sage attention
>>103479324is it even an official implementation?there's no mention of v2v on the repo
>>103479443>is it even an official implementation?Nah just a feature added in the wrapper.
>>103479324Up denoising
>>103479460I know, I've used masks beforetoo much and it doesn't follow, too little and it doesn't change
I'm confused, whats the issue with vid2vid?
>>103479324hunyuan sama, please try to impregnate this robot
>>103479472framerate apparently
>>103476374So is that why troch compile doesn't work for me, because I have a 3060?
>>103479543Yes, it only works with 40xx cards
>>103479482Here you go dude
>>103478888This one is impressive to me
>>103479585Oops, gave you the static image.
Lemme just lower that denoise a bit.
>>103479603It's a FURRY!!! RUUUUN!!
>>103479592>>103479603i... cant goon to this
>>103479649I swear, I tried for real, but making her pregnant made her black.
I guess if you just mask out the face you can make people pregnant.
migu are you ok?
eww wtf
>>103479822>>103479878lol
>>103479822>SPLOOSH!
>>103480207chat is this real?
>>103480207This isn't even a gen. This is just a random webm
>>103477295IIRC NVIDIA stopped producing 40XX cards even before it was known that 50XX production had started.Can't let those prices get too low!
>>103480338>the more you buy the more you savehe meant it, madman
>>103477412Yeah but the crypto is AI instead and they're mining prompts. Just like us.
>>103478221I usually keep my weird fetishes to myself but if you insist...
>>103480388I hate outputs like this. The weird fetish is whatever, but it's clear from the composition that the base is AI slop. It's a shame that the model seems to be poisoned with this shit.
MLLM text encoder when?Image2Video weights when?Multi-GPU support on Comfy when?Turbo (4-step) hv model when?I am going insaaaaane!
>>103480405>MLLM text encoder when?May never come. This was never a given.>Image2Video weights when?Soon, maybe. We have no idea when this is supposed to be released or how far along it is. If you look at their website, you can see it does controlnets as well so there's a whole lot of stuff we don't have right now.>Multi-GPU support on Comfy when?When someone makes a PR for it on the mage repo.>Turbo (4-step) hv model when?No idea.I'll add some more questions.>GGUF or those new super fast nf4 quants when?Lora inference and training code released when? (It exists but not published and merged)
>>103480242>>103480292>vramlet here
>>103480424If img2video happens, then Loras would be pretty much useless unless they are trained on motion and not just subjects/likeness
>>103480466Let's see how good/bad img2vid is before we discount the value of LoRAs.
>>103480400I mean, the AI slop aesthetic could just be an artifact of some patterns being more easy/difficult to learn for the used network architectures.Though I definitely feel like the outputs look more like AI slop for vertical video and unrealistically large breasts or butts.Unfortunately the latter two are also correlated with high body fat in outputs that don't look like AI slop.
LoRA inference any moment now.
>>103477230>https://github.com/Tencent/HunyuanVideo/issues/93#issuecomment-2533257381Anime pfp are more trustworthy in AI, as always.>https://github.com/Tencent/HunyuanVideo/issues/109#issuecomment-2533261573>FurkanGozukaraHow the hell is this dude EVERYWHERE
>>103477737I wonder if it can be used on hunyuan
>>103480534How else will he find information to paywall
>>103480534Holy shit I hate furk so much.
>>103480645
>>103479822I don't think she's fine bros...
>>103480558that's really impressive when you think about it, what I'd love about Hunyuan is the possibility to do it like Sora, instead of doing a simple v2v you ask the model to just replace the guy by the girl, that way the model will just change the character and not everything else surrounding it
>>103479400>With WHAT training script?he didn't add eh training script on this PR? that's fucking retarded... https://github.com/kijai/ComfyUI-HunyuanVideoWrapper/pull/72
>>103478892>Are you using STG at all?>I know Hun supports it, but thus far nobody seems to mention it.because STG uses the same method of CFG > 1, so when you use it it's twice as slow, It's already slow enough as it is
>>103478438>Limitations of hunyuan became evident.the toy isn't complete, once MLLM and i2v will be released it'll be even more hyped than now
>>103481002what's a MLLM ?Also I agree if they keep reducing it to make it faster like they did with flux, then it could have potential, also when crazy people make loras for this
https://chendaryen.github.io/NitroFusion.github.io/ehh...https://huggingface.co/spaces/ChenDY/NitroFusion_1step_T2I
>>103481014>what's a MLLM ?the text encoder we're currently using is just a duck tape (llama3), the official one is MLLM but they haven't released it yethttps://github.com/Tencent/HunyuanVideo/blob/main/ckpts/README.md#download-text-encoder
>>103481026ok lovely. Sorry for asking something I could have easily googled. Forgive me. and Thank you.
>>103481020uh huh
>>103481063lmao, they managed to do it worse than SD3M, new record!
https://github.com/Tencent/HunyuanVideo/issues/117>More pressure about MLLMYou love to see it
>>103480816it captured his yellow teeth - a testament to his model training skill
>>103480424>May never come. This was never a given.they said "we haven't released it YET", meaning that at some point they will >>103481026
Sora is really dissapointing not gonna lie
>>103478872>>103478943>>>/g/sdg/
Didn't the Mochi guys say that they'll release MochiHD or their i2v model in december or something?
>>103481313high fidelity tho, gotta give it that
>>103481063WTF IS THIS?!
>>103481313>Sora is really dissapointingthey're putting the turbo version on the API, the real deal would've been way more expensive to them, as a consequence, we're running a worse model than the one presented in february, their PR communication about that model is catastrophic, they should've just displayed the turbo version in february and then release the API the same month, the wow effect would've been still here because back in those times the best model we had so far was thishttps://www.youtube.com/watch?v=Itbc12qXr30
>>103481356more interesting cinematography too, although it makes it look like an anti-drinking ad
>>103481356>>103481380looks like some professional slop, I like that hunyuan amateurish look, that's how it looks like in real world, I guess I'm tired of pro pictures/videos because of Flux
>>103481380it probably enhances any prompt you give. plus 720/108-0p
>>103481313>>103481380Add "camera moves around to the left" in the hunyuan prompt and it's going to be "interesting cinematography" as well.
>>103480445>>103480207That's pretty good actually.
JOHN CENA, THE MIGUNATOR IS COMMING FOR YA!
>>103481378>their PR communication about that model is catastrophic>our model is incredibly dangerous>you can't ask for humans it's TOO DANGEROUS unless you pay more>you can't ask for known IPsUnderstatement of the century.
>>103481587>china just let people generate whatever the fuck they want in HD videoOpenAI is so fucking gay and corporate it makes me vomit
>>103481605yeah MiniMax for example understands what fun means, no one want to do some boring ass zoom in on some boring ass fields, the worst part for OpenAI is that sora is supposed to be a tool for filmaking, how can it be when it refuses humans in the first place? I overestimated OpenAI desu, maybe their chatgpt success was a fluke, they can't keep that momentum at all
>>103481622>sora is supposed to be a tool for filmaking, how can it be when it refuses humans in the first placethe version for hollywood will probably be uncensored
tf is happening kek
Just a tip bros. Generate a single frame at a fixed seed at low steps, see if the blurry image you gets looks like something you want,refine your prompt and iterate until the blurry first frame looks good.Then up the iterations and keep generating frames, see where it stops improving in quality, sometimes this can be at 30 sometimes at 100. Scale to the absolute optimum of quality.Only then start increasing the frames.This way you're not wasting 10 minutes on a gen. You can also batch 100+ "single frame videos" and see which seed has the best first frame to generate videos from.
>>103481762What? The number of frames completely changes the output of the seed.
>>103481762that's not how it works, you can have a fine image on a single frame and then when you go for video mode it looks like shit, even on the same seed
>>103481784>>103481788Works on my machine.
>An episode of Seinfeld with Hatsune Miku in itNot that there's anything wrong with that!
>>103480445how much ram do you have? I think the blockswap maxes out my mere 24gb ram lol (normal ram not vram)
>>103481805Bros, I got a confession to make. I never prompt Miku, I actually prompt a Miku cosplayer. Sorry for tricking you all.
>>103480445Also why not use the bnb_nf4 setting for the textencoder?
>>103481817yeah you need a shit ton of vram to make this shit work
>>103480445also also, do you ahve 12gb vram or 16gb?
>>103481817>>103481826*ram
>>103481794Post a catbox of a first frame that doesn't change when you increase the number of frames
>>103481378>they're putting the turbo version on the API, the real deal would've been way more expensive to themSo just like they silently downsized GPT-4o because it was too expensive, then. That's pretty in character for them, actually
I've been using 4xNomos8k_atd_jpg to upscale some images and it's great, but sometimes there are these odd transparent black bands on the final upscaled image. Does anyone know a good way to stop that from happening?
>>103480400Just turn down your guidance value. It doesn't look like that at 4.0 and below
>>103481784>>103481788>>103481855You misunderstood what I meant. The first frame isn't directly what the video generates but what is visible in the frame is still represented in the final video. Seeds with better initial frames consistently produce better full videos. The quality of the steps also transfer to the final video.
>>103481924>The quality of the steps also transfer to the final video.I don't believe that at all, I tried your technique days before, I had a perfect first frame, then I went to 97 frames and it was ass, looks like it has better prompt adherance when it's on image mode than on video mode
>>103481930Yes the video is lower quality than the image, always happen. What you need to do is generate 100 first frames and then develop all of those frames into videos, 100 videos. Compare them all, you will see that when the first frame looks good the end resulting video is STILL the best quality even if it is lower than just the first frame you generated, if you compare it to the full videos of the other less good first frame seeds you generated.
>>103481924Nah dude, I think your technique is bunk
holy shit, HunYuan absolutely rocks for /ss/ bros
>>103481975>for /ss/ brosfor what?
>>103481984Not elaborating
>>103481975I've prompted so much ss on anime models but I don't think I can stomach it in 3d.
>>103481975catbox or fake
>>103481984Super Saiyan dragonball enjoyers
>>103482023speaking of dragon ball enjoyers someone really needs to recreate vidrel
Come on Tencent, playing around with llama3 was funni, but now you have to give us the real text encoder
Is someone here smart enough to explain to me why Q8 quantization of the 13B HunYuan model wouldn't work?It's not even about the size of the model being smaller for better VRAM utilization. It's about speed of generation.
>>103482195Q8 would definitely work, like on mochi, you get better quality and you'll be able to use torch compile on a 3090, but kijai doesn't want to do it :(
>>103481826lol I was gonna correct you. Time to get some ram then...
Bakeryy
https://github.com/kijai/ComfyUI-HunyuanVideoWrapper/commit/6ab3d0ae62bcdc878e4e68db35d5b9566745613c>initial RF-inversion for testingWhat's that?
>>103482225>kijai doesn't want to do it :(how do we do it
>>103482451>how do we do itstep 1) Be a coding godstep 2) Profit
>>103481762Someone wrote that instead generate many low resolution videos, then pick whatever looks the best, and generate them again (same seed) with higher resolution.Would that work or is the result completely different depending on the resolution?
>>103482511Result is different based on resolution.Mostly because the model correlates resolution with training data at said resolutions.Meaning 544x960 is correlated with modern tiktok videos while 320p or something is correlated with old 2000s porn videos from across the web. So the AI model thinks it needs to create something like that.
>>103482526god damn it, so there is no way to create a bunch of them without using the full time every generation
>>103482446>https://github.com/kijai/ComfyUI-HunyuanVideoWrapper/commit/6ab3d0ae62bcdc878e4e68db35d5b9566745613cRF-Inversion is amazing, this is RF-Inversion in flux, it works amazingly with loras, if they make it work then you'll be able to generate your own kinos with loras from whoever you want
if a model says 1x/2x/4x, is that supposed to be the max scale you can upscale the original image to?>>103482539>can't tell which one is the originalmy ai sensors aren't tingling anymore...it's over
>>103482446what would be the input, another video?
>>103482566yeah, looks like a more fancy way to do v2v
>>103482539so it's like an inpainting trick?
>>103482552the left is the fake one, look at the earrings, that typical ai necklace
>>103482446https://github.com/kijai/ComfyUI-HunyuanVideoWrapper/blob/main/examples/hunhyuan_rf_inversion_testing_01.jsoncan't make the workflow work, I updated the node
>>103482636nvm i'm a fucking retard I forgot to refresh the page
>>103482636nvm i'm also gay too
>>103482752
new bread >>103482892>>103482892>>103482892
>>103482603Bait? Look at the numbers and signage on the train car in the background, length of lightbulbs, and the evenness of the door plates.