The First Output EditionDiscussion of Free and Open Source Text-to-Image/Video Models and UIPrev: >>106722132https://rentry.org/ldg-lazy-getting-started-guide>UIComfyUI: https://github.com/comfyanonymous/ComfyUISwarmUI: https://github.com/mcmonkeyprojects/SwarmUIre/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneoSD.Next: https://github.com/vladmandic/sdnextWan2GP: https://github.com/deepbeepmeep/Wan2GP>Checkpoints, LoRAs, Upscalers, & Workflowshttps://civitai.comhttps://civitaiarchive.com/https://openmodeldb.infohttps://openart.ai/workflows>Tuninghttps://github.com/spacepxl/demystifying-sd-finetuninghttps://github.com/Nerogar/OneTrainerhttps://github.com/kohya-ss/sd-scripts/tree/sd3https://github.com/derrian-distro/LoRA_Easy_Training_Scriptshttps://github.com/tdrussell/diffusion-pipe>WanXhttps://comfyanonymous.github.io/ComfyUI_examples/wan22/https://github.com/Wan-Video>Chromahttps://huggingface.co/lodestones/Chroma1-BaseTraining: https://rentry.org/mvu52t46>Neta Luminahttps://huggingface.co/neta-art/Neta-Luminahttps://civitai.com/models/1790792?modelVersionId=2203741https://neta-lumina-style.tz03.xyz/>Illustrious1girl and Beyond: https://rentry.org/comfyui_guide_1girlTag Explorer: https://tagexplorer.github.io/>MiscLocal Model Meta: https://rentry.org/localmodelsmetaShare Metadata: https://catbox.moe | https://litterbox.catbox.moe/GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-oneTxt2Img Plugin: https://github.com/Acly/krita-ai-diffusionArchive: https://rentry.org/sdg-linkBakery: https://rentry.org/ldgcollage>Neighbours>>>/aco/csdg>>>/b/degen>>>/b/realistic+parody>>>/gif/vdg>>>/d/ddg>>>/e/edg>>>/h/hdg>>>/trash/slop>>>/vt/vtai>>>/u/udg>Local Text>>>/g/lmg>Maintain Thread Qualityhttps://rentry.org/debo
>nigbo
I removed sage attention. 480s for a 720x720, lightning loras, 8step gen. like 50s of those are color correction and interpolations.Sounds about right?
oh boy, I drank too much again
>>106722862it's literally called Hunyuan Image 3.0, they're advertising it like an image model and seem to falsely believe that anybody gives a shit about how many completely generic sameface asian ladies it can gen at 1280x768 when it's an 80 BILLION PARAM model. Like no, I don't want to see ANY shit that doesn't show off the maximum extent capabilities of your 80 BILLION PARAM model, every single one of the examples should be impressive, when in reality none of them are.
So HunyuanImage-3.0 is actually an autoregressive MoE LLM actually, not just some dense 80b image model
>>106723744It could be a solid gold turd. I can't run it.
>>106721558>>106721465mode/prompt/catbox?
>>106723751ramlet issue
>>106723730Idk anon, an anon tested it towards end of last thread, and while not enough to make a conclusion yet, those gens look 80b worthy to me. They are better than anything similar I've seen so far from a local model, and its prompt following is very, very good. It is good enough to spark my curiosity and warrant possible API testing. Let's put our grudges against its steep requirements aside and see for ourselves if the model is good thru testing.
>>106723623i do sexy sometimes
>>106723813You've got fresh model brain. It's slopped.
>>106723815nice style, what artist?
>>106723717Hopefully the girl retain her body proportion from reference in the next wanimate version
>>106723826fresh lora i trained on >>106723142
>>106723839She had to match Tuco's body shape it was a hard ask.
>>106723717standing there like a silent gtav player during cutscenes
>Do something in another workflow>Preview on workflow that is currently rendering fails to displayWhen are they going to fix this stupid bug. What the fuck are they doing with that API money?
>2026>still no dedicated wiki or pastebin dedicated to ksampler settings for each model
>>106723995euler/simple
>>106723995depends on the workflow you're using and steps, so its worthless, just find a popular workflow and use it
>>106723995Then do it yourself then.
Third output of Hunyuan 3.0This one took 38:09. It's taking longer each time, so I need to figure out if it's some kind of leak or a higher variability than diffusion models.I told it to use Yoshitoshi Abe's style and the output is still generic, so that's a bit informative.
>>106724047Fully chinese within 3 seconds. Impressive.
>>106724038this looks like fucking 2022 models. the random lines and chaotic fine detail
>>106722281why is this face familiar to me
>>106724054If you run mayli through a tagger it can't make up its mind. Half of the captions are asian, the other half are caucasian.
ramtorch status?
>>106724100Ask yourself what the status of any of the furries projects are and you got your answer.
reduce steps to 4 to unslop. vrambloaters hate this trick
>>106724124cool 1995 CRT TV lora!
>>106724092i also have a difficult time expressing the uniqueness of her beauty through words alone
https://huggingface.co/lightx2v/Wan2.2-Lightning/tree/main/Wan2.2-T2V-A14B-4steps-lora-250928is it better than the previous version?
>>106724155Looks a lot better. But also T2V with I2V coming "When an even better T2V LoRA is released".
>>106724155Movement is a lot better and there's some improvement on everything else, but the not being able to make dark scenes problem remains.
>>106724100turn it into a custom node so people can test it
>>106724160I'm not going to complain much, these guys are heroes, they've made AI video creation so much more fun by not having to wait half an hour anymore.
>>106724189im surprised they dont get paid by any video gen model labs to fully fine tune their models to work at 4 steps at almost full quality, given whats possible with just this lora
>>106724183no YOU do it
>>106724200I tried but It keeps giving me tensor shape mismatch
>>106724038this reminds me of pixart except it's 40 minutes per image
I hate Face Detailer in ComfyUI. It should always use the max res that the model was trained at. Instead, it uses values based on confusing sets of inputs which can fluctuate depending on the size of the face and where it's positioned.I redid it manually using the standard face bbox detector using 1024x1024 inpainting boxes and it works a lot better. Fucking stupid that face detailer doesn't have that as a simple option, "always use 1 megapixel canvas sizes"
>>106724038>80B
>>106724155it looks less slopped and the speed is back again, seems to be a great improvement overallhttps://www.reddit.com/r/StableDiffusion/comments/1nshjxg/updated_wan22t2v_4step_lora_by_lightx2v/
>>106724242>t2vI sleep. I also want 8 step i2v. 4 step is too ass for my tastes
>>106724225I pretty much do the same thing though I use different detection models
>>106724248maybe they'll improve the 4 steps version to the point it'll be as viable as their old 8 steps lora
Can't you just run any step amount with the speed loras?
>>106724264No, that's why the lightx2v guys release 4 and 8 step variants, the 8 step retains more information or something.
>>106724155>https://huggingface.co/lightx2v/Wan2.2-Lightning/tree/main/Wan2.2-T2V-A14B-4steps-lora-250928I'm not sure they're working on ComfyUi, I got this shit
>>106724273yep there's an issue therehttps://huggingface.co/lightx2v/Wan2.2-Lightning/discussions/42
>>106724222Imagine if Pixart wasn't bought out by Nvidia. We'd have so much soul right now.
>>106724273Ruh roh, you usually get that when you try to load a lora that was made with a different model, ie 2.1 loras on 2.2
>>106724285Could be whatever whacky format they were using for training. I think it's an easy fix.
>>106724038Finetunes and LoRA's will fix itGuffaw
>>106724038There could be something wrong with the implementation there maybe, just saw this output on reddit. Apparently the prompt was "Draw the main villain Deku struggles with in the My Hero Academia Forest training camp arc"
>>106724273>>106724282>>106724285>>106724288try the kijai's one, maybe they're compatiblehttps://huggingface.co/Kijai/WanVideo_comfy/tree/main/LoRAs/Wan22-Lightning
>>106724294ugh, fine. I'll boot up t2v for the first time in a thousand years.
>>106724292>six fingers on an 80b modeldeos mio
>>106724305Chatgpt and nano both give me 6-7 fingers every now and then, no model will ever fix this.
>>106724160>>106724189>i2v laterI guess in a week or so, sad.
>>106724294>rank 128 for high>rank 64 for lowwhy?
>>106724305They also trained at 1024x1024, there's only so much you can pack into that many pixels. Why they didn't train a smaller model at 2-4MP is beyond me.
>>106724311You figure i2v would be first since like 99% of videos I see posted anywhere are i2v, but whatevs I guess
>>106724321I've seen t2v released first for so many things at this point, I think it's just a simpler model to do stuff with or something.Most people use i2v, so it makes no sense otherwise.
>>106724311>>106724321you can try the t2v version to do a i2v video and see if it improves shit lol
>>106724038grim
>>106724310You know what never gives me six fingers? Wan, either t2v video, or when it brings a hand out that was hidden/obscured in i2v, or when I use it to generate images. Never.
does any saas have hunyuan 3.0 yet? its not on lmarena or dreamina. there isnt even a api node yet! i just want to test the model
>>106724340Maybe has to do with it being trained on video, it sees hands from every angle in long-context form, so it learns better than an image model can with static single images alone
>>106724340Wan has melting fingers
>>1067240384th output Hunyuan 3.0. I only changed the prompt to request Masamune Shirow style in another attempt to see if anything would be recoginized. Obviously, it wasn't. It also accidentally a leg.This one took 46:25 so something is definitely broken and causing successive generations to get slower. The first one at 13 minutes was still terrible but probably more correct.>>106724292I have been using only the bare transformers inference example code. Their webui has additional default prompts that might be helping it. I'll try one more with that active before I have to stop for now.
>>106724352It looks like the first anime LoRA I saw for SDXL when it was released
>>106724340I got alien hands in i2v, when the woman showed her hands in the video (it started without them).No idea if it's due to 4 steps lora, q8, or inherent to the model.
>>106724352>cracks in ponytailLMAO this model is pure ass
>>106724359>4 steps loraProbably. I don't use lightning loras and I've never had weird hands. I have had a spare limb erupt from a body, or a tongue come out of an armpit, weird shit like that. Hand and feet anatomy is solid though.
>>106724352>46:25Is there any worse feeling than knowing you could have saved over half an hour by just restarting the environment and genning again?
>>106724362that's actually amazing
>>106724362That's obvious very old dried cum.
>>106724155People will post this non helpful shit and expect help.
>>106724393turn it off and on again
>>106724398>it learned compression artifacts>reproduces them unpromptedOH I AM LAFFIN', M8
can i use the lightning lora with wan 5b?
>>106724412No.
>>106724393The output is a blur,A hazy, ghost-like image,What could be the cause?
>>106724412>5b
>>106724294>kijai
What are your settings for non lightning lora workflows for 2.2?
>>106724423>the
>>106724398Did it output at that weird res?
>>106724423all right all right, we have to call him KJ Boss I get it
>>106724292>>106724038I wonder if the model is trying to approximate things it doesn't understand using its LLM or something. This was "A pencil sketch of Bayonetta". It looks nothing like the character, but looks exactly like what someone who never saw the character but read a description would envision. This is GPT's two-sentence description of the character.>Bayonetta is a tall, striking woman with long black hair styled into an elegant beehive and sharp bangs framing her face, often wearing glasses that accentuate her confident, alluring look. She’s most recognizable in her skintight black outfit made of her own magical hair, adorned with intricate designs and red accents, giving her a sleek, witch-like yet stylishly modern appearance.
>>106724398
>>106724423Kijai workflows are just better.
>>106724425the same as with lightning but with more steps
>>106724434I found this shit here lolhttps://xcancel.com/JeepersMedia/status/1972211090961715512#m
>>106724398Not even the people in the reflection are correct.
>>106724447you have no idea how much I miss these kinds of ads
>>106724443mid 90's adventure game background aesthetic
>>106724446Even the cfg and sampler?
>>106724447wtf? even the text on the bottom left is perfect
>>106724459yeah
>>106724447>benis in baginaKEK, what the fuck did you use to make this
>>106724459>1 cfg without lightning lorakek
>>106724445maybe, but that's why I don't really like his implementation. You're pretty much stuck with whatever workflow there is for his nodes and his nodes only (and that's on top of comfyui)
>>106724471>1 cfg without lightning lora>20 stepsIf you're curious.
>>106724471and KEKAROO to you to anon
>>106724476When the molly hits
>>106724393SAME HER
>>106724473His implementation is Wan's official implementation.
>>106724476>damn it shaggy this ain't weed
>>106724470It's from /v/ but the anon that made this was hanging out in the thread. He did explain how he made some of them though.>>>/v/721838671
>>106724476Damn, didn't know Hercules liked to get wet.
>>106724447>All right 1995 was a pretty good year in video games>Really the latter half of the 1990's saw some of the most iconic games coming outyep, and I'm glad I was old enough to witness this in 4k (or in 360p if we want to keep the context accurate kek)
>>106724486That still doesn't change what I said tho
>>106724494>he uses a1111more proof that cumshartui is good for nothing but mass production of assembly-line slop. nobody inpaints or puts effort into their gens with comfy because it's a pain in the ass. comfy is for non-creative individuals, such as those from the indian subcontinent
>>106724494>A1111jesus christ, at least use forge, how is he doing the text though if he's only using noobai?
>>106724476Smoking the good this
chat, how do I improve the backgrounds on my 1girls?[spoilers]there are actually 2 girls[/spoiler]
Is everyone in this thread retarded? He's clearly shopping in the text, logos and a bunch of other shit. Christ...
what an embarrassing post
>>106724524obviously pasting it on top in an image editor and applying a filter. so many tards here really waste time with shit like qwen and flux, 4+ minutes per gen only to get stuff that's 1/4 as good as simple sdxl with a bit of inpainting.
>>106724532>spelling mistakeit's over
>>106724537Yes pretty much 90% of people in this hobby are next level retarded
>>106724537But... why would he do that?
>>106724537fluxbrowns lowered the average thread iq by 50 points, got even worse after the 4o tourists.
>>106724543Someone will call you an 'Indian' or 'third worlder' and thread devolves into political infighting.
>>106724543memes from 2010s are cancelled anon
>>106724532maybe nano banana, "make this less shit", then img2img over it
And when I get that feelingI want sexual healingSexual healing, oh babyMakes me feel so fine
>>106724537I thought that was clear. The main Lara image is what he genned, the rest is shooped, including the filter over the top. It's still cool nonetheless, but I'd be seriously concerned over anyone who genuinely thought the whole thing was genned.
>>106724155>is it better than the previous version?definitely better, the slow mo shit is gone, now I wait for their I2V version
>>106724498kek
>>106724567My grandmother sent me a video of a bunny the size of a dog. It was clearly AI. She thought it was real
>upscale by 1.5>image is fine>upscale by 2>result is an abomination of limbs and multiple nipples>upscale by 2 on a different checkpoint>image is fine>upscale by 2 on the same checkpoint with a different upscaler>image is fineis this related to a checkpoint's maximum output resolution or the upscaler not being compatible with that checkpoint past a certain level of upscaling?i never changed the denoise level either so i dont know what causes this. im using illustrious/noobai
>>106724593Some checkpoints react very differently to the denoise level.I have one that goes all the way up to almost 0.7 with good results at 2x upscale, while another does the same amount at 0.15.
https://phantom-video.github.io/OmniInsert/I wish we could insert some shit on some I2V process, that would be much more fun
>>106724612>klingkek
>>106724602i see, so its a case by case basis. this helps a lot, i thought my install was fucked or something, thanks.
>>106724593Are you using tile controlnet?
>>106724593>is this related to a checkpoint's maximum output resolutionthis oneI tried a lot of upscaling with different noob/illust models, and illust v2 was the one I had the less abomination of limbs with, because it has the biggest output reso, but since the model is shit itself it fucks up the style everytimeAnd that anon is right too >>106724602, different models will need different denoise level, sometime you are lucky if you find the sweet spot between step/denoise, but most of the time you will get abomination a lot of because they aren't trained for higher resolution
>>106724633From what I've seen, a surprising amount of anons don't even know how to upscale, they literally try to upscale the entire image at once x2 in img2img.
>>106724633nope, just hires.fix, nothing else>>106724638got it, i'll just add notes with the upscaling info to each checkpoint that i use to avoid eldritch abominations
>>106724625Kling got the netflix treatment :(
Last Hunyuan 3.0 output for now. Using the included gradio webui with the default system prompt (instead of none). This took 16:56.It was another attempt at extracting a recognizable style. I told it to make an Asuka from Evangelion, but as concept art for Serial Experiments Lain.>>106724440I think this is a good theory. It does remind me of what you get sometimes if you feed the output of a verbose captioner straight into an image generator.
>>106724656>nope, just hires.fix, nothing elseThat's why. You need to learn how to use Ultimate SD Upscale. It cuts the image up into chunks that the model can manage, usually 1024x1024 chunks. It then upscales each one individually. You use a tile controlnet made for illustrious or whatever model you're using, and that makes it so that you can upscale at 0.6-0.7 denoise and keep the same composition, without sprouting new limbs or any other weird shit.
>>106724667there is no actual fucking reason for hunyuan 3.0 to take upwards of 40 minutes to gen 1girl slop holy shit. and thats at 1024x1024
Hunyuan 3.0 reminds me a lot of vae-less 1.5 or pre-vpred/noise offset SDXL, when everything was washed out and grey. the outputs have this sort of 2022 look to them
>>106724683It takes that long because he's offloading an 80B model onto ram.
>>106724667The quality looks pretty good here, but since it's such a fucking huge model, I'd be interested in some more difficult subjects and compositions than 1 girl.
>>106724675ill look into it, i've never used controlnet once so i'll have to mess with that first.
Didn't someone post a thing stating Seedream 4 was only 20b? Yet it generates at least 2k images that look way better than Hunyuan, plus their API spits them back faster too. What went wrong?
>>106724693Yeah I swear this place the creativity of a toddler. Anyway just waiting on some place to run it so I can test this shit myself.
>>106724705post a single creative gen you've made right nowone.
>>106724701The only way to get seedream local is to pose as a janitor at bytedance and work your way up the ladder until you have server access. I'll provide you with eye tape and a bronzing agent. The rest is up to you.
>>106724705makes you look like a superior cool guy to say that, but it's way more helpful to post common baselines that everyone has existing expectations for instead of trying to get too creative.
those new light loras, they ain't so good.how is it even possible that the 2.1 loras are still better for movement. wack.
we got mogged right?
> we
>>106724701https://xcancel.com/bdsqlsz/status/1966034419183124527#m>Didn't someone post a thing stating Seedream 4 was only 20b? I'm seeing 26b here
>>106724728Every model ever released can do this man. That's like testing a sports car by seeing if it can keep up with a carriage lol. The point of the newer models is prompt adherence if you wanted 1girl looking at screen and I mean in no derogative way, why tf would you use a 80B model?
SHE IS SUPPOSED TO BE HAPPY OR SOME SHIT. People might light this though...
>>106724683It's so slow because it's probably expecting at least eight GPUs instead of one.
>>106724745it's the reward model (used for training) not the model itself
>>106724737both are awful desu
New 4 step LoRA
>>106724710>no replyi won
>>106724778yo wtf that looks smooth
>>106724778pretty good, if only Alibaba was as focused on QIE compared to Wan we would've gotten Nano Banana at home already
is wan2.5 still in 5 second hell?
>>10672479110
>>106724799naisu
>>10672480010 for Wu but not for You.
>>106724811not naisu
>>106724811>b-but, the interview where the Alibaba engineer said that-ACK
Sage attention 3 is out.
>>106724822Beta test it for me then, slave.
>>106724124Ummm... based?
>>106724778really impressive, imagine telling your old self from a year ago you'd get this quality with only 4 steps he would've laughed at your face
>>106724822>only relevant for the 5090*yawn*
>>106723624me as the red demon
Very artsy and tasteful, so I think it's okay for blue board.
>>106724822You cannot do that for now : >Note: SageAttention3 does not guarantee lossless acceleration for all models. For other video generation models, we recommend selectively using SageAttention2++ in certain layers or timesteps.>For example:>Apply SageAttention2++ only at the first and last timesteps,>Use SageAttention3 for all the others.>This hybrid approach may achieve lossless acceleration.Once you can, it'll be worth trying for sure.
I know this is AI but how the fuck did they do it? it's so cleanhttps://vm.tiktok.com/ZNdG17uvs/
>>106724873It's not that clean? And I bet those camera flashes are covering some of the more egregious errors.
>>106724822>Sage attention 3 is out.
>>106724822nice, now waiting for the node updates to test it
>>106723624where's the rentry for Wan?
>>106724898dunno why they deleted it, but here it ishttps://rentry.org/wan21kjguide
>>106724938Street shitting caught on camera.
>>106724938you don't want to use the film interpolation node to get something less choppy?
>>106724956That thing smoothens it way too much.. It gives that vaseline dlss effect. So Im not really a fan.
>>106724985too bad the multiplier is either 1 or 2, it would be a good compromise to go for 1.5 imo
>>106724822How do I test it? I recall seeing a node that had sage 3 in it. I have a 5090.
>>106724994>2x interpolate>24 fpsderp
>>106725010no, 2x interpolate is 32fps since wan is a 16fps model
>>106724886going for 960x960 (which has the same number of megapixels as 1280x720) gives really clean results, too bad it's like twice as slow now (took me 4 mn for 720x720 -> 10 mn for 960x960
>>106725028Who the fuck that in the end there?
What is a good model for anime images with backgrounds that don't look terrible?
they updated the light loras again. (40mins ago)they work better now.
>>106725038it's Miku's twin duh!
>>106724822DOA
>Howdy Partners!
>>106725066the colors are a bit too fake, are you sure you're going for shift = 5
>>106725038yeah its a weird framerate
>>106725044>no infothanks for nothinghttps://huggingface.co/lightx2v/Wan2.2-Lightning/tree/main/Wan2.2-T2V-A14B-4steps-lora-250928(vid mega compressed for 4chan)
>>106725076damn.. meant >>106725018
>>106725074I think that's just light LoRAs being light loras. colors aside, I think it's a pretty good output for wan t2v
>>106725078https://huggingface.co/lightx2v/Wan2.2-Lightning/discussions/42#68d90d04340cbd3c0f3f2fa7minor stuff
>>106725078they just made the format compatible with comfyUi (since this fucker doesn't want the diffusers format to be compatible on his own shit)
>>106725018>completely missed the point
>>106725109ironic
>>106725109god this looks so retarded. how do you not abort that gen the second you see it in the preview
>>106725124>this looks so retardedhe probably likes that, because he's a retard lol
>>106725136zing!
Lotta talk for nogen poorfags
Why do I have to add myself to a github list in order to install sage attention 3, wtf.
that's it. i'm buying a 6000. an entire minute just for un/loading the models. go fuck yourself
>>106725157I think buying an nvme is a more sensible decision
>>106725157>that's it. i'm buying a 6000.you'll be able to run HunyuanImage 3.0 with it right? kek
>>106725166hope your 1650 can do the 1.5 atleast
>>106725170>haha funny randumb and it can place multiple objects, please ignore the 2002 jpeg quality
idk why it turned into a timelapse.
>>106725174you weren't gening in 2022 cleaarly
>>106725185he isnt genning in 2025 either
>>106725185>>106725191every accusation is a confession
>>106725191>>106725185cope chinkoid, the model is shit
>>106725197yet I was gening in 2022 and am gening in 2025
>>106725182>WanVideo2_1_T2Vwhy not 2.2?
>>106725199be grateful for what your getting from the chinks. Cumskins havent done anything
>>106725200no you weren't
>>106725202>why not 2.2?heh, don't believe everything you read, kid.
>>106725203>Cumskins havent done anythingthey've done Flux and Kontext, have some respect
>>106725208you're right, I was too busy fucking your mom to gen
>>106725214>flux its ded
>>106725216>brown skinned jokehttps://youtu.be/0GCkhjDO-2s?t=27
>>106725223so this is how you defeat the chuds, just have a non white do something and they'll avoid it forever
>>106725153PLEASE, HOW DO I USE IT
>>106725236You just want an .exe file don't you.
>>106725236how should we know? it's been released really recently
>>106725247hehe
I wonder if HunyuanImage 3.0 (80b model) will crack the top 10? lolhttps://artificialanalysis.ai/text-to-image/arena/leaderboard-text
>>106725236if you're going to be an early adopter, it helps to not be retarded
was there a local model that could replicate that overall style you got from dalle vivid?
There will not be a Studio Session today.Before you get that kleenex box out, I come bearing gifts.Last week's Nano Banana studio session is live - https://youtu.be/A4N2WBhpGyk AND -- This list of Nano Banana capabilities/techniques is more than I could possibly showcase in a single session. Worth taking a look through - There are some AWESOME capabilities!https://github.com/PicoTrex/Awesome-Nano-Banana-images/blob/main/README_en.mdWe'll resume with Studio Sessions next week!!! See y'all then.
>>106725286>dalle vivid?I don't know what that is.
>>106725296>style (defaults to vivid):>The style of the generated images. Must be one of vivid or natural:>– vivid causes the model to lean towards generating hyper-real and dramatic images.>– natural causes the model to produce more natural, less hyper-real looking images.>This param is only supported for dall-e-3.my guess is that they used 2 models
>>106725291https://www.youtube.com/watch?v=qL1e67jm290
Dear god, do not use high strength on the new t2v lora.
>>106725341well they're specifically made for 2.2 so you'd set the strength to around 1
>>106725341Yeah motion is better but it blows the fuck out of colours. Needs to be at like .8 or something
>>106725341yeah, it's not meant for that, the x3 strength trick worked only for the 2.1 loras
>>106725078can't wait for the I2V lora, they really nailed their new T2V one
Testing this, so far so good, vram usage seems down for me.
>>106725420on what?
>>106725420>vram usage seems down for me.how much improvement? give some numbers
>>106725427comfyui latest commits
>>106725445Anons here will have to do a before/after, I only noticed the change after pulling and noticing that I could swap less blocks to ram on my videos without OOM.
>>106725420went from 21.6 to 20.4 gb vram usage, damn
>>106725484that might just be enough to mitigate the memory that's leaked
Has anyone tried the new lightning lora on i2v?
>>106725420>>106725484>the lightvx fags finally managed to make good speed loras for wan 2.2>a huge improvement on memory usage has been foundtoday was a good dayhttps://www.youtube.com/watch?v=h4UqMyldS7Q
>>106725420>>106725484ok this is pretty cool
>>106725497>2.5 weights not releasedzzzzzzzzzzzzzzzzzzzzzzz
>>106725420>Testing this, so far so good, vram usage seems down for me.just tested it, impressive as fuck, I had to offload 1.5gb of memory to the ram to test Wan 2.2 on 720x720, now I don't need to offload anymore, great shit
>>106725420>More VRAM savings to come.I like this guy
>>106725420https://github.com/comfyanonymous/ComfyUI/pull/10062/files>changes 2 lines>up to 2gb of vram savedthat's black magic dude
morning, tell me what is babbys first video model I should try with 12gb of vram
i need help with vibevoice using tts audio suite. i have the small and large models already and put them in the right folders, but its saying i have to install some kind of package. pip install git+https://github.com/microsoft/VibeVoice.git produces an error since the repo isnt a python project. i dont know whats its talking about, it is saying it cant find the model?
>>106725420just imagine how much vram we could actually save if we focused more on making Comfy's code optimal or some shit, I didn't expect that much improvement desu, I thought it was already optimized enough
>HunyuanImage 3.0 fuses aesthetic skill with reasoning and world knowledge>aesthetic >aesthetic >aesthetic oh they dared to say that!
>>106725555catbox/prompt?
>>106725555>cyberrealisticPonyanon please. it's time to let go.
>>106725624>(massive breasts:3.0)
>>106725668and the rest
>>106725420Still using SDXL here. What does this new thing give me that I don't already have? Should I switch to Comfy? Or has everyone just forgotten about SDXL users?If that's the case, I'll stick with Invoke, they keep updating their UI for image generation.
miku hatsune cooks hot dogs on a grillnice
>>106725674>sdxlirrelevant to you
>>106725674>Still using SDXL here. What does this new thing give me that I don't already have?if the model you're running is too big for your gpu, that's a big deal, if you have enough vram for sdxl it's a nothingburger
>>1067256241girl, solo, sitting, massive breasts, tan lines, blonde hair, bloom, night, ugly, 3DPD, white lingerie bikini, cleavage
>>106725674These are vram savings, if you're not starved on vram when using SDXL, this won't make a difference
Are there any UI's that aren't abandoned? Something that isn't cumrag UI?
>>106725682>720x1264why not going for 1280 instead of 1264?
>>106725652not until you spoonfeed me sth better that I can use with sd next
>>106725685>>106725690>>106725694I'm not starvinf but I could always use faster image gen. Should I update and switch to ComfyUI? Or does this update hurt output quality?
I have a request if anyone is bored. Baby Football. Animated would be great. Wanimate would be better.
>>106725708just try it by yourself? ComfyUi is know to be the fastest (and it's even faster with sageattention 2++)
>>106725701>sd nextAnon I...
>>106725693catbox?
>>106725714I want to know if I press update.bat will I get faster SDXL gens, or do I need to add a node in between or something?
>>106725726it won't be faster, it's just a memory optimization
>>106725708>Or does this update hurt output qualityRead the commit, it has nothing to do with sdxl generation like >>106725685 said.
>>106725708>use faster image genThe potential speedup with these vram savings primarily come from having to offload less from vram to system ram during generation, if you're not offloading when using SDXL this will have no effect unless it allows you to use higher resolution / batch size
>>106725726@grokk, zap this anon with lightning
>>106725726Oh I see.Have a goo day.
>>106725685>irrelevant to youso it's only an optimization for flux and wan is that it? >>106725548
Thanks to everyone who helped. I'll keep doing what I've been doing until ComfyAnon remembers SDXL users again.Good Sunday anons.
Old forge is still superior to all and gets the job done.
>To use SageAttention 2.2.0 (containing SageAttention2++), please compile from sourcei am filtered
Does this speedup work with API nodes?
>>106725584i see on huggingface lots of people having similar problems with no answers
https://huggingface.co/lightx2v/Wan2.2-Lightning/discussions/42#68d92370efc669a86a5fe3ca>Our quantitative evaluation on an in-house test set shows that the motion dynamics scores for the base model (wan2.2-T2V-A14B), Lora-250928, and Lora-v1.1 are 10.66, 7.76, and 5.27, respectively.>We are currently working on an improved version.they're so fucking based desu
On a local comfyui install, how I can do texture tiling? All the guides online seem to be for cloud implementations
>>106725764what os? gpu? python version? cuda version?
>>106725774there's a lot of tutorials on how to install sageattention, don't be that lazy anon
Have used SDXL for a really long to time, almost always below 8 steps. Gave Flux a whirl again, but it's way too slow in comparison. What optimizations are best in slot? or would you recommend some other arch?>inb4 that guy
>>106725817its not sage attention its vibevoice. there is something wrong with the comfyui nodes that doesnt set it up properly for most people. maybe the set up only works on linux or something
>>106725584>>106725836cmd.exe in ComfyUI_windows_portable\ComfyUI\custom_nodes\tts_audio_suite..\..\..\python_embeded\python.exe install.pycmd.exe in ComfyUI_windows_portable\python_embeded\python.exe -m pip uninstall vibevoicepython_embeded\python.exe -m pip install vibevoicepython_embeded\python.exe -m pip install --upgrade transformers
>>106725836oh my b, it was meant for >>106725764
>>106724038>>106724292Recognizes some popular characters, but no indivudal style tags unfortunately, similar to their slopped Hunyuan Image 2.1. A shame in terms of output capacity this is basically and 80B version of that and it's not special otherwise.
https://github.com/lodestone-rock/RamTorchHow do I turn this into a comfy node?
>>106725860no one seem to care about his repo, there's no issues made so far, that's probably a huge nothingburger, like everything he's done so far (to be fair I like his willigness to make chroma VAE-less, that's the future of edit models that's for sure)
>update to and try the new 2.2 t2v loras for my i2v>all nsfw stuff now starts lactating https://files.catbox.moe/1c31ya.mp4Huh, I'll take it.
>>106725899>update to and try the new 2.2 t2v loras for my i2vI tried to do that and I got some weird ghosting and duplications, it wasn't made for this at all lol
>>106725899it's not working well at all with i2vbetter wait for their i2v version
>>106725904>>106725925It seems to do really well for me, I see a quality improvement. But then again I am doing small motions aiming for seamless loops.
>>106725843thanks anon uninstalling and reinstalling worked
>>106725939No problem.
>>106724494>one singular /v/ faggot made a better image than anything i've ever seen shat out of cumfartui users with flux/SaaS in these threads and /sdg/i.. i kneel.. and will probably download neoforge this morning honestly.
>>106725420When will this land in AniStudio(tm)?
>https://github.com/woct0rdho/SageAttention/releasesHow the fuck do I install this shit given all those versions of pytorch are deprecated and not even available anymore. Pytorch is up to 2.9. and 2.10.
What are your opinions on qwen image edit 2509?I want it to swap clothes but I can't make it work.
>>106726014compile it for your env
>>106726022>Use the girl from Image 1 as the primary subject, preserving her exact body, pose, facial features, hair, background, and the original lighting conditions of Image 1. Replace only her clothing with the outfit shown in Image 2, ensuring the new outfit matches the style, color, and details of the clothing in Image 2. Do not alter the girl's physical appearance, body shape, or any other elements from Image 1 except for the clothing. Ensure the new outfit is seamlessly integrated onto her body, maintaining proper fit and proportions as if she were naturally wearing the outfit from Image 2, while keeping the lighting, shadows, tone and highlights consistent with Image 1.
>>106726014>Pytorch is up to 2.9. and 2.10bro everyone runs 2.8 or 2.7.x what are you smoking?
>>106726022>What are your opinions on qwen image edit 2509?still plastic, still zooms in images, and it doesn't know how to make styles proprely like the old version, big meh
>>106726053Skill issue?
>>106725818I tried a q5 quant of flux schnell on my 12 gb 3060 the other day... and it took 5 s/it for 1024 rescan't use any of the inference tricks on this card either, lmao
>>106726040I tried to install 2.7.1 or 2.8.0, it didn't work. It used 2.10.0 instead. Or maybe I'm from the future, fucked if I know.
>install new thing for new model>breaks my other work flowsso do you guys use a separate embedded python for each thing?
>>106726072>it didn't workBro how. Just force reinstall requirements.
https://www.reddit.com/r/StableDiffusion/comments/1nsowjd/i_trained_my_first_qwen_lora_and_im_very/>makes a lora that finally unslops Qwen Image>doesn't provide the lorareeeeeeee
>>106726058if it takes even more practice to get good results with a new model, that's not a good thing
>>106726022Tried on anime girl and it didn't recognize the yellow shirt, left the sleeves and neck part.Once managed to remove the shirt but left wrinkles from the shirt and added them on the skin.Couldn't remove kitchen cabinet. All it had to do was to paint that area grayish white, but apparently too much to ask. Not that it wouldn't take 15 seconds on photoshop to fix
>>106726088What model do I use then, nano banana is censored crap
>>106726086thats his girl. why would he provide his lora lol
>>106726099maybe that's just a colleague
>>106726106oh lets not go there. There was one of those here back in the day
>>106725755I've had it not work on some newer models for some reason, I don't know what changed, hadn't messed around with SDXL for a while, ended up having to get classic.
Bros I wanna try out hunyuan 3 but 320 gigs is nuts, I have 10tb of ssd space and I'm always running out.
>>106726177It's absolutely not worth it
>>106726177if it looked awesome I would see the point, but it looks like shit desu >>106725170>>106724398
>>106726177Just gen with qwen image then add jpeg artifacts
>>106726196kek
>>106726177>I have 10tb of ssd space>I'm always running outwhat in the name of how?>>106726196lol'd
>>106726210I dabble with LLMs too and those are freaking huge and theres so many finetunes. I think I have like 3+ tbs in llm stuff, 1tb of imagegen stuff and 2/3 tb of video gen stuff. Other space is various apps and stuff like that.I guess I should start archiving old stuff or just deleting them, I'm a hoarder.
>>106726271i just wouldn't expect you to put all that on SSD's kek. i archive my old shit on HDD's and make copies of my daily drivers but i'm nowhere near 10tb of A.I models including gens.though LLM's are fucking huge it's ridiculous no wonder /lmg/ is suicidal, that space hasn't moved in years.
Postin (You) bait from another thread.
>>106725260no qwen?
>>106726325qwen is 22th
>>106726293gonna bait (Me) to this later
>>106725652What do you recommend instead (i'm on comfyUI)?
>>106726358if you unironically enjoyed cyberrealistic, then upgrade to its illustrious version.
>>106726368I'm after uncensored realistic stuff, not necessarily porn. I used SDXL bigLust for a while
>>106726411Chroma
>>106726411well have fun searching civitai for like 100 hours worth of your month, its really all about personal preference at this point. no one has it down perfectly.which is why you see dorks in this thread debating something as mundane as noise levels
It's always fascinating to see the difference in intelligence between text AI people and image AI people. Most image people melt down when they can't run a model and understand absolutely fucking nothing. Hunyuan's outputs don't look spectacular, but the fact that we finally have access to an autoregressive model is cool as hell. It isn't likely to replace anything except maybe editing models, but it will be fun to play with. I'm hoping we'll see some efficiency gains or a smaller version in the future that puts this in range of 3090 vramlets.
>>106726332
>>106726440>It isn't likely to replace anything except maybe editing modelsNow with that output quality
>>106726448Can it generate non-asian people?
>>106726448
>>106726432it doesn't take 100 hours to train a (text encoder +) LoRA. why do people insist on waiting for someone else to do it when they could have already had whatever they wanted?
>>106726450Unless I'm mistaken we haven't seen how well it handles outputs. QIE isn't exactly a high bar to clear.
>>106726484Because it's really just that easy innit bruv?
>>106726482>>106726483yes
>>106726503alright BUB you got my attention what checkpoint is this?
>>106726496buddy, you'll never know until you try>Tuninghttps://github.com/spacepxl/demystifying-sd-finetuninghttps://github.com/Nerogar/OneTrainerhttps://github.com/kohya-ss/sd-scripts/tree/sd3https://github.com/derrian-distro/LoRA_Easy_Training_Scriptshttps://github.com/tdrussell/diffusion-pipe
>>106726109>posting his colleagues images herethat's just retarded
>>106726508Flux with realism lora
>>106726484sure let me boot up my 192GB VRAM computer I have lying around
okay bros i got wan image to video working. i got voice gen working. now how does lip syncing work? can i take a video of a girl dancing and make her talk with the audio or do you have to take a still image and make her talk? local music gen would also complete what i need to do
>>106726411this sdxl checkpoint is good
[spoilers]chroma[/spoilers]
>>106726698[spoiler]spoiler[/spoiler]
>>106726733[secret]wan 2.5 soon[/secret]
qwen edit understands openpose models natively. use openpose from AIO aux preprocessor node to get a pose/skeleton, then use it as the input for image2.the anime girl in image1 is using the pose of image2. keep her anime appearance the same.guess the pose source.
[cope]B-bigma status?[/cope]
>>106726773what about nsfw open poses?
>>106726781why not? with a nsfw or clothes remover lora it would still work
>>106726691catbox?
>>106726773Wow, it even included Mi Maimen and Dr. Pavleheer (the crew of the Wreckage Brother).
>>106726805this was the input for image2, kek
>>106726698>>106726733>>106726762[wrong board]wrong board[/wrong board]
baker doko?>>106726670You want WAN 2.2 S2V, looking at the comfyui template it seems similar to i2v as you provide a reference image and audio track along with a prompt. I haven't fucked with it yet so I can't say more than that.
>>106726691>TgirlsAndMalesno thanks
depth seems to work too.change the pose of the anime girl in image1 using the depth map of image2. she is looking at the camera and the time of day is night.notice the hair strands stayed intact:
>>106726876this time with a random bocchi face:
Bakermen
>>106726906and again with the openpose, for you
Bakerstreet
Bakestreet boys
*also set the resolution to 1024 for AIO aux preprocessor (to get openpose/depth/etc), works much better.
>>106726990>sets your resolution to 1024x768heh nothin' personnel kid
>>106726995well qwen edit workflow scales it around 1024 at first so it should be better, output is more than that depending on size.
>>106726564
>>106727009yikes
>>106727009I THINK I'M TURNING JAPANESE
>>106727009>these monster handsscary