Discussion of Free and Open Source Text-to-Image/Video Models and UIPrev: >>106661715https://rentry.org/ldg-lazy-getting-started-guide>UIComfyUI: https://github.com/comfyanonymous/ComfyUISwarmUI: https://github.com/mcmonkeyprojects/SwarmUIre/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneoSD.Next: https://github.com/vladmandic/sdnextWan2GP: https://github.com/deepbeepmeep/Wan2GP>Checkpoints, LoRAs, Upscalers, & Workflowshttps://civitai.comhttps://civitaiarchive.com/https://openmodeldb.infohttps://openart.ai/workflows>Tuninghttps://github.com/spacepxl/demystifying-sd-finetuninghttps://github.com/Nerogar/OneTrainerhttps://github.com/kohya-ss/sd-scripts/tree/sd3https://github.com/derrian-distro/LoRA_Easy_Training_Scriptshttps://github.com/tdrussell/diffusion-pipe>WanXhttps://comfyanonymous.github.io/ComfyUI_examples/wan22/https://github.com/Wan-Video>Chromahttps://huggingface.co/lodestones/Chroma1-BaseTraining: https://rentry.org/mvu52t46>Neta Luminahttps://huggingface.co/neta-art/Neta-Luminahttps://civitai.com/models/1790792?modelVersionId=2122326https://neta-lumina-style.tz03.xyz/>Illustrious1girl and Beyond: https://rentry.org/comfyui_guide_1girlTag Explorer: https://tagexplorer.github.io/>MiscLocal Model Meta: https://rentry.org/localmodelsmetaShare Metadata: https://catbox.moe | https://litterbox.catbox.moe/GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-oneTxt2Img Plugin: https://github.com/Acly/krita-ai-diffusionArchive: https://rentry.org/sdg-linkBakery: https://rentry.org/ldgcollage>Neighbours>>>/aco/csdg>>>/b/degen>>>/b/realistic+parody>>>/gif/vdg>>>/d/ddg>>>/e/edg>>>/h/hdg>>>/trash/slop>>>/vt/vtai>>>/u/udg>Local Text>>>/g/lmg>Maintain Thread Qualityhttps://rentry.org/debo
Cookies & Cream
>>106666599pretty cool collage
For those who have missed that out, maybe we'll get something better than Wanhttps://byteaigc.github.io/Lynx/
Thanks for the help with the multiple sampler method to split up the strengths of the lora for each step. Sadly it doesn't seem it works.I'm trying to fix the color shift with the other wrapper wf.
>>106666683I downloaded one video and it's 16fps, it's probably a finetune of Wan 2.2
>>106666683i love how those numbers just do not mean anything at all.
>>106666755> Built on an open-source Diffusion Transformer (DiT) foundation modelit's 100% wan2.2. but the somewhat interesting part is their ID adapter. that aside i see nothing else of value.
>>106666800what's with the celery
>>106666878you don't know Hatsune Miku's leek? how young are you? lolhttps://www.youtube.com/watch?v=6ZWwqTnqxdk
>>106666890I don't speak japanese, do you?
>>106666912>do you?I do know how to search for the translated lyrics on the internet yes?
Blessed thread of frenship
>>106666890>Hatsune Miku's leekLeekspin was orihime from bleach mikunigger
>>106666683>>106666755While the demos do look pretty good, seems to still have the same 5 second cap....sigh...
If I got the hardware, are the fp32 options worth it over the fp16 precision in comfyui nodes? Can't tell much difference.
>>106666979https://www.youtube.com/watch?v=ekdKIKfY6Ngdamn I miss that era of youtube, All YouTube recommendations were as kino as this
cute kote
>>106667005likely a finetune of Wan so yeah, with all the drawbacks associated with it
>>106667011isn't bf16 better than fp32?
nunchaku qwen image edit plus when???????????
>>106667003corpo slop you'd see them hang in your office to remind you not to think for yourself and waste your life on the company while they rape you.10/10 would kms again
Is the state of unpozzed video gen good enough that there's a thread someplace with horrible degenerate images of heterosexual coupling (and not just "female with liquid overflowing")? Where? Is there a place to try my hand at it for free?
make a plastic anime figure of the cyan hair anime girl on a round pedestal. (qwen edit)
>>106667130> freeyes, simply buy a 5090 and you can gen it at home for free.
>>106667152oops, didnt set upscale on the compare image nodes to lanczos. now it's better:
>>106667153dont need a 5090 for wan. 12 or 16gb is enough.
>>106667084>againwhat are you a fucking cat?
>>106667189sure, if you want to wait forever just for 5 seconds.
>>106667203with a 4080 I can get a clip in like 100-120 seconds with lightx2v loras.
>>106666591>having good success with the 2.1 lora at 3 strength for high, and 2.2 lora at 1 for low. only 2.2 high seems to affect motion in a bad waynigga we been known this
>>106667175with both:
>>106665018Catbox?
put the image on a coffee cup, that is placed on a table in a coffee shop.the edit models are so neat. and you can use this stuff with wan or whatever. you can inpaint and do stuff like that but it'd be very hard to do this type of stuff without qwen edit/kontext.
>>106667279Woah that's crazy I had no idea this is entirely new information that no one's mentioned before thank you for sharing anon
>>106667288yes my bad for discussing diffusion models in the diffusion general.point remains, these are really good tools cause any amount of denoise % or controlnets would not be able to do what these can do.
>>106667003I would use that in a powerpoint.
How many GB vram do you need to train an SDXL LoRA?
>>106667320Much like the memes you choose to edit the discussion you bring is old and stale kek
>>106667250Can it make a mamama apimiku style mimukawa miku?
>>106667394kek
replace the anime girl with Miku Hatsune. Change the text "DOROTHY: SERENDIPITY" to "MIKU: HATSUNE".nice
>>106667445kys
>>106667455no ty sir
>>106667445pretty good, but hatsune miku isn't the hardest to change, most models know her well
>>106667481youre expecting anon to do something interesting? he would never
>>106667481you could also remove the model entirely and then swap them in with photoshop. what's neat about the edit models is they can edit or remove stuff but respect layers. can't do that with inpainting at high denoise levels.
https://huggingface.co/Qwen/Qwen-Image-Edit-2509it's up
Incredible groundbreaking developments from the miku poster
>>106667506Enhanced Single-image Consistency: For single-image inputs, Qwen-Image-Edit-2509 significantly improves editing consistency, specifically in the following areas:Improved Person Editing Consistency: Better preservation of facial identity, supporting various portrait styles and pose transformations;Improved Product Editing Consistency: Better preservation of product identity, supporting product poster editing;Improved Text Editing Consistency: In addition to modifying text content, it also supports editing text fonts, colors, and materialssounds good
>>106667506>This September, we are pleased to introduce Qwen-Image-Edit-2509>Multi-image Editing Support: For multi-image inputs, Qwen-Image-Edit-2509 builds upon the Qwen-Image-Edit architecture and is further trained via image concatenation to enable multi-image editing. It supports various combinations such as "person + person," "person + product," and "person + scene." Optimal performance is currently achieved with 1 to 3 input images.let's fucking go dude, no more image stitching cope anymore
>>106667506thank you chinks, you really are our saviors
>>106667506any fp8/q8 yet?
>>106667506https://huggingface.co/spaces/Qwen/Qwen-Image-Edit-2509https://huggingface.co/spaces/akhaliq/Qwen-Image-Edit-2509there's demos in here
>>106667537it's just been here less than an hour ago, unlikely
>>106667506already?lmao, qwen actually speedruning to agi in 3 years...
>>106667445>>106667279>>106667152Can you try with the updated model >>106667506
>>106667506>and it's not overwe'll get wan 2.5 in a few days, Alibaba is better than Santa Claus lmao
>>106667546>https://huggingface.co/spaces/akhaliq/Qwen-Image-Edit-2509wtf is this shit ;_;
>>106667574dont have a quant or fp8 download, I assume it has to be converted from this batch of files.
> This September, we are pleased to introduce Qwen-Image-Edit-2509, the monthly iteration of Qwen-Image-Edit.what? so they are updating it monthly or what?
>>106667601China does what Scam Altman dont
>>106667506sadly it does look just like a incremental improvement
>>106667601>what? so they are updating it monthly or what?I think they already done that with LLMs
at what point will you wake up and stop getting excited for chinese slop
>>106667506NUNCHAKU NEXT TARGET
>>106667611sure but is this just a monthly further finetune or athe actual "edit plus" they are talking about?
>>106667622>NUNCHAKUgive up bro lol
>>106667619slop? wan is better than any western video model and is open source. qwen/qwen edit are free. OpenAI want you to pay $1000 a month for 5 prompts a day.
>>106667506if that one doesn't zooms in randomly, all we have to do is to SPRO this shit and we'll be back
that's nice and all but let me know when they are brave enough to include aroused genitals in the dataset
ok yeah sure
So are the lighning loras supposed to be 2 high 2 low or 4 high 4 low?
>>106667619never, we can save it!
>>106667611I still would welcome a leak of dalle 3. It has a weird vibe no other model gets right.
>>106667506>The girl from Image 2 is sunbathing on the lounge chair in Image 1>The girl from Image 2 is drinking coffee on the sofa in Image 1excellent, that's exactly what I wanted from an edit model
>>106667678scale is way off in the coffee shop image
>>106667640wan is the only non slopped decent model from xi its an outlier need i remind you of the dozen failed image models
>>106667692that was a bad idea to use the ratio of the 2nd image, it should've been the ratio of the 1st image
>>106667506>60GBdo I need 60gb vram to run this?
>>106667619Qwen Image is less slopped than Flux and has the apache 2.0 licence, if only it wasn't so big it would be finetuned by another suspiciously richfag>>106667704it's the same size as Qwen Image, so a 24gb vram card will suffice (Q8 QIE + Q8 text encoder)
>>106667704where'd you get that number? I'm seeing 40, and that's fp16 so fp8 will be 20
>>106667712I use q8 image/edit on 16gb (4080) without issue, not even using multigpu node.
It's shit. You will cope for a week saying how it's better than nano banana until you too finally admit it's shit.
can you actually run the bf16 qie with blockswap on a 24gb card?
>>106667745who legit cares about if it's better or not than nano banan? this shit is free and wildly uncensored compared to any paid non-local model.you fat cunt.
>>106667546
>>106667745>You will cope for a week saying how it's better than nano bananano one will say that lol, I don't expect QIE to beat nano banana anytime soon
Is Wan 4:3 resolutions only or can it do 9:16 (ie iphone), 1:1, etc? I mostly gen between 4:5 and 9:16 so 4:3 doesn't work for me...
>>106667760the eyes are sus for poor Ryan, but I like the skin texture though
>>106667767this is a dumb question.
>>106667767just try it?
>>106667767you can do any size, smaller is good for fast gens (ie: 640x640 vs 832, etc)
the model has been out one hour nowwhere quants
why aren't all loras migrated to the new model already?
>>106667767If only there was perhaps a guide written that included this very information.
>>106667506not bad at all
Qwen Image Edit PSAAlways add:"without changing anything else about the image"at the end of your prompts if you want to preserve anything at all from the original imageAlso here's a great workflow for the old Qwen Image Edit modelhttps://files.catbox.moe/6wcz4m.png
>>106667839your advice is deprecated anon kek >>106667506
>>106667821Can you catbox those two images so I can try it with the old model? And paste the prompt too if you can
>>106667856I found it on reddit so I can't help you with that
>>106667853No, I'm giving it specifically because I see that the new version also needs that same prompt to be appended for it to preserve things properly and because I see people who are using bad workflows for the old model thinking it's bad but the new one is just an incremental improvement.The new model still doesn't keep the exact same resolution, and still has the same VAE quality loss obviously as it's still not pixelspace.
>>106667872once lodestones finishes his radiance/pixelspace model we will likely see more models adopt it.all pissy trolling aside, it really keep images nicer not having to run em through a vae. this will be important for edit models where you simply can't iterate because the quality gets raped by the vae.
do people still use guidance that makes generation take 2x as long if you use negatives? haven't done SD in a while
quants where?
>>106667642>if that one doesn't zooms in randomly,it does, look at 33 sechttps://xcancel.com/Ali_TongyiLab/status/1970194603161854214#m
>>106667821looks kinda bad, you can see how it completely rejects the blue jacket's texture when outpainting. looks like a 512x512 crop pasted on top
>>106667642spro is a meme lil bro, let it go
>>106667906oh yeah nice catch
not uploaded yet but seems like the first quants are here https://huggingface.co/calcuis/qwen-image-edit-plus-gguf
>>106667916we'll still have to wait for comfy to implement the multi image process too
no qwen image edit plus nsfw finetune yet?
>>106667885>once lodestones finishes his radiance/pixelspace model we will likely see more models adopt it.yep, for edit model it'll be mendatory to go for the pixel space, maybe QIE will be the first to do it who knows
>>106667932This is gonna be brutally heavy on vram and ram tho
>>106667506now that can be interesting to experiment with
>>106667938I still think 20b is overkill, if they manage to keep the quality with 13-14b + pixel space we could manage to run this shit
>>106667886Raising the CFG will inevitably make things slower.
>>106667782it takes me like 10 minutes every time I "just try" something with video, I'd like to not waste hours discovering the handful of things everyone else already knows. Besides I was hoping someone might know some info about how it was trained and whether it was intended to support such resolutions or not
>>106667506it says it supports ControlNet, holy shit
Here's the new vs old Qwen Image Edit models for comparison with the Will Smith example posted above.We need SRPO and no quality loss from VAE like Chroma radiance model has with the new pixelspace research, this is just an incremental improvement that isn't much different as the old model is already pretty good depending on the prompt and workflow.
>>106668015I like the improvement, the face is more accurate and the skin texture is not as slopped as before
>>106668015Old version workflow: https://files.catbox.moe/r0kyif.pngSame workflow as posted at >>106667839>>106667821
>>106668015old version looks so much better ahahah.can you do some more comparisons? i'm out of gpu time on hf
>>106667601>what? so they are updating it monthly or what?lmao, are they really gonna upload a new version of QIE each month? sounds crazy, I guess they realized that the training wasn't over and the curve loss wasn't flattened yet
>>106668015>>106668028Also keep in mind that the deployment parameters of models matter a lot, so we need to wait for the best workflow to be created for a more like to like comparisons.For example with this comparison and that generated image you see there, on the old model I added"Don't change anything about their heads at all, keeping their faces and heads exactly as they are."to the prompt and yet I got the same image as you see there as I get in picrel without adding that sentence to the prompt, meaning the old model for example can't like for like copy the original images like the new one can, despite the images being low quality themselves, it can be a showcaseing of the new model following the prompt better, which is important.
https://huggingface.co/calcuis/qwen-image-edit-plus-gguf/blob/main/qwen2.5-vl-7b-test-q4_0.ggufwhat's this
I just bought a 5060 Ti (16 GB) instead of a 5070 Ti.Not worth 2x the price; still a massive upgrade from my 3060
>>106668134the text encoder? it's probably the same text encoder as the previous Qwen Image model
>>106668137Waste of money. Should've waited until you had more and bought a 4090 or 5090. I can't imagine doing video gens on 16GB.
well I guess q8/other models will be up later today some time.
>>106668151wan q8 works absolutely fine on a 4080 (16gb). the only thing you have to consider is not making the dimensions *too* large cause that needs more vram.
>>106668134>>106668149if that's the same text encoder he's wasting his time, there already have gguf of this
>>106668137nice!
>>106667786imo there's specific resolutions that wan works best with, and i will continue to stick to wan 2.1 resolutions, which is 1280x720 high res, and 480x832 for low res.
>>106668137Should've waited for the super cards. A 3090 is faster than a 5060 ti and you're now stuck with 16gb.
>Sarrs... a second model is released this week.Wtf, I didn't get to fuck around with Wan animate completely yet. We're eating too good.
>>106668161you cant do 720p + future wan models may not do the whole high/low split thing again. if they don't, you'll be forced to use a lower quant like with wan2.1
>>106668181what the hell even is wan animate? is it like vace?
>>106668151waste of money, by getting +70% faster gens? not really. Not very interested in video>>106668168yeah! looking forward to it.>>106668175I considered waiting for them, but when they're coming is not confirmed - and they're hardly going to be anywhere close to MSRP anyway
>>106667506>Multi-image Editing Support: For multi-image inputs, Qwen-Image-Edit-2509 builds upon the Qwen-Image-Edit architecture and is further trained via image concatenation what makes it different to our image concatenation cope we used to do on the previous QIE?
>>106668216>waste of moneyit's literally not. vram is absolute king.>Not very interested in videoit's beneficial for training loras too and future proofing for the latest models, but you do you. 100% wasted.
>>106668212https://humanaigc.github.io/wan-animate/>tl:drCharacter replacer for videogen.
>>106668236whos the girl anyway?
>>106668234>vram is absolute kingwould you take a 96GB GTX 680 over a 16GB RTX 4080?
>>106668236Seems VACE is a component within Wan Animate, so essentially it's doing the same thing except better I guess.