Discussion of Free and Open Source Text-to-Image/Video ModelsPrev: >>107284812https://rentry.org/ldg-lazy-getting-started-guide>UIComfyUI: https://github.com/comfyanonymous/ComfyUISwarmUI: https://github.com/mcmonkeyprojects/SwarmUIre/Forge/Classic/Neo: https://rentry.org/ldg-lazy-getting-started-guide#reforgeclassicneoSD.Next: https://github.com/vladmandic/sdnextWan2GP: https://github.com/deepbeepmeep/Wan2GP>Checkpoints, LoRAs, Upscalers, & Workflowshttps://civitai.comhttps://civitaiarchive.com/https://openmodeldb.infohttps://openart.ai/workflows>Tuninghttps://github.com/spacepxl/demystifying-sd-finetuninghttps://github.com/Nerogar/OneTrainerhttps://github.com/kohya-ss/musubi-tunerhttps://github.com/kohya-ss/sd-scriptshttps://github.com/tdrussell/diffusion-pipehttps://github.com/ostris/ai-toolkit>WanXhttps://rentry.org/wan22ldgguidehttps://comfyanonymous.github.io/ComfyUI_examples/wan22/>NetaYumehttps://civitai.com/models/1790792?modelVersionId=2298660https://nieta-art.feishu.cn/wiki/RY3GwpT59icIQlkWXEfcCqIMnQdhttps://gumgum10.github.io/gumgum.github.io/https://huggingface.co/neta-art/Neta-Lumina>Chromahttps://huggingface.co/lodestones/Chroma1-BaseTraining: https://rentry.org/mvu52t46>Illustrious1girl and Beyond: https://rentry.org/comfyui_guide_1girlTag Explorer: https://tagexplorer.github.io/>MiscLocal Model Meta: https://rentry.org/localmodelsmetaShare Metadata: https://catbox.moe | https://litterbox.catbox.moe/GPU Benchmarks: https://chimolog.co/bto-gpu-stable-diffusion-specs/Img2Prompt: https://huggingface.co/spaces/fancyfeast/joy-caption-beta-oneTxt2Img Plugin: https://github.com/Acly/krita-ai-diffusionArchive: https://rentry.org/sdg-linkBakery: https://rentry.org/ldgcollage>Neighbors>>>/aco/csdg>>>/b/degen>>>/b/realistic+parody>>>/gif/vdg>>>/d/ddg>>>/e/edg>>>/h/hdg>>>/trash/slop>>>/vt/vtai>>>/u/udg>Local Text>>>/g/lmg>Maintain Thread Qualityhttps://rentry.org/debo
>>107294986>Calling it now: Flux2-Small, the only version they will release the weights for. Will probably be pretty good actuallyhttps://xcancel.com/bdsqlsz/status/1992248711603454072#mit might be Flux 2 small, he said that it won't be a chinese model
>>107294079>why does it look bad?????There are wan 2.1 workflows that are simpler than look better than mine too I was also asking because I'm not sure what the point of the second sampler in anon's workflow even is beyond an upscale step to get high resolution and I can't replicate it on a basic ksamplerActually I was looking for the citivai workflow to show you but I found like 3 other ones so honestly I'll just do my own research tonight and find a really good workflow for text to image that uses as little custom nodes as possible
>>107295046>SOTA in terms of portrait photography>posts the sloppiest base SDXL-esque picture everAre these guys just slop blind? That literally looks like a 3D render.
>>107295185>Are these guys just slop blind?they probably are, in china women have so much makeup they think plastic shit is the default skin or something
>>107295208>in china>>107295046>he said that it won't be a chinese modelzero reading comprehension award
>>107295224the guy that says it looks realistic is chinese, are you fucking retarded?
>twitter screencap
>>107295235no u
>>107295245>X (the everything app) screencapFTFY
Do you think we might get local Sora in ~5 years? The porn (and electricity bills) would be unreal.
>>107295267This depends entirely on the GPU market.
>>107295208It's pretty bizarre, maybe he thinks more details means SOTA in realism rather than just a proper photoreal look, or he could be talking purely based on aesthetics (think MJ type of aesthetics). MJ's "realism" while up there in details and aesthetics isn't really strictly realistic either.
>>107295267>Do you think we might get local Sora in ~5 years?Sora has IP characters and IP styles in there, no company is gonna give that to localkeks so, never
is it still itoddler only kek
is hyvid2 good as a t2i model at least?
>>107295185>one tiny image in a screencap from an unreleased model>entire model is slop trashplease, please, touch grass
>>107295158That's pretty great, mind sharing prompt?
>>107295389Often a single image is all one needs to see. Like those guys calling Seedream 4 photorealistic but then they're posting slop that looks like SDXL.
>installed comfyui>put this safetensor file GonzaLomo XL/Flux/Pony in checkpoint folder and tried unet folder>it doesn't show up in comfy so I drag the file in>install the custom nodes>now thisWTF. Any help?
New Hunyuan is just 2-second long???
>>107295612no, 5 seconds takes an age but 2 is reasonable
>>107295527Click on the selector list and pick it once more. Also try to put it into checkpoint folder.
>>107295288>>107295185SD1.5 as a video?
How does hunyuan1.5 compare to wan2.2?how does it perform with anime?
>>107295906the myriad of snake oil speedups wanx has notwithstanding; wan is better at sequential actions, a broader range of single actions, likeness conservation and clarity. Hyvid is 24fps, and... vramlet friendly
>>107295906>How does hunyuan1.5 compare to wan2.2?it's bad, stick with wan
>>107295906the current model is incredibly slow. we need a 4steps version. the good news is that the t2v model is uncensored. i don't know about i2v
The secret to good text-to-image with WAN is literally just using karras scheduler instead of simplepicrel 40 steps cfg 2.0 euler/karras low noise only
>>107295612Yep, what this anon >>107295626said, 2 secs takes only 2 mins on my 3090, good for testing lots of different prompts
Here is 1080p imagegen with the lightning Seko rank64 lora. 37 seconds 10 steps euler/karras>>107296057I'm getting flashbacks to December 2024 with this garbagethis shit is dead on arrival. The only thing I'd be interested in even downloading these weights for is looking into the fact that they mentioned using both FlashAttention and SageAttention in Hunyuan which is retarded since SageAttention is a replacement for FlashAttention so there may be a free lunch, but like who even cares when wan 2.2 still mogs
>>107295923The clarity in Wan may be there but in T2V the slop is more noticeable and the video's motions are slowed. A decent NSFW tune and Hunyuan comes out on top due to being easier to adopt.
With the original 2.2 lightning t2v lora
and finally the original wan 2.1 lightx2v loraI think I am now a Seko-recommender over just using the 2.1 lightning lora for text-to-x purposes>>107296106>in T2V the slop is more noticeable and the video's motions are slowedvideo motion was mostly fixed for me with the release of the Seko T2V lightning loras and I haven't really seen anything but improvements using it compared to previous lightx2v releases, especially the wan 2.2 ones
>>107295985>we need a 4steps version.No need to slop it further. That's the issue with Wan. Takes up to 20 mins for a simple smooth non-jagged videos because the community has accepted color shifting speed LoRAs as the standard.>>107296092Looks pretty good on some prompts desu. Way better solution than lightx2v Wan.
>>107296142>tested one prompt>draws definite conclusion
>>107296178>color shifting speed LoRAsnever had the problem with this even for older loras, this is a quantlet/wf issue
>>107294974Is the "1girl and Beyond" rentry guide and it's accommodating workflows still worth using in the current year?Would it be worth exploring different workflows for WAI? Any worthwhile benefits?
>>107295267>Do you think we might get local Sora in ~5 years?we will definitely get some form of a video + audio model with lipsync in 2026. all the technology has been invented now it's just up to someone to collect the data and train itat worst, it's going to to be wan 2.2 + audio addon but at best it'll be a fully new model. and of course it will be no less than 24gb of vram to run, probably 32gb minimum at Q6_K so mentally prepare for that because memory is not getting cheaper or more available any time soon>>107296178>Looks pretty good on some prompts desu. Way better solution than lightx2v Wan.seko is just the newest t2v release by lightx2vseko v2 actually, i guess v1 sucked since i never heard about ithttps://huggingface.co/lightx2v/Wan2.2-Lightning/tree/main/Wan2.2-T2V-A14B-4steps-lora-rank64-Seko-V2.0>>107296186you can't demotivate me deebster, I already got within 10% of the workflow with shark-autism nodes in this gen>>107296198its hard to take opinions seriously because people may have misconfigured something. i spent 11 hours 3d printing something and thought it was the model that was the problem and I was super pissed at all the retards that gave it 5 stars until I realized I made a mistake
>>107296142is this fucking AI ?
>>107296142>video motion was mostly fixed for me with the release of the Seko T2V lightning lorasIt still looks pretty poor on some caseshttps://huggingface.co/lightx2v/Wan2.2-Lightning/discussions/64And they all have something in common, slow motion and slopped (though even base model has the plastic skin so that can't be fixed easily).Still waiting on updated I2V LoRA though, maybe it's currently better than I think it is for that.