Discussion of free and open source text-to-image modelsPrevious /ldg/ bred : >>1029409412 4 8 16 32 64 128 Edition>Beginner UIFooocus: https://github.com/lllyasviel/fooocusEasyDiffusion: https://easydiffusion.github.ioMetastable: https://metastable.studio>Advanced UIForge: https://github.com/lllyasviel/stable-diffusion-webui-forgereForge: https://github.com/Panchovix/stable-diffusion-webui-reForgeAutomatic1111: https://github.com/automatic1111/stable-diffusion-webuiComfyUI: https://github.com/comfyanonymous/ComfyUIInvokeAI: https://github.com/invoke-ai/InvokeAISD.Next: https://github.com/vladmandic/automaticSwarmUI: https://github.com/mcmonkeyprojects/SwarmUI>Use a VAE if your images look washed outhttps://rentry.org/sdvae>Model Rankinghttps://imgsys.org/rankings>Models, LoRAs & traininghttps://aitracker.arthttps://huggingface.cohttps://civitai.comhttps://tensor.art/modelshttps://liblib.arthttps://github.com/Nerogar/OneTrainerhttps://github.com/kohya-ss/sd-scripts/tree/sd3>SD3.5https://huggingface.co/stabilityai/stable-diffusion-3.5-largehttps://replicate.com/stability-ai/stable-diffusion-3.5-large>Sanahttps://github.com/NVlabs/Sanahttps://8876bd28ee2da4b909.gradio.live>Fluxhttps://huggingface.co/spaces/black-forest-labs/FLUX.1-schnellhttps://comfyanonymous.github.io/ComfyUI_examples/fluxDeDistilled Quants: https://huggingface.co/TheYuriLover/flux-dev-de-distill-GGUF/tree/main>Index of guides and other toolshttps://rentry.org/sdg-linkhttps://rentry.org/rentrysd>Try online without registrationtxt2img: https://www.mage.spaceimg2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest>Maintain thread qualityhttps://rentry.org/debo>Related boards>>>/aco/sdg>>>/aco/aivg>>>/b/degen>>>/c/kdg>>>/d/ddg>>>/e/edg>>>/h/hdg>>>/tg/slop>>>/trash/sdg>>>/u/udg>>>/vt/vtai
>>102949088Oh yesMore
>>102949167You can't do that and make a good model ever
>>102949205>You can't do that and make a good model everthis, at this point I'm just waiting for China to deliver the goods, the west has lost the AI battle
billions must gen
>>102949193I've got an very old gen from you
>>102949211China has a no shit given about IPs, but they have their own problem around nsfw, so not sure they'd do nice things either.Though at least the base quality would probably be leagues better than not using any copyrighted or "allowed" "safe" content ever.The consent obsession in the west is turning everything into shit.
>>102949238>China has a no shit given about IPs, but they have their own problem around nsfw,desu they aren't much more cucked on nfsw than the west for example, Moshi can do female nudes just fine
>3 of my gens made it into the OPExcellent>>102949229Great stuff, flux?
>>102949245>Moshi can do female nudes just finekino piercelhttps://files.catbox.moe/t6276z.mp4
>>102949245Oh the difference is just that individuals in the chinese companies have no problems with nsfw used in their training, while the law is very anti nsfw in general, so they always tread lightly, at leas in public. In the west it's the opposite, nothing nsfw is illegal, so the individuals do that by conviction. It's kind of sad really.
>>102949252SDXL
>>102949265>In the west it's the opposite, nothing nsfw is illegal, so the individuals do that by conviction. It's kind of sad really.amen anon, fucking amen... I'm just glad China exists at this point because if we had to only rely on the west to get good shit I would end up depressed kekhttps://www.youtube.com/watch?v=XZcN6lIVmSoTo be fair, SD3.5 got a little better in that department, it can do females nudes now, I guess that removing Emad the prude from the team was a good idea after all
>>102949252since you're still using webui/forge, you should play with this extension https://github.com/muerrilla/sd-webui-detail-daemonI remember using it but never got ported to comfy, thank to migu poster to post about it in the last thread
>>102949318>I remember using it but never got ported to comfy, thank to migu poster to post about it in the last threadany equivalent for comfy?
>>102949318for posting about it**
>>102949295>Emad the prudemy view is that most of them are prudes, they all write the same "safety" shit (which always means no nsfw)
>>102949346https://www.reddit.com/r/comfyui/comments/1g9wfbq/simple_way_to_increase_detail_in_flux_and_remove/
>>102949284What model, it looks good>>102949318I will look into it, thanks mate
>>102949371thanks man
>>102949318that dev is so good, this is another extension that doesnt have a comfyui porthttps://github.com/muerrilla/stable-diffusion-NPW
>>102949371Idk about that method, I can't find a good value between 0.95 and 1, it's not consistent, what value consistently look the best to you anon?
>>102949477they all look great anon, 0.95 seems to be the perfect value
that one pic in the collage with the miku tranny is savage
>>102949522>they all look great anonlook at the prompts, it's really a hit or miss, sometimes they miss the text to be displayed, sometimes it doesn't understand it aswell, I mean for you that's all right you go for simplistic 1girl images but if you go past that I'm not sure if this is a good deal
>>102949536He will never be a Migu, many such cases :(
>>102949318This detail daemon extension seems to work wonders, thanks mate
kek
>>102949546well, you could try re-formatting your prompt, have you tried using a LLM bot to rewrite your prompts? that can help with formatting and grammar
Can you multigpu on the video models that just got released? I have 1 3090 and 1 3080...
OK but why would you do any of this
>>102949601because it's on ComfyUi, you can put the text encoder on the 2nd gpu, but that's all, I don't think you can do inference paralellism
>>102949575don't thank me, thank migu, I totally forgot about that extension until he posted about it
>>102949613thanks, is that faster doing that or is there no point?I've also seen that : https://github.com/victorchall/genmoai-smolBut it's monogpu.
>>102949622>thanks, is that faster doing that or is there no point?I guess that's faster the moment you want to change the prompt, it won't unload and reload the text encoder, but desu it's not that much of a deal, making a video is fucking long so you won't care if you won 10 sec because of the text encoder kek
>>102949477here's another example
>>102949585>50s horor (makeup)>80s horror (soul)>modern horror (cgi)
>>102949642I see, sad
>>102948557I did some blind comparisons with flux lite. I've got four sets of images of the same 10 seeds of this knight guy.Lite plus artstyle lora vs Dev Q8 plus artstyle lora:Results: 1 vote for lite, 9 votes for dev.The artstyle lora does mostly work, but it's noticeably capturing only like 90% of the style.And here's the second comparison without the artstyle lora:Lite with no lora vs Dev Q8 with no lora:Results: 3 votes for flux lite, 7 votes for dev q8So in conclusion: flux lite is 23% worse in exchange for being 23% faster.
>>102949648>>102949618https://www.reddit.com/r/comfyui/comments/1g9wfbq/comment/lte0rdg/?utm_source=share&utm_medium=web2x&context=3seems like they improved on the Sigma thing with this "LyingSigmaSampler" node, it adds details without changing the overall picture
>>102949674>Lite with no lora vs Dev Q8 with no lora:>Results: 3 votes for flux lite, 7 votes for dev q8ok so that's a fucking nothingburger, Q8 is 12gb big wheras flux lite is 16gb big, and Q8 wins lol
>Moderated: QUALITYThat's a new oneStill let's you download the gen though, just doesn't show it to you online for some reason
https://github.com/kijai/ComfyUI-MochiWrapperOk gentlemen, I just did a bf16 vs fp8 comparison, they have the exact same settings (pircel image)>A 22 years old woman dancing on the Hotel Room, she is holding a Pikachu plushbf16https://files.catbox.moe/fubqwj.webmfp8https://files.catbox.moe/92gksm.webm
>>102949717>>102949730>>102949738How does one achieve this
>>1029492953.5 Medium is apparently multi-res while Large isn't, so it might be better even
>>102949764Not use models with zero aesthetic like flux trained on synthetic slop and instead use models trained on actual arthttps://civitai.com/models/833294/noobai-xl-nai-xl
>>102949772>3.5 Medium is apparently multi-res while Large isn't, so it might be better eventhere's no way a 2b model is gonna be better than a 8b model, right? I can feel SD3.5M will be a distilled version of SD3.5-8b
>>102949675its amazing how webui extension devs mog the comfyui ones, adetailer, npw, detailerdaemon, resharpen, hires-fix-tweaks, webui-controlnet I could go on...meanwhile custom nodes all they do is pollute your comfyui with schizo options and tweaks, the only good custom node dev is kijai, the rest are really autistic
>>102949717>>102949730im in awe
>>102949795You assume everyone's purpose is to create "quality". Some of us want to generate an image that corresponds to the prompt.
>>102949755We were so hyped after those previews... Minimax at home they said...
>>102949675How's that work?
>>102949816>You assume everyone's purpose is to create "quality". Some of us want to generate an image that corresponds to the prompt.if this was true, SD3M would be a popular model and not a meme, because it follows the prompt well
>>102949823>We were so hyped after those previews... Minimax at home they said...we'll get Minimax at home anon, it'll be the HD version, that's the one they probably use in their demoshttps://www.genmo.ai/blog>Today, we are releasing our 480p base model, with Mochi 1 HD coming later this year.
https://github.com/kijai/ComfyUI-MochiWrapperHas anyone managed to make sage attention work on windows? got those weird ass errors
>>102949889These look nice.
>>102949823Genmo falls off very quickly as soon as you stray from the training data. Heavy cherry picking is needed too. But let's not lie to ourselves: img2video is the only use case valuable for actual production companies and larger projects since you need some way to control subject consistency. >>102949847Even if the HD version is also underwhelming and suffers from the same issues as the 480p version, if it's Apache 2 we'll just have to wait for PonyGenmo in 2025. I highly doubt many people will be using the HD version locally given the VRAM requirements though,even if there's a god-tier 4bit quant of the model.Hopefully the eventual fine-tunes improve prompt adherence too because after being spoiled by the natural language understanding of Flux and the Chinese video models it's really frustrating when genmo doesn't listen. The only consolation is I wasted 20 cents of some Silicon Valley incubators seed funding on the failed gen
>>102949800no, it's literally a different model apparently. He's saying 3.5 Large is just a finetune of old 3.0 Large, but 3.5 Medium was done up from scratch which let them add new stuff. So he believes that's why it has multi-res training but Large doesn't. It definitely won't be distilled like Large Turbo, also.I think it's possible for it to have worse prompt adherence but better image quality, at least.
>>102949908>Even if the HD version is also underwhelming and suffers from the same issues as the 480p version, if it's Apache 2that's my concern, maybe it won't be apache 2 for the HD version
>>102949935Handsome guy. At first I though he was a lady.
>>102949908>highly doubt many people will be using the HD version locally given the VRAM requirementsI'm sure that's possible, the 480p + fp8 version only asks for 12gb of vram during inference and only 10 gb of vram during vae decoding if you go for vae tilt
>>102949960In that case we cope with 480p or just wait 6 months for the new hotness base model. As soon as a company finds a way to be profitable with AI video competition will increase, and Minimax beating out kling and now the open source release of genmo (with the paper coming out soon allegedly) shows that there's no moat for video models
>>102950000You underestimate the VRAM of the average normie, anonEven an enjoyable SDXL experience is still too out of reach for a lot of the population since they're on 8GB or less cards
>>102950017The vram issue is because inference software sucks.
vram is waaaayyyyy too large, for ai.The reason to have loads of vram, for games, is that players can be very random.
>>102950070>vram is waaaayyyyy too large, for ai.true, but we have no other choice, you can't get a good model with 1b, maybe if we improved the transformers architecture that would be possible, we'll see about that
>>102950017SDXL is totally fine in Comfy with Nvidia Turing or higher arch cards even at 6GB
>>102950017How many normies do you think are running AI models locally? Their interest in AI is playing for 5 minutes with whatever the new shilled thing (DALLE3/Suno/Minimax/etc.) is and moving on
>>102950129>How many normies do you think are running AI models locally?a lot, there's a lot of discussions in the US about destroying the AI local ecosystem by making laws that would make impossible to train uncucked local models
>>102950165>there's a lot of discussions in the US about destroying the AI local ecosystem by making laws that would make impossible to train uncucked local modelsHuh? Where?
>>102950182https://leginfo.legislature.ca.gov/faces/billNavClient.xhtml?bill_id=202320240SB1047In Commiefornia especially, a lot of AI companies are in there unfortunatly
I FUCKING HATE BUILDING BINARIES IT NEVER WORK FUUUUUUUUUUUUUUCK
>he's actually trying to build flash attentionGive up lol, you're wasting your time. You have already spent more time than you'd ever save by slightly faster gens.
>>102950218>slightly faster gensflash attention is useful for memory, which is the most important ressource in AI
>>102950017According to the latest steam survey around 25% of users have 12gb or more of VRAM which is pretty good considering a big portion of steam users are poverty Russians and Sudacas just playing DOTA2 on shitboxes
>>102950194>>102950165>make impossible to train uncucked local modelsTechnically if you're using less than $10,000,000 in computes and keeping the model to yourself it's exempt. Still gay desu.
>>102950257>According to the latest steam survey around 25% of users have 12gb or more of VRAMThat sounds crazy when 12GB+ vram only really started existing at all after the RTX 2000 series
>>102950334>if you're using less than $10,000,000 in computes and keeping the model to yourself it's exempt. Still gay desu.If the US don't want to do it, someone else will, and that's exactly why China will win the AI race, they don't want to kill the AI advance, that's the opposite, they want to make the best AI possible, oh well.
>>102950370it's not that surprising, the games of nowdays aren't optimised anymore so you need a shit ton of vram, and the 3060 has a lot of vram and is cheap as fuck because Nvdia was making money out of the crypto grifters during the crypto boom in 2021 kek
>>102950113If you're okay with a single pass and no hiresfix and consider that "totally fine" sure>>102950381Video games being unoptimized doesn't give thirdies money they can spend on new GPUs though. I guess it doesn't matter since the future of inference is with NPU/IPUs anyways
>>102950420>I guess it doesn't matter since the future of inference is with NPU/IPUs anywayswhat's that?
>>102950443Neural/Inference processing units. Basically dedicated hardware for running inference of models. Right now they're just embedded into the CPU and used for small things like helping blur your webcam when on a zoom call etc but both AMD and Nvidia (as well as a lot of startups probably) are working on larger discrete acceleratorssource: I worked at AMD for a bit on a project related to NPUs
>>102950494those NPUs will be able to do graphic stuff aswell? Like video games and shit? because the advantage of a GPU is that it can do both video games an AI
>>102950494>Neural/Inference processing units. Basically dedicated hardware for running inference of models.how much faster will it be? Let's take a comparison, how much faster a NPU would be against a 3090 for example on inference?
>>102950498No, they're only for running inference. The advantage of an NPU is that it'll be much much cheaper than a 3k 4090 for the same ML performance which is interesting to me because I don't really care about gayming >>102950519.>how much faster will it be? Let's take a comparison, how much faster a NPU would be against a 3090 for example on inference?Depends on the NPU. It'll be more efficient for sure. I wasn't working on the ML side of things but more the hardware side so I can't give you any estimates but I'm assuming it'll be similar to ASICs for crypto mining where they completely mog GPUs at the same price point since they're optimized for that specific task
https://github.com/kijai/ComfyUI-MochiWrapperOk I managed to make sage attention work, here's the steps:1) Install triton with those binarieshttps://github.com/woct0rdho/triton-windows/releases/tag/v3.1.0-windows.post52) Install python 3.11.9 on your computerhttps://www.python.org/downloads/release/python-3119/3) Go to C:\Users\Home\AppData\Local\Programs\Python\Python311 and copy the "libs" and "include" folders4) Paste those folders onto ComfyUI_windows_portable\python_embeded
>>102950544forgot step 0) -> install sage attention -> pip install sageattention
>>102950544>>102950555>sage attention >Quantized Attention that achieves speedups of 2.1x and 2.7x compared to FlashAttention2 and xformers, respectivelyNeat
>>102949411>my chatbot at the start>by the third reply
1girl, river, walking
>>102950597OMG IT TETOhttps://www.youtube.com/watch?v=dDlljvDSLSg
https://github.com/comfyanonymous/ComfyUI/commit/f82314fcfcc4d83b307f30f06e77db44e95686cf#diff-ff903427b64d57103d983ee5eeb6c33ffb5ab760526a48a35ad42d5afafdf2fbR360-R361Interesting, so the beta had an issue on ComfyUi, maybe that could explain why that one tend to overburn the image compared to the other samplers
>>102950542>Depends on the NPU. It'll be more efficient for sure.what about the memory though? it's the most important thing, it's easier to add memory onto a NPU compared to a GPU?
>>102950597Now do 1girl, river, wanking
>>102950742nice style
>>102950748>wanking
>>102950724>what about the memory though? it's the most important thing, it's easier to add memory onto a NPU compared to a GPU?Memory is cheap anon, don't let Nvidia and AMD trick you into thinking it's not because of their ridiculous prices at the data center tier. They could both sell us a 32GB vram graphics card for under $1000 if they wanted to. There's just no reason to ever do that because they know people will pay 4x the price or more for it due to the gold rush and their fiduciary duty to shareholders to maximize profit
>>102950761the problem is that those are the same greedy companies (Nvdia and AMD) that will make those NPUs, so yeah maybe NPUs will be fast, but if they only have 16gb of vram you're just as fucked as if you had a 16gb of gpu vram
>>102950420You can do hi-res-fix up to like 1.5x with 6GB VRAM + 16GB RAM on an SDXL model no problem. Again the card has to be Nvidia and it has to be Turing or later, though.
so many 1girl sloppu ;_;
Remember this?https://blackforestlabs.ai/up-next/I thought they would never release it to the public, but now that we got Mochi, maybe they'll try to enter into competition with them, that would be cool desu
>>102950788Sure but my point is that a professional or prosumer tier NPU will be cheaper than the equivalent GPU for equivalent inference performance >>102950906>the card has to be Nvidia and it has to be Turing or later, though.I stand corrected then. How long would a 1024x1024 hiresfixed to 1536x1536 be on 6GB+16GB? I consider anything longer than 2 minutes per gen "unusable"
>>102951259>my point is that a professional or prosumer tier NPU will be cheaper than the equivalent GPU for equivalent inference performanceIdk about that, if only Nvdia makes them good, they'll just make expensive as fuck like their overpriced GPUs
flux gave this bitch a amulet
>>102951284I read it as "a mulet" so I was expecting this kek
>>102950544>>102950200btw with this method you'll be able to build your own binaries of flash attention
>>102951284It didn't fill her other equipment slots though
>>102951140These models are all trained really well for 1girl slop, usually if i try to make anything else its shit. Yes there's a skill issue component, but it's easier to go with the flow and produce 1girls.
>>102950568>>Quantized Attention that achieves speedups of 2.1x and 2.7x compared to FlashAttention2 and xformers, respectivelyI wanted to verify by myself, so here you go:>Donald Trump mocks and laughs at a kneeling, weeping Kamala Harris>121 frames, 64 steps, seed 42>Sage: 35:28<00:00, 33.26s/it, 16.2gb VRAM during inferencehttps://files.catbox.moe/dyi3cm.webm>flash_att: 42:49<00:00, 40.14s/it, 16.2gb VRAM during inferencehttps://files.catbox.moe/dqxrat.webmSage is definitely faster, use it >>102950544
I wanted a fisherman catching a bass. Instead i get... a bass fisherman? I mean, if I'd written the words bass fisherman in the prompt, I'd get it.
>>102952099So it's faster, but not 2.7x fasterI wonder how long it'll take to go from 30+ minutes to under 5 between optimizations and hardware improvements
>>102952099the neat part is that sageattention is also used on image models like flux, and it's also faster than going for the sdpa optimisation, I used to get 3.6s/it on sdpa, now I'm at 2.91, let's go dude!
>>102952128>So it's faster, but not 2.7x faster*2.1x, 2.7x was for segatt vs xformers, but yeah of course they went for the most extreme of cases to get this number (as every researchers do kek), having a 1.2x speed is still really cool though and when I look at those 2 videos I notice the quality on sage is better, so it's a win/win situation there
Does everyone know what cuda malloc do? When I disable it I don't notice any difference
>>102949755Thanks for the comparison.II'll give bf16 a go today. I think you had an "unlucky seed" for your test btw.
>>102952265>II'll give bf16 a go today. I think you had an "unlucky seed" for your test btw.probably yeah, it was on fp8 too, unfortunately you can't go too far on bf16, when I go over 60 frames it overflows, I wish there was something in between fp8 and bf16 so that you'll get the quality and enough space to stack up the frames
>>102952287.>I wish there was something in between fp8 and bf16 so that you'll get the quality and enough space to stack up the framesWhy does bf8 not exist for fp8 like how bf16 exists for fp16?
kek, so this general just devolved into coomposting too. its just sdg 2.0 now
>>102952412These are great, having human for scale is a nice touch
So can you run that mochi thing on 2x3090, on comfy (on linux) ?
>>102949193>>102949229>>102949252>>102949318>>102949358>>102949371>>102949409>>102949522>>102949599are these real or AI? I don't have my glasses right now
>>102950544https://reddit.com/r/StableDiffusion/comments/1gb07vj/how_to_run_mochi_1_on_a_single_24gb_vram_card/Just made a long ass tutorial for those who weren't able to make Mochi work on their computer
>>102945840ok I've made this program (claude did) to help simplify the second half of the process:https://github.com/rainlizard/EasyQuantizationGUI/releasesFor anyone on Windows who wants to convert the 24GB flux1-dev.safetensors file to 12GB .gguf it should be pretty easy now.
>>102952977>So can you run that mochi thing on 2x3090, on comfy (on linux) ?you can make it work with a single 3090 >>102953060
guys I finally got the "room heater" comfyui extension
>>102953224>150 stepswhat? you can go for lower, the default value for Mochi is 64 steps
>>102953068>For anyone on Windows who wants to convert the 24GB flux1-dev.safetensors file to 12GB .gguf it should be pretty easy now.nice job anon, I'll keep this link in mind I wanna make gguf of other flux variant models
>>102953278I think the examples used by mochi use 200, and I can wait. I can't fucking install sageattn so Im using pytorch's default attention, how much slower does it make the process?
>>102953302>I think the examples used by mochi use 200I've heard it was 64, how did you get that number?>I can't fucking install sageattn so Im using pytorch's default attentionWhy? It works on both windows and linux >>102950544
>>102953325>how did you get that number?half-remember reading it somewhere like yesterday, maybe I made it up idk
>>102953325I know but triton gives me a message about not being able to find Windows Kit/10/Include and I've been fiddling with pip shit for an hour now so I've given up
>>102953357>triton gives me a message about not being able to find Windows Kit/10/Includedid you do this? >>102950544>3) Go to C:\Users\Home\AppData\Local\Programs\Python\Python311 and copy the "libs" and "include" folders>4) Paste those folders onto ComfyUI_windows_portable\python_embeded
>>102953372Whoops, no. Ill do it after I get this gen, thanks
>>102953060Why specifiy 24gb when it works on a 16gb card, well at least in Linux it does.
>>102953525>Why specifiy 24gb when it works on a 16gb cardtrue, but my tutorial is for both fp8 and bf16 and your 16gb card can't handle the bf16
Flux 8B modelhttps://huggingface.co/Freepik/flux.1-lite-8B-alpha
>>102953753it's still distilled right?
The Russian teens are middle eastern indians now wtf. What garbage captioner did the genmo team use or is the website secretly modifying prompts?
Here's the better of the two to show it wasn't just a fluke
>>102953771>>102953782>1696x960that's the official resolution you got when downloading those videos? if yes then it means that their demo is using the HD version, we only got the 480p model locally yet
>>102953545I tried, you are right, much sadness :(Good redit post though.
>>102953835>I tried, you are right, much sadness :(it's ok, the fp8 isn't that different to the bf16, you don't loose much and you can still run Mochi>Good redit post though.thanks :3
>>102953790That's the official resolution yeah. That would explain why the catbox webms of the 480p gens look so much worse.
>>102953848All my gens with fp8 have had considerable "wavy mirage" effects on motion, I'd hoped the bf16 would reduce that, facial coherence and generation is sometimes as spot on as FaceDetailer for sdxl but it's wildly unpredicable through the 8 or so gens i've done so far.This is the first instance since I bought my GPU last Christmas (4060ti) that i've desired a new card due to the processing time, it was good while it lasted.
>>102953918>All my gens with fp8 have had considerable "wavy mirage" effects on motion, I'd hoped the bf16 would reduce that,I think it does yeah >>102949755, but the quality is still not on par with their demo (that probably use the HD version), I'm trying to increase the resolution and see what it does kek (this resolution and fp8 asks for 16gb of vram)
>>102953918>All my gens with fp8 have had considerable "wavy mirage" effects on motionmaybe Comfy's new scaled fp8 could fix this, it's supposedly better in quality than the regular fp8, dunno how to make those thoughhttps://huggingface.co/comfyanonymous/flux_dev_scaled_fp8_test
She's looking at me judgingly because of my promptletnessThe online site's text prompt moderation works like Luma's, where one prompt works but a very similar one with one word changed doesn't. In this case "bathroom" was blocking the prompt until I changed it to "apartment"
>>102954000>...dunno how to make those thoughI don't think Comfy would do it specifically for mochi but some techniques might be applicable for people in that area to use if they wanted to try.Also, I think captcha is a about to break, had a few just now where the slider doesn't change the image.
Remember that meme? Good times.https://xcancel.com/__theben/status/1829554120270987740
>>102953070thanks anon
SD3.5L is pretty decent so far
>>102954405anatomy is very bad for me and it doesn't work well with resolutions outside of 1024x1024, but they said they used a improved architecture for sd 3.5 medium so i'm waiting for that
Yep, their demo is definitely the HD one, because on the demo it's rendering a 1696x960 and when I put those resolutions on the local 480p model I got this blurry shit kek
>>102954433>they said they used a improved architecture for sd 3.5 medium so i'm waiting for thatdesu they should've used the improved architecture for the 8b aswell, they could've beaten flux that way, their 2b model won't change the needle I'm afraid
>>102954448im assuming the 8b was just released as a way to generate hype and the medium model is what they put most of their work into. if they released the medium model by itself it would have been met with disappointment ig
>>102954466it's just a 2b model, they shouldn't focus on this little shit, they had the chance to beat flux with their 8b and they didn't go for it, why are they so fucking retarded??
>>102953068Just curious, are we able to quantize any of the general image diffusion models like sd 3 or pony and run it from sd webui forge?
>>102954490they are a collapsing company, i doubt they have the resources to focus on the 8b anymore. back when sd3 released, the old 8b was the one they locked behind the api to generate some coin out of desperation and even that failed because they were outclassed by the competition
what do you jabronis put in your negative when genning with flux? Bad hands, bad image, ugly, deformed? Does any term in the negative improve overall image quality?
>>102954433Okay, I'll keep having a good time though
>>102950544>>102953060Do I have to use portable Comfy? I assume dropping those Python311 libraries into my Python 312 install won't work?
>>102954569>is happy generating white plates with strawberries and chocolate splattered aroundI guess I'll envy you.
>>102954529No, the only use is if your gen has something you didn't want, add that to negs and regen.
>>102954490>they had the chance to beat flux with their 8b and they didn't go for itThey tried, a Base Model with naked women is the proof they really tried.The bad thing with incompetence is that you can't do it no matter how hard you try.
>>102954707>I assume dropping those Python311 libraries into my Python 312 install won't work?I guess you can install python 312 on your computer and put those libraries, don't forget to also download the triton that has python 312 in it
>>102954940I mean, they will be using a better architecture for SD3.5-2b, if they also did that for SD3-8b, I'm sure they would've caught up with Flux
>>102954707>Do I have to use portable Comfy?If you know where is located your python.exe that is used to run your Comfy, you can remix this tutorial to make it work I guess
https://github.com/kijai/ComfyUI-MochiWrapperhttps://huggingface.co/Kijai/Mochi_preview_comfy/blob/main/mochi_preview_dit_GGUF_Q4_0_v1.safetensors>GGUF.safetensorsdidn't know it could work like that :d
>>102955078can I run this shit on my 3060 12gb?
>>102955097without any issues, fp8 was asking for a bit more than 12gb of vram
>>102955078ok, I'm mildly aroused now.
>>102955078this mf transformed a gguf into a safetensor, that's a fucking genius move if you ask me, maybe if we do the same for flux gguf and transform them into safetensors, they'll be as fast as fp8 and won't be slow motherfuckers when we add loras on top of them?
>>102955168neat
>>102955078it's crazy to think of where we were at the end of 2022 compared to now
it's impressive how much of an impact this tiny node has https://reddit.com/r/comfyui/comments/1g9wfbq/simple_way_to_increase_detail_in_flux_and_remove/
https://civitai.com/models/883426/verus-vision-10b?modelVersionId=988886This is the first finetune of dedistill, I like the skin texture, it doesn't look like plastic anymore
>>102955467>This is the first finetune of dedistill>finetunenot even close, it's just a Lora mergefinetune
They'll never let us down
A little bread from heaven
>>102955532they know what to do, a bigger model will make their shit relevant, we're waiting BIGMA my dead Mr. Lawrencehttps://www.youtube.com/watch?v=jElCDsfptVU
>>102955554I know you love to burn money doing inefficient shit, but efficiency makes it possible to train big models. Or maybe you like having big models restricted to people with $100k in hardware?
>>102955566>Or maybe you like having big models restricted to people with $100k in hardware?why are you whining like that? the llm fags are dealing with fucking 70b models, you have no idea how good you have
>>102955593Yeah, must be nice relying 100% on Meta lmao
>>102955593>>102955598a tale as old as time
>>102955548goddamnit just give us the model from the demo I don't give a fuck give me new toys
>>102955610you are already tired from the SD3.5 toy from yesterday??
>>102955619you should've stopped caring about SAI awhile ago
>>102955467I tested it a few threads ago, it doesn't really improve over regular dedistill while removing a lot of flexibility. I believe any perceived skin detail increase is just the extra grain Verus adds.Left to right, Verus>DeDistillfp8>Distilledfp8
>>102955643I kinda do, last time I gave a fuck about them was during the SD3M fiasco, it was funni, and then they released SD3.5, I tested it out for an hour yesterday and I quickly noticed it wasn't at the level of flux, put it on the trash and went on with my life, I'll care about them for the next release though, if it's still inferior to flux it'll be 1 more hour, if not then they succeed on their redemption arc kek
>>102954992>>102955029What model is that?
>>102955655I see, what I notice though is that the distilled version seems to have more realistic humans than vanilla flux dev, but it's just one picture so it's hard to make a definitive conclusion about that
>>102955078https://huggingface.co/Kijai/Mochi_preview_comfy/tree/mainWhich one is recommended? The Q4? The fp8? The bf16? Is there a difference in quality?
>>102955731>Which one is recommended? The Q4? The fp8? The bf16? >Is there a difference in quality?so far we only have image models as a baseline, but fp8 seems to increase some blur glitches during movement >>102949755 >>102953918
>>102955665I immediately saw potential in XL despite the initial reaction from anon but I also saw potential in sigma which, to be fair, got some love but never became the meta. regardless, the trajectory of SAI and the people who left for BFL seem to be larger and larger models which is not something I want. I do like flux but I'm not going to dillude myself into thinking its ecosystem will take any less time to flesh out than XL - in fact, we all know itll take MUCH longer. I'm sure some are okay with waiting and don't mind the hardware requirements but I think it's idiotic to completely forgo the idea of making smaller, better models. flux is bloated as hell and sd3 appears to suffer from le safety demon. it's like software in general being more about "lets put more shit in it" rather than "how can we optimize" or whatever.I could simply be poorfag coping but I don't think I am.
>>102955775thanks anon, the videos are both very blurry anyway, I hope all local gens aren't that badI'll probably try both anyway
>>102955885we won't get the same quality as their API demo because they are using the HD version, and we are nothttps://www.genmo.ai/blog>Today, we are releasing our 480p base model, with Mochi 1 HD coming later this year.>>102954440 >>102953790 >>102953884
>>102955838I just don't get how people don't see that Pixart gets decent results with 600m, much better than SD 1.5 or base SDXL, so why wouldn't a 1.6B model which an extremely efficient architecture be something that could be very good especially for a niche model. We don't need every model to be the kitchen sink, in fact Pony is a perfect example of a niche model that everyone likes. There's no reason to believe that Sana 1.6B can't be the next hyper specific booru anime model.
>>102955899oh I see, and I'd guess the hd would be unusable locally unless you get it work on multiple h100...
>>102955775q4 produced very blurred output and needed the frame_batch cranked right down. Testing again on a small gen rn using the default prompt and seed
>>102955926>oh I see, and I'd guess the hd would be unusable locally unless you get it work on multiple h100...not at all, I was able to make a high resolution video on my 3090, of course the result look like absolute shit because it was never intended, it was just to test out if it would be enough in terms of VRAM, and it does, it's asking for 16gb of VRAM for fp8, we're good o/ >>102954440
>>102955958oh very cool, hopefully the hd one is released soon then
>>102955902I think base sigma being so undertrained compared to the competition did it. Perhaps some of it was skill issue and likely the audience for base models don't care about aesthetics (which it had in spades). I could shizo-babble about the west conspiring to kill any chinese competition by means of making their models look bad or something, but I think the real reason is something else. I know for a fact that, somehow, many were filtered by its install. Somehow they didn't know where to place the files for sigma but then when Flux/SD3 arrived, magically they remembered or figured it out. I wish I knew the answer to your question, but I don't.
>>102955939I retract this, seems it was just a bad gen idk...iterate, iterate, iterate.
>>102956002It's the retarded bigger number meme. There's a reason why the Xbox 2 was called the Xbox 360 (because of the PS3). But I also think it was the timing, Pixart is too small of a model and SAI was promising a Flux-like super model so no one wanted to switch and even to this day ComfyUI doesn't natively support Pixart.
>>102956036>It's the retarded bigger number meme.so much of a meme that the SOTA local model is the biggest one
>>102956036>ComfyUI doesn't natively support Pixart.i have a suspicion that this is due to some dumbass politics between comfy and city et. al.
>>102956045The problem with Flux in particular is no one challenges that 12B isn't anything but a means of preventing local competition. It's what I would've done if I was monetizing in this space, you purposely make impossible to train base models and you get to build a loyal audience of window lickers while ensuring no one will make a Pony model to compete against you. We already see the results of this, BFL gets way too much respect despite ghosting us.
https://www.reddit.com/r/comfyui/comments/1g9wfbq/comment/lte0rdg/?utm_source=share&utm_medium=web2x&context=3Ok now that's impressive https://imgsli.com/MzExNjQ2
>>102956074>BFL gets way too much respect despite ghosting us.I think you overestimate the % of trainers in the ecosystem, this model is currently being downloaded more than a million times per month, the very vast majority of people are just using models, not training them, to them Flux is excellent and they'll never see its shortcommings in terms of training because they will never do something like that, so their feelings towards BFL can only be overall positive, they got in their hands a model that is consistently good and that's it
>>102956079 dishonesty_factor is probably the funniest name for a setting ive seen thus far
dishonesty_factor
>>102956122Just sucks because this will arrest local development for a year or more.
>>102955467>>102955508>verus-vision was lauded as the first flux finetune by anon>it's not actually a finetune holy kek
>Hi everyone, yes, we are still alive! Thank you for your attention to SANA, our latest work on efficient text-to-image generation. It was developed jointly by people from NVIDIA, MIT, and Tsinghua University.>We are preparing to open source SANA recently (waiting for the company's approval process, but whether it can be open source depends on the company's approval result). If you have any suggestions, you can leave a message directly in the channel, send an email to the SANA team (enzex@nvidia.com/junsongc@nvidia.com), or fill in this Google sheet >https://docs.google.com/spreadsheets/d/1rQWGYdswcl8O6V5Vu3AqtBh9PotkmTkxN2inI_njDy0/edit?gid=0#gid=0anything you guys wanna tell them?>We initially plan to support ControlNet and expand to Video generation. We hope that with community feedback, SANA will get better and better.i had a feeling they were planning on using sana as a base for a video gen model, only way the super compressed vae would make sense
>>102956249Super compressed VAE is good for both video and high resolution images. Also less tokens required for the model to learn which means per parameter efficiency goes up.
>>102956079https://imgsli.com/MzExNjYxthis shit is really amazing, I always felt flux was a bit empty on its image, that node fixes that perfectly>>102956140kek
>>102956283its not good when it wrecks eyes and hands and any other small detail
>>102956295You can use Flux I don't care
>>102956249https://github.com/NVlabs/Sana/issues/3#issuecomment-2434357814>The released version will be further trained. This is a prototype demo for experience.>>102956283>Also less tokens required for the model to learn which means per parameter efficiency goes up.you are right, i completely forgot about that
>>102956249>anything you guys wanna tell them?yes, why do they want to compress the VAE so much? their model is small enough, a good quality VAE makes all the difference, especially on details, that's not something you can just overlook and compress
>>102956310>duh why woood u wunt efficansy
>>102956295you also have to take into account how much that model is trained / the webp compression from the demo. not saying the vae DOESN'T contribute to this but there are other factors at play
>>102956300might as well just close your eyes while you generate if you want the model to be good no matter the reality
>>102956310>yes, why do they want to compress the VAE so much?this is probably why>expand to Video generation. We hope that with community feedback, SANA will get better and betterthey want to use sana's research to dip their toes into video gen, it would make sense min-max performance then
>>102956317they're not efficient at all, it looks like shit, that's the problem anon
>>102956345No it doesn't and I know this for a fact given I've been posting many Sana gens and no one has say anything :)Almost like you're just a dumb BFL employee
>>102956365>Almost like you're just a dumb BFL employeesays the Sana employee
>>102956344It always makes sense to min-max performance because it's impossible to predict what happens when things run 8 times faster
>>102956244at least that means that the result is dissapointing only because he didn't really finetune the model, what a shame
now now you guys, lets not fight, we are all big adults here. if you can please make sure to give the sana guys any good constructive criticism. i've never seen anyone else ask the community for advice like this before
>>102956376For example, there are multiple training losses techniques no one uses because the overhead is way too high (ie perceptual loss) despite those objectively improving convergence and final output
>>102956338thats true but after seeing the effects of sdxls/1.5 vae and cascade im definitely super skeptical against any model that does this super compression shit
>>102956288gj miqu anon
>>102956405There's pros and cons for everything, but please keep demanding your filet minon and act like that's the only thing in the world.
10 fucking minutes
>>102956441>AnimateDiffI thought you used Mochi for that one lol
>>102956453Oops, that is Mochi. I forgot to change the filename prefix
>>102956023retracting this (again), I get blurry low coherence with the Q4 compared to fp8. the default 168 frames json has an error where i have to rebuild the video_combine node or it doesn't save the image or vid (Some Time Later...)fp8https://files.catbox.moe/sj6ecc.mp4ggufq4https://files.catbox.moe/e1cg5r.mp4idkwtf is going on.
>>102956440the (sdxl mainly) compression together with low parameters and bad te IS one of the main things bottlenecking image generation right nowhave u ever tried just taking a normal image and encoding it and decoding it? im sure u will hecking love the result
>>102956474>idkwtf is going on.it's not complicated, Q4 is too aggressive to be usable
>>102956482The biggest bottleneck is training speed and requirements. We can survive just fine on the SDXL VAE quality if it means 8 times faster training and total requirements.
>>102956474Ok so basically this low is unusable.
>>102956498sure training speed is a big problem but if the absolute quality ceiling is not good then there is no point in training in the first place
>>102956523You people were happy to slurp up shitty SD 1.5, I think you are just moving around requirements arbitrarily. And again, 100% VAE recreation is not the be-all-end-all of a model, in fact it's a very superficial requirement.
>>102956523>if the absolute quality ceiling is not good then there is no point in training in the first placethis, case closed>>102956549>You people were happy to slurp up shitty SD 1.5are you a retard? we had no other choice back then, now we have
>>102956497It's probably this, I'm overthinking, searching for answers from the perspective of "I don't know enough about all the parameters in front of me".>>102956516Seems so, but i will polish the turd for a bit and see.
>>102956565It's okay, you can use Flux. I'll use Sana. We'll see who gets bored first, I'd imagine Flux will get boring when you realize Loras are extremely limited. Also as always, I never see the Flux apologists ever posting gens.
>>102956079can you pls post a workflow
https://www.reddit.com/r/StableDiffusion/comments/1gb07vj/comment/ltjdvlm/?utm_source=share&utm_medium=web2x&context=3>Apparently 200 steps is the official number too, I haven't dared to go that high yet.HOLY SHIT ARE THEY SERIOUS?? THATS ALREADY TOO SLOW WITH 50 STEPS AAAAAAA
Yeah I'm starting to get really disappointed in genmo, prompt adherence and world knowledge is really shit compared to minimax or klingAnd I've only been using the websiteMaybe I should be prompting using tags and commas instead of natural language?
>>102956614what workflow do you need? it's just one new node to add between, it's not rocket science lol
>>102956371Sana is a company?
>>102956637have you tried prompting in chinese?
>>102956637yeah their HD version isn't that good, but that's probably why they're not releasing it now but at the end of the year? They're probably trying to improve it I guess
So am I getting this right that neither SDXL nor Flux offer any benefit to anime generations, which have already been perfected with SD1.5? I mean if you want to use a Pony you need to use SDXL, sure, but inherently it doesn't seem to add anything.
>>102956685Pony does actual porn with accurate genitals. SD 1.5 cannot do that.
>>102956685ive been liking some pony models for anime quite a bit compared to other ones i've triedis there a better low res model to look at? kinda stopped doing ai gens for a while when I swapped from my 3090 and rocm didn't yet support rdna3 for a while so I missed out on a lot between sd1 and now
>>102956645but where exactly?my shit looks like picrel
>>102956618https://youtu.be/4lVUuuuJU7c?t=5
Cutest 1girl so far but unfortunately she's 2young>>102956670kek I haven't but I've seen examples on the website using Arabic, Cyrillic etc and it seems to understand those languages. Gemno is a US company so English should be best >>102956679>They're probably trying to improve it I guessYeah their pricing page implies you get early access to new models so hopefully the new model is better AND it makes BFL release theirs in some way too
>>102956718yeah it won't work with the KSampler, take this workflow: https://files.catbox.moe/a0snom.pngand also I modified the script so that it allows for more decimals on the values: https://files.catbox.moe/4gxohm.py
>>102956613>>102956744>>102956772unrealistic, no girls post here nor would they wear merch
>>102956714checkout noobxl
>>102956759thanks bro, wut are these red ones?
>>102956826It's too bad illustrious has those issues. I choose models based on their names. This is why I will be using Sana over Flux.
>>102956854some nodes to put the text encoder on my second gpu, you don't need them I guess, just go with the regular loaders you're currently using
>>102956826>noobxlwhen proxl?
>>102956815>he doesnt know about LDG brand clothing
how much vram do you recommend for a beginner UI?
Keep it rolling >>102956911>>102956911>>102956911
>>102956826i think i did try their prerelease stuff but would have to totally rework my prompts that im using on pony atmthe early access one did generate an image but it does seem to be missing data that models like pony have for specific niches
>>102956879ok makes sense, also now I get pic related error.do I have to change any settings?
>>102956958did you update comfyui?
>>102956936noobxl v5 pred has e621 images in it's dataset so that should help fill in some of the more niche tags.>totally rework my prompts that im using on pony atmyes, if you use pony score tags on an illustrious model you will be labeled a jeet
>>102955958There are other option they are just using upscalers, I mean 1696x960 are litterally the double factor of 480p. I mean, the resolution look like the 480 but upscaled.
>>102956977I did
>>102957013Idk man, their API demo output look way better than what we have locally, something else is going on
>>102957006i swapped all of that out for the typical danbooru quality tags they suggest (and I already use typically) but still wasn't really getting what I was asking for from itits in progress and still training or whatnot so will probably wait to fully judge it once the full model releasesive had issues trying to get certain tags to show up in anime-only/anime-primary trained models so it might just be because of that in general when that's their focusthose models also seem to overcook quite a bit earlier at higher cfg, at least that's how illustrious seems to be
>>102957016oh what node did you get that error? ComfyUi colors the culprit node
>>102957063"SamplerCustomAdvanced"
>>102955508That explain many think, so even this guy have not the hardware to train a full Flux model, and you think SD 3.5 large trains with is just 4B less? I doubt, the reality is his frankmerge is best that the finetunes that we had till now, so this show that de destilled is the path. Sadly, no much people have the hardware to do it, and the nigger pony creator would not train a real model. We cannot make a fucking cross-founding to rent for a month computing cloud and train our model?
any1 know some good chinese artists? maybe sana has lots of that in the dataset and would look really cool
>>102957100can you show a screen of your workflow? something's weird here
>>102957124
>>102957158you haven't used my custom script, your LyingSigmaSampler doesn't have all the decimals, and my workflow was going for values that your current node cannot reach, that's probably why you got an error
>>102957198>you haven't used my custom scriptbut I did.I opened it up with the editor and pasted it in and saved it.
>>102957255then why do you have 0.1 and not 0.01? you should restart ComfyUi to get it working
>>102957280I did and I still get the same error
>>102957301can't help you further, I suggest you to use a workflow that works for you and the reconstruct everything from it to get it working, instead of using my workflow that is somehow incompatible with yours
>>102957048Is call prompt enhacer, the same with replika and flux.dev and some faggot that have skill issue saying the same moths ago.
>>102949176Could you give me a picture of the blonde?
>>102956067I mean it doesn't natively support Kolors either TBF. Only Hunyuan for some reason I think.
>>102954448I think it'll see interest if it has strong baseline resolution support / image quality, even with the almost certainly worse prompt adherence vs 3.5 Large
https://xcancel.com/OpenAI/status/1849139783362347293>We are sharing a new approach, called sCM, which simplifies the theoretical formulation of continuous-time consistency models, allowing us to stabilize and scale their training for large scale datasets. This approach achieves comparable sample quality to leading diffusion models, while using only two sampling steps. Really interesting