Discussion of free and open source text-to-image modelsPreviously baked bread : >>103005229SD3.5M Edition>Beginner UIFooocus: https://github.com/lllyasviel/fooocusEasyDiffusion: https://easydiffusion.github.ioMetastable: https://metastable.studio>Advanced UIForge: https://github.com/lllyasviel/stable-diffusion-webui-forgereForge: https://github.com/Panchovix/stable-diffusion-webui-reForgeAutomatic1111: https://github.com/automatic1111/stable-diffusion-webuiComfyUI: https://github.com/comfyanonymous/ComfyUIInvokeAI: https://github.com/invoke-ai/InvokeAISD.Next: https://github.com/vladmandic/automaticSwarmUI: https://github.com/mcmonkeyprojects/SwarmUI>Use a VAE if your images look washed outhttps://rentry.org/sdvae>Model Rankinghttps://imgsys.org/rankings>Models, LoRAs & traininghttps://aitracker.arthttps://huggingface.cohttps://civitai.comhttps://tensor.art/modelshttps://liblib.arthttps://github.com/Nerogar/OneTrainerhttps://github.com/kohya-ss/sd-scripts/tree/sd3>SD3.5L/Mhttps://huggingface.co/stabilityai/stable-diffusion-3.5-largehttps://replicate.com/stability-ai/stable-diffusion-3.5-largehttps://huggingface.co/stabilityai/stable-diffusion-3.5-mediumhttps://huggingface.co/spaces/stabilityai/stable-diffusion-3.5-medium>Sanahttps://github.com/NVlabs/Sanahttps://sana-gen.mit.edu>Fluxhttps://huggingface.co/spaces/black-forest-labs/FLUX.1-schnellhttps://comfyanonymous.github.io/ComfyUI_examples/fluxDeDistilled Quants: https://huggingface.co/TheYuriLover/flux-dev-de-distill-GGUF/tree/main>Index of guides and other toolshttps://rentry.org/sdg-linkhttps://rentry.org/rentrysd>Try online without registrationtxt2img: https://www.mage.spaceimg2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest>Maintain thread qualityhttps://rentry.org/debo>Related boards>>>/aco/sdg>>>/aco/aivg>>>/b/degen>>>/c/kdg>>>/d/ddg>>>/e/edg>>>/h/hdg>>>/tg/slop>>>/trash/sdg>>>/u/udg>>>/vt/vtai
Blessed thread of frenship
https://huggingface.co/spaces/stabilityai/stable-diffusion-3.5-mediumNow the question is, is this even better than SD3M?
>>103016063ass collage
>>103016108https://x.com/StabilityAI/status/1851270169173365118#mWhy are they lying like that? Flux bf16 can definitely run on a 3090
>>103016144who's hoarding all the H100s in here
>>103016121second
>>103016121third, titties are for fags
>>103016108bruh, this isn't 2022 anymore, base models shouldn't have 3 arms now
>>103016144This chart is a mess. Don't rely on it, i think it's for reditors only.
>>103016228well that anon is a ledditor so
>>103016231kek
>shit model release >shit bake to boot All is well with LDG
>>103016063Whats stopping people from making a gui for stablediffusion.cpp and using that as main? Why go through all the python setup and embed shit?
steady as she (1girl) goes
>>103016108GREAT SUCCESS
>>103016210>base models shouldn't have 3 arms nowclearly a stylistic choice made to appease Eldenring Ranfags
can it do anime girls
its so over
>>103016339try the demo >>103016108
>>103016346that sounds like a lot of work
>>103016339>can it do anime girlsFor your sanity, do NOT try to write "Gwar Gura" on this model... https://files.catbox.moe/9kvehy.JPG
>>103016144The exclamation mark indicates that you need to use performance compromising optimizations to run the model. And Its true because no one is using FP32 to run these models fully inside the GPU memory.
why is the encoder bigger than the model
>>103016360he's cute
ooh babby here we go
>>103016339Not what I prompted for
>>103016358kek
>>1030163681) We don't have the FP32 so we can only run the BF162) This comparison is only about bf16 models3) let's suppose it's fp32 (it's not), Flux wouldn't run on a 32gb vram card like they say on the image because fp32 Flux is asking for at least 48gb of vram
are there any more modern approaches to upscaling? i have tried sd upscale and high res fix, but still seems like i have to do a lot of manual work. i get get it to generate awesome 1024x1024 images, but i just want them to 2x in size
>>103016409Is it too much/little to slap your gen into the extra tab in Forge?
>>103016392looks like Flux dedistill at cfg 1 kek
>>103016430topkek
>>103016391Does an H100 with only 32 GB exist?
>>103016422it doesnt work very well, either it does nothing or it does dumb things like generating multiple knees or belly buttons. i find its more consistent to upscale the image in photoshop then manually inpaint sections
>>103016409for sdxl anime gens i upscale to x1.5 times using highres fix with foolhardyremacri upscaler, 15 steps and 0.4 denoise. no clue about photorealistic gens
>>103016452>Does an H100 with only 32 GB exist?nope, that confuses me aswell, they really don't know what they're doing lol
>>103016457Weird, so far I didn't have issues going for something like r-esrgan x4 or lanczos, but then again I usually upscale by no more than a ~1,5 factor.
>a long queue of pill shaped buses with hatsune miku's face on it in the streets of new york cityy grainy? is it because of the webp compression?
Awh man
>>103016509The details and architecture are horrible... at least it doesn't have the Flux blur but still
>sexy jade beautyyou guys made fun of sana for it's anatomy, atleast sana knows what tits are. APOLOGIZE TO CHANG
>>103016509>webp demoEVERY FUCKING TIME
>>103016539I didn't apologise for Hunyuan either, so Chang can get bent.
>>103016539wtf, that's MJ v2 tier lmao
>anime girl with large breastsare we really going to be stuck with sdxl forever?
>>103016574Looks like it's gonna be a while before I hop off the pony ride.
>>102975512>>102976024>>102976108>>102976199>>102976348>>102976533>>102977366prompt/workflow?
>>103016593>masterpiece, best quality
>>103016539>sexy jade beautyhere's what sana gave for comparison
>>103016611try descriping your pompts with full, natural speech sentances for a change
>a cute, chubby little raccoon in a mystical forest full of glowing creatures and fauna, the image is in a low poly style
kek all that time just to get it going backwards
>>103016620desu it kinda works and would make for a cool idea to polish out
3.5 Medium in my testing so far does not have worse or better anatomy than Large really. However, the claimed 0.25 to 2 megapixel support is legit, it really does up to 1440x1440 no problem out of the box, as in the attached pic.Something something we're back
>>103016653wake me up when autists manage to finetune the shit out of it, or when it's easier to run on 8 vram
>>103016653>>103016662I don't feel it has improved compared to SD3M
>>103016620just pretend the blue light is a headlight or laser beam
>>103016687you have no idea how much I'm waiting for image2video for Mochi, it's gonna be fucking epic
>>103016674I don't see how you could possibly think this TBQH, it's WAY better.>>103016662uh, what makes you think it's much harder than XL to run on 8GB or even 6GB (assuming non-ancient Nvidia architecture for the card)? It's not.
>>103016687they're trying to burn his tires off with a lazerbeam, is my canon
>>103016710>it's WAY better.care to show some image comparisons to show the improvement?
>>103016687nice, didn't even cross my mind to do that.>>103016706same here.
>>103016710Not saying it's harder to run, but I'm just looking forward to more efficient models for consumer grade GPUs, since current standards are kind of on the edge already
>>103016721Mochi really does blow the local competition away.I may make an entirely new install, just for it.
>>103016715My first attempt got me this, which has no issues beyond kinda bad fingers on one of the hands, for example. It doesn't have anything like the very very persistent and unsolveable "randomly exploding bodies" issue that 3.0M did even at standard aspect ratios / resolutions for all sorts of prompts.Also again it really does do up to 1440x1440 native out of the box, like I mentioned earlier.
>Error You have exceeded your free GPU quota (65s requested vs. 55s left). Subscribe to Pro to get 5x more usage quota.
>>103016745>Mochi really does blow the local competition away.and we'll get even better shit with the HD version, the second half of 2024 was really a good period for us (We got Flux and soon MochiHD)
chang let you use their office 4090 as much as you want for sana's demo but stability ai can't do the same for sd3.5m?
can I even do this with a 1060
>>103016752>It doesn't have anything like the very very persistent and unsolveable "randomly exploding bodies"that can definitely happen
>>103016793Nvidia can surely afford hosting an HF space with literally unlimited bandwidth lol, SAI can't necessarily
I will never forget you Chang
>>103016807Every model has poses that destroy anatomy. If it gets somewhat close to how it should be and can be trained it is good.
>>103016807It can happen in any model though, and does. The way it happened in original 3.0M was unique and abnormal, that's what I mean. Anyone expecting 100% perfection out of every gen is just dumb, it's a 2.6B param base model and should be considered as such. So far I think it's looking like a very solid foundation overall versus XL.
>>103016108meh
>>103016851>It can happen in any model thoughFlux dev would never give such monstruosity, it's just too good compared to the rest, I get that SD3M is a small model and shouldn't be compared to Flux dev, but let's not lie about Flux dev and pretend that it can do fucked up shit like that, because it can't
>>103016785>HDI shudder at the requirement possibilities... I'm thinking 4090 only, possibly 5xxx series only, so maybe not until late Jan if it's 5 series, which would be a shame.the vram required for generated frames rather than coherence seems to be Mochi's real issue and 100 steps is a time constraint.
>>103016875>I get that SD3M is a small model and shouldn't be compared to Flux dev>sana never got this treatment from anon sad
>>103016879>I shudder at the requirement possibilities... I'm thinking 4090 only, possibly 5xxx series only,not at all, I did a test for 720p on regular Mochi (it gives shit result but that's not the point) and it still works fine on a 24gb vram card (Q8)
>>103016805If you sell it and buy a better card then yes
>>103016875>Flux dev would never give such monstruosityTry a picture that shows the soles of a woman's feet
>>103016895Ok, how many frames could you generate before it OOM's and do you know by chance how much vram each 720p frame+data actually took? or an answer in seconds at 30fps is fine too, thanks.
>>103016914flux foota works tho https://desuarchive.org/g/thread/101689050/#q101693628https://desuarchive.org/g/thread/101694073/#101694586
>>103016884Sana was not released yet so it is being treated as a SAAS.
>a photo of a cafe at night, the is a cardboard cutout of an anime girl sitting on a chairnot too shabby, maybe it can be salvaged. hopefully it's easy to finetune
>>103016925from the same thread:https://desu-usergeneratedcontent.xyz/g/image/1722/63/1722634596793.pngYes, you can, as with any other model, cherry pick pictures to something that it can do but you cant say >Flux dev would never give such monstruosity
anyone tried medium with the provided example upscale comfy workflow? my internet is dogshit trash so the model will take a while to download
>>103016964that is a cool gimmick but the rest of the image looks like utter shite
>>103016967>Flux dev would never give such monstruosityIt's not a monstruosity, yeah it's an anatomical mistake but it's not like it's a complete disaster, I made more than 10000 pictures with flux at this point and not a single time I got a third leg or a third arm, it's just consistently good
>>103016930>being treated as a SAAS.Absurdity
>>103016884Sana is useless at text and looks worse than Kolors in most cases out of the box, it was just too small I think
The niggers at SAI invented a new node called "SkipLayerGuidanceSD3" to make the hands better lol, maybe it could be used on Flux to make it even better who knows
>>103017123Alright I will make it even better for you this is/was flux pro 1.1:https://desuarchive.org/g/thread/102661327/#102661327
>>103017315one of the devs said it works similar to PAG
>>103017315>betterThis is hilarious, wouldn't expect any less from Stability after women on the grass.
>>103017349wrong link:https://desuarchive.org/g/search/image/9_uYxpd3MGtXSSVDkRluoQ/
>>103017349>>103017377surprisignly I managed to get something better with flux dev lawl
>>103017315This is pathetic, they can't make a model that can make good hand by itself so they're using some mumbo jumbo layers manipulations with PAG to get something acceptable, the result looks completly artificial and burned, well done StabilityAI!
>>103017433kek, this one scares me even more because it seems to have magical powers.
noicehttps://civitai.com/models/899132/impasto-virtuoso-flux?modelVersionId=1006047
>>103016314Thank you :)
Kadinsky 3 paper dropped yesterdayhttps://arxiv.org/abs/2410.21061just skimmed through it, doesn't seem that impressiveDiffusion U-Net ~12BText encoder Flan-UL 20BSber-MoVQGAN image decoder
>>103017527
>>103017570>Additionally, we released the source code and checkpoints for the Kandinsky 3 and extended models.coolhttps://ai-forever.github.io/Kandinsky-3/https://github.com/ai-forever/Kandinsky-3>Text encoder Flan-UL2 (encoder part) - 8.6B>Latent Diffusion U-Net - 3B>MoVQ encoder/decoder - 267MInteresting, maybe that'll be the small and good model we all been waiting for
>>103017570>Text encoder Flan-UL 20B>got destroyed by Dalle-3 on "Text alignment" anywaysad
>>103017681>>103017628I think this is an old model, and that the paper just got released nowOn huggingface the models seems to have been pushed 11 months agoDon't know why they would release the paper now.
>>103017728oh ok, never heard about that one before, I guess it must have sucked pure undistilled (see the joke?) ass to not even be talked about here
https://lucefriends.fun/Anyone made a lora for the new Vatican mascot?
>>103017865
>>103017865Christian's should've went the aniconist route.
>dust settled >its so over
So Flux is still Kang?
>>103018254*Pony
>>103018254Have to wait for SD 3.5M fine tunes to see how good it is
>>103018254>So Flux is still Kang?of course, and it'll likely be the case for years on end, SAI can't compete with that
>>103018269wasnt it already confirmed that its dogshit?
>>103018254yes>>103017865It was unwise of them to make a mascot that will no doubt be used almost exclusively for porn.
>updated forgeUI>extensions stopped workingreinstallan time
shut the fuck up
please, do speak
>>103018342The base model quality doesn't matter as much as how accessible the training is. If training is reasonably fast, then the model can be salvaged.
>>103018376so when will we know if its shit or not?2 weeks?
>>103018376>then the model can be salvaged.what level do you think it can reach though? ponyv6?
So, is SD large better than Flux?Can it make girls laying on grass?
>>103018342Anyone claiming it's "dogshit" is being disingenuous on purpose, it's a 2.6B param model (meaning not hard to run) with native support for up to 1440 x 1440 gens and generally very solid outputs. It'd be quite easy to make the argument that the quality gap between it and 12B param Flux is nowhere remotely close to as big as it should be.
>>103018416Don't bother, I decided to scroll through the thread.As usual stability sucks cock
>>103018394It has enough parameters and the base architecture required to make the best porn model especially if you take the Pony approach and long haul finetune it. The question ultimately is how fast.
>>103018416>So, is SD large better than Flux?not even close, Flux is still on its own league
>>1030184163.5 Medium in my eyes is legit better than Large quite often just due to the higher res baseline
>>103018429I had to train a Lora for about a day to get Flux looking remotely as "hard realistic" and non plasticy as 3.5 does out of the box for photographic gens
I could be out socializing, but there's nowhere for me to go so here I am generating shit.
>>103018469>here I am generating shit.By all means post some anon >its lewd Catbox
>>103018454well, now you have your lora you can use with flux, so that problem is solved
>>103018454>>103018427>>103018434Jesus Christ, rajeesh, stop shilling. Is stability paying these guys to cope so hard?
>>103018480How fucking retarded can you be, these models are free to use and train. SAI gets nothing if I train a porn model on SD 3.5M.
>>103018495>SAI gets nothingthey got investments if they have enough internet advertising and credibility, don't play obtuse
>>103018520WHO FUCKING CARES IF THE FINAL MODEL IS GOODI can't believe how fucking childish some of you are, fucking hell everything is them vs us even for dumb fucking bullshit. Maybe /v/ is your speed, come back in 10 years for your AI console war.
>>103018539>WHO FUCKING CARES IF THE FINAL MODEL IS GOODbut it's not
>>103018550Who cares what some poorfag who can't train thinks about anything. You'll use whatever is best. Why do you even hang out here, you obviously don't give a shit about experimenting or learning anything, you just want to fap to 1girls from Flux. Go away. You contribute nothing. You're like a child who hangs around with adults that doesn't know that children are best seen not heard.
>>103016092ty for blessing this thread
>>103018575>You'll use whatever is best.yep, that's why I'm using flux
>>103018477The Lora wouldn't exist if Flux easn't distilled, is the point.
>>103018586And you'll use SD 3.5M if someone makes the best finetune. So shut the fuck up. These models aren't magic, any of them can be trained for a 20 million steps into something good. Too bad for Flux that means 5 years.
all thirdie countries should be carpetbombed with neutron bombs
>>103018600you think Flux has plastic skin because it's distilled? maybe it just has to do with the aesthetic finetune they did on top of it, that's also why it has this intense blur on every picture too
>>103018480Comparing Flux to 3.5 Med is retarded to begin with, how am I shilling. 3.5 Med is really good for the size of model it is versus previous similar options, that is my actual opinion.
>>103018630Because he's a literal child that is 90 IQ so abstract thought is difficult. He can't understand that 80% of Flux's weights aren't even being activated during inference.
>>103018606>And you'll use SD 3.5M if someone makes the best finetune.you're delusional if you think that SD3M can reach Flux's level with a finetune, it's the same size as SDXL, and we've trained SDXL for a year and a half at this point, and it's still nowhere the level of Flux, because it simply can't
>>103018624It is objectively and clearly the reason Flux in general looks shiny / toylike / plastic, SD 3.5 Turbo looks exactly the same way because it too is distilled.
at least you can finally upscale or img2img with sd3.5
>>103018646SDXL is Unet and uses shitty CLIP
>he pulled
>>103018662Yeah, the high resolution support in 3.5 medium is great, does 1440x1440 no problem
>>103018664good luck asking someone to spend tens of thousands of dollars (like pony did) to train a small bad model just for the sake of it, no matter how good you'll train a small shit model, it'll never reach Flux's level, deal with it, cope and seethe
>>1030186824090s and 5090s can train SD 3.5M no problem so it's not a question of tens of thousands of dollars, just some time. I'm very sorry anon that you haven't learned how to do things for fun. People spend literal months making a train model and you're afraid of running a 4090 for a month.
>>103018682I've trained a hardcore NSFW concept Lora for Flux, it was possible but highly annoying, much moreso than I'm certain doing the same on 3.5 will be.
>>1030187033.5 you won't have to rape the weights with a Lora so the hardcore will be properly integrated without destroying all the other knowledge.
>>103018646>if you think that SD3M can reach Flux's level with a finetunenta but nobody cares about this, if they want flux level then they'll just use flux. image gen is just a worthless toy and all it's good for is porn and nobody is going to be training a competent porn model on flux, that's it. you don't need flux level for anime porn. all people want is a good enough model to make porn with.
>>103018697>you're afraid of running a 4090 for a month.someone did that, on Flux, to make it even better and more unreachable lolhttps://civitai.com/models/141592/pixelwave?modelVersionId=992642
>>103018715>you don't need flux level for anime porn. all people want is a good enough model to make porn with.SDXL already exists for the "good enough", sorry if your standards are that low but it won't be the case for everyone
How did the pixelwave guy manage to finetune flux dev without losing the guidance distillation at all?When I train loras, the built-in guidance gets all fucked up very fast, and I end up in a state where it works and learns the concept, but I have to use CFG + antiburner. Here's an example, a concept lora trained on pixelwave itself, but prompting for generic 1girl standing just to showcase the problem. Training set is thousands of images. Generation used guidance 4 and CFG 1.Base pixelwave: picrelLora epoch 1: https://files.catbox.moe/ttlynw.pngLora epoch 13: https://files.catbox.moe/jiooiq.pngI'm training with guidance set at 1 like you should. But then when used with normal generation settings, the lora starts acting somewhat like a non-distilled model with CFG set too low. Which makes sense I guess, you're training the lora with no distillation at guidance 1, so the model starts gradually forgetting what to output when guidance is > 1, catastrophic forgetting and all that. But then how the fuck is pixelwave trained for 100s of thousands of steps and it's fine?One thing I haven't tried yet, is that maybe the fact that I'm training a lora and not a full finetune messes everything up. But I've compared training lora vs finetune, on this very same dataset, on SDXL models and it makes almost no difference. So surely it's not that, right?
>>103018732i'm sure your 1girl, standing, big breasts, buttchin, blur really needs those 12 billion parameters
>>103018736>How did the pixelwave guy manage to finetune flux dev without losing the guidance distillation at all?he went for ridiculously low learning rate
I'm a simple anon, and I don't need that many billions of parameters.
>>103018742like I said, if you want your simplistic 1girl hentai, SDXL has plenty of finetunes like that, what are you trying to achieve with SD3.5M? That's just redundant, the "good enough" finetunes are there, we need to up the level
>>103018743big guy
>>103018761Define "level". Photorealistic images? Boomer schizo prompting?
>>103018769dickhead even
>>103018773that's up to you anon, what do you want to achieve with a finetune of SD3.5M? What do you want from it that SDXL can't already?
>>103018761ermmm.... 16ch vae and t5 text encoder instead of clip already are huge upgrades over sdxl on it's own
>>103018783I know you haven't trained anything in your life, but SDXL is miserable to train. But here's some things:- shitty text encoder that has to be trained for new concepts- shitty unet that is prone to spinning off to oblivion- lovely shade of grey for all image generations- VAE that is incapable of drawing a face if it's less than 100 pixels
>>103018745That might make sense if it barely modified the base model, but it didn't. Pixelwave is significantly different from dev. Side-by-side compare the same seeds, and it's often completely different images. I mean I guess I can try a really low LR but with a bunch of steps and see if it looks different than higher LR and less steps. My LR is already 5e-5 with batch size 16, it's not like I'm overshooting it.
>>103018803Flux changes rapidly even with 100 steps, it's very unstable as a model, likely due to how many parameters it has.
>>103018801>shitty text encoderyou'll be having to deal with 3 text encoders on SD3.5M, there's also the "shitty text encoder" from SDXL that's in those 3 lawl
>>103018817It's really annoying when you're just a bad faith obtuse childish fanboy. This is a pointless discussion, it's like arguing with a child about Batman vs Superman.
>>103018826Concession Accepted.
>>103018835Nuh uhfaggot
>>103018813>Flux changes rapidly even with 100 steps, it's very unstable as a model, likely due to how many parameters it has.this, that's why you need a really low learning rate to make it stable, but it'll be even slower, Flux is a bitch to train not gonna lie
https://civitai.com/models/899486/luce-vaticans-mascot-fluxPLAP PLAP PLAP PLAP
>>103018864Jesus is the bread
new Medium has the same sharpness as the old one which is good. Large has the XL soft blur, which I can't stand.Wonder what the difference in their training was.
>>103018877Large is 1024 only whereas Medium was progressive and multi-resolution trained.
>>103018888>Medium was progressive and multi-resolution trained.I don't get why they haven't done that for Medium aswell, they could've made a model good enough to be competitive with Flux, why are they so fucking retarded I swear
>>103018888If the greater sharpness of Medium comes from multi-resolution training, I wonder why they didn't do it with Large? Shame.
>>103018903>>103018906because there's no way their crumbling company has the funds to continue training the 8b
>>103018906Because SAI is a bunch of retards and the non-retards made Flux. Pixart basically gave the blueprint for training any model fast(er). You start with 256 to train the model on how to understand text, composition, etc. Then you keep adding more resolutions to make the model more robust which both makes your final model capable of doing many resolutions but also makes the general quality much better because the model is more confident with pixel patches.
Man even though I've got a 3090 and can run Flux no problem, it's so much more pleasant to test a 2B model due to the speed, plus negative prompt
>>103018926>because there's no way their crumbling company has the funds to continue training the 8bhttps://stability.ai/news/stability-ai-announces-101-million-in-funding-for-open-source-artificial-intelligenceSomehow they manage to still get insane funding even after all their disasters kek
>>103018942>plus negative promptyou can go for negative prompt if you use flux undistill, but yeah Flux is so fucking sloowwwwwww
>>103018946>101 millionjesus, emad really ruined that company huh. that's what happens when all your talent leaves
>>103018736He shared the hyperparameters, in case you're interested inhttps://www.reddit.com/r/StableDiffusion/comments/1gcbn4t/comment/ltsj9v8/?utm_source=share&utm_medium=web2x&context=3
>>103016593he posted a prompt here https://desuarchive.org/g/thread/102956911/#q102963599 not sure if same one tho
>>103018758600M is All You Need
>>103018977https://www.reddit.com/r/StableDiffusion/comments/1gcbn4t/comment/ltt1rnb/?utm_source=share&utm_medium=web2x&context=3>Because it's so slow to train, I selected a few thousand images while trying to be diverse as possible>few thousands>1x4090>5 weeks of traininggrim... really grim...
>>103019002You can be surprised what Pixart can do with 600M
>>103018977Thanks, I somehow missed this. The major things I can see that are different from my hyperparameters:1. Very low LR, obviously. But full finetune needs lower learning rate in order to learn at the same perceived speed. I find it hard to believe 5e-5 is way too high for a lora.2. Relatively high warmup steps, at 4000. I was just using 100. Maybe instability in the beginning of training, and this counteracts it?3. Adafactor optimizer vs AdamW.4. Full bf16 training. This one is a bit weird. I've tried full bf16 in the past, both with diffusion models and LLMs. It greatly slows down learning, and impairs the ability to learn at all. It's not theoretically sound. Because when you take a step, you do "p += gradient * lr", basically. But gradient is already really small compared to p, and LR is even smaller. So you end up with "value += really_fucking_small_value", in bf16 precision. The limited precision of bf16 means the update gets entirely dropped, except for a very small number of parameters that happen to have relatively huge gradients. I dunno, usually you have to use fp32 parameters, or a kahan summation buffer in the optimizer so you don't drop updates due to bf16 roundoff error. Would be funny if this is actually what stabilizes it.More things to try I guess.
>Making tons of AI art>Inpainting and studying anatomy and touching up/correcting anatomy in photoshop gets me excited about art>Bust out paper>Can't draw a straight line or copy for shit, can't even make a good loomis headWhy this fuck is it like this? It's not like I can't draw individual parts of anatomy, but I just can't make a complete image to save my life. Anyone else use this because struggle really fucking hard with normal art?
>>103019140Dunno, stopped doodling and replaced it with AI some time after my tablet broke, but the few times I get to do some manual work I do feel kinda rusty.
>>103019140being good at one thing doesn't mean you're good at another thing
>>103019140>>Can't draw a straight line or copy for shit, can't even make a good loomis headnone of these things matter for good art anyway
>>103019227And that's how we got modern art
SD3.5M sucks dick and SD3.5L is not a "slightly worse flux" it's a "much much worse flux", anyone saying otherwise is just in delirium caused by the wait of 12 2weeks to receive this shit
do you guise think sana will ever release (with a fixed AE or otherwise)
>>103019269ya
>>103019258amen
SD3.5 is at least an improvement to SDXL right?
>>103019320ya
Insufferable General
>>103019323nice picture anon
>>103019003they are called memetunes for a reason
Yay
>>103019370Neat
Can someone tell me the differences so I can pick which one to use?>Beginner UIFooocus: https://github.com/lllyasviel/fooocusEasyDiffusion: https://easydiffusion.github.ioMetastable: https://metastable.studio
>>103019995anon this is a blue board, remove your post before you get the hammer
>>103020006ignore those, if you want to use the new sd 3,5 models you're currently limited to comfyui, if you don't care about that and only want flux and sdxl i recommend forge webui, if you only plan on using sdxl finetunes go reforge webui
>>103019995RIP anon
>>103020037alright thanks
im trying to make a night scene for halloween, but it looks weird. the sky is dark with moon and stars, but the characters look like they are lit in day light and the grass is glowing bright green
>>103019859Medium?
>>103020634Flux
Feeling good about the new medium, I think tuners are gonna be able to do good work with itVERY importantly, it has good artist knowledge in my testing unlike every other VLM-captioned model that dropped in the last 6 months, it has recognized every name I've tried with it so far, including living artists, and is clearly able to replicate their style albeit with lacking coherence
Why retards can't understand that you should take the best or it's pointless
I use leddit for porn and there's a post with a clearly AI generated celebrity related video that has 300+ upvotes and people in the comments talking about it negatively but not because it's ai generated: they all think it's the a trial celeb and that she's washed and overdoing itIt's over right? There's going to be like a tiny percentage of us left who can tell this shit apart and we're going to go insane watching reality fall apart
>>103020688I think they pulled it out of the oven early but it's better that then it sitting in hell for another month.
>>103020710>Real* celeb
>>103020710Long before that social media will have tools that will tell you X probability of being AI.
>>103020756Yeah man I'm sure social media companies will do the right thing and spend a tonne of money to lower engagement across their own networks
mochi can't do extreme from below "step on me mommy" perspective it seems I don't even want an upskirt I just want ankles and face visible at the same time with them looking down at the camera
>>103020710Link?
>>103020922https://www.reddit.com/r/HipHopGoneWild/s/kl86nLdRwt
>>103016429she cute
>>103020842Whether they want it or not, it will happen, just like the return of the youtube's dislike numbers though an extension or a software.
>>103020278>>103020627baste
>>103020627should be browsing reddit
Been a while since I've dabbled in genning, which ui should I be using? A1111 is effectively dead, from I've heard, and ComfyUI was compromised a while back; which is the most secure and privacy focused ui with a decent array of features?
>>103016144I have trouble with bf16 on my 3090. I must be doing something wrong.
>>103016063How does it feel to know that Debo won?
>>103020634My favorite part of a new model drop is thinking any good gens are the new one only to be told they are not This has happened to me many such times
>>103018985thanks
>>103021319>>103020037
>>103016596>art station, unreal
>>103021538But if you are on of those who can run it from the scripts<<<-----
>>103020709anything less than perfection is worthless
>>103016108I thought we were done with the body horror?Simple karate kick pose is already a solved in basically all other new models.This just tells me instantly that the model is only good for face portraits. No dynamic poses allowed.
>>103020688The high resolution support it has is really great IMO, I don't know how people can downplay it. It also can do normal hi-res-fix denoise passes without the weird artifacting around the perimeter of the image, unlike Large.
>>103021744what was your prompt? Flux generated, uh, this for just "man karate kicking" so I think we need to be a bit more specific here
>>103019140The knowledge you've gained doesn't have anything to do with the fundamentals of drawing. But it'll be valuable later if you continue to practice.
>>103022020>karate man striking a high kick pose. His leg is lifted high.Here was a painting version from one of the newer asian models (i don't remember which). It's part of the standard prompts I use to test the ability to do complicated poses.Dall-e 3 crushes every model still though in these types of scenarios.
>>103016210Any model will give you extra limbs if you make the image too large.
>>103016296Total victory for China, Taiwan is no more.
>>103016108>a diaper fetish concert at a balloon festival
>>103022305you have a way with words
so is SD3.5 good now?
>>103022341It's horrible and glorious:>>103022305Here it understands the prompt and has 3-4 major faults in the image.picrel is Flux. adherence is poor, but Flux looks way beyond.
>>103016706>>103016785Hoping for img2video + hd version that looks less blurry + still running in 24GB before end of the year.
>>103022399I'm hoping for static bound llama.cpp, but I guess I just need to convert my models to ollama.
>>103022091This image is not my opinion, but represents the liberated capacity of local gens. Every mainstream service would refuse to make this.
>>103018421How harder would it be to train it to get a pony like version vs ponyxl?I'm thinking about the caption stuff, since one is tag based, this one is natural language.
>>103018421>>103022305uhIt's no Flux.
>>103022506The liberty of local models is pretty wild. Expect legislation soon.
>>103022529T5 can learn the tag structure just fine. It's really not that "hard", just would take a couple million steps just like any other model.
>>103022558Sure, but I'd rather have the same cool feature of being able to completely write without the constraint of booru tagging everything.
>>103022600Anything can be captioned too...
How do you replicate the hires fix on comfy?Also anyone tried the lying sigma trick on sdxl/sd3.5m?
>>103022636https://comfyui-wiki.com/workflows/2-pass-txt2img.en-USfigure out the former before you start on the latter
>>103022661thanks anon, it's pretty easy, added the nodes and it works wellfor lyingsigma though, I have no idea where to attach this to
>>103022550
>>103022746I'm using two in plus (anti-detail). kinda digging it.
>>103022746ok I could make it work using SamplerCustomAdvanced
>>103022385>no faces because it's hard>all grrrlz by defaultI'm unimpressed.
>>103017168can I haz catbox
>>103016063>>103016092Wtf is 1girl standing?
>>103018394it's better than SDXL so yes
>>103023052A prompt. 1girl, standing
>>103023070Explain it to me like I have no idea WTF you're talking about
>>103018646Flux is literally only good because of the t5xxl embeddings and multimodal diffusion transformer architecture, arguably 1.5 and sdxl are capeable of creating aesthetic and realistic images. flux is distilled garbage with a "shallow depth of field, masterwork, best quality" LECO slapped on. Maybe not sd3m because sai are cucks, but a model of that parameter size and architecture indeed can reach flux capeabilities.
>>103023114When you create an image using an image generator, you first enter a prompt. The most common words put in a prompt are "1girl, standing". This is because on many websites, this is how pornographic photos of women are tagged and these websites and their tags were used for captioning the dataset for Pony Diffusion, an SDXL finetune designed to create pornography and illustrations of characters.
>>103023142I suspect this is a genuine high quality response, so thank you.>Protip: I've never used "AI" and despite being a retired programmer have zero basis for reference to this shit
>>103023179booru tags, stinky noob
>>103023124>Flux is literally only good because of it's superior architecture and superior t5. I like 1.5 and sdxl slop. flux makes images look good on purpose, which is wrong. Maybe not sd3m because sai are cucks, but a model of that parameter size and architecture indeed can reach flux CAPEabilities so basically flux at any size would be superior to SD3.5M at any size by my own logic, because it has the superior architecture.
how does stable diffusion work? i am using automatic1111 but i noticed in the command line its sending a command for every generation. so would i be able to create a script that sends the same command? except maybe change some things like the resolution and hair style so i would be able to generate 100 different images over an hour or so
>>103023052Salvation
>>103023253If you're looking into something API oriented use comfy. It works out of the box as an API and the nodes are just a visual representation, you can use the JSON of the nodes to do API commands and tweak settings programmatically.
>>103023253I think ComfyUI is designed to do that already. I'm not sure about altering the resolution for each image in a batch. For different hairstyles etc, I think you use a wildcard?
>>103023253https://github.com/leejet/stable-diffusion.cpp
>>103023253All the UIs just pipe to a backend. If you are just looking for hair style change use a wildcard in your prompt. If you are playing with backend pay attention that the different A1111 variants don't play nice together when setting models/vaes/samplers/schedulers. It is in the documents. Or just use the fastAPI stuff to curl everything. Up to you. >>103023268>>103023274comfy api is trash. You need to create a workflow and then use node reference to "fill" information. It is so limiting.
>>103022832
>>103023022"diaper" apparently means butts.
SD3.5M finetune status?
Help a nigga out, what is the best photorealistic model for generating porn on a 12 gb vram card?
>>103023253There is option for scripts and built in scripts for basic things like that. It's at the very bottom.It's the reason I still use forge/a1111 for prompt research. It's so simple to use and automate the creation of comparison sheets.
>>103023374Flux or one of the ponyrealism things. Both have flaws, try them out.
>>103023339
>>103016429The pedo poster is back!!!!
>>103023400
>>103016430>looks like Flux dedistill at cfg 1 kekIt's amazing
>>103023399Will do, thank you.Have a good day, sir.
>>103023459
>>103023539ngl those guys look like ladies.
>>103012961>>103016593best i can do for you https://litter.catbox.moe/e6r99w.png
>>103022288>Any model will give you extra limbs if you make the image too large.but the image isn't too large here, it's the sweet spot of any model, so the fact it's making 3 arms on a sweet spot is bullshit
>>103023583Does it make 3 arms at 512x512?
>>103023124>Flux is literally only good because of the t5xxl embeddingsSD3.5 also has t5xxl anon
>>103023588>512x512no one want to generate at such low resolution, it should be good at 1024x1024
>>103023602>no one want to generate at such low resolution,I do
>>103023611kek
So annoying managing embeddings across multiple GPUs, the caching mechanism in Kohya is cancer
>>103023679SD3.5M seems decent enough to full finetune. Batch size 2 for 1024px dataset (4-5 seconds per step) with AdamW 8-bit, doing a test.
>>103022986proud of u anon
Hold...
>>103022986>ok I could make it work using SamplerCustomAdvancedyes
Fresh>>103024144>>103024144>>103024144