Discussion of free and open source text-to-image modelsPrevious /ldg/ bred : >>102908985SANA Round Two Edition>Beginner UIFooocus: https://github.com/lllyasviel/fooocusEasyDiffusion: https://easydiffusion.github.ioMetastable: https://metastable.studio>Advanced UIForge: https://github.com/lllyasviel/stable-diffusion-webui-forgereForge: https://github.com/Panchovix/stable-diffusion-webui-reForgeAutomatic1111: https://github.com/automatic1111/stable-diffusion-webuiComfyUI: https://github.com/comfyanonymous/ComfyUIInvokeAI: https://github.com/invoke-ai/InvokeAISD.Next: https://github.com/vladmandic/automaticSwarmUI: https://github.com/mcmonkeyprojects/SwarmUI>Use a VAE if your images look washed outhttps://rentry.org/sdvae>Model Rankinghttps://imgsys.org/rankings>Models, LoRAs & traininghttps://aitracker.arthttps://huggingface.cohttps://civitai.comhttps://tensor.art/modelshttps://liblib.arthttps://github.com/Nerogar/OneTrainerhttps://github.com/kohya-ss/sd-scripts/tree/sd3>Fluxhttps://huggingface.co/spaces/black-forest-labs/FLUX.1-schnellhttps://comfyanonymous.github.io/ComfyUI_examples/fluxDeDistilled Quants: https://huggingface.co/TheYuriLover/flux-dev-de-distill-GGUF/tree/main>Pixart Sigma & Hunyuan DIThttps://huggingface.co/spaces/PixArt-alpha/PixArt-Sigmahttps://huggingface.co/comfyanonymous/hunyuan_dit_comfyuiNodes: https://github.com/city96/ComfyUI_ExtraModels>Index of guides and other toolshttps://rentry.org/sdg-linkhttps://rentry.org/rentrysd>Try online without registrationtxt2img: https://www.mage.spaceimg2img: https://huggingface.co/spaces/huggingface/diffuse-the-restsd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium>Maintain thread qualityhttps://rentry.org/debo>Related boards>>>/aco/sdg>>>/aco/aivg>>>/b/degen>>>/c/kdg>>>/d/ddg>>>/e/edg>>>/h/hdg>>>/tg/slop>>>/trash/sdg>>>/u/udg>>>/vt/vtai
this thread is for mourning the death of bigma
Once I get over the VAE compression I think I can accept this model for dicking around.
Now for drugs and sleep, good night anons.
>>102919465the fuck is wrong with their eyes? if we can change the VAE maybe it'll be saved idk
>>102919480It's hard to tell what's the VAE and what's from the model being undercooked. There's a reason why they haven't released the weights (it's not done).
>>102919516>There's a reason why they haven't released the weights (it's not done).I really don't get what they're doing, what's the point of releasing an uncooked demo in the first place? They wanted to be clowned on?
cascade anon has been on suicide watch for so long someone check up on him
>>102919537>schizo anon is talking to himself again
>>102919530Who knows, they made some weird decisions, like going way too hard on the VAE compression. Even 16x would've been impressive and achieved their goal. Same with switching to Gemma saving some headroom on the text encoder. If it were me I would've figured the requirements to train the model on a 24 GB VRAM GPU at 1024px then size the model for that, either 2B or 3B with 16x compression. The model is too experimental like Cascade.
>>102919576yep, they made a serious mistake there, no one care about small models that produce bad images, we want quality first, and they could've achived that with a 5b model + a normal VAE
>>1029196383B for SD3 barely fits at 768px for training on a 4090 at batch 1 with all the optimization tricks. 5B is a dream.
>>102919645>5B is a dream.the 5090 (32gb) will be there soon, so it won't be a dream anymore
what is 8x8?is it better than fp8?
>>1029196688x8 is 4x4 doubled.
>>102919668>what is 8x8?the what? where did you find this?
>>102919652That might mean 1024px for SD3, maybe batch size 2. 5B would still be a dream. 1.6B wouldn't have been bad if the AE wasn't so extreme. But I won't poo poo it until I can train it myself. Their training methods are questionable.
>>102919689>That might mean 1024px for SD3I don't get it, SD3 is a 2b model, yet we managed to train SDXL (2.7b) on a 3090 though?
>>102919686
>>102919699Transformers uses more VRAM.
>>102919638>5b model + a normal VAEThen where is the research? That's just another SD clone then. At the end of the day I'm glad they tried something new. Later zhangs will read their paper and create the ultimate 1girl generator with 128x compression VAE
anyone using juggernaut v11?
>>102919711>That's just another SD clone then.it's not, SAI tried to make DiT models, and they suck ass, I'd say it's a smaller version of Flux, we don't need a 12b model, I'm sure we can reach the same quality with 5b, but definitely not with 1.6b, which is my point
So is Illustrious shit or promising? I wasn't around when it was fresh. Seems like a step down from my playing with it, even for coomers.
>>102919427forgot SANA links, retard https://github.com/NVlabs/Sanahttps://huggingface.co/collections/mit-han-lab/dc-ae-670085b9400ad7197bb1009bhttps://ea13ab4f5bd9c74f93.gradio.live/>>102919768try out NoobAI which is a derivative of illustrious
>>102919771>forgot SANA links, retardsana doesn't deserve to be in the OP, it sucks ass
>>102919757You're such a size queen, you would think after seeing Florence2 that size isn't the be-all-end-all of models. Pixart at 600m was just fine. 1.6B would be fine especially for niche models with 100,000 image datasets. Sana could very well be *the* porn model.
>>102919789>1.6B would be finesana proved it's not, stop coping, they tried all the tricks that existed on earth for that one, and it still looks like ass
>>102919775>>102919802chill out anon we're just having fun with a new model it's not that deep
>>102919802Sana didn't prove anything. I already just said Pixart is fine with 600m. You don't know the final result of Sana, you just know what an undertrained alpha model looks like. Honestly you're exactly why no one ever posts anything because you're incapable of abstract thought. Congrats anon, you're a faggot. You're someone that sees the ingredients of a cake and say "I CANNOT EAT THIS SHIT"
I'm looking to upgrade my video card to something a little bit more fitting for AI image generation than my current GTX 970. I've got a Ryzen 5 5600 and a 550 watt power supply. I'm going to guess I'm not going to be able to support the latest and greatest of 4090's on that, so what's a non-AMD card with a comfortable amount of VRAM that my current hardware can support?
>>102919832>you just know what an undertrained alpha model looks like.that's your assumption, what makes you believe it's undertrained and it's not the final result?
>>102919757>it's not, SAI tried to but it sucked ass>implying SAI is competent at all
>>102919771Thanks anon, taking a look at the gallery, still seems to have that sketchy, uncertain quality to the images.>>102919832Is the reason pixart hasn't been adopted because of anti-hype?
>>102919825>>102919832let me guess, you were SD3 shills back in the days aswell?>"Just trust Lykon bro, SD3 will be the best model ever"you can't stop taking L's don't you? kek
>>102919851>Is the reason pixart hasn't been adopted because of anti-hype?it's because it hadn't beaten the previous local sota model which was SDXL, simple as that
>>102919851>galleryresearcher gens are always ass but i wont pretend it's not wonky
>>102919422Because the promt results are the same as in the paper, which uses prompt rewriting. For example try portrait photo of a girl, photograph, highly detailed face, depth of fieldIt looks basically the same
>>102919851600m is fine but is restricting and definitely at the niche size category and not what people are looking for for a base model. But comparatively 600m for Pixart is the same overall quality of say SD 1.5.
Excuse me, this is the pixart bigma thread, chud.
>>102919889I'm glad I decided to become an engineer rather than a researcher, looks like a field of litteral retards>Make a paper about an unfinished product>Put shitty pictures as their cherry picked pictures >Add a demo of their undertrained turdI swear to god if I was working with such dumbasses I would end my life
>>102919916Maybe because researchers can see the bigger picture when you say hit your face saying "I can't eat raw flour"
>>102919427Another bake that deliberately snubs all 1girl in favor of low effort slop. Do I need to make the real collage myself?
>>102919932>Maybe because researchers can see the bigger picturenah, look at the SD3 researchers, they are actual retards, and the sana team will join that list of retards
>>102919938There are three 1girls in the collage anon
>>102919851Best of 4 with the same prompt (with replaced prompt conditioning), illustrious definitely has better prompt adherence than pony. There's at least some feeling of Ornifex there. But the quality is so much better, and I heaped on the retarded sd1.5 prompt that seems recommended. Can't say base pony was too much better. As with all these things, we'll just have to see how things shake out.
>>102919947SD3 took way too long and had no results. Sana is coming out after 5 months from Pixart Sigma. Completely different story.
>>102919958its different for that anon because he is incapable of abstract thought or nuance
>>102919958>Sana is coming out after 5 months from Pixart Sigma.what's the point? I prefer them to took 1 year but make a diamon than going for 5 month to shit out of a turd
>>102919950No, there are zero.
>>102919966**its not different
>>102919966>>102919978you lost the nuance in your grammar anon kek
>>102919970Because it's 3girls? BRAIN BLAST
>>102919969I know you're incapable of thought, but maybe the more important part is they managed to reduce the resources required to train a model 8x? That means someone can make your precious 5B model 8 times faster. Do you know what it means to work smarter not harder?
>>102919993>they managed to reduce the resources required to train a model 8x?again, what's the point? it's a little turd, you won't make it great with more training, we've been training SDXL (2.7b) for a year and a half at this point, and it'll never reach Flux, when you're small you're small, cope with it
>>102920004What's the point of making better tools for training models that makes training future models 8 times faster? Surely you're not this stupid, right?
>>102920004I don't know why you're here, obviously AI is way too experimental for you. Maybe you should come back 10 years when things are more like your iPhone 14.
>>102920019>Surely you're not this stupid, right?tell that to the sana team who decided to go for a little turd even though they would've proven to everyone how great their technique was if we had a big model in our hands
Sana is the biggest leap fod local models since sd 1.0. I've heard rumors that sana 2 started training recently and the team is already shocked by the results. Apparantly it is beating flux on most benchmarks after just 50 H100 hours of training time.
>>102920034I accept your concession
nothing like a new model to bring out the retards
This general was founded on an irrational millennarian enthusiasm around the release of Pixart Sigma; today, while we continue to use models finetuned from Stability models or made by former Stability employees, Pixart drops another dud. The thread is, predictably, in shambles. And we absolutely deserve it.
>>102920049>nothing like a new model to bring out the retardstrue, that's why you're here right now
>>102920054Weird because you're an sdg faggot that prefers to be here for some reason.
>>102920058yeah
>>102920065"some reason"... you know full well what his name is
>>102920067I was just joking, and you were supposed to call me a nigger faggot, now I feel bad saying it, sorry anon
has anyone done a direct comparison to sigma? to see how much it improved if at all?
The issue is technology has been infested with tech illiterate retards that wouldn't be here if smartphones didn't have web browsers. Something about AI brings the 90 IQ retards.
>>102920088>The issue is technology has been infested with tech illiterate retards that wouldn't be here if smartphones didn't have web browsers.sana was literally made so that it could be run on a smartphone, it's written on their paper and I hate every line of it
>>102920087Absolutely no point in doing it until the weights are dropped, every couple seconds there's another H100 batch size 1024 being dropped on it.
>>102920096And? It's still a prototype, it's like going to alpha-tech trade show showing a prototype 1-inch OLED and saying "duurrrr dat is too small"
>>102920038
>>102920123>And? It's still a prototypethe classic "prototype cope">Hurdur, Pixart Sigma is a prototype, the next iteration will destroy everything>next iteration (sama) comes>looks like shit>well... it's a prototype duh! It's not like we were waiting for this or something, 2 weeks anon! of course with this method I'll never win right?
>>102920104so true, i think ive heard the researchers drop another turd on it
>>102920125My sources are people close to the sana team. I can't reveal their names or exact positions because NvLabs may pull funding.
>game developer posts pre-alpha footage for nerds and techies>retard gamer with smart phone: durr da game has bad grafix
>>102920151Me too I have sources, and they said that they won't do anymore models after sana, it's over. I can't reveal their names or exact positions but trust me bro it's true
I do think Gemma is too retarded for prompt rewriting.
>>102920153did you seriously made an analogy between image models and video games? Can't believe you're this retarded, image models are all about graphics, if it looks like shit no one will give a fuck, that's the only goal of an image model, to produce good looking pictures and accurate to your prompts
It's funny because any criticism of Flux is met with "but dey gave it tou you fer FREE!" You got Sana, for free. Calm down.
>>102920184did you seriously make an analogy between video games and image models? Can't believe you're this retarded, video games are all about graphics, if it looks like shit no one will give a fuck, that's the only goal for video games, to produce good looking graphics.
>>102920166Don't come crawling back when Sana 2 releases. Aparantly it will be ready in a month or even weeks.
>>102920211>Aparantly it will be ready in a month or even weeks.2 weeks?
>>102920211I heard maybe even two weeks. They had a breakthrough in quantum shrinkflation and after the lead researcher got shrinkflated he was able to design a hyperbolic time training algorithm, I heard it from the engineers
>>102920211>Don't come crawling back when Sana 2 releases.took them 5 month to make this piece of turd that is sana, and you're expecting us to believe they'll make a better model in a month?
>>102920220I can't give further details but I can see it will be at least more than a week. I'm already risking my source's identity from what I've revealed so far
>>102920232oh my god he's retarded
>>102920241I accept your concession.
>>102920245(we're all roleplaying anon)
>>102920240Oh dear sana (((alledged))) employee anon, why do sana 1 looks so bad?
at least black forrest labs had the balls to send a peon into the thread to answer questions chang where are you?
>>102920269no you dont understand hes been here the whole time, be careful to not speak badly of the model or you will summon him
>>102919771>demo saves as webp
Sana is the best!
>>102920371turn PAG guidance down pls
>>102920362True anon, Sanoa iis the Hhe Sanao the but gest
Rip. Application is busy.
can it gen sanna marin?
>>102920375what's the difference between PAG guidance and CFG? why isn't it working with just CFG like every normal models?
>>102920442i remember when PAG came out but can't recall what its purpose is. i think it looks bad desu.
trying out old sigma prompts on sana, i plan on killing myself soon>a cute, chubby little raccoon in a mystical forest full of glowing creatures and fauna, the image is in a low poly style
>>102920462Base sigma could do lowpoly?
>>102920462>i plan on killing myself soonwhy? because it looks worse than old sigma?
>>102920468Without seeing the Gemma prompt Gemma could've gobbled up the low poly style part of the prompt.
>>102920462for comparison this is what sigma gave>>102920468>>102920470sigma was pure soul in safetensor form
>>102920462>>102920474yikes, it looks way worse than its predecessor, how could've the fucked it up this bad?
i like to cope and think that maybe, just maybe, sana is not part of the pixart family of models
>single flux gen ITT
it can do low poly but the prompt adherence is awful>low poly style render of an old rusted robot wearing pants and a jacket riding skis in a supermarket
THIS IS WHY YOU DONT USE SYNTHETIC MIDJOURNEY IMAGES YOU STUPID FUCKING CHINKSWHY IS LOCAL SO FUCKING INCOMPETENTevery fucking model since sdxl has been trained on dogshit synthetic data, we could've had local midjourney or dalle already if these faggot bakers didn't cuck their shit.
>>102920531From my basic tests the Gemma 2 prompt expander is very ass, it will change your prompt for the worst and miss the primary intent and style.
>>102920503no wonder they just released the demo, they knew it was shit enough to be clowned on, I hope they'll improve it now, if you're reading this sana employee, get back to work!
>>102920532amen
4chang detects my post as spam if i try to post the prompt>sanahttps://pastebin.com/5CcFUbGh
>>102920562>pixart sigma output
I think it's easily worse than SD 1.5
Let's talk about CtrLoRA again.https://github.com/comfyanonymous/ComfyUI/issues/5314https://github.com/xyfJASON/ctrlora>ControlNet (Zhang et al., 2023) adds an extra network that accepts a condition image, turning a T2Imodel into an image-to-image (I2I) model. In this manner, ControlNet is able to generate imagesaccording to a specific kind of condition image such as canny edge, significantly improving thecontrollability. However, for each condition type, an independent ControlNet needs to be trainedfrom scratch with a large amount of data and computational resources. For example, the ControlNetconditioned on canny edge is trained on 3 million images for around 600 A100 GPU hours. >To address this problem we propose a CtrLoRA frame-work that allows users to conveniently and efficiently establish a ControlNet for a customized typeof condition image. As illustrated in Fig. 2(a), we first train a Base ControlNet on a large-scaledataset across multiple base condition-to-image tasks such as canny-to-image, depth-to-image, andskeleton-to-image, where the network parameters are shared by all these base conditions. Mean-while, for each base condition, we add a condition-specific LoRA to the Base ControlNet. In thismanner, the condition-specific LoRAs capture the unique characteristics of the corresponding condi-tions, allowing the Base ControlNet to focus on learning the common knowledge of image-to-image(I2I) generation from multiple conditions simultaneously. With our framework, in most scenarios, we can learn a customizedtype of condition with as few as 1,000 training data and less than one hour of training on a singleGPU. Moreover, our method requires only 37 million LoRA parameters per new condition, a sig-nificant reduction compared to the 361 million parameters required by the original ControlNet foreach condition.
>>102920586I don't think you realize how fucking bad SD 1.5 is. Let's get those rose tinted glasses off buddy.
>>102920596>I don't think you realize how fucking bad SD 1.5 is.this, I played with base SD1.5 a month ago, it was horrible, we really improved our shit since then
>low poly render of a man wearing glasses with a sign that says "IT'S OVER"> Image Style: 3D Model
>>102920609>hatsune miku with her tits out
>>102920587>Let's talk about CtrLoRA again.there's a model that'll work on flux?
Meta seems to be the best text to image Gen AI for me and it works best in WhatsApp. Even Instagram one sucks despite sharing same Llama version. Is there any Android app that let's me create unlimited images everyday for free like WhatsApp does?
>>102920630this is a local thread anon
>low poly render of a man wearing glasses with a sign that says "IT'S OVER"> Image Style: (No style)
>>102920639they took away it's soul... give it back.... GIVE IT BAAAAAAAAACKKKKKKK!!!!!!!!!!!!!!!!!!!!!!!!!!
>>102920633My bad, I thought Llama based txt to img would be considered on topic considering it's open source
nvidia will pay
>>102920653not only it removed it soul, but it looks worse, Flux doesn't have much soul but at least the images look consistenly good
>>102920657>it's open sourcearen't. meta doesn't release their image models unfortunately
>>102920623Seems like not>5. CONCLUSION AND LIMITATIONS>We speculate this issue might originate from the capabilities of the network architectures, specifically the architectures of VAE, UNet-based Stable Diffusion, and ControlNet. To enhance the capabilities of our framework, it is worth developing our CtrLoRA using more advanced DiT-based (Peebles & Xie, 2023) backbones such as Stable Diffusion V3 (Esser et al., 2024) and Flux.1, which we leave for future work.
>>102920671>meta doesn't release their image models unfortunatelyand their video model (that one looks amazing, goddam I hate it)
>>102920678ok so that's a nothingburger
>low poly render of hatsune miku
>>102920692Did you not read>which we leave for future work.?Time to start putting those mikus to use and port it to flux
I think they're overly obsessed with using numbers to guide their training and just like overly using aesthetics scores, using scoring to determine prompt adherence probably obliterates concepts.
What is better for genning: a 4060 Ti 16GB VRAM card or a 4070 12GB VRAM card?
>>102920704not my problem, if they want to prove it work on flux, they have to do it
>low poly render of donald trump
>>102920716Lmao not going to happen because like everything else, you need an H100. No one is going to train even Control Net for Flux without a hefty grant or access to H100s laying around.
>>102920725that's why I said it's a nothingburger
>low poly render of the solar system
>>102920562nice
this one isn't too bad, i feel like it beats sigma here>sanahttps://pastebin.com/F7mZzSvB
>nvidia geforce rtx 5090 gpu
>>102920759sigma
>>102920759hmmm... on second thought... no visible brushstrokes and the hands are worse...
>>102920781Aesthetics are easy to fix.
>>102920791i zope so, sigma had some of the best aesthetics for a local model back then>sana>a candle that looks like a cute cat
>>102920807I think they did a worse job with their captioning.
>This application is currently busy. Please try again.STOP HOGGING IT ANON
>>102920807>pixart soulma
>>102920820It's cute they let us use their office 4090.
demo queue has been stuck on 1 guy for a while... did he request 8 billion steps or something
>>102920856>their office 4090I hope that's not a 4090, they claimed it would be ultra fast to make a picture, but when I'm on the top of the queue and the generation is starting, it takes more than a mn
>>102920864they sort though naughty prompts by hand and are confused by mine
>>102920867No it doesn't. It's like 2 seconds. Actually watch the queue, it's a couple of seconds when it's processing your prompt.>>1029208644K with a negative prompt takes a bit but I think it might have crashed or something.
Definitely has Warhammer 40K in the dataset.
>>102920791lol no they're not. aesthetics/style are a key part of comprehension. which is why fluxjeets can't even do this simple midjourney prompt despite being able to do complex text on signs https://www.reddit.com/r/StableDiffusion/comments/1g6q1x3/whats_the_process_to_create_this/the flux results look like dogshit in comparison, a fundamental misunderstanding of aesthetic construction thanks to butchered training data.
>>102920867takes like 2 seconds for each gen, it's just that there's alot of people in queue>sanahttps://pastebin.com/8YbwfDkX
>>102920890I know you're stupid, but the reason why this problem comes up is because AI captioning is extremely bad at using style keywords in the prompts. The only way to avoid this problem is hand written prompts or including meta information.
>>102920902>I know you're stupidprojection
Sorry anon I've been prompting "1girl" with 40 steps on random seeds this whole time. I'll stop.
>>102920900>pixart sigma
>>102920913I just explained to you why it happens. If you did anything productive with your time you would've known this yourself. You bitch about training but never have you captioned 100k images.
>>102920900>takes like 2 seconds for each genI fucking hate the current era we're in, or else we got a giant model (Flux) that takes minutes for a single image, or else we got small little shit that produce turds in 2 seconds, why are they so weary on going for the sweet spot? Something big but not too big
>we will try our best to...Fuck
>>102920932im assuming they plan on using sana as a base for something else like a video model or whatever, explains why they minmaxxed speed and efficiency so much
>>102920939>Model zoothat's the model weights right?
>>102920932Sounds simple enough anon, I'm glad you're spearheading this. Oh wait, you want other people to spend thousands of dollars so you can call their work a turd.
>>102920948>I'm glad you're spearheading this.thanks anon, it sure needs to be talked about
>sana>origami figure of a cute girl with cyan hair and long twintails, the girl's name is hatsune mikusana still has a bit of soul remaining
>>102920944Yes that would be every model. The demo is the 1024px model. But I'd expect there to be a 512, 1024, 2K and 4K model. For Pixart they spent the most time on the 2K model.
>>102920963sigmo
For the record this is pretty aligned with Gustav Klimt's work.
>>102920986why's she looking at me like that?
>>102920932>why are they so weary on going for the sweet spot? Something big but not too bigSAI wanted to make SD3 4b for the sweet spot but it got canceled for (((whatever))) reason
>sana>a line of pill shaped buses with hatsune miku's face on it in new york city, honk honkit's not too bad, i feel like sana's salvageable
>>102921012The big question is if it delivers on the trainability aspect.
does anyone have the flux masterchief prompt?
>>102920741What is this?
Knows Master Chief
>>102921041>does anyone have the flux masterchief prompt?I have>Photo of Criminal in a ski mask making a phone call in front of a store. There is caption on the bottom of the image: "It's time to Counter the Strike...". There is a red arrow pointing towards the caption. The reda arrow is from a Red circle which has an image of Halo Master Chief in it.
>>102921058the single best flux image. no others come close.
>a painting by leonardo davinci of a pregnant jesus lovingly carrasing his belly, a speech bubble above him has a blurry screenshot of donald trump badly photoshopped onto it3rd try, not what i asked for but eehhhehrhehhh>>102921058thank you anon
>Photo of Criminal in a ski mask making a phone call in front of a store. There is caption on the bottom of the image: "It's time to Counter the Strike...". There is a red arrow pointing towards the caption. The reda arrow is from a Red circle which has an image of Halo Master Chief in it.
>>102921062It's impressive as a concept but practically isn't how any prompts or want to prompt. Especially given how most people just 1girl prompt.
>>102921062>the single best flux image. no others come close.it was a way too much sophisticated prompt to be understood the very first day of Flux's release, it was probably made by some BFL employee, and it sure did had the wow effect he was expecting, it's such a good prompt to show the strengths of Flux
This is what the collage should have been. Bakers shouldn't be allowed to be allergic to 1girl.
>>102921073how come yours was much better?
>>102921081I'm sure we haven't discovered the optimal settings
>>102921075>it was probably made by some BFL employee,I've been wondering, for awhile now, about how many of those really good early Flux images were BFL employees using Pro. >It's time to counter the strike >Zoom call>powerpoint presentation etc...
>>102921092if it was only pro pictures it wouldn't have the impact it had, it was such a big deal because we were able to replicate those with dev aswell
>>102921092I mean I used their prompt and still got an amazing result. It's a good prompt, Flux is a good model, it's just impossible to train and BFL has ghosted us. I mean I guess we have a year to solve the Flux or SD3 problem until the next massive model comes. The best thing Flux will do is make a target and something we often see in tech is people like to smash targets.
>>102921109>it's just impossible to trainwith the undistilled models we have now, not anymore, but yeah still hard because it's a big ass motherfucker
>>102921113The undistlled models are as alpha as Sana.
>blurry cctv footage of donald trumpy menacingly floating in the night sky, full moon behind his backi remember having to fiddle alot more to get this image on flux, i feel like sana 'gets' what i'm going for more
>>102921116absolutely not, dev dedistill has the same quality of vanilla dev and has all the guidance removed, so it's ready to be trained on, and someone is already up to the taskhttps://huggingface.co/SG161222/Verus_Vision_1.0b
You said Sana is shit but yet the demo queue is fucking long
>>102921137There's some gold to dig in there
>>102921137it's probably someone trying to get past the nsfw heart image filter and gen some sana tits
>>102921137it's like watching a car crash, it's horrible but a lot of people are gathering around to see the damage done
>>102921155Is there a reason you don't post images, are you poor or something?
>>102921165>he says, while not posting an image
>he never posts images
STOP HOGGING THE DEMO, FUCK OFF!!
>>102921175good goy, I asked you to post an image and you did!
What are the best 1.5 models at this point? I never moved on from yuzu.
>>102921214sana
The more I use Sana the less I hate it.
>>102921269>The more I use Sana the less I hate it.Show us some pictures that made you love Sana more
>>102921269yeah i feel the same, it's salvageable>an anime screenshot of a wide open field, a gargantuan celestial anime girl towers up into the sky, to the left is a bright blue sky and to the right of the girl is a starry night sky, which she wears like a capedidn't get what i wanted exactly but that's more of a bad prompt issue
>>102921278No you're a contrarian faggot, but it would funny if I posted some Flux images and had you pretend you thought they were shit.
>>102921291>NoConcession Accepted.
>>102921269im slowly losing interesting in using the demo. gimmie local damnit!
CFG vs PAG... what's that about?
My first Sana gen. I'll have to tweak some settings. Too early to tell whether this is promising or not.
>>102921269>the less I hate it.I never hated it nor did I love it. It simply is, and I simply am.
>>102921319i know pag is a new thing that helps sd1.5 with anatomy and image coherence, haven't used it myself>>102921329try this anon's settings >>102921086
With 1.6B parameters the first thing to do is split the extreme styles apart into separate models.
>celestial princess hatsune miku, her face is replaced by a spiraling galaxy, armpit hairprompt understanding can be a bit hit or miss at times, or maybe it's because im esl
>>102921358I doubt "replaced with" shows up much if at all in the captions. And fetishes like "armpit hair" is never in the captions.
>>102921377>And fetishes like "armpit hair" is never in the captions.i'm killing myself
>sailor moon eating the moon
>>102921397I wonder if it knows more characters than Migu and Sailor Moon kek
>>102921404It's the same basic cast that Flux knows.
>>102921397delicious lunar crisp
>>102921404we could know if some demon wasn't hoarding the demo doing 4k batches. LEAVE CHANG'S OFFICE GPU ALONE!
>>102921319CFG - model adherence to original imagesPAG - I could give the real definition, but I would rather describe it as the amount of ritalin the model does. Occasionally something brilliant will come out of it. Usually, it will screw it up if your does is too high.
>sailor moon eating the moon, armpit hairnot sure if that's a tooth brush or a strange leek
>>102921423Honestly shocked the AE manages that fine of details. Maybe it can be saved.
>>102921436clearly Luna's femur
likely to be non commercial license OH NONONOhttps://github.com/NVlabs/Sana/commit/7d32332055abbcacc97d00918d43eabe0af950f9#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5R13
it's coming bbbbbbbbs
>>102921475LMAOOOOOOO, it's fucking DOA, Schnell has an apache 2.0 licence and it's way better than this small piece of shit
>>102921443It's nice to have an alternative model with high channel VAE
i guess we'll be stuck with sdxl forever huh
style wise it feels like 1.5 in a good way
>>102921494License doesn't matter when training uses consumer hardware.
>>102921622we already have SDXL for that
Sana does not work at 512x512, can confirm. Does anyone know their actual buckets?
>>102921670Looks fine to me, anon
>>102921677kek
remember omigen? they have a demo out now, github also says they plan on releasing the model. china save us from china?>https://github.com/VectorSpaceLab/OmniGen>https://arxiv.org/abs/2409.11340demo>https://huggingface.co/spaces/Shitao/OmniGenprompt>a cute cat holding a sign saying "china hello china cheeenaaaa lalalala", ultra high definition
>>102921687Get this non pixanasexual shit out of here
>>102921687ehh...
>>102921723why did you make miku a n*gger?
>>102921708the sana-samas have failed us, the age of the pixart sexual is over. it's the dawn of the planet of the omnigenders
>>102921743why not?
>>102921755that's not a fucking answer bitch
>>102921762it is nigger
trying to recreate Breezewood, Pennsylvania
>>102921775is that sana?
>>102921778Flux
>>102921783o
>>102921073>but practically isn't how any prompts or want to promptIt's short, simple, and to the point. What do you mean?
>>102921687kek
>>102921847yep... it's china time. i just hope the thing isn't so damn slow locally though
I don't understand. This is 1.01 cfg, 1.01 pag. Why does it look like I didn't touch the settings at all?
>>102921687>they have a demo out now, github also says they plan on releasing the model.so that's the new meta now? releasing the demo before the model? I mean it makes sense but I hate it being teased like that kek
>>102921891what's your prompt
>>102921847looks better than Sana, maybe this shit is the real deal
>>102921904you funny guy
>>102921687>miku holding a sign that says "omnigen > sana"uhhh sanabros? omnigen bros? what does this means? And now I'm out of credits to I can't try other variations
>>102921904there's also that whole built in controlnet thing but it keeps erroring out when i try it in the demo
>>102921916china promotes equality
should I just train a lora? not sure if I can with my 8GB of vram
>ani is making a sdcpp guihow do we get him over here to be our guy?
>demo erroring out It's over
>>102921959which one? the sana one or the omnigen one?
>>102921938source?
>>102921968>>102921169
>blurry cctv footage of donald trumpy menacingly floating in the night sky, full moon behind his backimages look alot cooler at cfg 1.1 and pag 1.1
>>102921966sana
>two new models and a new gui comingwe eating good /ldg/
>>102921974>cfg 1.1 and pag 1.1yep. CFG can go up to about three as well. the only reason for a higher PAG is if you're doing text.
>>102921893It could be to garner free publicity. If they gain enough attention and some investor believes in the potential evolution of their alpha version then they will release nothing to the open source community and shift the project towards a SAAS by scaling, optimizing, tuning, etc. If they gain no publicity and no investment then the alpha will be released as a last resort, again, to get free publicity.
>>102921966Both
so what now?
>>102922034we gen
>>102922038im depressed
>>102922044its okay anon im here for you
>>102922049*sob* *sob... uwaaaaahhhhhh... *hic*
>>102922034Back to 1girl for you. Meanwhile I will go back to my drew threads and when that's done I will go back to jacking off to /v/ butt threads while you play pretend with the local models. I only show up for big events like flux, sana etc.
>>102922060thanks for stopping by
>>102921938about a half year ago when he found out that C++ is hard. If he hasn't released he won't.
https://huggingface.co/rhymes-ai/Allegrobabe wake up new text to video model just dropped
>>102922188>Single GPU Memory Usage 9.3G BF16 (with cpu_offload)>check downloads foldertext encoder is 19GB
>>102922217>text encoder is 19GB"architectures": [ "T5EncoderModel"It's the classic T5_XXL, so we'll only be using the encoder part, which is roughlly 9.2gb of vram
>>102922188>Apache 2.0>6-second videos at 15 FPS with 720x1280 resolution>175M parameter VideoVAE and a 2.8B parameter VideoDiT modelPretty nice, I can feel it's a new local sota
New>>102922252>>102922252>>102922252