[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: the longest dick general.jpg (1.94 MB, 3264x1543)
1.94 MB
1.94 MB JPG
Discussion of free and open source text-to-image models

Previous /ldg/ bred : >>102922252

SD3 Large Edition

>Beginner UI
Fooocus: https://github.com/lllyasviel/fooocus
EasyDiffusion: https://easydiffusion.github.io
Metastable: https://metastable.studio

>Advanced UI
Forge: https://github.com/lllyasviel/stable-diffusion-webui-forge
reForge: https://github.com/Panchovix/stable-diffusion-webui-reForge
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
ComfyUI: https://github.com/comfyanonymous/ComfyUI
InvokeAI: https://github.com/invoke-ai/InvokeAI
SD.Next: https://github.com/vladmandic/automatic
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI

>Use a VAE if your images look washed out
https://rentry.org/sdvae

>Model Ranking
https://imgsys.org/rankings

>Models, LoRAs & training
https://aitracker.art
https://huggingface.co
https://civitai.com
https://tensor.art/models
https://liblib.art
https://github.com/Nerogar/OneTrainer
https://github.com/kohya-ss/sd-scripts/tree/sd3

>SD3 Large
https://huggingface.co/stabilityai/stable-diffusion-3.5-large
https://replicate.com/stability-ai/stable-diffusion-3.5-large

>SANA
https://github.com/NVlabs/Sana
https://ea13ab4f5bd9c74f93.gradio.live

>Flux
https://huggingface.co/spaces/black-forest-labs/FLUX.1-schnell
https://comfyanonymous.github.io/ComfyUI_examples/flux
DeDistilled Quants: https://huggingface.co/TheYuriLover/flux-dev-de-distill-GGUF/tree/main

>Index of guides and other tools
https://rentry.org/sdg-link
https://rentry.org/rentrysd

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest

>Maintain thread quality
https://rentry.org/debo

>Related boards
>>>/aco/sdg
>>>/aco/aivg
>>>/b/degen
>>>/c/kdg
>>>/d/ddg
>>>/e/edg
>>>/h/hdg
>>>/tg/slop
>>>/trash/sdg
>>>/u/udg
>>>/vt/vtai
>>
File: dena_00025_.png (2.8 MB, 1728x1344)
2.8 MB
2.8 MB PNG
>mfw
>>
so current conclusion is that sd3.5 cannot be finetuned in practice, like flux?
>>
ldg and discord stay winning
>>
>>102926821
>so current conclusion is that sd3.5 cannot be finetuned in practice, like flux?
it can, it's not a distilled model, the issue is that it's a giant model (8b), so you won't train it with your 24gb card, don't dream about that lol
>>
File: ComfyUI_SD35L_0010.jpg (315 KB, 896x1152)
315 KB
315 KB JPG
>>
I DON'T WANT TO SIGN UP FOR AN ACCOUNT JUST TO DOWNLOAD THIS SHIT FUCK YOU
>>
>>102926803
>subhuman so obnoxious not even the rest of xir fellow avatartroons tolerate xir
LMFAAAAAAAAAAAOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO
>>
>>102926821
You can almost train Flux with layer offloading, so yes, you can train a model that is 4B smaller.
>>
>>102926860
desu skill issue
>>
so do i need to download any of the other shit off of hf or just the model
>>
>>102926862
>layer offloading
that's slow as fuck though?
>>
>>102926855
Meh
>>
File: file.png (3.26 MB, 1440x1440)
3.26 MB
3.26 MB PNG
looks like SD3.5 can't do high resolutions?
>>
>>102926897
it's still going to be significantly faster because 4B is a lot
>>
>>102926860
buy vpn retard.
>>
>>102926843
>>102926862
I see, thanks anons
>>
>>102926860
>I DON'T WANT TO SIGN UP FOR AN ACCOUNT JUST TO DOWNLOAD THIS SHIT FUCK YOU
this, I'll wait for someone to upload the weights elsewhere
>>
>>102926966
What does that have to do with anything?
>>
did they remove negatives again?
>>
>>102926980
it's just an ad for nordvpn, don't worry anon
>>
File: file.png (2.19 MB, 1232x1304)
2.19 MB
2.19 MB PNG
they spent months retraining SD3 -> SD3.5 on women lying on grass and they still haven't nailed it lmao
>>
>>102926934
Looks like the same issue as flux and sd3. Use a different technique to upscale like SDUltimate. Cap the working area to 1 megapixel
>>
>>102926990
there should be negatives because you go for CFG > 1, the demo just doesn't allow you to use them I guess
>>
File: ComfyUI_00035_.png (2.45 MB, 1280x1280)
2.45 MB
2.45 MB PNG
>>102926934
1280x1280 seems about the limit before things start getting extremely fucked up. Edges still get garbled though
>>
File: ComfyUI_SD35L_0022.jpg (160 KB, 896x1152)
160 KB
160 KB JPG
>>
>>102927007
Probably the safety/alignment teams fault. They’re so scared of the models learning anatomy that they butcher them.
>>
>>102927020
Flux can do higher resolutions without any upscaling just fine
>>
>>102927040
>They’re so scared of the models learning anatomy that they butcher them.
if they want to stay behind Flux for the rest of their life, that's their problem, we'll go forward with or without them
>>
>>102927007
It's real? Damn, they really didn't learn.
>>
>>102926803
welcome back to these threads, it's been a while
>>
File: 1702749643327477.png (152 KB, 319x351)
152 KB
152 KB PNG
i absolutely love FLUX for being so incredibly good at character LORAs. 20 small images and you get near perfect results you couldnt even dream off using SDXL. i fucking hope SD 3.5 will mange this as well, because i cant go back now. i recently tried using my old SDXL/PONY LORAs that i thought turned out well and they look fucking dogshit all of them.
>>
File: ComfyUI_SD35L_0024.jpg (130 KB, 896x1152)
130 KB
130 KB JPG
>>
File: file.png (1.29 MB, 688x1024)
1.29 MB
1.29 MB PNG
>>102927007
I really doesn't like upside down view
>a top-down full body view of a woman lying in grass, she's holding a sign with the text "Garbage model, don't recommend"
>>
File: ComfyUI_SD35L_0028.jpg (184 KB, 896x1152)
184 KB
184 KB JPG
it passes the sailor moon cenobite check. already liking it
>>
File: ComfyUI_04049_.png (1.72 MB, 1024x1024)
1.72 MB
1.72 MB PNG
i'll try this flux prompt in 3.5, brb
>>
File: file.png (1.77 MB, 1024x1024)
1.77 MB
1.77 MB PNG
>>102927127
This is the same prompt but with Flux dev dedistill
>>
>>102927195
her feet are fucking blurry
>>
Babe wake up, the first real finetune of flux dev (from dedistill) has been finished
https://huggingface.co/SG161222/Verus_Vision_1.0b
>>
>>102927206
I can't keep up with all these models nigga
>>
>>102927200
sorry Dan
>>
>>102927200
yeah I hate Flux for that, it makes everything blurry
>>
>>102927206
so how is it?
>>
>>102927195
>>102927200
more than that, the hands are fucked up on a fundamental level
>>
>>102927232
>the hands are fucked up on a fundamental level
compared to SD3.5 it looks fine lol >>102927127
>>
>>102927200
Skill issue unironically. I posted about how to prompt it back during release but I guess people didn't keep the knowledge alive.
>>
Quick impression. I always found flux gives its "oil paintings" a muddied amateurish look. SD3.5 gives a better result, looks more like bing which isn't a bad thing exactly.
Still no artist names sucks.
>>
File: ComfyUI_00047_.png (1.81 MB, 1024x1024)
1.81 MB
1.81 MB PNG
It's fun but being limited to 1mp without upscaling kills it for me
>>
File: 00056-2576600517.png (1.18 MB, 1024x1024)
1.18 MB
1.18 MB PNG
Is it over or are we back bros?
>>
>>102927241
none of that ever worked reliably on all sorts of difference prompts
>>
>>102927206
>11.9 GB
I hope he's not only releasing the fp8 model, would be retarded
>>
File: 00001-2371339493.jpg (298 KB, 1120x1440)
298 KB
298 KB JPG
>>
File: file.png (1.08 MB, 960x1280)
1.08 MB
1.08 MB PNG
>>102927136
>sailor moon cenobite check
sana
>>
>>102927206
>warning do not use a negative prompt
lol okay
>>
File: file.png (43 KB, 675x164)
43 KB
43 KB PNG
>>102927243
>Still no artist names sucks.
I feel like we'll never get something as kino as MJ for the artist names/celebrities if they keep going for the VLM path, and they like doing it that way, they are too cucked to take the risk on making the mob angry, that's why I respect MJ, they don't give a fuck they have actual balls
>>
>>102927282
>prompt: shifty looking chinaman greased with lard
>>
>>102927252
>Is it over or are we back bros?
SD3.5 looks like a worse version of Flux, desu I don't see the point, if people can run a 8b model they definitely can run a 12b one, so I'll stick with Flux I guess, the only way to win me over would be them having a model that has the styles and celebrities in there, but it looks like it's as empty as Flux
>>
>>102927310
you won't. localjeets have been psyopped into thinking text on signs and jesters juggling green cubes are more important.
>>
File: file.png (891 KB, 960x1280)
891 KB
891 KB PNG
>>
>>102927272
It did, but you never tried it since you never actually heard about it, because people don't listen and let information be buried.
>>
File: file.png (363 KB, 750x1000)
363 KB
363 KB PNG
>>102927336
>you won't.
I know anon, I know
>>
>>102927310
>>102927333
it's a nothingburger of a model, needs a decent finetune before even being worth checking out.
>>
>>102927349
prove your claims
>>
>>102927296
yeah, even with dedistill, when you use negative prompt it tend to destroy the quality image, maybe that's a proof it's not fully undistilled idk?
>>
>>102927206
dedistill is a red herring stop wasting your time
>>
>>102927310
It's never going to happen because you'll just do deep fakes and ruin it for everyone
>>
>>102927408
>you'll just do deep fakes and ruin it for everyone
what is celebrity loras? what is PuLID? what is InstantID?
>>
>>102927428
extra hoops that filters you
>>
>>102927433
it doesn't, I know how to use them
>>
>>102927445
no you don't, that's why you bitch about it
>>
>>102927388
nope, you don't want to try that's a you problem, people want to save flux and we're going to do it, with or without your doomerism
>>
File: file.png (692 KB, 960x1280)
692 KB
692 KB PNG
sana doesnt appear to know "cenobrite" really well but cool image regardless
>premiere studio anime, junji ito and yoji shinkawa, sailor moon as a cenobite from hellraiser
>>
>>102927463
by we you mean someone that isn't you
>>
File: ComfyUI_02477_.png (1.41 MB, 1024x1024)
1.41 MB
1.41 MB PNG
>>
SD3.5 can do nipples
https://imgur.com/m6yJqRB
>>
File: file.png (829 KB, 960x1280)
829 KB
829 KB PNG
>>102927472
CFG 1 for both of these
>>102927486
im so lonely bros
>>
File: test.png (3.51 MB, 2688x1152)
3.51 MB
3.51 MB PNG
>>102927206
>An asian woman
Left to right
Verus_Vision
FluxDeDistillfp8
Fluxfp8
>>
File: file.png (2.22 MB, 1360x768)
2.22 MB
2.22 MB PNG
that one is pretty good
>WWE fight, a person jumping from the ropes into another one
>>
>>102927506
looks like he removed the buttchin and the wrinkles over the mouth, nice
>>
File: ComfyUI_04054_.png (1.8 MB, 1024x1024)
1.8 MB
1.8 MB PNG
>>102927152
*first pic made with downloaded sd 3.5l *

oh, way better than expected
>>
Are we back or is it so over?
>>
>>102927566
they definitely improved their model, but it's still inferior to Flux so... maybe SD4 will beat them I guess?
>>
>>102927566
it is always over. It will be days of nitpicks between models and idiots who deny that secondary i2i is always needed.
>>
File: test2.png (3.96 MB, 2688x1152)
3.96 MB
3.96 MB PNG
>>102927537
Not quite. "Middle aged blonde woman" this time
>>
>>102927490
it's mispositioned
>>
>>102927611
why is it so shiny on vanilla flux? are you on cfg 1 on that one?
>>
>>102927629
Yes. That's what skin looks like with default flux
>>
>>102927506
thanks for the comparison, anon
>>
File: file.png (1.27 MB, 1360x768)
1.27 MB
1.27 MB PNG
>>102927517
this is the same prompt but on flux dev dedistill
>>
>>102927566
It was never over.. personally I'm waiting for the 3.5 Medium
>>
>>102927646
>That's what skin looks like with default flux
I'm surprised dedistill has the skin more natural, he really nailed that shit desu
>>
File: ComfyUI_SD35L_0066.jpg (176 KB, 896x1152)
176 KB
176 KB JPG
>>
File: ComfyUI_00360_.png (1.53 MB, 1280x720)
1.53 MB
1.53 MB PNG
>>102927357
Here
>>101714916
>>101714923
>>101714958
And NEVER use "sharp background", "clear", "in focus", or other things that describe the focus of an image. Those do nothing or can make things worse.
In my experience since those posts, styles have more influence in the ability to remove DOF, but optimally you want both detailed description and style to get consistently sharp gens.
>but I don't want to describe so much shit
Sure it's unfortunate but it works and it proves it is possible.
The better solution these days is probably to just use a LoRA or use a workflow with negative. I simply just wanted to challenge myself to see if it was possible to do with a vanilla workflow and prompting alone.
I don't gen images anymore since I always just check things out once and then wait for more releases.
>>
>>102927566
>Are we back or is it so over?
I think we're back, the licence is good, it's not better than dev but better than schnell (I guess?) and it's smaller (12b -> 8b), this definitely has potential for training
>>
>>102927696
neat. can you do other stuff in that style? buff guys with swords, buxom women, general fantasy, etc.
>>
>>102927541
what about when you add some text like in your previous images?
>>
File: ComfyUI_04068_.png (1.81 MB, 1024x1024)
1.81 MB
1.81 MB PNG
>>102927744
lets try, also ,can't render a full figure... i smell something fishy here
>>
>>102927704
>NEVER use "sharp background", "clear", "in focus", or other things that describe the focus of an image. Those do nothing or can make things worse.
It's insane how many anons think tokens like these help, but then you remember most don't even care to set a fixed seed when testing...
>>
>>102927704
why did you bother to link me posts when there is literally one sentence of usable info in them? and no metadata for the images

on second thought dont bother replying, it's cool
>>
File: ComfyUI_SD35L_0070.jpg (413 KB, 896x1152)
413 KB
413 KB JPG
>>
am i stupid or wasn't there one branch of forge/reforge that has sd3 support
>>
>>102927839
kek but that's a portrait nono, pikachu don't need a skyscraper gorwing out of he head
>>
https://huggingface.co/camenduru/stable-diffusion-3.5-large/tree/main
Ok, one fine gentleman uploaded the fp16 weights on his channel, so we won't have to give our infos to SAI to download it, let's go
>>
File: 00006-964609837.png (1.14 MB, 1152x896)
1.14 MB
1.14 MB PNG
>>
How large is large?
>>
File: ComfyUI_SD35L_0074.jpg (359 KB, 896x1152)
359 KB
359 KB JPG
>>102927871
cityscape was prompted for
>>
back status?
>>
File: file.jpg (764 KB, 3358x1810)
764 KB
764 KB JPG
>>102927007
YOU HAD ONE JOB
>>
File: sdkekl.png (1.68 MB, 950x948)
1.68 MB
1.68 MB PNG
>>
>>102927898
>How large is large?
8b
>>
>>102927964
thank you
>>
File: 00009-2887163644.png (1.29 MB, 1152x896)
1.29 MB
1.29 MB PNG
>>
>>102927876
now reduce and compress that shit until i can run it on my 10 series chop chop
>>
>>102927944
ngl that doesnt look very stable
>>
>>102927992
kek
>>
>>102927860
dev2?
>>
File: sd35_1.jpg (390 KB, 1024x1024)
390 KB
390 KB JPG
>>
File: ComfyUI_04081_.png (1.8 MB, 1024x1024)
1.8 MB
1.8 MB PNG
60% of renders are plagued with blatant anatomy errors, but i'm also toying with cfg... still no legs
>>
File: file.png (1022 KB, 800x1024)
1022 KB
1022 KB PNG
On Fal:
>SD 3.5 large -> $0.065 per megapixel
>Flux-dev -> $0.025 per megapixel
SAI is completly delusional, making you pay almost 3x more than a better model
>>
File: ComfyUI_04082_.png (1.88 MB, 1024x1024)
1.88 MB
1.88 MB PNG
I like it tho
>>
File: file.png (417 KB, 512x512)
417 KB
417 KB PNG
>>
File: sd35_2.jpg (339 KB, 1024x1024)
339 KB
339 KB JPG
>>
File: file.png (1.69 MB, 1184x888)
1.69 MB
1.69 MB PNG
it's strange that the default denoise strength on the demo s 0.85
>>
File: file.png (1.64 MB, 1344x768)
1.64 MB
1.64 MB PNG
I'm sorry footfags, but you won't be eating good with SD3.5 ;(
>>
>>102927206
why is verus vision so slow? 5s/it when flux-dev is 2.4
>>
File: file.png (412 KB, 512x512)
412 KB
412 KB PNG
>>
>>102928144
that's because you're going for CFG > 1, so that halves the speed, such is the fate of undistilled models
>>
>>102928111
>>102928149
Is this what I think it is?
>>
File: catbox_isguyc.png (1.33 MB, 1280x720)
1.33 MB
1.33 MB PNG
>>102927803
Well, it's not impossible they could help with some models if they were trained on those tokens. In the end the issue with people is probably just not wanting to spend the time doing objective AB comparisons on the influence of tokens. And actual shitposters.
Anyway, looks like the other guy was just a troll after all that doesn't actually want to learn anything or have a productive discussion. Sad.
>>
File: file.png (414 KB, 512x512)
414 KB
414 KB PNG
>>102928159
what do you think it is?
>>
File: file.png (962 KB, 1184x888)
962 KB
962 KB PNG
>>
File: ComfyUI_04088_.png (1.51 MB, 1024x1024)
1.51 MB
1.51 MB PNG
Jesus Christ, i'm starting to think they've done a trip to a grassy field and took a bunch of photos to train this model, but only waist up...
>>
>>102928158
its over....
>>
>>102928158
nta but why does cfg halve the speed? what is different about flux/dedistill cfg compared to classic 1.5/xl?
>>
File: test3.jpg (2.82 MB, 3072x1440)
2.82 MB
2.82 MB JPG
>>102927506
>>
>>102928172
Never mind. What is it? It's interesting.
>>
>>102928180
the negative prompt makes a sort of nega-image that it uses to guide the main image away from, with cfg 1 negative is ignores so obviously it will be twice as fast
>>
>>102928180
>nta but why does cfg halve the speed?
because cfg uses 2 pictures to calculate the negative prompt against the positive prompt, so it's twice as slow

>what is different about flux/dedistill cfg compared to classic 1.5/xl?
nothing, that's the point, it's undistilled so it means it act like a "normal" model like SD1.5/XL (those one also work on CFG > 1 and also have their speed halved)
>>
File: file.png (430 KB, 512x512)
430 KB
430 KB PNG
>>102928186
>horror movie screengrab, 1980s, cinematography
It's my unstable Pixart Sigma 600m finetune while I wait for Sana.
>>
>>102928185
the difference is subtle, but it looks less burned on the finetune I guess? and the woman has better horizontal proportion
>>
File: file.png (2.12 MB, 1024x1024)
2.12 MB
2.12 MB PNG
what I don't like about SD3.5 is the oversaturation, they fucked up the colors somehow
>>
File: sd35_6.jpg (275 KB, 1024x1024)
275 KB
275 KB JPG
Well there's definitely some "diversity". I prompted for cyberpunk geisha android and got netflix version.

>>102928213
cool stuff dude
>>
>>102928228
have you tried a slightly lower cfg. that might desaturate things a bit.
>>
>>102928247
yeah but it looks like cfg 3.5 is the "expected" value from that model to get good prompt adherance
>>
>>102928214
Yeah I'm not sure. Skin texture might be a little better but that could be the slight grain Verus seems to have, I dunno. The only thing these comparisons have reinforced for me is I'm not going back to distilled ever, s/it be damned
>>
>>102928260
model authors are wrong about this shit almost every time
>>
You guys see the new Open Source video gen model?
https://x.com/genmoai/status/1848762405779574990

The weights are open source, I wonder how much ram it takes to run though.
>>
>>102928291
What the fuck is happening with all these model releases
>>
File: ComfyUI_SD35L_0088.jpg (222 KB, 1152x896)
222 KB
222 KB JPG
>>
File: file.png (559 KB, 512x512)
559 KB
559 KB PNG
>>102928254
Yeah I decided to play with base Sigma since Sana was announced.
>>
>>102928291
HOLY SHIT HOW MANY MODELS GOT RELEASED TODAY??

- Moshi 1
- SD3.5
- Allegro
- Omnigen
- Emu1
- The demo of Sana

I won't forget this day that's for sure
>>
File: ComfyUI_SD35L_0096.jpg (268 KB, 1152x896)
268 KB
268 KB JPG
>>
>>102928291
The quality is absolutely insane, and that's Apage 2.0? What the fuck man?
>>
>>102928291
>>102928321
Now which one is the best of the bunch and can run on a 3090...
>>
File: 1701798533824529.png (33 KB, 551x259)
33 KB
33 KB PNG
what do i do with guidance with verus vision?
>>
>>102928291
>open torrent
>dit.safetensors
>40 GB
holy shit its a big boi
>>
>>102928367
>holy shit its a big boi
if that's the whole package (DiT + text encoders) then maybe it'll be runable on Q8_0 on a 24gb card
>>
File: sd35_8.jpg (306 KB, 1024x1024)
306 KB
306 KB JPG
>>
File: ComfyUI_SD35L_0102.jpg (261 KB, 1152x896)
261 KB
261 KB JPG
>>
Bigma sisters?
>>
>>102928291
>Minimax tier but local
Do you guys have any idea how big of a deal this is?
https://files.catbox.moe/6ddgsl.mp4
>>
>>102928291
They have a test site you can try it out on
https://www.genmo.ai/
>>
File: ComfyUI_SD35L_0098.jpg (263 KB, 1152x896)
263 KB
263 KB JPG
>>
>>102927989
Jones!
>>
>>102928429
bigma dead, sd 3.5 curb stomped it
>>
File: file.png (722 KB, 512x512)
722 KB
722 KB PNG
>>102928456
600m is all you need
>>
>Our commitment to safety

>We believe in safe, responsible AI practices and take deliberate measures to ensure Integrity starts at the early stages of development. This means we have taken and continue to take reasonable steps to prevent the misuse of Stable Diffusion 3.5 by bad actors. For more information about our approach to Safety please visit our Stable Safety page.
On the safety page, one of their "6 pillars for safe AI":
>Ensuring data integrity
>We maintain model integrity by carefully screening training data, excluding illegal content to uphold safe and ethical standards in our products.
And of course it's gimped lmao
SD 3.5 is DOA
>>
File: sana geisha android.jpg (262 KB, 1024x1024)
262 KB
262 KB JPG
>>102928242
>>
Verus can't do nipples
Into the trash
>>
File: 00027-2500614278.png (1.49 MB, 896x1152)
1.49 MB
1.49 MB PNG
>>102928454
>>
>>102928291
>The model requires at least 4 H100 GPUs to run. We welcome contributions from the community to reduce this requirement.

oof, guess we have to wait on quants
>>
Very busy day. Have like two hours to myself -_-
Is sana local yet, has pyramid released the proper model?
And uh, is eveyone ok?
>>
File: ComfyUI_SD35L_0120.jpg (243 KB, 896x1152)
243 KB
243 KB JPG
>>
>>102928604
>And uh, is eveyone ok?
fuck off homo
>>
>>102928612
nice
>>
>>102928604
everything changed
>>
>>102928604
I'm not ok. I'm dealing with health issues that my doctors were not able to figure out. I'm in literal pain and it sucks. But anyway...
>>
>>102928456
>curb stomped
Let's not devolve into exaggeration.
>>
>>102928616
I'm happy you hate faggots, everyone does, faggot.
>>102928638
I see >>102928321
>>102928649
I hope you find a sedative that allows you mental space to get back on the diagnostic process.
>>
Is it cheaper to run locally these models, if I'm doing it for a business? I made an app that uses AI, but I ran the numbers to see how much I would spend using the Open AI API prices, I got spooked. It's expensive, especially the input tokens.
If my app is "freemium", I will be losing money unless a decent part of the free users convert to premium.
>>
File: 00033-1718397392.png (1.11 MB, 1152x896)
1.11 MB
1.11 MB PNG
>>
>>102928696
>Open AI API prices, I got spooked. It's expensive, especially the input tokens.
are you talking about text gen or image gen?
>>
>>102928591
is there any info on how long it takes on 4xH100? No way it will be reasonably fast on 4090
And I thought Allegro taking 50 minutes for 6 second video is as slow as it gets
>>
>>102928716
Text gen using gpt-4o.
Image generation is expensive too though, and it would use image generation... But image generation is easier to run locally since it doesnt take that long
>>
File: sana.jpg (437 KB, 1542x1152)
437 KB
437 KB JPG
>>102928612
thats fucking great
>>
>>102928716
>>102928742
I'm retarded, just realize I'm on the wrong thread again
>>
>>102928438
I signed in even but it's not letting me prompt. Shame.
>>
>>102928429
sana exceeded my expectations so i'm happy
and it can only get better
>>
anybody else have to verify email or wait 15 minutes to post?
>>
>only 8 new model releases today
Yawn
>>
>>102928882
>smacks lips
>>
>>102928882
Flux 1.1 WHEN
>>
File: sana.jpg (130 KB, 1146x766)
130 KB
130 KB JPG
>>
>>102928794
>>102928438

yeah same for me, seems bugged, but it does say you get a free 30 videos per month
>>
Flux could die and I would not care
>>
>>102928434
holy shit! that's really good
>>
>>102928604
>has pyramid released the proper model?
forget about that shit, we literally have MiniMax at home now >>102928434
https://huggingface.co/genmo/mochi-1-preview
>>
File: file.png (268 KB, 640x782)
268 KB
268 KB PNG
>Pony won't finetune SD3.5
It's ova...
>>
FIX
YOUR
DEMO
FAGGOT
>>
>>102928957
>The model requires at least 4 H100 GPUs to run
Come on dawg
>>
>>102928969
it's asking for 320 go of vram? oof... fuck my life...
>>
>>102928987
so even after quants it will be like 80gb? Maybe doable if you can use regular ram as well though I bet it's slow
>>
>>102928987
it says *atleast* 4 h100s so that means 320gb of vram is on the lower end
>>
Are we sovl again?
>>
>>102928741
I think Mochi should be runnable locally as long as you have two 3090s / 4090s.
Allegro runs on 4090 using 22GB VRAM, tested it myself. The transformer is 2.8B parameters, with a context length of 80k when doing the full video length.
Mochi is 10B parameter transformer, with 44k context length, which is lower because I think the VAE has more temporal compression.

So assuming you could quantize each to 8 bit, the model weights are a difference of only 7GB. Memory usage should scale linearly with context size and hidden dimension. For Mochi, context is half that of Allegro, hidden dimension is larger (need to look up the exact difference). This is all back of the envelope math and extrapolation but probably mochi can be made to run on 2 3090s with a good, efficient inference implementation.
>>
>>102929008
i'd say so, yeah. all the new models today have pretty sovlful gens
>>
>>102929017
whelp, if thats accurate and it's good I might finally have my excuse to get another 3090
>>
>>102928966
The sooner you accept Pony won't train a new model the happier you'll be. He's going to grift like the Summertime Saga dev.
>>
>>102926788
Horrendous gens in OP
>>
>>102929051
Sounds like you're just jealous yours didn't make into the OP
>>
File: file.png (18 KB, 2723x110)
18 KB
18 KB PNG
this is something I like on SD3.5, even at CFG > 1 it's really fast (1.25s/it) compared to Flux at CFG > 1 (3.5s/it)
>>
File: file.png (563 KB, 448x544)
563 KB
563 KB PNG
>>
>>102929051
my gens are in there so you better take that back, punk
>>
>>102928966
Is he still claiming to be using the super secret version of Auraflow that supposedly isn't ass?
>>
>>102929062
i did make it into the OP tho
>>
>>102929062
Bottom left is the only half decent one the rest are ass just like SD3M
>>
>>102928957
I dont have 4 H100's at home? Do you?
Back to waiting for Pyramid.
>>
File: file.png (424 KB, 448x544)
424 KB
424 KB PNG
>>
>>102929098
Bottom right**
>>
>>102929098
sorry i meant to say that bottom right is ass and the rest are great** need my coffee lol :p
>>
File: file.png (509 KB, 448x544)
509 KB
509 KB PNG
>>
>>102929017
>This is all back of the envelope math and extrapolation but probably mochi can be made to run on 2 3090s with a good, efficient inference implementation.
https://www.youtube.com/watch?v=oxSJFkS9iVM
>>
>>
>>102926788
thank you for including a 1girl this time
>>
https://stability.ai/news/introducing-stable-diffusion-3-5
>Stable Diffusion 3.5 Medium (to be released on October 29th): At 2.5 billion parameters, with improved MMDiT-X architecture and training methods, this model is designed to run “out of the box” on consumer hardware, striking a balance between quality and ease of customization. It is capable of generating images ranging between 0.25 and 2 megapixel resolution.
they plan on releasing a 2.5b model in a few days
>>
File: file.png (542 KB, 448x544)
542 KB
542 KB PNG
>>
>>102929222
they really wanted to bury sana before it even had a chance
>>
Holy shit, SD3.5 can do nude women just fine, it's not censored at all!
https://files.catbox.moe/w0katp.png
https://files.catbox.moe/646coy.png
>>
>>102929236
the competition just makes sana 2 that much stronger
>>
>>102929222
>with improved MMDiT-X architecture
they also used that improved architecture for the 8b model or not
>>
>>102929239
>wow! it's a whole bunch of nothing!
>>
>>102929252
well it's a huge improvement to the body horror of SD3
>>
>>102929239
Heh, looks like Flux actually mentally broke them. Complete 180 from the sterilized shit they were trying to do before.
>>
File: file.png (503 KB, 448x544)
503 KB
503 KB PNG
>>102929236
Niche makes it fun
>>
>>102929252
At least it can do nipples unlike Flux lol
>>
>>102929245
>MMDiT-X architecture
i think it's just sd3.5m, but don't quote me on that
>>
>>102929264
>>102929279
fair points
>>
File: ComfyUI_11021_.png (1.36 MB, 768x1280)
1.36 MB
1.36 MB PNG
better feet than flux
>>
>>102929275
this is why competition is good, the thousand models that released today is a sign of good things to come. i don't want a single company to dominate again
>>
>>102929301
Is it though? She only has 4 toes on each foot
>>
File: file.png (1.86 MB, 1024x1024)
1.86 MB
1.86 MB PNG
>Sailor Moon playing ping pong against Hatsune Miku
It doesn't look great but there's no concept bleeding like on Flux, I think SD3.5 has some huge potential
>>
>>102927252
VRAMlets are so over, but VRAM chads are so back.
https://x.com/genmoai/status/1848762405779574990
>SD is still a shitty model.
>>
>>102929307
Nvidia still is...
>>
can sd3.5 large do armpit hair?
>>
File: file.png (491 KB, 448x544)
491 KB
491 KB PNG
>>
>>102929301
>>102929311
yeah at least flux made the toes look like dicks
>>
>>102929319
yeah but i meant for image gen models
>>
flux remains winning. this recent garbage is today's equivalent of kolors and lumina. useless shovelware trash that will be forgotten within a week. sd3 8b looks no better than sdxl.
>>
>>102929339
You don't want nvidia to lose the lead?
>>
>>102929355
what?
>>
File: ComfyUI_11027_.png (1.05 MB, 1024x1024)
1.05 MB
1.05 MB PNG
>>
>>102929315
I bet everyone in this thread combined doesn't have enough vram to run that locally
>>
>>102929350
>this recent garbage is today's equivalent of kolors and lumina.
the only thing worth looking at is this insane model, fucking MiniMax at all who would've thought we'd get something that powerful so soon? >>102929315
>>
>>102929311
Yes, flux cannot do feet soles AT ALL, its pretty bad at feet
>>
File: file.png (1.75 MB, 1024x1024)
1.75 MB
1.75 MB PNG
>Jim Carrey
Come on SAI... it would be one thing cool to dethrone Flux, have fucking celebrities and artist styles onto your fucking model
>>
File: file.png (551 KB, 448x544)
551 KB
551 KB PNG
>>
File: file.png (244 KB, 1920x1145)
244 KB
244 KB PNG
>>102929315
I have no idea why they decided to release this model (and with an Apache 2.0 licence), this isn't just a SOTA local model, it's probably one of the best video model that ever exist, holy shit...
>>
>>102929415
How does CogVideoX score that high? In my experience it fucking sucks
>>
File: file.png (2.51 MB, 1024x1024)
2.51 MB
2.51 MB PNG
>>102929383
I really dig the aesthetics, it has a lot of sovl, too bad the details are bad though, something's missing on that SD3.5 model I feel, it can be close to Flux with some more training I think
>>
File: ComfyUI_04132_.png (1.49 MB, 1024x1024)
1.49 MB
1.49 MB PNG
Jeebus Christ...
>>
File: file.png (776 KB, 800x517)
776 KB
776 KB PNG
remember to thank SAI's new ceo Prem Akkaraju for the shiny new (uncensored) models!
>>
>>102929315
>Here, I got you MiniMax local, you just need 300 Gb of Vram to run it though, BYE!
Thanks I guess? kek
>>
>>102929415
probably some ai image/video model regulation coming up pretty soon, thats my bet why companies are releasing models left and right, it will be very difficult for companies to release models in the future
>>
are they gatekeeping us with vram on purpose?
>>
File: ComfyUI_04121_.png (1.5 MB, 1024x1024)
1.5 MB
1.5 MB PNG
>>102929301

Except for feet of a body lying down in the grass...
>>
>>102929473
>are they gatekeeping us with vram on purpose?
I don't think that's what they want to do, we just can't make quality models without much vram requirements, at this point we can only blame Nvdia from gatekeeping the max VRAM, they still locked it at 24gb since 2018, those greedy fuckers...
>>
>>102929473
Whether you like it or not, attention takes memory.
>>
File: 00016-3386998335.png (3.16 MB, 1280x1920)
3.16 MB
3.16 MB PNG
>>102929457
How much coke do you think he does?
>>
>>102929473
yes so all those data centers can wring your wallet
>>
File: file.png (2.09 MB, 1024x1024)
2.09 MB
2.09 MB PNG
>>102929453
after spending more than 2 months on Flux, it just feels refreshing testing out SD3.5 model, it just more diverse images styles and not the same shit over and over again
>>
>>102929498
who can wring my pp?
>>
What sample are you guys using with Flux? Euler 20 steps? I want to gen faster. My autism needs it
>>
>>102929520
sounds like a skill issue
>>
File: file.png (142 KB, 474x628)
142 KB
142 KB PNG
>>102929496
>>
>>102929533
>I want to gen faster.
you can try out that lora, it allows you to gen for only 8 steps
https://civitai.com/models/870028/real-fascination-hyper-8steps-flux1d?modelVersionId=973745
>>
File: 00001-2807647590.png (3.12 MB, 1280x1920)
3.12 MB
3.12 MB PNG
>>
>>102929551
Thanks bro, I will try it
>>
>>102929239
Ok as expected there's no pp in this model, lame
https://files.catbox.moe/ab7piy.png
>>
Civit exists for so long now, and it's still possibly the worst site I've ever used. Will the they ever make it work properly?
>>
if you told me this was a 1.5 finetune I'd believe you. where are the 8b parameters going?? because I'm not seeing them in any of these outputs. why are the details fucking melting on an 8b model?
>>
>>102929582
>1 hit wonder
>keep doing the same shit
>it's still shit
>???
>profit
>>
>>
>Sana
SHIT
>SD 3.5
SHIT
>Verus Flux fine tune
SHIT
>Genmo Mochi 1
Unrunnable
LAME
>>
File: 00021-2287231328.png (2.76 MB, 1280x1920)
2.76 MB
2.76 MB PNG
>>
>>102929383
Even if styles by name of artists aren't there, it's better than flux in this aspect. Prompt adherence seems ok (as SD3M already was) but anatomy mistakes are more frequent than with Flux.
If it behaves better with style Loras than flux does, I think it is a worthy model.
>>
Anyone know what resolutions SD3.5 is trained at other than 1024x1024?
>>
>>102929628
sd 3.5 medium will be ultra kinorino
>>
>>102929582
>why are the details fucking melting on an 8b model?
that's what I'm asking myself aswell, why the details are so fucking bad? 8b isn't that far away from 12b, don't tell me what BFL did was fucking magic, it can definitely be replicated, SAI just sucks ass man...
>>
File: ComfyUI_04142_.png (2.02 MB, 1024x1024)
2.02 MB
2.02 MB PNG
Nearly there...
>>
>>102929643
I can't go much higher than 1024x1024 without everything turning to garbage
>>
>>102928966
Is because the license has a limit of 1 million, so... This faggot is earned more than one million with PonyXL and his service, make you think...
>>
>>102929582
SD 1.5 finetunes were pretty bad too. You're thinking of SD 1.5 mixed with NAI... Now THAT made some pretty good models *siiipps*
>>
File: file.webm (3.64 MB, 2034x1152)
3.64 MB
3.64 MB WEBM
>>102928957
They fucking nailed that shit, holy fuck dude this looks incredible
https://github.com/genmoai/models
>The model requires at least 4 H100 GPUs to run.
Kek, gonna wrap a rope and kms I guess, why can't we have nice things :(
>>
>>102929665
>This faggot is earned more than one million with PonyXL and his service, make you think...
wtf? no fucking way...
>>
File: file.png (638 KB, 1521x933)
638 KB
638 KB PNG
>>102929664
>I can't go much higher than 1024x1024 without everything turning to garbage
yeah, it's even worse than SD1.5 with the duplication, here it's just complete glitches, impossible to go further, are they for real?
>>
>>102929682
damn, that's hella good quality for what it's worth
>>
File: file.png (42 KB, 979x512)
42 KB
42 KB PNG
Jesus Christ, I must be doing something wrong.
Flux-dev cant be that slow on my computer right?
It's a RTX 3060 12GB VRAM. A Core i5 12400 with 64GB RAM.
That time was to generate a 768x768 picture with 15 steps.
I dont get it, it doesnt take that long in any of the SDXL models.
>>
>>102929017
the model is 40Gb big, can't we limit the vram requirement by using Flash Attention or something?
>>
>>102929744
it's slow as fuck if Flux is too big for your gpu? What Quant are you runing anon? If you tell me you run the official bf16 safetensors then of course you're fucked, this is asking for more than 22gb of vram, your RTX 3060 will never eat this shit up
>>
File: ComfyUI_SD35L_0152.jpg (227 KB, 896x1152)
227 KB
227 KB JPG
>>
>>102926788
Now that SD3.5 won, isn't it time to close this thread and go back to /sdg/? You don't need to keep using a distilled model anymore.
>>
0/10
>>
cough
>>
for effort
>>
File: ComfyUI_04149_.png (2.06 MB, 1024x1024)
2.06 MB
2.06 MB PNG
Goddamnit!
Posting this for the lulz
>>
File: file.png (45 KB, 478x493)
45 KB
45 KB PNG
>>102929762
>If you tell me you run the official bf16 safetensors
I might be very stupid.
When I click on weight_dtype I can select different options. Should I be running the fp8 option then? For poorfags gpu like mine
>>
gm
>>
>>102929792
>full flux dev
>T5 fp16
There's you're problem holy shit
>>
>>102929807
Alright, thanks for spotting it bro.
I'm downloading the Flux dev model that's on Comfy-org's hugging face page right now. That should work better I presume?
>>
File: ComfyUI_SD35L_0164.jpg (288 KB, 896x1152)
288 KB
288 KB JPG
>>
>>102929792
I advise you to put the text encoder (9gb) into your ram, and go for fp8 yeah
https://reddit.com/r/StableDiffusion/comments/1el79h3/flux_can_be_run_on_a_multigpu_configuration/
>>
Anyone figure out higher resolutions with SD3.5? Or is it really just fucked?
>>
File: file.png (442 KB, 512x512)
442 KB
442 KB PNG
>>
>>102929844
yep, impossible to increase the resolution somehow, almost like it's completly locked at the regular one, that's so odd
>>
>>102929844
>>102929910
Killed all my enthusiasm being locked to 1mp
>bro just upscale
NO!
>>
>>102929844
I think img2img is fucked just like SD3 was, when I try to do a hires upscale, I get weird artifacts
>>
>>102929987
glad i wasn't the only one. thought i might've just been retarded.
>>
>>
>>
>>102929682
I don't get it why it's asking for so much vram, it's "only" a 10b model, twice as big as CogVideoX, and Cog can run on a 3090
>>
>>
>>102929835
that's a cool image
>>
Fresh

>>102930087
>>102930087
>>102930087
>>
>>102929239
the naughty bits are wobbly, not a good sign bro
>>
>>102929790
lmao no way
>>
>>102929910
did they train it on only one aspect ratio images? if so that's fuckin retarded
>>
can i run anything on a 6700xt yet?



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.