File: the longest dick general.jpg (2.31 MB, 3264x1895)
2.31 MB
2.31 MB JPG
Discussion of free and open source text-to-image models

Undistilled Edition

Previous /ldg/ bread : >>102646216

>Beginner UI
Fooocus: https://github.com/lllyasviel/fooocus
EasyDiffusion: https://easydiffusion.github.io
Metastable: https://metastable.studio

>Advanced UI
Forge: https://github.com/lllyasviel/stable-diffusion-webui-forge
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
ComfyUI: https://github.com/comfyanonymous/ComfyUI
InvokeAI: https://github.com/invoke-ai/InvokeAI
SD.Next: https://github.com/vladmandic/automatic
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI

>Use a VAE if your images look washed out

>Model Ranking

>Models, LoRAs & training


>Pixart Sigma & Hunyuan DIT
Nodes: https://github.com/city96/ComfyUI_ExtraModels

>Index of guides and other tools

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
sd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium

>Maintain thread quality

File: 00132-3280264361.png (1.27 MB, 704x1664)
File: 00133-277872321.png (1.61 MB, 704x1664)
File: 00136-1418937834.png (2.46 MB, 1216x1600)
File: 00135-502578135.png (1.55 MB, 704x1664)
File: 00142-1547560536.png (1.57 MB, 896x1344)
File: 00144-3649137082.png (1.74 MB, 896x1344)
File: 00152-3335402968.png (953 KB, 576x1024)
I guess this is a technical question - how is it known that it is now undistilled? what if it's only 80% undistilled? how do you verify this
File: 00161-3036414018.png (1020 KB, 576x1024)
File: 00164-2384723655.png (789 KB, 576x1024)
File: 00166-2721046942.png (834 KB, 576x1024)
File: file.jpg (1.73 MB, 9111x1796)
>how is it known that it is now undistilled?
it doesn't burn at high cfg's, that's how you know it's undistilled
File: 00172-2695188538.png (873 KB, 576x1024)
File: 00183-1104493318.png (884 KB, 576x1024)
File: 00182-2049000285.png (1.01 MB, 576x1024)
File: 00188-4198240280.png (933 KB, 576x1024)
File: 00192-3985904370.png (949 KB, 576x1024)
File: 00193-1537400177.png (915 KB, 576x1024)
File: 00195-3419126569.png (737 KB, 576x1024)
File: 00197-1455805134.png (889 KB, 576x1024)
File: 00194-2615945741.png (952 KB, 576x1024)
File: 00204-291631721.png (846 KB, 576x1024)
File: 00201-3790433226.png (827 KB, 576x1024)
File: file.png (3.57 MB, 896x1536)
remember the previous thread when it's been said that we couldn't finetune Flux? Well... the guy that previously made Realistic Vision took it personally lol
File: 00203-1467414437.png (852 KB, 576x1024)
I don't see any information on the training he did other than Civitai's "Checkpoint Trained" tag...
File: file.jpg (2.22 MB, 9999x1186)
bruh it converges a 60 steps? oh man :(
File: file.png (310 KB, 1843x1536)
>I don't see any information on the training he did other than Civitai's "Checkpoint Trained" tag...
I think he's making a real finetune because he's seeing some collapse and is talking about the undistilled flux model
>we will hope for the release of the non-dustilled version of the model
what? un-distilled flux dev was released 20 days ago
File: file.png (83 KB, 1668x306)
83 KB
>I think he's making a real finetune
he is
File: 00225-4124722864.png (1.26 MB, 768x1344)
File: 00232-2651050876.png (1.12 MB, 768x1344)
File: 00234-2343208380.png (1.49 MB, 768x1344)
Is there a discussion anywhere from the pony guy on how many H100 GPU-hours he would need to do it for either of these undistilled ones?
Or discussion of his GPU-hours at all for any model so I can extrapolate.
>Is there a discussion anywhere from the pony guy on how many H100 GPU-hours he would need to do it for either of these undistilled ones?
to be honest I don't really care, Schnell's quality is so bad compared to dev, even if he made an insane finetune out of it the best scenario would be reaching dev's quality, I think a guy that makes a "normal" finetune on dev and does it because he doesn't care about money will be a better finetune than pony's insane one on schnell
How many many images did he use for his dataset?
Would you like to play a game?
Which is which, and which is SDXL?
File: file.png (94 KB, 690x948)
Ugh... he's giving up on flux because it's "distilled", I think he doesn't know un-distilled versions of flux already exist. I'm kinda curious what's this "Venus Vision" is though
you already did this a few threads ago, I don't want to sound rude, but you're the only human on earth that hasn't noticed the quality difference between schnell and dev
That was SDXL vs schnell.
I can see it I’m just coping hard.
>I can see it I’m just coping hard.
fair enough :v
>Distilled Model
The only logical explanation was to tempt anon into purchasing a sub for pro, right?
The model isn't even trained yet, lol. It could be a vaporware scam.
File: file.png (220 KB, 631x513)
It was more so releasing a model good enough to get the hype and publicity, and then destroy any competition by making their "open" models impossible to finetune, such an evil genius move if you ask me
Bigma status?
File: file.png (47 KB, 1074x224)
>The model isn't even trained yet, lol. It could be a vaporware scam.
looks like Venus Vision is a continuation of the Flux undistilled model, they intend on improving over it, I like that path, those guys are definitely gonna save flux
It’s to poison the well. Remember these are people that fled stability with everything learned there (if not outright their codebase). The reason stability is fucked financially is because their free models are almost as good as the paid one, and has been tuned to be way better. BFL made dev specifically to be at the same level as mid journey in benchmarks with a cuck license, and schnell just a hair above SDXL to fuck up open source, and clipped both of their wings for tuning to try to make it so that they couldn’t be tuned to be as good as the next model up.
i havent seen a flux pro image that doesnt look like hyperslop tho
desu you cant desloppify it through prompt alone
>i havent seen a flux pro image that doesnt look like hyperslop tho
true, pro isn't that better than dev, at least it should've gotten some sovl like Midjourney Niji, but I think they don't care that much, they got a partnership with Twitter, they are in a good place right now
File: file.png (1.34 MB, 1024x1024)
can't believe this took us 20 days to realize a trainable flux model existed, goddam...
is the untrained model good or bad
Can either of the undistilled Dev or Schnell models be used in ComfyUI?
Has anyone even figured out if it being distilled was the only issue preventing training? From my experience with loras I think there's more fuckery up bfls sleeve. This could sadly still be a nothing burger
File: 57069.jpg (346 KB, 2560x2560)
It's apparently done, but they are writing and releasing paper about it first
none of the examples I've seen other people post look like they couldn't have been done with base Dev
Is there a gguf version of this?

What's the difference to this?
File: 1699093161939375.png (1.99 MB, 1024x1024)
Is there a good AI lyrics editing software yet?

Because there is so much good music with stupid fucking lyrics. I would make this shit into a fucking global no. 1.
If I wanted to make pixel art for a game what would be the best local solution to do that?
i'm usually running the normal flux fp8 model which has baked in vae and clip, and the model runs at 1.5it/s. but whenever i try the light weight versions like gguf, it becomes slow as shit. are there some settings i can use to get similar speeds or is this just how it works? there seems to be no reason to run the models with external clip and vae due to this.
Is it possible to use flux with only 8gb of vram? I assume no but wanted to check.
>Is it possible to use flux with only 8gb of vram?
yeah, you can run flux quanted
pick one with a file fize that's smaller than your max vram + some extra room for stuff like image resolution and loras. i think you should be able to use the Q4_0 quant with 8gb
Happened to be reading others stuff and saw this.
>GGUF is a pure compression tech, which means it is smaller but also slower because it has extra steps to decompress tensors and computation is still pytorch
Does a1111 support this format or do I need to use comfy?
>Does a1111 support
not sure but forge does
see it as an opportunity to start using comfy instead.
It's too annoying. Maybe once you get it set up with a pipeline it'll be "better" but I just don't care to do any of the fine tuning that comfy allows. Being able to select a model and type prompts immediately is good enough. I'm open to trying again because I still have it installed but there's just no usecase for me right now.
Are you on a 40XX card? I believe they have some speed hack that makes fp8 run much faster.
yeah i am. the compression also makes alot of sense.
File: bComfyUI_124038_.jpg (782 KB, 1440x1024)
it wasn't that bad but thanks for giving me something to gen for later
File: bComfyUI_124112_.jpg (699 KB, 1440x1080)
can you go fatter
File: 0.jpg (263 KB, 1024x1024)
File: 0.jpg (254 KB, 1024x1024)
File: 0.jpg (26 KB, 169x542)
26 KB
If you for some reason can't or don't want to use Photoshop's content-aware fill, here's a nice tool https://github.com/Sanster/IOPaint
File: 0.jpg (243 KB, 1024x1024)
yeah, messed up. posted wrong image.
File: bComfyUI_123627_.jpg (642 KB, 1280x1024)
I like this style
mystery , Halloween , by Simon Stalenhag by Frazetta, surreal, by Andre Kohn,

