[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: tmp.jpg (912 KB, 3264x3264)
912 KB
912 KB JPG
Discussion of free and open source text-to-image models

Previous /ldg/ bread : >>101900433

>Beginner UI
EasyDiffusion: https://easydiffusion.github.io
Fooocus: https://github.com/lllyasviel/fooocus
Metastable: https://metastable.studio

>Advanced UI
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
ComfyUI: https://github.com/comfyanonymous/ComfyUI
Forge: https://github.com/lllyasviel/stable-diffusion-webui-forge
InvokeAI: https://github.com/invoke-ai/InvokeAI
SD.Next: https://github.com/vladmandic/automatic
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI

>Use a VAE if your images look washed out
https://rentry.org/sdvae

>Model Ranking
https://imgsys.org/rankings

>Models, LoRAs & training
https://civitai.com
https://huggingface.co
https://aitracker.art
https://github.com/Nerogar/OneTrainer
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts

>Flux
https://huggingface.co/spaces/black-forest-labs/FLUX.1-schnell
https://comfyanonymous.github.io/ComfyUI_examples/flux

>Pixart Sigma & Hunyuan DIT
https://huggingface.co/spaces/PixArt-alpha/PixArt-Sigma
https://huggingface.co/spaces/Tencent-Hunyuan/HunyuanDiT
https://huggingface.co/comfyanonymous/hunyuan_dit_comfyui
Nodes: https://github.com/city96/ComfyUI_ExtraModels

>Index of guides and other tools
https://rentry.org/sdg-link
https://rentry.org/rentrysd

>GPU performance
https://vladmandic.github.io/sd-extension-system-info/pages/benchmark.html
https://docs.getgrist.com/3mjouqRSdkBY/sdperformance

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
sd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium

>Maintain thread quality
https://rentry.org/debo

>Related boards
>>>/h/hdg
>>>/e/edg
>>>/d/ddg
>>>/b/degen
>>>/vt/vtai
>>>/aco/sdg
>>>/trash/sdg
>>
If someone knowledgeable can pitch in on this I'd appreciate it
>>101902601
>>
File: ldg6.jpg (368 KB, 1999x1999)
368 KB
368 KB JPG
Fuck good thing I always refresh before baking, almost double baked
>>
File: 1723019311152745.png (1.53 MB, 1216x1536)
1.53 MB
1.53 MB PNG
>>
File: ifx25.png (1.16 MB, 1024x1024)
1.16 MB
1.16 MB PNG
>>
File: ComfyUI_00012_.png (1.25 MB, 1216x832)
1.25 MB
1.25 MB PNG
>>
Blessed thread of frenship
>>
File: Flux 31.png (1.47 MB, 768x1344)
1.47 MB
1.47 MB PNG
>>
File: itsover.jpg (26 KB, 510x510)
26 KB
26 KB JPG
>fuck amd
>google and amazon make dedicated chips
>only for their cloud
>arm makes only inference chip
>>
>>101902653
cursed shithole for nonces
>>
File: choices.jpg (173 KB, 1024x1024)
173 KB
173 KB JPG
>>
>>101902676
Go away non-fren
>>
Noob question. Is the main (practical) difference between flux and flux-dev that the second one is less censored?
>>
File: ifx27.png (1.28 MB, 1024x1024)
1.28 MB
1.28 MB PNG
>>
File: ComfyUI_00014_.png (1.06 MB, 1216x832)
1.06 MB
1.06 MB PNG
>>101902663
kek
>>
File: USSA.jpg (360 KB, 1024x1024)
360 KB
360 KB JPG
>>101902700
You mean Flux Pro and Flux Dev?
Flux Dev is a distilled model of Flux Pro. It is weaker and produces worse gens than Flux Pro. Pro is SaaS only, Dev is a local model.
There is also Flux Schnell which is a "fast" model which gens in 4 steps at worse quality.
The censorship is the same on each, and meaningless because people have already made nude LoRAs.
Beyond nudity and porn, violence is also censored. Other than that it's pretty open, and there is very little you can't gen.
>>
>>101902700
Flux Pro is only available through BFL's API or replicate and fal.ai
these all have some form of restriction but I haven't seen anyone post a Flux Pro gen that isn't possible on FluxDev
>>
File: ComfyUI_00015_.png (1.09 MB, 1216x832)
1.09 MB
1.09 MB PNG
>>
>>101902739
>>101902741
Thank you frens. Are LoRA's possible on GGUF quants? I think that's the only thing I will be able to use in the foreseeable future if at all.
>>
>>101902741
All of my gens tonight have been from fluxpro.art because I need my gpu for other tasks. I have run into 0 model restrictions that don't also apply to dev.
>>101902766
>Are LoRA's possible on GGUF
Yes, if they are possible on nf4 they are possible on gguf, all it takes is for someone to do it.
>>
>>101902778
what do you find is the biggest difference between Pro and Dev?
>>
GGUF anon, someone is asking you to quant Kolors aswell kek
https://www.reddit.com/r/StableDiffusion/comments/1eso216/comment/li7qhen/?utm_source=share&utm_medium=web2x&context=3
>>
File: fp8vsfp16vsPro.jpg (853 KB, 2971x1092)
853 KB
853 KB JPG
>>101902788
Quality
>>
https://imgsli.com/Mjg3ODgy/1/3
Here's a comparaison between fp8/fp16/Q8_0/nf4 on a house
>>
>>101902824
GGUF Q8 is even closet to FP18 than any of them. Very impressive. Only 12GB right?
>>
>>101902824
I never expected fp8 to be so different to fp16... damn... glad that Q8_0 exists now lol
>>
>>101902840
>Very impressive. Only 12GB right?
yep, just a little bit bigger than fp8 (8bpw) because Q8_0 is a 8.5bpw
>>
>>101902848
What is the gen speed compared?
>>
>>101902855
You have all the informations there:
https://reddit.com/r/StableDiffusion/comments/1eso216/comparison_all_quants_we_have_so_far/
>>
File: reddit.jpg (150 KB, 1024x1024)
150 KB
150 KB JPG
>>101902866
OK but what about without cfgmaxxing?
>>
>>101902896
without the cfgmaxxing it's twice as fast
>>
>>101902900
So essentially the same as fp8, which, given the quality improvement, is insane.
>>
>>101902906
as a /lmg/ fag it's not surprising, fp8 is just a dumb quand method putting all the weights at 8bit, there's way more sophisticated methods than that and GGUF is one of them, but yeah, Q8_0 is really really close to fp16 now, that's cool, I'm using this shit now
>>
>>101902916
I really should hang out in lmg more but it's hard to think of a use case outside of sexbots which I don't care about.
>>
What was the idea behind having Miku with dreads the standard for comparisons since flux came out.
>>
File: ComfyUI_00017_.png (1.12 MB, 1216x832)
1.12 MB
1.12 MB PNG
>>
>try loading flux q8 gguf in kpnold CPP
>Crashes

Wtf?
>>
>>101902944
nta but because it's a gen that you cannot in flux do without using dynamic thresholding and altering the cfg. Miku is way too powerful in the model and always takes over a gen.
>>
>>101902944
It's because I found that at cfg = 1, it was impossible to me to add dreadlocks and black skin to miku, so it basically became for me a good way to measure prompt understanding, and one way to make it work is by foing some CFGmaxxing
https://reddit.com/r/StableDiffusion/comments/1ekgiw6/heres_a_hack_to_make_flux_better_at_prompt/
>>
>>101902961
>>101902969
fix your internet nigger
>>
>>101902975
I made a typo, sowwy :3
>>
File: file.webm (2.98 MB, 450x800)
2.98 MB
2.98 MB WEBM
Bros.... we are so back

The future has never looked this bright
>>
>>101902981
anon, don't be a grammar nazi on yourself
>>
File: a minor typo.jpg (151 KB, 1024x1024)
151 KB
151 KB JPG
>>101902981
This isn't reddit nobody cares.
>>
File: 2024-08-15_00174_.png (1.54 MB, 1024x1024)
1.54 MB
1.54 MB PNG
its silly how good FLUX combines women and tentacles
>>
>>101903040
I'm more interested in how you made it gen a camel toe
>>
https://github.com/city96/ComfyUI-GGUF
>LoRA / Controlnet / etc are currently not supported due to the weights being quantized.
>currently
I hope that means that it's possible at the end
>>
>>101903054
Mate, 2 weeks ago it was impossible to train Flux. Look at us now.
Give it 2 more weeks.
>>
>>101903066
true true, I shall never doubt the power of flux after all the good shit that happened in such a short time
>>
i predict that next month we'll get a new model that destroys flux and puts local image gen at the very top of the food chain
>>
>>101902739
How about non-photoreal stuff like "3d" illustration style fantasy art, paintings, anime, etc
>>101903040
I'd have excellent gens of this genre from a sd1.5 model
>>
>>101903054
I mean, what happens if you train on a quantized model?
>>
File: 2024-08-15_00180_.png (1.62 MB, 1024x1024)
1.62 MB
1.62 MB PNG
>>101903104
ya me to, but I wonder what will happen with this if it gets mangled to make such topics

also, some bad advice: dont ask flux to have a dog sitting on her, might shock you
>>
>>101903124
you don't need to train a lora on a quantized model to make it work, on the LLM ecosystem, they can use a full precision lora on top of a GGUF quant
>>
File: Blank 4 Grids Collage.jpg (441 KB, 1999x1999)
441 KB
441 KB JPG
>>101903104
yes
>>
>>101902778
>>Are LoRA's possible on GGUF
>Yes, if they are possible on nf4 they are possible on gguf, all it takes is for someone to do it.
nf4 for flux is already done by bitsandbytes
>>
>>101903145
are these flux pro gens?
>>
would it be possible to get a Q8 of the T5?
>>
>>101903221
it's way easier, t5 is a text encoder, the Q quants were made for text models
>>
>>101903227
that's nice to know, because the fp8 t5 is garbage
>>
File: FD_00036_.png (1.06 MB, 1024x1024)
1.06 MB
1.06 MB PNG
>>101903191
Yes, dev can do these too though.
picrel is dev, and 99% of the gens you see here are dev.
>>
>>101902957
>>try loading flux q8 gguf in kpnold CPP
>>Crashes
>Wtf?
I need this too. Or something that allows me to share my two 12gb 3060's for loading the model instead of using the other for t5, clip and vae.
>>
>>101903234
i guessed these were flux pro because they are way more "lively" than your usual flux dev gen >>101902812
is flux on grok the pro version?
>>
File: FluxDev_01226_.jpg (320 KB, 1024x1024)
320 KB
320 KB JPG
>>
File: 2024-08-08_00558_.png (1.85 MB, 1024x1280)
1.85 MB
1.85 MB PNG
>>101903234
reminds me of of her
>>
>>101903247
They were basic gens, and very doable on dev. Literally nothing special about them. The prompt was simply
>"style" fantasy art
>>
File: 2024-08-08_00010_.png (1.3 MB, 720x1280)
1.3 MB
1.3 MB PNG
>>101903247
you can nearly anything with flux (but genitalia) do your work and prompt correctly
>>
>>101903265
>>101903298
Definitely, they are all doable, i never questioned that. just the composition and lighting are way weaker, and most gens look "muted" by comparison
>>
How mad is SAI right now that their dream of landing a contract with THE social networking site was just taken from them by BFL?
>>
>>101903364
your prompt is not generating this consistently
>>
>>101903332
Considering they weren't even in the running, probably blindsided. Midjourney was close to being the model used but allegedly they backed out last second.
>>
File: Capture.png (3.19 MB, 2772x3061)
3.19 MB
3.19 MB PNG
Guys, what if I told you I managed to remove the blur on photos with the ToneMap node?
https://imgsli.com/Mjg3OTE5
https://github.com/comfyanonymous/ComfyUI_experiments
>>
>>101903054
>https://github.com/city96/ComfyUI-GGUF
Does GGUF give better performance on CPU/APU?
I'm wondering if it'd support Vulkan?
>>
>>101903364
You could have just posted a catbox so I didn't have to see this
>>
>>101903385
I think SAI should suck a fat bag of dicks for almost getting away with intentionally killing image diffusion for reasons I still cannot comprehend.
>>
File: file.png (1.41 MB, 1024x1024)
1.41 MB
1.41 MB PNG
Is 1.5 dead? I feel left out with my shitty GPU.
>>
>>101903409
Yes, but how shitty is your GPU?
>>
File: 1723705027597982.png (1.51 MB, 2364x1706)
1.51 MB
1.51 MB PNG
Why flux generation on this anons macbook look so good compared to the flux generation hugface?
>>
>>101903418
AMD's RX5700XT
>>
File: 1723707746316748.png (845 KB, 768x768)
845 KB
845 KB PNG
>>101903420
>>
>>101903426
>>
File: FD_00092_.png (59 KB, 128x256)
59 KB
59 KB PNG
>>101903409
Flux can gen ok images at low res.
>>
File: 2024-08-15_00040_.png (1.97 MB, 1024x1024)
1.97 MB
1.97 MB PNG
>>101903409
no 1.5 is still very valid it can create more with less prompting
>>101903426
thats kinda sad

but the idea is .. FLUX is so mighty you have to tell it everything you want to see, while SD15 or SDXL just make the imagination spin of a few simple words
>>
File: Untitled.jpg (192 KB, 1000x1127)
192 KB
192 KB JPG
>>101903433
vs https://huggingface.co/spaces/black-forest-labs/FLUX.1-schnell
>>
File: image.png (783 KB, 1024x1024)
783 KB
783 KB PNG
>>101903444
>>
Why the fuck is image generation at Q8 gguf at 1344x768 resolution taking up 23gb of vram?
>>
File: file.png (1.09 MB, 935x927)
1.09 MB
1.09 MB PNG
>>101903442
Thanks for the cope. I will defer killing myself for a few months and see if rocm 6.2 let's me run flux on this card.
>>
>>101903472
Because you are unzipping the file
>>
>>101903442
>but the idea is .. FLUX is so mighty you have to tell it everything you want to see
Dataset and tagging, dalle works just fine with simple prompts and is more creative. Nothing that a good 2M+ images finetune can't fix. I wonder which model is going to be the next lightning in a bottle.
>>
hmmm, i am trying the new flux gguf quants.
is it supposed to be this slow?
also feels like the q4 quant uses way too much memory.
>>
>>101903482
wrong thread anon
>>
>>101903496
>I wonder which model is going to be the next lightning in a bottle.
One that hasn't started being trained yet
>>
>>101903497
the workflow i use.
>>
>>101903493
That is not how ggufs work.

>>101903482
I think you're in the wrong thread, but those are the raw weights. You either want an EXL2 or a GGUF version of the quant your computer can handle.
>>
File: Comparison_all_quants.jpg (3.84 MB, 7961x2897)
3.84 MB
3.84 MB JPG
>>101903497
no it's not supposed to be this slow
>>
>>101903508
pony was what? 6 months after XL? i think we need a bigger timeframe for this model.
>>
File: 1717132289298356.png (48 KB, 424x345)
48 KB
48 KB PNG
i'm looking for something like the Power Lora Loader but for text prompts instead. i want to be able to toggle prompts on and off and add them to positives or negatives, instead of having to write them and delete them all the time. does this exist?
>>
>>101903510
I don't know why it's slow but negative prompts don't work so delete that, and prompt weights don't work so remove those too.
>>
>>101903393
What happens to the blur if you change your scheduler to sgm_uniform and the sampler to dpmpp_2m?
>>
>>101903524
>i think we need a bigger timeframe for this model.
It took fucking ages before good LoRAs started coming out for XL too, and we have a shit load of them in 2 weeks. Flux is exponential. I expect fine tunes much sooner.
>>
File: 00415-2237752481.png (1.84 MB, 1024x1024)
1.84 MB
1.84 MB PNG
>>101903409
I am not interested in anything I can't train myself locally. with 12GB, I am stuck with 1.5, but I am happy with the results.
>>
>>101902631
the future is here
>>
Is it normal that vram usage doubles when using cfg with flux?
>>
>>101903595
yes. also leave cfg at 1 unless you are using dynamic thresholding.
>>
>>101903595
like you jump from 8gb to 16? no it's not normal at all
>>
>>101903510
You can bring your image width and height down too. Then if you see something you like you can save the seed and rerender at a higher resolution
>>
File: Capture.jpg (309 KB, 2746x1414)
309 KB
309 KB JPG
>>101903541
yep, still working anon
>>
what sampler for flux
>>
>>101903618
No, I mean if you disable the tonemap
>>
What are the chances that this grok shit gets fluxed nerfed before it even gets started? Elon is using twitter to bring flux to normies, they are already genning taytor swift lewds and lawsuits are likely coming.
>>
>>101903496
>dalle works just fine with simple prompts and is more creative
your prompts get rewritten by GPT before being fed to DALL-E 3
>>
>>101903629
I can't disable tonemap, cfg = 6 would burn the picture
>>
File: 2024-08-12_00073_.png (2.12 MB, 1024x1280)
2.12 MB
2.12 MB PNG
>>101903619
euler+beta
heunpp+beta if you have the time
ddim+ddim_scheduler if you are going high CFG
>>
Anyone got amd to work with zluda and comfyui and flux on windows?
>>
>>101903634
on bing, not at all, on gpt+ you can ask gpt to not rewrite it, though you need a soft jailbreak for it to stick.
>>
>gguf
>still have to load CLIP separately
what is the fucking point
>>
>>101903651
>on bing, not at all
unless that's a new development then yes, especially on Bing, your prompt gets rewritten
>>
File: image (1).png (923 KB, 1024x1024)
923 KB
923 KB PNG
>>101903465
(American political stadium republican rally background), [(Taylor Swift with goth makeup), polaroid photo with VSCO filter, 1990, at night, flash photo, selfie, detailed, beautiful shadows, DSLR]
>>
>>101903665
>he doesn't know
unfortunate
>>
>>101903663
that would be retarded to not separate them, it's always the same clip, I don't want to download the same clip everytime I wanna try a new flux model, I'll just download the unet model
>>
File: file.png (91 KB, 281x238)
91 KB
91 KB PNG
I understand this kind of thing happens when a model lacks training data on a particular concept, or on low resolutions (it has to spread attention over a lot of things in the image and does all of them bad, even if it could generate a single image focusing on each and do well).
What strategies do you use to correct this? For simple cases, upscaling (hires fix) or genning at a higher resolution usually does the trick. Other than that I can only think inpainting each portion of the image to fix it individually.
I'm a comfyui noob, so maybe there are comfy-specific strategies, too.
>>
>>101903663
why is that an issue? you don't want CLIP and T5 tied to the diffusion model, it would be a waste of space and bandwidth
>>
>>101903643
hmm in my tests on xl sgm uniform was a standout for scheduler but i will try that
>>
>>101903679
Upscale to schizo res, face detailer, then downscale
>>
File: file.png (491 KB, 512x512)
491 KB
491 KB PNG
>>101903671
Nice prompt.
>>
>>101903671
what's with the brackets
>>
>>101903603
Yeah cause when I use the q8 gguf with cfg, it's using like ALL my vram. It's normal with no cfg, around 13gb
>>
>>101903758
that's weird, I don't have this issue at all... maybe you should make a detailled issue on his Github so that he fixes it
https://github.com/city96/ComfyUI-GGUF
>>
File: 2024-08-15_00190_.png (2.57 MB, 1280x1024)
2.57 MB
2.57 MB PNG
>CFG is a must for expression
>>101903758
>>
File: file.png (544 KB, 512x512)
544 KB
544 KB PNG
>>101903702
1.5 has a special kind of soul
>>
File: image (2).png (1.15 MB, 1024x1024)
1.15 MB
1.15 MB PNG
>>101903702
(American political stadium rally background), [(Taylor Swift) blue eyes goth makeup showing off her cute feet, masterpiece, (photorealistic:1.4), best quality, soft lighting, photograph of a beautiful [European|White] young woman with long blonde hair, RAW photo, 8k uhd, film grain, (fisheye lens:0.9), (cowboy shot:0.9), (bokeh:0.9):2]

>>101903720
It adds the character on top of the background, but comfyui has that feature already so not necessary
>>
>>101903795
Damn this looks realistic, with some editing would seem like something she posted here
>>
File: FluxDev_01243_.jpg (226 KB, 1024x1024)
226 KB
226 KB JPG
>>101903827
>Damn this looks realistic
nigga what
>>
File: file.png (639 KB, 512x512)
639 KB
639 KB PNG
>>101903803
pic related
>>101903827
It's epicphotogasm (sd 1.5)
>>
File: 00319-1084696360.png (1.65 MB, 1024x1024)
1.65 MB
1.65 MB PNG
>>101903803
kek, feetfag's wet dream
>>
>>101903516
>>101903614
>>101903533
thanks for the input, much appreciated.
changed the prompt and did 512*512. still slow but better
3/40 [00:27<05:41, 9.24s/it
ah well, i'm sure the nerds will make it retard prove soon for people like me.
once its done it does output a good image though.

wish i was younger with more time on my hands. had to beg artfags for rpgmaker assets. and i remember the 64*64 horror images we had a couple years ago. really cool stuff.
>>
>>101903774
Yeah, I'm gonna post my workflow here and see if anyone wants to see if they get the same thing happening. Maybe I'm not seeing something in the workflow that's causing it to eat twice the vram it should be.

https://files.catbox.moe/ws73hx.png

If it's not just me I'll make a report to the github
>>
>>101903868
Make an issue on his github, maybe he'll fix it:
https://github.com/city96/ComfyUI-GGUF
>>
File: 2024-08-15_00191_.png (2.54 MB, 1280x1024)
2.54 MB
2.54 MB PNG
Can we now close the debate that FLUX cant do picasso Miku?
>>
>>101903889
only if you catbox it
>>
>>101903889
>No messi
>No obscure sport from final fantasy that that retard claimed was a common piece of well known trivia that all models should know

I'm still not convinced.
>>
File: tay_fdev.png (3.03 MB, 1494x1486)
3.03 MB
3.03 MB PNG
>>101903671
>>101903795
>>101903842
Nice, this is what I get
>>
File: tay_fdev2.png (2.93 MB, 1498x1500)
2.93 MB
2.93 MB PNG
>>101903924
and this
>>
>>101903932
there's a TS lora for Flux Dev
>>
>>101903889
We closed it a week ago when we were posting picasso mikus
>>
File: 2024-08-15_00192_.png (2.64 MB, 1280x1024)
2.64 MB
2.64 MB PNG
>>101903889
okay
https://files.catbox.moe/xay1ip.png

>>101903918
messi? explain
>>
>>101903393
I guess dev only?
>>
File: file.png (2.39 MB, 1024x1024)
2.39 MB
2.39 MB PNG
>>101903924
I was feeding the same prompt to positive and negative conditioning... I'm so fucking stupid.
I think I'm going to start saving for a new card. Iterating on this piece of crap is a pain in the ass. Not to mention I'm stuck on 1.5 forever.
>>
>>101903965
you can try it out on Schnell, the most important parts are the ToneMap node (multiplier 0.2) + CFG 6 + GuidanceNeg 10
>>
>>101903963
>messi? explain
Some guy was claiming the model was a failure because it couldn't do Messi and some very specific from a final fantasy game. When people called him a retard he got very defensive.
>>
>>101903968
That doesn't look too bad for 1.5. Background starts getting messy, but foreground seems ok.
>>
>>101903968
The other ones looked better
>>101903932
>>101903924

>>101903842
This looks most realistic

>>101903859
What settings are you using exactly?
>>
flux finetunes already coming in, nice
>>
File: file.png (2.38 MB, 1024x1024)
2.38 MB
2.38 MB PNG
>>101903995
The original 512x512 looks better (and worse, at the same time). Bilinear scaling seems to help with preventing the denoise from the second pass from messing up the details it got right in the first one.
>>101904007
>The other ones looked better
>What settings are you using exactly?
comfy defaults but I accidentally used the same prompt for positive and negative. Model is epicphotogasm last unicorn.
>>
>>101903409
1.5 forever, 1.5 will never die
>>
>>101904020
Sampling method and steps?
>>
>>101904009

But is it gguf or is it nf4, is it dev or is it schnell does it loras work with comfy or does it need to be recombobulated?
>>
File: aopuewfh.png (466 KB, 704x512)
466 KB
466 KB PNG
Does comfy have consistent ways of getting around the fucked up faces when the face takes up a smaller area of the picture, that isn't just img2img?
>>
File: 2024-08-15_00195_.png (2.27 MB, 1024x1280)
2.27 MB
2.27 MB PNG
>>101903994
just make a lora out of messi

>>101904020
>>101903968
>>101903932
>>101903924
>>
>>101904033
euler with normal scheduler, 8 cfg, 20 steps
>>
>>101903510
to make sure it's not leaking to ram try offload clip to cpu device
>>
File: file.png (553 KB, 512x512)
553 KB
553 KB PNG
>>101904042
>(cubist:1.15) portrait, taylor swift
>>
>>101904034
dont care i have 24GB lol
>>
>>101904042
>just make a lora out of messi
Literally everyone was telling him that but he kept doubling down.
>>
>>101904045
This is what I get
>>
File: ComfyUI_00682_.png (1002 KB, 1024x1024)
1002 KB
1002 KB PNG
Thought Flux would be another SD3 so ignored it since it's been out.
That was obviously a mistake.
Now i'm at the beginning of knowing, again, reee.
>>
File: file.png (618 KB, 512x512)
618 KB
618 KB PNG
>>101904079
Mmh... Want to post full workflow so we can test properly?
I'm generating at 512x512 btw
>>
File: 2024-08-15_00005_.png (2.61 MB, 1280x1024)
2.61 MB
2.61 MB PNG
>>101903963
Sonic is too powerful. I am positive Sonic has the strongest style in the model.
>>
File: file.png (535 KB, 512x512)
535 KB
535 KB PNG
>>101904079
>(American political stadium republican rally background), [(Taylor Swift with goth makeup), polaroid photo with VSCO filter, 1990, at night, flash photo, selfie, detailed, beautiful shadows, DSLR]
euler/normal
8 cfg 20 steps
>>
>>101904099
I'm using the same settings, 512x512, 20 steps, euler, epicphoto. But on sd.next not comfyui
>>
>>101903981
Do I have to let negative prompt empty or put bokeh, blurry etc?
>>
File: 2024-08-15_00197_.png (2.21 MB, 1024x1280)
2.21 MB
2.21 MB PNG
>>101904106
I will try it
>>
File: file.png (1.15 MB, 1024x512)
1.15 MB
1.15 MB PNG
>>101904123
Strange
>>
>>101904132
look at the picture again you get all the answers
>>101903393
>>
File: 2024-08-15_00006_.png (2.65 MB, 1280x1024)
2.65 MB
2.65 MB PNG
Queen Elizabeth looks like a Monty Python skit.
Also I need to train a Monty Python LoRA now.
>>
File: 2024-08-15_00200_.png (2.25 MB, 1024x1280)
2.25 MB
2.25 MB PNG
>>101904106
I guess you are wrong
>>
>>101904170
I only did one gen. What did you change about it?
>>
File: file.png (1.84 MB, 768x1152)
1.84 MB
1.84 MB PNG
I love city
>>
>>101904170
baker you better put this in the collage holy shit
>>
>>101904150
Are you using comfyui? On linux or windows?
>>
File: 2024-08-16_00002_.png (2.47 MB, 1024x1280)
2.47 MB
2.47 MB PNG
>>101904189
making it taller helped
>>
>>101904189
not much I just replaced Hatsune Miku/Taylor Swift with Sonic the Hedgehog with this exact workflow >>101903963
>>
sdui has this way of inpainting where it takes a square image that includes only the masked part (instead of the whole picture), and then patches it back into the original image when it's finished. Is there a way to do this with comfy?
>>
https://github.com/intel/AI-Playground

intel arc bros..
>>
>>101904266
Also, how do I tell comfy if I want to use the image colors, random noise, or the existing image underneath the mask as a starting point for the inpaint?
>>
File: ComfyUI_01441_.png (1.12 MB, 1024x1024)
1.12 MB
1.12 MB PNG
>>
File: ComfyUI_01484_.png (1019 KB, 1344x768)
1019 KB
1019 KB PNG
>>
Does any anon have a good ComfyUI workflow for Flux? I'm on a 3090 with 24GB VRAM, can I speed it up somehow? 30-45s per imagen is cringe

Running the one in here https://comfyanonymous.github.io/ComfyUI_examples/flux/
>>
>>101903100
Bro, we can barely run flux. How about we just first wait for nvidia AI accelerator monopoly to end before we dream about something better.
>>
File: 2024-08-15_00206_.png (2.23 MB, 1024x1280)
2.23 MB
2.23 MB PNG
>>101904316
look here>>101903963
>>
>>101904316
In the default workflow set your weight_dtype to e4. I gen in 20s on a 4080
>>
>>101904266
>>101904294
Nevermind, found crop and stitch. I have to get used to check for custom nodes that do the things I want to do.
>>101904231
Comfy on linux. But I'm no longer using the same prompt for positive and negative (obv)
>>
File: 2024-08-15_00211_.png (1.98 MB, 1280x1024)
1.98 MB
1.98 MB PNG
>>
File: file.png (2.46 MB, 1024x1024)
2.46 MB
2.46 MB PNG
>>101904343
>wait for nvidia monopoly to end
You seem to believe the free market is real
>>
>>101904315
fix the lighting on ronald and it's perfect
>>
>>101904045
Goddamn nigger! 8 cfg? Are you trying to prove a point?
>>
>>101904115
Very consistent on that dick sucking lips. Hair is way better in this one. Chest is a little weird, but pretty good!
>>
File: 00042-2627143628.png (870 KB, 960x536)
870 KB
870 KB PNG
>>101903409
1.5 is still the most aesthetic I've found, specifically the Noosphere model mixed with Azovya
>>
>>101904303
Damn. For a cartoon that shit has more kinematic and physio-biological structure than almost anything I have seen.
>>
>>101904448
How the hell are you getting cute Taylor cameltoes? What prompt magic is this?
>>
File: 2024-08-15_00217_.png (2.03 MB, 1280x1024)
2.03 MB
2.03 MB PNG
>>101904577
what do you mean? can't you prompt?
>>
>>101904561
CFG 8 is perfectly normal for 1.5 models, retard
>>
I'm using Fooocus and I've got this LORA with a slider but I don't see how to actually make the slider be in the interface, what do?
>>
>>101904619
No. I am a promptcel that can only get the minimum images rendered. I would like to subscribe to your news letter.
>>
>>101904636
I apologize, I haven't used the model for the poors, only flux.
>>
>>101904616
>cute
look again lenny
>>
>>101904638
?
You mean the strength of the lora? sliders usually work by changing the strength
>>101904655
outing yourself as a newfag, there was SD 1.x to use for a long while
>>
>>101904655
>admitting to being a newfag riding on the bandwagon of Flux

not very good at this, are you anon
>>
>>101903409
don't feel left out, sd15 can do magnificent things.
>>101903842
this one is scary good
>>
>>101904667
>>101904668
I am a newfag. That is way came here to learn from oldfags that were making tranime out of pixels before generative models even existed. Teach me sensei!
>>
File: ymuov9nlwlga1.jpg (113 KB, 709x896)
113 KB
113 KB JPG
>>101904678
my sides!

you SIR have TRULY won the internet on this blessed day!!
>>
File: 2024-08-15_00222_.png (2.04 MB, 1280x1024)
2.04 MB
2.04 MB PNG
>>101904644
just type what you want.. write it out like and essay and you get what you want with flux (anything but porn)
>>
File: joker.jpg (198 KB, 1152x832)
198 KB
198 KB JPG
Do you clean your dataset tags? For a long time I have simply removed low entropy tags, i.e, tags with really high or low count.
I've been trying to find a way to cluster images based on tags so I can automatically remove outlier images.
>>
>>101904690
Not mine
>>
>>101903671
>>
>>101904718
oh my sweet summer child
>>
>>101904724
sir, pls help me. I'm doing it for my daughter
>>
https://reddit.com/r/StableDiffusion/comments/1estj69/remove_the_blur_on_photos_with_tonemap_an/
I highly suggest you to try this new clip_l thing, it really improves the quality picture and prompt adherance
https://huggingface.co/zer0int/CLIP-GmP-ViT-L-14/blob/main/ViT-L-14-BEST-smooth-GmP-ft.safetensors
https://imgsli.com/Mjg3OTU5
>>
>>101904739
seen it, noted, thanks
>>
File: file.png (15 KB, 943x117)
15 KB
15 KB PNG
>do the needful
>>
>>101904231
bro you're using a LCM model. You need a lcm lora and cfg of 1.5-3. Look it up on civitai or just use a normal model for now
>>
>>101904738
remove irrelevant tags
consolidate important tags (e.g. if you have several tags for the same feature, pick one and use that for every image where it appears)
be consistent
if you have extra time add missing image composition tags since afaik most autotaggers don't apply them
>>
>>101904761
If it's time gated i don't really care.
If it's "premium loras", it's just retarded.
>>
>>101904738
also the most important thing for datasets is to not have any shit images, GIGO is still fundamental computing principle
>>
>>101904720
How

>>101904767
I'm not using any LORA, what do you recommend?
>>
>>101904616
I'm using SD 1.5 (epicphotogasm last unicorn). If you look closely, the image is a mess. I had to inpaint the face, too.
>>
File: 1709192565934786.jpg (174 KB, 1024x1024)
174 KB
174 KB JPG
>>
>>101904739
thanks for figuring out good baseline numbers, at this point i've just started coming to /ldg/ to see what you're working on next lmao
>>
File: 1711753132928496.jpg (176 KB, 1024x1024)
176 KB
176 KB JPG
>>
>>101904739
is there a TE only of the BEST version?
>>
File: 1700537894412818.png (1.78 MB, 946x946)
1.78 MB
1.78 MB PNG
Kino Sd 1.4
>>
>>101904790
https://files.catbox.moe/el38ck.png

i don't have whatever comfy feature anon had just slapped the prompt in
>>
>>101904829
can you elaborate?

>>101904819
aww, thanks for the praise anon :3
>>
File: ComfyUI_Flux_8909.jpg (139 KB, 2048x512)
139 KB
139 KB JPG
>>
>>101904851
WTF are you using
>>
>>101904865
ViT-L-14-BEST-smooth-GmP-ft is the only thing, text encoder and image encoder, the text encoder is only 1/3 the size like ViT-L-14-GmP-ft-TE-only-HF-format
>>
>>101904790
search lcm on civitai. "Theally" has one for 1.5. But LCM/Turbo/Lightning models are made for speed so the results are usually not as good and they are way more restrictive (you need the lora + cfg 1.5-3) so again I recommend you just use a normal model that doesn't have LCM/Turbo/Lightning in its name
>>
>>101904851
Very nice. Care to share your specs?
>>101904899
https://www.nayuki.io/page/png-file-chunk-inspector
>>
>razzz making Flux celeb loras
>they are shit
Every time.
>>
Is there a way in comfy to have it automatically start with "extra options" and "view queue" ticked?
>>
>>101904916
Yeah, it's the new shiny clip_l, you should use that one imo, Flux has all of these (unet + clip_l + t5xxl + VAE), that huggingface is only about a modified clip_l
>>
>>101904916
*is the whole thing
>>101904967
anon that's not... nevermind
>>
File: fingers still bad lol.png (784 KB, 1024x768)
784 KB
784 KB PNG
>>101904899
>open up reforge page
>go to issues
>someone is requesting flux support
>read comment saying to use a certain branch if you want to use flux
>google how to install just one branch of a repo in git
>install that shit
>spend like 10 minutes testing some baseline settings
>go on /g/
>post

>>101904932
3060 12gb
>>
>>101905006
forge is back up and running fella
>>
>>101904983
I think I start to understand what you mean by that, the original clip_l is 256 mb
https://huggingface.co/comfyanonymous/flux_text_encoders/blob/main/clip_l.safetensors

And this one is 932 mb
https://huggingface.co/zer0int/CLIP-GmP-ViT-L-14/blob/main/ViT-L-14-BEST-smooth-GmP-ft.safetensors

I guess that it's a bigger version of clip_l or something, desu I won't cry for a 700mb increase in my vram if at the end I get better prompt understanding: https://imgsli.com/Mjg3OTU5
>>
>>101905032
yeah and then he's gonna disappear again in a month
>>
>>101905036
it's not bigger, CLIP is a text encoder and image encoder, we only use the text encoder part.
90% sure Comfy knows to load just the text encoder if you use the file that has both parts
>>
File: ComfyUI_02286_.png (1.52 MB, 832x1216)
1.52 MB
1.52 MB PNG
>>
>>101905055
Tbh I didn't notice an increase in vram in the text encoder part, it's still 9.3 gb total, so I guess that comfy is only using the text encoder part of ViT-L-smooth-GmP, which is a good thing
>>
>>101904917
This is what I get with a LORA added on sd.next
>>
>>101905043
this
>>101905056
VROOOM
>>
File: 0.jpg (149 KB, 1024x1024)
149 KB
149 KB JPG
walking to school, by Sam Toft, by Serge Najar, by Richard Scarry, Studio Ghibli (minimalism:1.9)
FLUX.1 [schnell]
>>
>>101904739
hey anon, you know your code changes

are all 3 files required - or just sampler_tonemap.py? since that's the node you use.

not sure if they're dependent on each other or you just changed it across files for consistency sake
>>
>>101905086
>not sure if they're dependent on each other or you just changed it across files for consistency sake
it's for consistency sake, so that if you want to try the other nodes, it'll work aswell
>>
>>101903007
ok this is epic
>>
>>101905081
>>
File: ComfyUI_02277_.png (1.51 MB, 832x1216)
1.51 MB
1.51 MB PNG
it's nice that flux can do generic cute anime girl sometimes
>>
File: 0.jpg (288 KB, 1024x1024)
288 KB
288 KB JPG
>>
>>101905129
boink, nice one
>>101905085
I don't think raising the strength of a token does anything in flux. those gens are very nice.
>>
>>101904688
People will teach you when you stop typing like a faggot. Also if you see the CP poster just report and hide, don't engage.
>>
File: ComfyUI_Flux_8911.jpg (204 KB, 2048x512)
204 KB
204 KB JPG
flux unchained
>>
trying to gguf, what did I do wrong?
>>
>>101905081
>>101905123
lol not just any lora, a LCM lora. Here: https://civitai.com/models/195519?modelVersionId=424706
And again: 1.5 cfg, 3-8 steps and its just better to use a different model
>>
>>101905221
DID YOU PULL?!
>>
>>
>>101905261
Yes, Comfy is up to date via manager. Installed the node. Installed the dependency.
I will try a manual pull.
>>
>>101903444
damn, republican women look like THIS?
>>
File: ifx29w.jpg (1.2 MB, 1600x1600)
1.2 MB
1.2 MB JPG
>>
>>101905330
Nope. Maybe my workflow is just shit. Can someone please catbox a gguf workflow for me?
>>
File: 1710255285299733.png (33 KB, 1031x390)
33 KB
33 KB PNG
>>101904739
So it does replace the second "clip_l" in picrel?
Or both of them?
>>
>>101905411
yeah, it just replaces the second clip_l, I should include that in the tutorial aswell
>>
>>101905417
Alright thanks.
What does the first one does vs the second one?
I didn't find much explanations outside of "here is how to use flux".
>>
>>101905411
but there is only one clip_l in that image
>>
>>101905433
t5xxl is great at natural language, clip_l is good at tags, so it's supposed to get the best of both worlds
>>
>>101905441
NTA. Is there an equivalent for 1.5? Getting the models to be better at understanding the prompt by changing the text encoder? Sorry if this is a stupid question.
>>
>>101902778
The gguf node doesn't rely on the forge nf4 custom node thing implementation right?
>>
>>101905434
*the second one named "clip_l"

>>101905441
Alright makes sense.
Would that mean that finding a superior version of t5xxl would be even better for natural language friendly models like flux?
>>
>>101905456
I suppose? Dunno what does 1.5 uses as a text encoder, but if there's another one with the same architecture that has been improved, maybe you should try to replace that yeah
>>
>>101905242
No change with any option, starting to think the issue is with sd.next
>>
>>101905460
>Would that mean that finding a superior version of t5xxl would be even better for natural language friendly models like flux?
better prompt understanding basically
>>
>>101905469
it uses the same encoder
>>
>>101903868
What is this prompt?
>>
>>101905477
dude get off this LCM checkpoint and get the regular one.
>>
>>101905485
then it means that you could change its clip_l with this one, try it out and let us know if you notice some better results
>>
File: ifx30.png (1.42 MB, 1024x1024)
1.42 MB
1.42 MB PNG
>>
File: flux_upscaled_00095_.png (3.78 MB, 2400x1368)
3.78 MB
3.78 MB PNG
>>101905495
I told him that twice already kek. Probably still uses the wrong settings
>>
https://www.reddit.com/r/StableDiffusion/comments/1eslcg0/comment/li8myda/?utm_source=share&utm_medium=web2x&context=3
Looks like the GGUF loader is now working on Forge aswell?
>>
>>101905485
>>101905497
Will try and report back (in a few hours). Thanks.
>>
File: Untitled.jpg (219 KB, 1500x753)
219 KB
219 KB JPG
>>101905515
>>101905495
What's the regular checkpoint? I used same settings as the guy with comfyui but in sd.next
>>
Is there a way in comfy to have multiple "desktops", aka I'd like one purely for upscaling, one for generation, one for image to text...
Next best thing would be allowing multi instance.
>>
>>101903868
(full body portrait, standing pose), (best quality:0.8) perfect anime illustration,on a black surface, on a plain black background, a pretty, happy man with short curly brown hair, wearing a hoodie
then i use InspyrenetRembg to remove the background. works pretty well.
the part after background was actually randomized by an llm. i play around with auto making character cards and picture creation with various expressions.
>>
>>101905541
NTA but that looks similar to what I get when I upscale with a denoise parameter that is too low.
>>
>>101905549
ah, i fucked up.
was meant for >>101905493
>>
>>101905544
Not sure what you mean... it's a web front end. Just open another tab and put a different workflow in
>>
File: gadget0016.jpg (142 KB, 1304x1304)
142 KB
142 KB JPG
>>
File: end_gen~6.jpg (258 KB, 1448x1280)
258 KB
258 KB JPG
>>
>>101905558
Wouldn't that mess my currently open project?
Basically having multiple tabs would work and everything would be separated?
I'm a bit of a techlet for this.
>>
>>101905541
you shouldnt use more than 8 steps with a LCM model. And the size shouldnt go beyond 768 for 1.5. So use 512x768 for example. And you still seem to use a random lora not he lcm one. The name of that lora is LCM_LoRA_Weights_SD15. Also if you really want to download epicPhotoGasm then just download the original https://civitai.com/models/132632/epicphotogasm not a LCM version. Then you dont need to do all this stuff
>>
File: ComfyUI_03946_.png (1.01 MB, 1024x1024)
1.01 MB
1.01 MB PNG
>>101904739
So much has improved this day
>DynamicThreshold got a better sucessor, Tonemap
>A better version of clip_l can replace the old clip_l on flux
>Q8_0 is closer to fp16 than fp8 and can be used without much drawback (it's a bit bigger than fp8)
Now with all those changes I have way more ease to make cool prompts on flux, that's nice
>>
>>101905541
https://civitai.com/models/132632?modelVersionId=223670
>>101905515
those underwater gens look realy good. did you also do those squids in deep water?
>>
boys is flux just another model I can pop into my stablediffusion folder and start using or is there more i need to do to get started
>>
>>101902289
>Is it possible to use any version of flux on an RX 5700 XT
Of course. Simply recompile the entire ROCm stack all the way through MIOpen and composable_kernel. And patch MIOpen and composable_kernel so they work less shitty on the 5700. Then compile PyTorch.
>>
>>101905588
Each tab is its own instance of the front end. The back end is like a "server", it just handles api calls (queued prompts) from any instance of the front end.
>>
>>101905632
Nice, thanks!
>>
File: flux_upscaled_00059_.png (2.67 MB, 2400x1368)
2.67 MB
2.67 MB PNG
>>101905611
no. Only jellyfish
>>
File: file.png (1.36 MB, 2390x1100)
1.36 MB
1.36 MB PNG
Managed to run flux gguf, but not with this clip-vit-large-patch14.bin in the clip loader. I used instead a clip_l.safetensors. I have no idea if this is how I'm supposed to use it, but it worked. I found a model "clip-vit-large-patch14.bin", but it's clip vision, and when I tried to load it, the loader complained, so I guess it's the wrong model but with the same name?

Anyway, I'm getting a 4.3s/it on a 4060Ti.
>>
File: Untitled.png (38 KB, 1208x414)
38 KB
38 KB PNG
anyone know why this random pony model has the term "flux" in its version?
https://civitai.com/models/592983?modelVersionId=724618

what the hell does it have to do with flux?
>>
>>101905651
>Managed to run flux gguf, but not with this clip-vit-large-patch14.bin in the clip loader. I used instead a clip_l.safetensors.
go for that one anon, it's a better version than the classic clip_l >>101904739
>>
File: ComfyUI_Flux_8973.jpg (214 KB, 1024x1024)
214 KB
214 KB JPG
>>
>>101902766
>Are LoRA's possible on GGUF quants? I think that's the only thing I will be able to use in the foreseeable future if at all.
Why not a node that allows you to load a LoRA separately on its own into VRAM instead of merging with model weights? The math should let you keep it side-by-side no? You'd just lose a bit of VRAM on the LoRA itself and a tiny loss of speed from the LoRA math each gen.
>>
>>101905660
Yes, the one making it clicked wrong.
>>
>>101905627
I actually got started down that path at some point. But I'll rather wait for rocm 6.2. It's supposed to work with gfx1010. Then I will be able to use the shittiest gguf.
>>
File: Untitled.jpg (126 KB, 1500x750)
126 KB
126 KB JPG
>>101905611
>>101905604
Same thing, must be some setting in sd.next
>>
Ready to go with some fresh bread...
>>101905755
>>101905755
>>101905755
>>
>>101905737
cfg should be 1.5 not 6
>>
>>101905737
now remove that goddamn LCM lora, raise steps to 20 ish, pick a better sampler and off you go
>>
>>101905700
rocm 6.2 is out, but checking composable_kernel at least gfx1010 still isn't a default build target, so unless your distro loves you (debian) it still needs recompilation.
I think 6.2 has the bugfix for rocblas so stuff like llama.cpp that only uses rocblas will just work on gfx1010, but not pytorch.
>>
>>101905779
nvm this isnt the LCM model anymore. Disregard what I said
>>
>>101905737
lora weight to 0.5, cfg to 1, steps to 6
t. genned with LCM for 5 months, switched to HyperSD which let me gen in 4 steps with better quality
>>
>>101905660
Just a jeet trying to get more eyes on his shitty merge.
>>101905682
There was no misclick.
>>
>>101905737
>>
File: chara_card_v2 (1).jpg (153 KB, 1024x1024)
153 KB
153 KB JPG
Do any of you have any idea what was used to gen this?
Original PNG: https://avatars.charhub.io/avatars/similar_reply_6303/student-vs-teacher-7408d42bf204/chara_card_v2.png
>>
File: YOU.jpg (168 KB, 1024x1024)
168 KB
168 KB JPG
>>101903007
Well...I guess this is why I am here forever.



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.