[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: tmp.jpg (821 KB, 3264x3264)
821 KB
821 KB JPG
Discussion of free and open source text-to-image models

Previous /ldg/ bread : >>101902610

>Beginner UI
EasyDiffusion: https://easydiffusion.github.io
Fooocus: https://github.com/lllyasviel/fooocus
Metastable: https://metastable.studio

>Advanced UI
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
ComfyUI: https://github.com/comfyanonymous/ComfyUI
Forge: https://github.com/lllyasviel/stable-diffusion-webui-forge
InvokeAI: https://github.com/invoke-ai/InvokeAI
SD.Next: https://github.com/vladmandic/automatic
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI

>Use a VAE if your images look washed out
https://rentry.org/sdvae

>Model Ranking
https://imgsys.org/rankings

>Models, LoRAs & training
https://civitai.com
https://huggingface.co
https://aitracker.art
https://github.com/Nerogar/OneTrainer
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts

>Flux
https://huggingface.co/spaces/black-forest-labs/FLUX.1-schnell
https://comfyanonymous.github.io/ComfyUI_examples/flux

>Pixart Sigma & Hunyuan DIT
https://huggingface.co/spaces/PixArt-alpha/PixArt-Sigma
https://huggingface.co/spaces/Tencent-Hunyuan/HunyuanDiT
https://huggingface.co/comfyanonymous/hunyuan_dit_comfyui
Nodes: https://github.com/city96/ComfyUI_ExtraModels

>Index of guides and other tools
https://rentry.org/sdg-link
https://rentry.org/rentrysd

>GPU performance
https://vladmandic.github.io/sd-extension-system-info/pages/benchmark.html
https://docs.getgrist.com/3mjouqRSdkBY/sdperformance

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
sd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium

>Maintain thread quality
https://rentry.org/debo

>Related boards
>>>/h/hdg
>>>/e/edg
>>>/d/ddg
>>>/b/degen
>>>/vt/vtai
>>>/aco/sdg
>>>/trash/sdg
>>
Got gguf working but it's 4x slower than fp8 flux for me, Using the Q5.
>>
File: delux_mp_000002.jpg (76 KB, 1000x1000)
76 KB
76 KB JPG
>>
File: file.png (662 KB, 512x512)
662 KB
662 KB PNG
how's it going fellow genners
>>
File: ifx31.png (1.29 MB, 1024x1024)
1.29 MB
1.29 MB PNG
n1
>>
File: jdog.jpg (7 KB, 275x183)
7 KB
7 KB JPG
>>101905794
>>
File: flux_upscaled_00099_.png (2.67 MB, 2400x1368)
2.67 MB
2.67 MB PNG
that's not an anglerfish
>>
File: Comparison_all_quants.jpg (3.84 MB, 7961x2897)
3.84 MB
3.84 MB JPG
>>101905786
Q5_0 is surprisingly slow compared to the rest, dunno why
>>
>>101905811
>1girl, holding_animal, animal, dog, holding, solo, faux_traditional_media, fine_art_parody, traditional_media, realistic, painting_medium, sky, upper_body, black_hair, looking_to_the_side, closed_mouth, brown_hair, long_sleeves, rating_general, fine_art, artist_joshua_reynolds, portrait, style_rococo

It's trying its best
>>
File: ifx34.png (1.14 MB, 1024x1024)
1.14 MB
1.14 MB PNG
>>101905794
hola
>>
>>101905793
I'm using Ubuntu atm, so I guess I'm in luck.
>not pytorch
Fuck. Still, they will eventually release a wheel that uses 6.2, right? It shouldn't be too difficult to compile anyway.
>>
https://www.reddit.com/r/StableDiffusion/comments/1eslcg0/comment/li8b7om/?utm_source=share&utm_medium=web2x&context=3
>Also note that the weighs are dequantized on the fly, so it's not as optimized as a stable-diffusion-like implementation that operates directly on quantized weights
So that mean that the GGUF quants can get even faster? nice
>>
File: ComfyUI_00058_.png (1.65 MB, 1024x1024)
1.65 MB
1.65 MB PNG
Cock with sand up StabilityAI's asshole. Cripple our models and make us subscribe? Fuck you!
>>
>>101905815
Weirdly so, I tried q8 and it's 3x faster.
>>
>>101905862
oh
So I do not need the NF4 node thing for it.
nice
>>
File: 151701_00001_.png (1.29 MB, 1024x1024)
1.29 MB
1.29 MB PNG
40 seconds for a 1024x1024 fluxdev image on a 4060ti is far too long, fix it pls,ty.
Custom tensorrt models for flux on comfy when?
https://github.com/comfyanonymous/ComfyUI_TensorRT
>>
>>101905898
>Custom tensorrt models for flux on comfy when?
How much speed improvement are we expecting from tensorrt usually?
>>
>>101905898
Skill issue
>>
>>101905914
https://github.com/comfyanonymous/ComfyUI_TensorRT/discussions/34
>>
File: flux_upscaled_00100_.png (2.44 MB, 2400x1368)
2.44 MB
2.44 MB PNG
this one is a bit shy

>>101905891
congrats. You did it
>>
>>101905898
40 seconds? I'm getting 2 minutes per image. 30 steps with euler, 4.2s/it. VRAM on 14GB.
>>
>>101905891
Will any model ever surpass SD 1.5 in sovl? Better models just look too perfect.
>>
i was getting OOM on comfyui with all the gguf models, but it just werks on forge
>>
>>101905972
wait you can use gguf quants on forge? that was fast
>>
>>101905932
magnificent. nah that gen was me trying it out and now I am stuck in taytay loophole. someone drag me out, help!
>>
>>101905945
are you using bnb nf4 v2? try that if not.
>>
>>101905984
yaeh, git pull, make sure to read the settings from
https://github.com/lllyasviel/stable-diffusion-webui-forge/discussions/1050

lots of goodies in that update too
>>
>>101906004
>2 days ago
the fuck? I thought the GGUF anon only made his shit work today
>>
>>101906004
>Support LoRAs for Q8/Q5/Q4 GGUF Models
come on, Comfy, copy paste it already
>>
>>
File: Capture.jpg (723 KB, 3300x1760)
723 KB
723 KB JPG
A bit dissapointed by deis on that one, only euler managed to make her black
>>
>>101906015
i guess he kept it secret til last night
>>101905972
>>
File: Untitled.jpg (245 KB, 1500x930)
245 KB
245 KB JPG
>>101905932
Ok it,s better, thought more steps gives better quality, still weird that the other anon with same input and settings got different images in comfyui
>>
File: 1875988048.png (1.56 MB, 1344x768)
1.56 MB
1.56 MB PNG
>>
>>101906085
>>
>>101906020
no need to copy paste
dtypes just need to be made compatible
>>
>>101906098
Any idea how to get rid of that line in the middle?
>>
>>101906085
you're not using the LCM model anymore. You can go back to using regular settings like 20 steps and a cfg of 7. You also dont need that lora if you dont use LCM models
>>
hyper lora; heun AYS
>>
is grok free
>>
File: ComfyUI_Flux_9035.jpg (129 KB, 1024x1024)
129 KB
129 KB JPG
>>101906037
why are anons obsessed with this prompt
>>
File: file.png (531 KB, 512x512)
531 KB
531 KB PNG
>>101906085
That anon here. If you want, I can try to replicate the exact same gen again. I probably had some different params because my workflow was a mess earlier.
>>
File: Rdg_121.jpg (152 KB, 1448x1280)
152 KB
152 KB JPG
>>
>>101906038
I suck at comfy ui, is there a tard-level guide on adding upscaling to a basic flux workflow? Preferably not a youtube video.
>>
>>101906130
What's the difference between LCM model and the other one from civit.ai? I downloaded the LCM models from stability matrix ui

>>101906172
>>101906143
I used it first on flux huggingface and it worked ok
>>
Anyone else try the new clip model? Are you seeing improved gens?
>>
>>101906214
yeah it knows what a vegana looks like now
>>
>>101906194
Do you gen at 768 straight up or upscale from 512?
>>
>>101906214
>Are you seeing improved gens?
yeah, it helped for removing the blur for me
https://reddit.com/r/StableDiffusion/comments/1estj69/remove_the_blur_on_photos_with_tonemap_an/
https://imgsli.com/Mjg3OTU5
>>
File: 01444-1553280503.png (2.91 MB, 1523x1741)
2.91 MB
2.91 MB PNG
Yeeah buddy, loras working on nf4.
Can't wait to see what kind of depraved realistic stuff can be made with this model once the loras become plentiful enough.
>>
>>101905892
You need to update comfy
install https://github.com/comfyanonymous/ComfyUI_bitsandbytes_NF4
and then install the GGUF Custom Node to use them
>>
>>101906245
>Yeeah buddy
https://www.youtube.com/watch?v=FJRMldSmy-M
>>
>>101906241
but negative prompting still works with the original CLIP
>>
File: flux_00164_.png (1.94 MB, 1344x768)
1.94 MB
1.94 MB PNG
>>101906194
lcm/turbo/lighting models are fast and dont require as many resources at the cost of quality. They are useful for real-time stuff or txt2video

>>101905972
kek nice negative
>>
>>101906245
>Yeeah buddy, loras working on nf4.
desu you better use Q4_0, it's closer to fp16 than nf4 >>101905815

And GGUF lora is a thing on forge >>101906004
>>
>>101906269
it's for regular sdxl models >_<
>>
>>101906258
it works, but doesn't work as well, look on how the blur completely dissapeared on the improved clip_l https://imgsli.com/Mjg3OTU5
>>
>>101906194
you in the back
>>
so is like gguf is like 8gb and as fast as nf4 but as good as fp18? is that what i heard?
>>
>>101906285
lol

>>101906233
Directly but it adds some line in the middle

>>101906269
Then why the fuck were they the most downloaded ones
>>
>>101906273

This thing updates so quickly I can't even keep up with these models.
I'll have to check that out.
>>
File: file.png (638 KB, 1380x628)
638 KB
638 KB PNG
>>101906214
Here's a test. I'm not sure yet.
>>
>>101906299
>so is like gguf is like 8gb and as fast as nf4 but as good as fp18? is that what i heard?
Q4_0 is better than nf4 (same size) yeah, but it's nowhere the fp16 level kek >>101905815
>>
>>101906304
because people are dumb af
>>
>>101906315
>This thing updates so quickly I can't even keep up with these models.
same, I'm afraid to sleep because it means I'll be losing track, no joke kek
>>
File: file.png (693 KB, 1365x617)
693 KB
693 KB PNG
>>101906333
Here's the same test for the prompt
>a dog jumping on a coach
>>
Someone try the black migu prompt to see the prompt adherence for the new clip?
>>
don't engage
>>
just report
>>
File: flux_00166_.png (1.98 MB, 1344x768)
1.98 MB
1.98 MB PNG
>>
File: x.png (263 KB, 746x539)
263 KB
263 KB PNG
>>101906299
It's not what you heard it's what you're hearing.

Listen.

Listen. >>101905815
>>
On a 10GB 3080 + 32GB RAM, I get 2s/it at 512x512 with the q4 flux gguf, but 240s/it at 1280x720. I assume this means any amount of overflow into system RAM = your gen times get totally fucked just like the (semi-)old days, but it's interesting that I was getting 6s/it on the full size Schnell for 512x512. I didn't think it was meant to fit into 10GB. Is there a level of overflow into RAM swapping that still gives okay speeds, or does Flux just barely fit into 10GB VRAM even without being quanted as long as you're genning small images?
>>
>>101906214
On SD 1.5, it's virtually the same as before.
>>
>>101906359
looks better, kinda. interesting
>>
>>101906359
that's no coach
>>
>>101906471
lmao you're right. Couch. ESL please understand.
>>101906457
Yeah, but it's arbitrary. In the tests I've done, it's essentially as if you chose a different seed.
>>
no neg prompt in flux sucks
>try to prompt gloves
>completely random fingerless gloves or not
ree
>>
I pulled the recent changes to comfyui and updated python dependencies. As a result the same workflow with flux dev fp8 tripled in time compared to earlier. Never pull if something's working
>>
>>101906504
>no neg prompt in flux sucks
you can neg prompt anon
https://reddit.com/r/StableDiffusion/comments/1estj69/remove_the_blur_on_photos_with_tonemap_an/
>>
>>101906123
This is a really good image.
>>
File: flux_00168_.png (1.35 MB, 1344x768)
1.35 MB
1.35 MB PNG
>>
>>101906504
>prompt viking
>get human with horns because horned helmets
>>
>>101906534
node based uis are for fucking retards so i will wait patiently
>>
File: ComfyUI_31702_.png (2.12 MB, 1280x1280)
2.12 MB
2.12 MB PNG
>>
>>101906569
I'm not sure of the tonemap method works on forge, could be wrong though, you need that to make it work
>>
>>101906537
back to limgrave
>>
>>101906569
Normally I would ask you not to be so rude, but I'm fluctuating between getting 1.6s/it and 60s/it each time I hit generate at the moment on 512x512 q4 flux in comfy, so you might be right
>>
>>101906496
gonna try it out later, for now I am stuck genning TAY
>>
File: Gen_1723688091_0001.jpg (1.05 MB, 1792x2304)
1.05 MB
1.05 MB JPG
>>
>>101906534
checking out 2 git repositories and downloading a .rar with 3 python scripts from catbox because someone on reddit told me to doesn't feel right somehow
>>
>>101906640
you're not obligated to use the AdaptiveGuidance and therefore not obligated to download the .rar, like I said on the tutorial that shit is optional
>>
>Outpainting is still a nightmare
Is Fooocus still the state of the art on this? I'm surprised nobody made more of an effort to solve this issue.
>>
File: flux_00171_.png (1.42 MB, 1344x768)
1.42 MB
1.42 MB PNG
>>
>>101906715
i've noticed that flux is super attached to capes on armor
>>
the text encoder ViT-L-14-BEST-smooth-GmP-ft.safetensors does not appear to work on flux for gguf models, only "clip l" works otherwise i get "AssertionError: You do not have CLIP state dict!"
>>
File: Capture.jpg (319 KB, 3296x1704)
319 KB
319 KB JPG
>>101906726
it works for me
>>
>>101906741
on forge, that is
>>
File: flux_00169_.png (1.34 MB, 1344x768)
1.34 MB
1.34 MB PNG
>>101906719
I acutally prompted for a "tattered cape". Sometimes it just turns into a scarf tho. Also its doesnt really know how knight helmets look like
>>
>>101906756
oh, my b
>>
>>101906741
also pls share taht workflow, i want to see why i keep getting OOM on comfyui with mine (which looks similar to yours but maybe there's something i missed)
>>
>>101906757
Post prompt, please? I would like to try to replicate flux pics with 1.5 as accurately as possible. Just for practice.
>>
>forge repo mentions GGUF models two days before city posted his implementation here
So who the fuck came up with this
>>
>>101906773
Sure: https://files.catbox.moe/go1c5o.png
>>
>>101906796
I'm really surprised Forge didn't hype with his GGUF at all, this shit is way better than his stinky nf4 he's trying to shill so hard >>101905815
>>
>>101906778
>in the middle of a detailed gorgeous meadow field during sun down. An exhausted knight in a tattered cape sitting on the ground, his armor is very damaged and a little blood is leaking out of the gaps, the visor of his helmet is closed. Rusty, battered and scarred armor, worn metal. He is too weak to pick up his bloody sword that is stuck in the ground beside him. He is alone left to die
Boomerprompting works best for me so I doubt it will good in 1.5 since tags are better there
>>
>>101906640
The script fix changes one line. Just literally open them up side by side. The original node is from comfy so you have nothing to worry about
>>
>>101906651
Nice try glowie
>>
>>101906741
>>101906726
Is there any advantage to using ViT-L-14-BEST-smooth-GmP-ft instead of the normal clip?
>>
ok one final tay gen, the title is "deconstruction"
>>101906741
ty for workflow
>>101906719
quite a few concepts it easily attaches to, like a red circle with everything japanese
>>
>>101906831
>Is there any advantage to using ViT-L-14-BEST-smooth-GmP-ft instead of the normal clip?
>>101906241
>>
>>101906799
thx, i get this error and cant find the node (with manager/missing modules)
>>
File: flux_upscaled_00111_.png (3.25 MB, 2400x1368)
3.25 MB
3.25 MB PNG
>>
>>101906820
Thanks. I'll see if I can adapt it.
>>
File: Capture.jpg (91 KB, 1512x830)
91 KB
91 KB JPG
>>101906830
are you a retard? the only node you need to download and install is from a github made by Comfy himself, your IQ, now!
>>
takes like 2 years to highresfix on my shitbox but it's still neat
>>
>>101906856
oh yeah, it's because I updated the script on the multigpu tutorial, it has OverrideMODELNode now
https://reddit.com/r/StableDiffusion/comments/1el79h3/flux_can_be_run_on_a_multigpu_configuration/
>>
>>101906861
I SAID NICE TRY GLOWIE YOU'RE NOT STEALING MY NUDES THAT EASILY
>>
>>101906869
Nice gen gsister
>>
>>
>>101906916
Sir, this is a Flux thread, Flux gens only.
>>
File: buttcoin.png (3.13 MB, 2048x1024)
3.13 MB
3.13 MB PNG
>>101906214
it does seem so. But hard to tell from random noise.
>>
File: file.png (864 KB, 768x512)
864 KB
864 KB PNG
I'm using adaptative guidance for the first time, so I don't know if it's hurting the quality. I don't know what I'm doing.
>gorgeous meadow, sunset, medieval knight, tattered cape, sitting, bloody, helmet, rusty, worn metal, bloody sword
>>
File: FLUX-SunflowerHedgehog.png (957 KB, 1024x1024)
957 KB
957 KB PNG
Back when Dalle 3 was released I was running around posting its sunflower hedgehogs and challenging people to reproduce that style with a Stable Diffusion based mode, and none could do it, providing either realistic hedgehogs or cartoon styles that were completely different from this.
Flux can do it.
>children card illustration vintage style page scan of cute sunflower hedgehog sitting at the park. kindergarden eyes
That was it and I didn't even need an artist in the prompt, no idea if Flux does know them but they were making a difference, but it just shows the style like this.
>>
I haven't been following the SD for a while, what's the best way of inpainting nudes at the moment?

train lora and generate or use some decent inpainting model? what webui to use etc?
>>
File: 00002-2617544267.jpg (181 KB, 512x768)
181 KB
181 KB JPG
>>101905972
Same, even werks on 6 GB vram
flux q4 + t5 fp8 + clip_l
>>
>>101906872
thx, that loaded, but i'm still getting an OOM on a 3090ti so who knows
it works (other than the clip model) on forge for now so that's good enough
>>
>>101906965
basically, the thing you need to touch on Adaptive Guidance is the threshold, the smaller the value is, the more steps it's gonna end with cfg = 1, it's cleaning the image more but if the threshold value is too low, cfg = 1 is gonna be too impactful and you'll lose the prompt adherance the high CFG provides, you need to find a good balance between the 2 of them, for me it's between 0.990 and 0.995
>>
Are any of these even good...? I thought Flux and Pony were the best models right now.
>>
File: flux_00174_.png (1.25 MB, 768x1344)
1.25 MB
1.25 MB PNG
>>101906965
looks good. You could use high res fix to upscale it. On A1111/Forge it's the star emoji below the image preview
>>
>>
>>101907002
if you have the ram you can run FP16 T5 on the CPU, only takes a couple of seconds for me and avoids having to swap the weights
>>
>>101907004
Model?
>>
>>101907002
>6gb
Did you have to do anything special? Show me your ways.
>>
>>101907039
With short prompts sure, but really long prompts take minutes on CPU.
>>
>>101907039
Wait, coming back to SD after months, is inference on CPU possible now? I know it is with LLMs.
>>
>>101907072
Nothing, just use the latest update

Version: f2.0.1v1.10.1-previous-297-g5fb67f49
Commit hash: 5fb67f49e8552c1e4e40feafd92cad9aca5c92c2
>>
>>101907101
>Wait, coming back to SD after months, is inference on CPU possible now? I know it is with LLMs.
You can force the text encoder to be in your cpu with this script yeah >>101906872
>>
>>101907075
just did an enormous prompt in 10 seconds, 2300 characters, surely you're not that much of a boomerprompter
>>101907101
talking about the text encoder, T5, which is a transformer LLM
>>
>>101907015
There are many SD1.5 finetunes that would kick the ass of RealVisXL4, in fact, SD1.5's architecture proved superior and promptsharingsamaritan had to go back to it for his RealisticStockPhoto model for version 3.
XL models just can't be as detailed.
Stable Diffusion 1.6 scores better than SDXL and its outputs have nothing in a model like AnalogMadness 5 which is completely uncensored, too.
All of them became hidden gems but I guess it's a moot point after FLUX made all of them obsolete. No matter what model I mention, you may get eyes of different fingers or hands with wrong number of fingers.
>>
>>101906796
>who the fuck came up with this
>copypaste
>shill
Now you came up with it
>>
>>101906935
fuck flux, until i can use it on A1111 and can generate boobies
>>
>>101907128
>>101907126
Man, damn. Probably doesn't save a ton of VRAM, huh?
>>
File: Tonemap_multiplier.jpg (3.94 MB, 9785x2097)
3.94 MB
3.94 MB JPG
>>101906534
Just tried to see which "multiplier" value is the best on Tonemap for cfg 6, seems like 0.22 is where you can see consistent accurate result on text I guess
>>
File: ComfyUI_31720_.png (1.18 MB, 1024x1024)
1.18 MB
1.18 MB PNG
>>
>>101907151
t5 bigg
>>
>>101907151
T5 at FP16 takes 10GB of VRAM so yes, it does.
>>
>>101907151
>Man, damn. Probably doesn't save a ton of VRAM, huh?
It saves a lot of vram, t5xxl fp16 is asking for more than 9gb of memory
>>
>>101906838
looks like a fusion with that linus troon guy
>>
can flux generate anime shit ?
>>
>>101907039
>on the CPU
Is this OOTB in Forge or do I have to change settings?
>>
>>101907105
>AttributeError: module 'torch' has no attribute 'float8_e4m3fn'
FUCK AMD AND FUCK MY SHIT UP GOD FUCKING DAMNIT
>>
File: tmp0d_ax6o2.png (771 KB, 550x880)
771 KB
771 KB PNG
>>
>>101907139
you could be having fun with it right now, anon
https://files.catbox.moe/79tiar.jpg
>>101907234
don't know, don't use Forge
>>
>>101907212
It's not its strength. You can still basically do it, but think of how DALL-E 3 is incredible at photos/memes/comics/etc and only pretty good at anime and stays fairly stuck on one (decent) style. Flux is great at photos/memes/comics/etc and sort of mediocre at anime and stays stuck on one (ugly) style. That said, I'm referring only to the foundational model, so your next best local comparison is the dogshit that comes out of base SDXL. There's already a lora on civitai that looks like it enhances the anime style quality a fair bit, I'm sure it's only a matter of weeks before we're seeing finetunes and loras that if not great are at least fairly positive.
>>
File: Fred Calleri.png (1.51 MB, 1024x1024)
1.51 MB
1.51 MB PNG
>>101907004
>flux doesn't know artist prompts
But it doesn't know that it doesn't know. Here's Fred Calleri in the prompt, does it look like his art at all? No idea, who cares, the point is it gives a very different style.
>A cute sunflower hedgehog, round eyes, Children book illustration page scan. Art by Fred Calleri
>Seed: 35853
>>
File: ComfyUI_31725_.png (1.46 MB, 1024x1024)
1.46 MB
1.46 MB PNG
>>
File: tmpuaaucsta.png (772 KB, 550x880)
772 KB
772 KB PNG
Still can't get rid of that blur in the middle
>>
>>101907212
it's ok but it doesn't seem to be a priority in the model
>>
>>101907262
>heavysweating.jpg
>>
>>101905755
slop
>>
File: bnpa5kzf0rid1.png (1.07 MB, 2390x1100)
1.07 MB
1.07 MB PNG
Can someone post the workflow for the gguf from this image?

Dont know why they used litterbox instead of catbox. Any workflow for gguf is appreciated aswell
>>
File: ComfyUI_31727_.png (1.22 MB, 1024x1024)
1.22 MB
1.22 MB PNG
>>
>>101905972
Lyumin Zhang is the savior of VRAMlets
>>
>>101907295
ouch, wtf man. looks like it processed the 2 parts with different cfg values and everything or just a complete vae decode trainwreck
>>101907174
just base model tay + weight slider
>>
File: ComfyUI_31728_.png (1.37 MB, 1024x1024)
1.37 MB
1.37 MB PNG
>>
>>101907238
i had some errors too that got fixed by just closing it and starting it up again, which made it download some more python shit, somehow
>>
>>101907273
alright, I' m downloading flux once they release anime stuff, I don't care about 3D whores or 2000's memes
>>
>>101907335
It will take you like 60 seconds to assemble all that yourself.
>>
File: t_1.jpg (447 KB, 688x1216)
447 KB
447 KB JPG
>>
>gguf
>Q#
>nf4
No idea what any of this means, it's hard to keep up when you're a techlet, I've just been using fp8 this whole time, what's the current meta?
>>
>>101907354
I know this error well. The version of ROCm I'm forced to use will not work. Not that I had a lot of hope anyway.
>>
>>101907354
nevermind i missed the mention of "AMD"
I am sorry for your loss
>>
>>101907360
I don't think the odds of a massive anime finetune of FluxDev happening are good due to the license
>>
>>101906838
>Tay who got fat trying to get into her old concert outfits
there must be a lora for this
>>
>>101907372
you don't need to know the specifics, just look at what it does, and act accordignly, if you were using fp8, ditch that and go for Q8_0 >>101905815
>>
File: chara_card_v2 (1).jpg (153 KB, 1024x1024)
153 KB
153 KB JPG
Any idea what was used to gen this? Just genuinely curious about what the fuck went into this extreme uncanny valley shit.
Original PNG: https://avatars.charhub.io/avatars/similar_reply_6303/student-vs-teacher-7408d42bf204/chara_card_v2.png
>>
Can someone with dual GPU splitting kindly post their flux WF? I don't mind if its messy or whatever, I just want an easy copy of the nodes with manager
>>
>>101907372
Q8 which is as big as FP8 but much closer in output to FP16
anyone who needs lower than that should just save for a new GPU
>>
>>101907262
>Unable to find workflow in 79tiar.jpg
I feel robbed anon.
>>
File: tmpco4_16gy.png (722 KB, 550x880)
722 KB
722 KB PNG
>>101907350
I have no idea why it ends up like that
>>
>>101907407
proprietary prompt, please understand
>>
File: ComfyUI_03992_.png (1.52 MB, 1024x1024)
1.52 MB
1.52 MB PNG
>>101906534
I'm impressed how well this shit is working, having some blur on my anime gen was making me crazy
>>
>>101907394
there's one there >>101906799
>>
>>101907408
do you have adetailer enabled or something? when you gen does it show two progress bars?
>>
File: grid-0103.jpg (311 KB, 2688x1536)
311 KB
311 KB JPG
Does Flux have any artist or art styles that it does really accurately?
It seems to have knowledge of some few, but they only pushes it slightly in that direction.
If you just write something like anime or comic book you get the most generic stuff possible and oil painting are basically photographs with a filter.

Still love it so far.
>>
>>101907366
dont @ me if you not posting the workflow. Anyways dont need it anymore found it. Thanks for nothing
>>
does flux have img2img or something similar to controlnet ?
>>
File: tmp48sjgork.png (657 KB, 550x880)
657 KB
657 KB PNG
>>101907439
No, and it's the same on other prompts too
>>
>>101907335
>has picture book
>cant read it
>>
>>101907408
I do not know but its a trainwreck. there seems to be some sort of tiling going on but I'd have to take a look at it. need to a/b test everything that could affect the image. any special quality option, tiled something, disable anything until you get a clean gen. try 512x512 just to see if it disappears. you got the right vae?
>>
File: flux_upscaled_00117_.png (3.17 MB, 1368x2400)
3.17 MB
3.17 MB PNG
>>101907408
could be vae related but idk
>>
>>101907451
Wait lmao, dont tell me you actually tried to assemble that childs play ass workflow and failed and came here to beg for a readymade workflow out of frustration. Glad I am not you.
>>
File: 2024-08-15_00231_.png (2.18 MB, 1280x1024)
2.18 MB
2.18 MB PNG
>>101907441
any you just have to prompt correctly, its bad at manga, its goood at everything else

pic related
>a cat made out of cheese painted by Kandisky
>>
is flux reasonable with a 10g 3080 ?
>>
>>101907139
its getting there
https://files.catbox.moe/8lhon7.jpg
>>
File: file.png (1.47 MB, 1152x896)
1.47 MB
1.47 MB PNG
>>101905972
when i use vitl instead of clipl like this anon does, it says i dont have a CLIP state dict... anyone know the error in my ways?
>>
>>101907562
you could run Q4_0 with it + offloading the text encoder into your cpu >>101905815
>>
File: file.png (1.35 MB, 768x1152)
1.35 MB
1.35 MB PNG
>>
>>101907567
show your workflow, aslo FLUX doesnt work with clipskip
>>
>>101907589
>aslo FLUX doesnt work with clipskip
you can't use clip skip on clip_l though?
>>
File: MarkuryFLUX_00294_.png (2.38 MB, 1400x1024)
2.38 MB
2.38 MB PNG
>>101905815
Q16_0 when?
>>
>>101907579
boobs nice, downloading forge
>>
>>101907535
>>101907525
I don't have any vae, it goes away on 512x512 but any other size it appears on, even 600x600
>>
>>101907596
if it wasn't trained with clip skip you don't want to clip skip
>>
>You do not have T5 state dict!
Where do I get this thing and how do I set it to run on the CPU?
>>
File: 2024-08-15_00224_.png (2.15 MB, 1280x1024)
2.15 MB
2.15 MB PNG
>>101907596
no you cant, clip skip is not an option for FLUX
>>
>>101907589
same forge setup as the anon i replied to, except q4 flux dev and fp8 version of t5xxl
i dont know what the hell a clipskip is
>>
is it possible to offload the text encoder to cpu on stable diffusion models? is that how it works?
>>
>>101907608
768x768
>>
>>101907613
I tried downloading google's flan-t5-base but forge complains about shape mismatches. I have no idea what I'm doing. Using q4 gguf.
>>
>>101907657
don't care, quit genning rn chode kid
>>
>>101907656
of course, you do the same method as on flux on ComfyUi and you're good to go
>>
>>101907608
try adding the 1.5 vae then. Most models have it baked in but it seems this one doesnt. Your images are also a little colorless which is a sign that the vae is missing
>>
>>101907392
that image has no metadata retard
>>
File: 00010-348817366.jpg (101 KB, 1024x1024)
101 KB
101 KB JPG
>>101907652
clip skip is what SD15 and SDXL used.. it it was good it called out CLIP (which is very low aggressive text encoder) and it just called it down you needed that for NAI prompts like pic related.. its useless for FLUX
>>
>>101907567
see >>101906726
>>
>>101907687
retard, no image has meta data on 4chan
>>
File: forge.jpg (232 KB, 1851x919)
232 KB
232 KB JPG
drooling retard here,
what exactly is it telling me i need to get?
>>
>>101907677
shit im on forge
i dont see the option anywhere in its settings.
>>
>>101907720
that only applies if you attach it to the post, it can't strip metadata from a url, double retard
>>
>>101907723
a VAE state dictionary, whatever that is
>>
>>101907723
A vae state dict
>>
>>101907723
lmao I can help you and you can help me. You need https://huggingface.co/black-forest-labs/FLUX.1-schnell/blob/main/ae.safetensors in the vae models dir. And I can see from your pic what I need. Thank you!
>>
>>101907723
https://huggingface.co/black-forest-labs/FLUX.1-schnell/blob/main/ae.safetensors
>>
>>101907723
add ae.vae
>>
>>101907669
try this one
https://huggingface.co/comfyanonymous/flux_text_encoders/tree/main
mentioned in this guide:
https://github.com/lllyasviel/stable-diffusion-webui-forge/discussions/1050
>>101907699
sounds neat, i don't use it though so i doubt its related to my issue of forge not recognizing vitl
>>101907712
why did it work for the guy with the lightning wizard cat then
>>
How well can flux do porn? And when are flux finetunes coming out.
>>
>>101907015
Those kinds of rankings are so dumb, they're actually a list of mediocre models, because those models are generalized to be "decent" at everything, instead of excellent at one thing. They're not specialized and finetuned.
>>
>>101907763
Softcore/artsy and who knows.
>>
>>101907669
flan-t5-base is not the right T5
>>
>>101907755
>why did it work for the guy with the lightning wizard cat then
that was also me, and that lightning cat gen shows the wrong flux model (it was still on nf4-v2, even tho it shows the gguf)
i think forge needs more work on loading the VIT-L ones

this is how it works for me now:
>>
>>101907699
NAI was trained with clip skip, the benefits of clip skip will not be there on a model that was not trained with clip skip
>>
File: grid-0111.jpg (362 KB, 2688x1536)
362 KB
362 KB JPG
>>
>>101907794
>>
>>101907764
general models that can do everything are much more interesting tho, a lot more possibilities
>>
get beleted Teebs lal
>>
>>101907375
just update it nigger
>>
>>101907794
ohh ok, i guess ol clippie will do for now
>>
>>101907808
i hope you realize that the reason they renamed the term "child porn" to "CSAM" is so that they can go after people like you, for non-pornographic images.
enjoy your jail time.
>>
File: 1717345881502861.png (854 KB, 1632x1020)
854 KB
854 KB PNG
anyone got any more memes like this
>>
>>101907684
Added VAE, still has the blur
>>
>>101907834
RX5700XT will not work on anything past ROCm 5.2, and that function is missing on anything before PyTorch 2.1, which does not support such an old version of ROCm. Until I change my card, or AMD miraculously decide to upgrade their software to support my card, I'm fucked.
>>
>gguf quant lobotomies killed all progress in local llms
>now the same shit will have to image models
it's so tiring
>>
>>101907817
only true in the sense that you mean "smarter base = more interesting" like how dall-e 3 is better than xl is better than 1.5. Working from any given base (let's say XL), it's IMO much *less* interesting to just create an XL finetune that's relatively "decent" at everything, vs making an XL finetune that gets the best possible anime out of SDXL, or the best possible anime, or the best possible aesthetic photorealism, etc.
once we live in a perfect world where dall-e 5 is creating absolutely anything with complete flexibility based on just prompt, sure, don't bother finetuning your good base towards a specific task. until then, take your jack of all trades XL and turn it into an aesthetic anime 1girl machine like animagine, or a sex pose understander like pony, etc.
>>
File: ComfyUI_00110_.png (1013 KB, 896x1136)
1013 KB
1013 KB PNG
I'm using a flux q5 quant with a fp8 t5. prompt:
>Cinematic portrait of Syndel wearing violet leotard posing for Mortal Kombat loading screen
When I tried this prompt on spaces though it would always generate the MK logo and text. Locally, however, there was none so far, even though the quality is fine. Is it the model that was different, or is it because of the quants?
>>
>>101907869
>lobotomies
lol, Q8_0 is more accurate to fp8 (the one the majority of people were using it so far), it's the opposite of lobotomy, we got a better quality quant out of this >>101905815
>>
>>101907866
>RX5700XT will not work on anything past ROCm 5.2
Was fixed in 6.2 (if your distro compile for your ISA, don't think AMD does), but you could have just compiled other versions with some patch.
>>
File: 00000-3597725605.png (951 KB, 896x1152)
951 KB
951 KB PNG
>>101907740
sweet
>>
>>101907878
>only true in the sense that you mean "smarter base = more interesting"
No, take XL, train it on danbooru AND real porn and it will be more interesting to me than two models only trained on one of them.
>>
>>101907858
ok last attempt, your GUI, whatever it is (I can't identify it), do a search for (your gui) and something like "tiles appearing". someone else must've had that problem too - or, nuclear option, nuke your GUI and use something solid, like comfyUI or... erm.. forge?
>>
Is there any good pixelizing custom node set for comfy? I can only see ComfyUI-PixelArt-Detector which unless I'm blind has a bunch of fucking retarded design choices, instead of a node which downscales your image by a factor of N, it forces you to use an all-in-one save node (i don't want to save every gen) and forces you to save in webp (what the fuck)

searching archives got me nothing but it did get me this pretty nice looking repo https://github.com/KohakuBlueleaf/PixelOE?tab=readme-ov-file not a comfy node but the example looks fantastic, maybe i'll implement it as a comfy node
>>
>>101907687
If it had metadata I'd have just looked through it instead of asking here.
>>
>>101907956
and you can't adapt that ComfyUI-PixelArt-Detector to do what you want? what a techlet
>>
>>101906004
>>101906015
>>101906796
>>101906819
>edited 9 hours ago
he added it after city did
>>
Will Forge abandon their ui again if A1111 catches up?
>>
>>101907799
dirtier
>>
>>101907953
Found this
https://www.reddit.com/r/StableDiffusion/comments/1arfs35/how_to_get_rid_of_this_weird_seamline_in_the/
>>
>>101907849
Do note engage, just ignore and report.
>>
File: 4275460115.png (1.07 MB, 896x1152)
1.07 MB
1.07 MB PNG
>>
>>101907982
i mean creating a node that just downscales your image by a factor of n with a selection of scaling algorithms is probably a 15 minute job after learning the boilerplate of making a custom comfy node, so if im making my own node i'll just do it from scratch or implement the cool black magic stuff in the linked repo rather than dig through somebody's hardcoded webp jpeg autosave rubbish. but i dont want to reinvent the wheel if there's already a decent pixelizing node set i can use
>>
What does T5 even do? If I'm gonna blow 10 gb VRAM extra, it better be good.
>>
>>101908065
>What does T5 even do? If I'm gonna blow 10 gb VRAM extra, it better be good.
put that in your cpu dude
>>
File: 1707074702754903.png (699 KB, 720x1080)
699 KB
699 KB PNG
>>
>>101908051
Model/prompt? I like the art style. Reminds me a bit of Castlevania.
>>
>>101908065
It's the part of the model that understands the prompt in natural language
>>
Where can I find the force/set clip device node for comfyui?
>>
>>101908097
Download the manager extension and have it download missing nodes for you.
>>
>>101906966
cute sonic
>>
>>101908065
>>101908078
Are we able to do that now? Feel like this is the optimization vramlets actually need
>>
>>101908097
here >>101906872
>>
>>101908076
Looks a little overbaked/weirdly smoothed. Are you using a highly quanted version of Flux, or is it just kind of naturally like that?
>>
>>101908128
>Are we able to do that now?
yeah you can force clip to go on your cpu with this script >>101906872
>>
>>101908065
think of it this way: how good the image model is dictates how nice the image will be if you type in "beautiful girl, sexy blonde sitting on bed, high quality photograph". sd1.5 vs sdxl vs flux vs dall-e 3 changes this a lot even though all you really asked for was "draw me a sexy girl".
having a better clip tokenizer is the important bit for if you were prompting "beautiful girl, sexy blonde with emerald ring on her left hand, blue headband, large perky breasts with pink nipples and small areolae, sitting on bed with beige sheets and red satin pillows, holding a sign that has a very racist word on it". t5 will get you the exact areola size and racist slur you're looking for, sure maybe on a weaker model it'll still be an ugly retarded picture but the better clip is what gives it the fine grained ability to actually follow a long complicated set of instructions regardless of how easily it generates beautiful images that look like they might have been made by an artist or be a real photo
>>
>>101906831
No because the model wasn't trained with the improved CLIP.
>>
File: 3006526815.png (886 KB, 896x1152)
886 KB
886 KB PNG
>>101908090
flux q4 with an ayami kojima lora that's on civitai
>>
the hands are still sussy
>>
>>101908153
Thanks, anon! I appreciate the explanation, that makes a lot of sense.
>>
File: 1701796085431553.png (833 KB, 720x1080)
833 KB
833 KB PNG
>>
I'm trying to use forge but it says
>Moving model(s) has taken 13.35 seconds
And just stays there forever taking 100% CPU.
>>
https://github.com/city96/ComfyUI-GGUF/commit/88fb6fa0014850615ca5b3e0ec1c018f67319237
Interesting commit, when you update comfyUi, it also update the packages/custom_nodes in it right?
>>
>>101908184
Oh damn, I wasn't expecting a lora for her already. Nice. Thanks for showing it off.
>>
File: grid-0117.jpg (396 KB, 2688x1536)
396 KB
396 KB JPG
>>101907998
Mixing styles didn't quite go the way I wanted.
>>
>>101908184
It's already possible to use loras with quants?
>>
>>101908254
oof
>>
File: 1710970278349608.png (801 KB, 720x1080)
801 KB
801 KB PNG
>>
>>101908236
>improved existing quants
Where is mikufag, we need an updated comparison
>>
>>101908278
I'm not gonna do that again ;_;
>>
File: 12gb.png (81 KB, 1902x586)
81 KB
81 KB PNG
12GB VRAMLET REPORTING IN
RECENT UPDATE TO KOHYA'S SD3 BRANCH IS SMOOTH SAILING, NO ADDITIONAL FIXES REQUIRED

obviously time for lora completion isn't accurate since its 126 steps only, was just trying to get this shit running

FUCKING HAPPY AS A PIG IN SHIT BOYS
>>
why is it only really shit "artists" that cry about ai all the time?
https://old.reddit.com/user/alan_reddit_m
None of the theft!!! arguments apply to you bro
>>
>>101907995
both are getting abandoned. gradio sucks ass
>>
>>101908236
I'm assuming this also means that the premade gguf quants were based on a f32 gguf and not the included f16? Since flux's original weights are bf16.
>>
File: grid-0121.jpg (373 KB, 2688x1536)
373 KB
373 KB JPG
>>
>>101908308
gradio is hetero tho
>>
File: file.png (5 KB, 352x52)
5 KB
5 KB PNG
>be vramlet
>decide to try out upscaling an image in forge
i dont feel so good
>>
>>101908331
it's gay. not listening to your faggot lies glowie
>>
>>101908308
cumfart lost the ui wars get over it already, everyone still uses auto.
>>
File: 1715066436771314.png (1.06 MB, 832x1248)
1.06 MB
1.06 MB PNG
>>101907212
sort of
>>
File: ComfyUI_00142_.png (2.15 MB, 1456x816)
2.15 MB
2.15 MB PNG
The finetunes and loras are fucking up the text capability and prompt understanding of the model. The low quality lora spam of SD 1.5 and SDXL isn't going to fly with Flux.
>>
>>101908342
k FBI man
>>
>he doesn't know gradio telemetry spies on you
>>
File: 1694495208792029.png (1.48 MB, 832x1248)
1.48 MB
1.48 MB PNG
>>
>>101907855
kek, i know exactly who the bottom left is talking about
>>
>he doesn't know about Cumfy's trip to CHYNA
>>
>>101908367
>he doesn't know everything spies on you
>>
File: 1703860668262515.png (2.54 MB, 1536x1536)
2.54 MB
2.54 MB PNG
>>
File: grid-0122.jpg (370 KB, 2688x1536)
370 KB
370 KB JPG
>>
>>101908305
because it's weird and scary to them
>>
>>101908319
The 1D weights are in FP16 (didn't matter in my testing for quality but added 150MBs to the weights, keeping them in BF16 killed speed due to needing to be dequantized as well).
The weights that get quantized down get converted to FP32 before being quantized to Q8_0/etc, though FP16 here is also fine.
The issue was that the dequantizing algo itself was running in FP16, and it was mangling some of the intermediate values while doing the maths in that part for Q8_0 and Q4_0. Q8_0 should be even closer to native FP16 now.

Yes I'll fix the actual bugs with loading soon, I just had these changes left over from yesterday that I still needed to test.
>>
File: flux_lora_7.jpg (421 KB, 1024x1024)
421 KB
421 KB JPG
>>101907212
It can generate anime shit with lora
>>
>>101908338
I can recommend https://github.com/upscayl/upscayl
>>
>>101908356
Some people still follow dreambooth practices when training loras like retards.
>>
>>101908415
That's very weird, chinese outsourced animu
>>
File: 1701133472164373.png (741 KB, 720x1080)
741 KB
741 KB PNG
>>
>>101908446
...what's the new meta then?
>>
File: Galaxy_brain.jpg (14 KB, 220x220)
14 KB
14 KB JPG
>Curate dataset for fetish
>"Accidentally" put it on huggingface for scraping
>All models are now better at my fetish without having to spend a cent on training
>>
>>101908456
Not only Flux became the best at handwritten text, it beat all existing fonts in the process.
>>
The oven door just opened and a loaf of bread appeared...
>>101908455
>>101908455
>>101908455
>>
>>101908458
Caption the images accurately and fully, don't use mangled names when training characters/celebrities.
>>
>>101908466
Did you mean to post this is /lmg/ or does this apply to image models as well?
>>
>>101908500
it's a meme, sir
>>
>>101908466
Was it pantyshots? There was a clear before and after about them, before you'd get complete darkness below skirts, afterwards it was like when booru girls sit but the panties are now photorealistic.
Dreamshaper 8 got them and then infected all models merging with it with very detailed underwear.
>>
File: 4107775908.png (1.62 MB, 768x1344)
1.62 MB
1.62 MB PNG
>>101908248
Neither was I, just saw it by chance on civitai.
>>101908257
Yeah, on forge it is. It's still early though and some stuff works with q4 but not with nf4 and vice versa.
>>
using flux1 schnell nf4, what settings do you recommend?
this is cfg1, Euler a 4 steps
>>
Minimum GPU requirements for Flux?
>>
>>101908720
it seems with forge webui and the 'schnell version' NF4, its quite usable on lower gpus' not more demanding than SDXL or even less
>>
>>101908486
What about an activation phrase? Even if it's not a new mangled string of tokens
>>
>>101908740
Nice. Would it work on gtx 1080?
>>
>>101907438
thank you!!
>>
>>101908861
anon, buy a new card
i dont know, maybe forge can offload to cpu but it wont be fast i'd guess
>>
>>101906089
Prompt?
>>
>>101908383
Name names
>>
File: 0.jpg (141 KB, 1024x1024)
141 KB
141 KB JPG
>>
File: Untitled.png (266 KB, 470x560)
266 KB
266 KB PNG
>>101905788
>>
chads, what pyTorch version should i have ?
Launching Web UI with arguments: --listen --api --ckpt-dir=d://sd-checkpoints
Total VRAM 10239 MB, total RAM 32719 MB
pytorch version: 2.3.1+cu121
WARNING:xformers:WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for:
PyTorch 2.1.2+cu121 with CUDA 1201 (you have 2.3.1+cu121)
Python 3.10.11 (you have 3.10.11)
Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)
Memory-efficient attention, SwiGLU, sparse and more won't be available.
Set XFORMERS_MORE_DETAILS=1 for more details
WARNING:xformers:A matching Triton is not available, some optimizations will not be enabled.
Error caught was: No module named 'triton'
xformers version: 0.0.23.post1
Set vram state to: NORMAL_VRAM
Device: cuda:0 NVIDIA GeForce RTX 3080 : native
Hint: your device supports --cuda-malloc for potential speed improvements.
VAE dtype preferences: [torch.bfloat16, torch.float32] -> torch.bfloat16
CUDA Using Stream: False
Using pytorch cross attention
Using pytorch attention for VAE
>>
>>101909374
https://x.com/voicesbyzane

he farms interactions with "anti-ai" arguments because he's a twitter personality with no voice acting career to speak of
>>
>>101908184
are loras working with q4 on comfyui?



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.