[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: flux8.png (913 KB, 1024x1024)
913 KB
913 KB PNG
Discussion of free and open source text-to-image models

Previous /ldg/ bread : >>101914501

>Beginner UI
EasyDiffusion: https://easydiffusion.github.io
Fooocus: https://github.com/lllyasviel/fooocus
Metastable: https://metastable.studio

>Advanced UI
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
ComfyUI: https://github.com/comfyanonymous/ComfyUI
Forge: https://github.com/lllyasviel/stable-diffusion-webui-forge
InvokeAI: https://github.com/invoke-ai/InvokeAI
SD.Next: https://github.com/vladmandic/automatic
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI

>Use a VAE if your images look washed out
https://rentry.org/sdvae

>Model Ranking
https://imgsys.org/rankings

>Models, LoRAs & training
https://civitai.com
https://huggingface.co
https://aitracker.art
https://github.com/Nerogar/OneTrainer
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts

>Flux
https://huggingface.co/spaces/black-forest-labs/FLUX.1-schnell
https://comfyanonymous.github.io/ComfyUI_examples/flux

>Pixart Sigma & Hunyuan DIT
https://huggingface.co/spaces/PixArt-alpha/PixArt-Sigma
https://huggingface.co/spaces/Tencent-Hunyuan/HunyuanDiT
https://huggingface.co/comfyanonymous/hunyuan_dit_comfyui
Nodes: https://github.com/city96/ComfyUI_ExtraModels

>Index of guides and other tools
https://rentry.org/sdg-link
https://rentry.org/rentrysd

>GPU performance
https://vladmandic.github.io/sd-extension-system-info/pages/benchmark.html
https://docs.getgrist.com/3mjouqRSdkBY/sdperformance

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
sd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium

>Maintain thread quality
https://rentry.org/debo

>Related boards
>>>/h/hdg
>>>/e/edg
>>>/d/ddg
>>>/b/degen
>>>/vt/vtai
>>>/aco/sdg
>>>/trash/sdg
>>
>leave PC and have it autogenning
>come back
>turns out one of the first gens was the best by a very large margin
It keeps happening...
>>
>no collage
Thread ruined already.
>>
Purple goes great with green.
>>
File: ComfyUI_00059_.png (1.96 MB, 1248x1848)
1.96 MB
1.96 MB PNG
>>101917935
I know that feel anon
>>
This gives me a Dr. Stone vibe
>Flux is quite good
>>
>>
>no collage
I'm going to kill myself
>>
File: kig news.jpg (712 KB, 1216x832)
712 KB
712 KB JPG
>>
This fortnite one is spot on.
>>
>>
File: ComfyUI_00001_.png (622 KB, 832x1216)
622 KB
622 KB PNG
We need a complete and total shut down of "photorealistic" slop until we figure out what the hell is going on.
>>
>>101918017
>We need a complete and total shut down of "photorealistic" slop until we figure out what the hell is going on.
you can remove the slop with this method
https://reddit.com/r/StableDiffusion/comments/1enm9og/discovered_by_accident_a_trick_to_make_flux/
>>
>>
File: ComfyUI_00071_.png (3.24 MB, 1248x1824)
3.24 MB
3.24 MB PNG
>>
>>
>>
>>101918055
>>
File: file.png (880 KB, 768x768)
880 KB
880 KB PNG
A Japanese girl in a bikini at the Beach. It's raining a little bit and the waves are slightly choppy, but you can see the sun rays in the distance.
>>
File: FD_00102_.png (1.43 MB, 1344x768)
1.43 MB
1.43 MB PNG
>>
File: file.png (709 KB, 768x768)
709 KB
709 KB PNG
>>101918071
2/2
>>
File: 00060-2738255433.png (1.28 MB, 896x1152)
1.28 MB
1.28 MB PNG
>>
File: file.png (773 KB, 768x768)
773 KB
773 KB PNG
so much for black butler anime prompt
>>
File: 00061-3103772079.png (1.15 MB, 896x1152)
1.15 MB
1.15 MB PNG
>>101918137
its the 5gs that cause viruses, duh
>>
>>
>>
File: file.png (879 KB, 768x768)
879 KB
879 KB PNG
1/2
>>
>>101918137
2 cents have been credited to your Pfizer food stamps account.
>>
File: t4mod5vtwft31.jpg (48 KB, 640x348)
48 KB
48 KB JPG
>>101918137
>eat horse paste instead
That line sounded exactly like that line on idiocracy
>>
File: file.png (681 KB, 768x768)
681 KB
681 KB PNG
2/2
>>
File: 1717489504723024.png (683 KB, 896x1152)
683 KB
683 KB PNG
neat, reactor also works with flux in forge

behold, gosling trump: txt2img btw, probably even better swaps in img2img
>>
File: kanye 1.png (626 KB, 768x768)
626 KB
626 KB PNG
>>101918225
at least the horse paste would kill any parasites you might have from undercooked food
>>
>>101918137
oh no... debo has hijacked /ldg/, it's over...
>>
File: 1707261323702619.png (1.11 MB, 1024x1208)
1.11 MB
1.11 MB PNG
>>101918233
>>
>You find an unmarked crate of H200s unattended, no cameras around, factory sealed and large enough that you can carry in your arms.

What do you do?
>>
File: delux_flebo_00035_.png (2.34 MB, 1152x1536)
2.34 MB
2.34 MB PNG
>>101918242
I bless thee
>>
>>101916458
what is this extension?
>>
>>
>>
File: 1700960888872783.png (1.23 MB, 1024x1208)
1.23 MB
1.23 MB PNG
sdxl/pony gens can also be used with forge/comfy inpainting for text or whatever you like too, fun stuff.
>latent noise + 1 denoise
>get signs
>>
>>101917989
derricious
i wish real asian women had asses like that
>>
>>
this shit should be illegal
>>
File: FD_00107_.png (1.41 MB, 1344x768)
1.41 MB
1.41 MB PNG
>>101918137
>bill gates and fauci want to put tiny bruce lees in your body to fight covids. its anti american. eat horse paste instead
>>
>>101918399
fuck off tripfag
>>
>>101918366
They do anon, you just won't get to meet beautiful Asian women like that easily.
>>
File: 1723399415386986.png (1.1 MB, 1200x1200)
1.1 MB
1.1 MB PNG
>>
>>101918431
it will be illegal after kamala is elected
even though her campaign is using the tech to fake crowds.
>>
>>101918447
it will be illegal 4 u, not for them
>>
>>101918294
The people will be getting the ultimate flux finetune(s) for realistic porn and hentai, for free.
>>
so reactor works 100% fine with flux (in forge, I assume comfy is the same), there is your loophole for celebs not in the dataset.
>>
wow! celebs!!
>>
>>101918294
I briefly consider stealing them before realizing I don't have all the other required equipment for actually mounting, running and cooling them, and walk away
>>
>>
File: 1719515153442698.png (882 KB, 1024x1024)
882 KB
882 KB PNG
>>101918460
>>
>>101918294
A shit ton of XY plots to find the best settings for my workflow kek
>>
File: 1711291327084139.png (1.42 MB, 1200x1200)
1.42 MB
1.42 MB PNG
guys there's a loophole, for the celebs!!!
>>
>inpainting in comfyui still intolerable after all this time
>>
>>101918447
I certainly do hope America makes it illegal so China can be the one to develop AGI.
>>
I think I'm done with heavy image genning for now. Time to take a break and go back to text gen for a while. Miku could use a break from all the walking herself.
Good night /ldg/.
>>
>>101918475
They cost for the peripherals would be miniscule compared to the profit you could generate from any number of uses for a a crate full of enterprise GPUs.
Play your cards right and you have a legitimate passive income from the rental alone.
>>
>>101918499
it's a shame, I just keep a Fooocus install around for inpainting but I wish I didn't have to
>>
>>
File: 1701974691801744.png (935 KB, 1024x1024)
935 KB
935 KB PNG
corporate miku:
>>
what the FUCK is bitsandbytes and how do you install it?
>>
>>101918532
pip install bitsandbytes
>>
>intelligent models have a slopped style
>soulful models are retarded and don't understand prompts
what do
>>
>>101918539
in which folder? main?
>>
>>101918548
don't think it matters
>>
>>101918447
Flux is open source. fuck the police. You can't stop Flux from existing on every computer. The results of the election won't stop the existence of open source AI. It will continue to spread and can not be stopped. No amount of zionist influence can control open source AI. It is the future. It is inevitable.
>>
>>101918548
Open the file explorer, go to the comfy ui custom nodes folder and the nf4 (which I assume you're using) and type cmd in the top bar
then write, pip install bitsandbytes
>>
>>
>>101918562
>Flux is open source
Why does everyone keep saying that? It's not open source, they just threw us the weights to play with.
>>
File: Comparison_all_quants.jpg (3.84 MB, 7961x2897)
3.84 MB
3.84 MB JPG
>>101918548
do not use nf4 dude, it's worse than Q4_0 (which has the same size)
>>
>>101918435
>They do anon
nope
I've never seen an amazing-assed azn until this thread and I'm almost 40
>>
File: FD_00108_.png (1.44 MB, 1344x768)
1.44 MB
1.44 MB PNG
>>
>>101918557
>>101918566
did it in main, I don't have any nf4 folder yet because it says to first install bitsandbytes and then the node

>>101918579
this shit is esoteric as fuck, the only way to know for sure is to test all this crap for yourself and compare
>>
>Trump is pro crypto and pro Artificial Intelligence
>Kamala is anti crypto and and anti AI
wtf I love Trump now

shit, I wasn't going to vote because the system is a joke and I don't trust the government but now this might convert me into a voter again
>>
>>101918597
>this shit is esoteric as fuck, the only way to know for sure is to test all this crap for yourself and compare
The simple fact that nf4 is the only quant who decided to have Miku as a different pose than the rest is a huge redflag, this shit stinks
>>
File: 1710494146180167.png (1.23 MB, 1200x1200)
1.23 MB
1.23 MB PNG
>>
>>101918598
well duh, democrats are pro censorship and republicans are pro freedom, nothing new in the sun
>>
>>101918611
I did test it on forge and it kinda suck BUT I have everything in forge sucks by default
if I can install this fucking shit I'll report back
>>
>>101918611
He's right. They should all produce nearly the same image but with spot the difference levels of variation.
Nigu being in a different pose is concerning.
>>
>>101918625
i love pro-freedom politicians that require me to swear oaths to israel if i want a government job
>>
>>101918627
forge supports both nf4 and gguf quants, so you better go for Q4_0, it's not really good but way better than this piece of nf4-v2 shit
>>
How to make the text more clear?
>>
>>101918640
both side do that though? it's not like voting democrat will help you on that shit
>>
>>101918640
the real path is to use AI to free these polticians from Israel control
>>
>>101918641
where can I get Q4? what are the advantages? does it run in chackpoints or unet?
>>
>>101918640
both parties and candidates are pro israel so you're fucked there
>>
>>101917989
brap
>>
>>101918673
this
>>
>>101918673
yeah this is real reason i didn't want to vote. unless they have the balls to denounce Israel and have a pro nuclear energy stance in their campaign, they aren't good enough.
>>
File: Capture.jpg (17 KB, 686x222)
17 KB
17 KB JPG
>>101918665
>where can I get Q4?
https://huggingface.co/city96/FLUX.1-dev-gguf
>what are the advantages?
like I said, it's the same size as nf4 but produces better quality overall >>101918579
>does it run in chackpoints or unet?
in comfyUi it's a unet loader because the huggingfaces gives you unet models (I prefer like this, it's retarded to have to download the same vae + text encoder for each unet changes)
https://github.com/city96/ComfyUI-GGUF
>>
>>101918374
Love this, catbox?
>>
>>101918705
you're a gentleman and a scholar
I'm going to test both because autism, and compare all the other models as well
>>
>>101918705
NTA but was a LoRA workaround figured out for ggufs yet? It's very high on my importance list.
>>
>>101918723
>you're a gentleman and a scholar
thanks, and have fun with those anon o/
>>
File: FD_00112_.png (1.65 MB, 1024x1024)
1.65 MB
1.65 MB PNG
>>101918579
Is it just me or are all the Q models half the speed of the F ones?
>>
>>101918753
not at all, look at the speed again, it's equivalent, only Q5_0 is fucked on that regard (my theory is that Q5 isn't a multiple of 2 so the GPU doesn't like it)
>>
>>101918715
Thanks
https://files.catbox.moe/qpqk8g.png
>>
>>101918741
>6.79 GB
oh shit, so small
ok, so based on your comparison table, Q4 would be perfect for 12Gb vRAM, right?
the rest are obviously way better, but go beyond 12Gb so they should be considerably slower, am I in the ballpark?
>>
>>101918753
No, but using CFG for some reason eats up all my vram even at q8
>>
>>101918779
Q5_0 is a little to big for 12gb vram but Q4_1 is exactly what you need, it's between Q4_0 (too small) and Q5_0 (too big)
https://huggingface.co/city96/FLUX.1-dev-gguf/tree/main
>>
>>101918765
>>101918781
Not talking about the picture or cfg I mean in practice on my 4080 in a standard workflow
>>
so I gave this a shot to compare to joy caption, https://huggingface.co/qresearch/llama-3-vision-alpha

got it working with Llama-3.1-8B-Lexi-Uncensored-V2 instead of the 4bit version of llama-3-8b-Instruct

seems to actually be uncensored, but main issue is it squeezes the images to 384 x 384, but joy caption i dont know what it is resizing to, it handles images differnetly
>>
>>101918809
Q quants aren't supposed to be slower than the fp ones if you have enough vram to run, if you don't, then I don't know how it behaves
>>
>>101918543
Learn to prompt each the way it’s supposed to be prompted.
Learning is a process that involves being wrong repeatedly. Some people are incapable of enduring this discomfort and thus never learn anything. They pretend to know everything and scurry through life hiding from themselves.

Now, how the fuck is it possible for AUTOMATIC1111 and Comfy to gen normally (SD 1.5) and Forge to take while minutes to for 2 measly low res iterations?
>>
>>101918797
yeah, if anything it might be worth testing Q4 and Q8
I'll do so
>>
File: 1720045341036217.png (1.07 MB, 1024x1024)
1.07 MB
1.07 MB PNG
gen made in sdxl/pony, inpaint edits (to test) with flux/forge. flux doesn't have all the character loras yet, but you can use existing loras/models together with flux, to do whatever you like. flux inpainting even without controlnets is good. (school uniform before edits)

flux excels with text and many other things, but for specific characters sdxl/pony/1.5 are still the best option unless flux has the character in it (like Miku)
>>
>>101918353
>I open an image editor
>I type text onto the sign
???
>>
>>101918435
That ass is as fake as those “Japanese palm trees”. Clearly a Californian that has never been to Asia conjured that image. A tiny Californian living inside your computer.
>>
>>101918811
joycaption certainly isn't 384, it can pick up some pretty small text
>>
>>
>>101918598
The companies that run your country have already decided how it’s going to go down. Your candidate of choice will only affect the flavor of the whining from the plebeians.
>>
>>101918705
what about Q4.1 and Q5.1, have you tried those?
>>
>>101918853
>>
what does this mean? happens every time I try to run a gguf
>\ComfyUI\custom_nodes\ComfyUI-GGUF\dequant.py:10: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
data = torch.tensor(tensor.data)
>>
>>101918502
Neat gens
>>
>>101918872
she has momface (positive)
>>
>>101918867
We know how the Q behave, it's like this;
Q4_0 < Q4_1 < Q5_0 < Q5_1 < Q8_0
The comparaison image was made simply to see how some of those Q quants perform against nf4 and the fp4/fp8
>>
>>101918832
How the fuck is it possible for AUTOMATIC1111 and Comfy to gen normally (SD 1.5) and Forge to take while minutes to for 2 measly low res iterations?
>>
>>101918867
i cant get 4.1 to work in forge. q8 works fine.
>>
>>101918892
>>101918904
roger

>>101918897
imo forge is fast but sucks ass, and speed over quality is useless
>>
File: 1699249603097897.png (1.43 MB, 1024x1024)
1.43 MB
1.43 MB PNG
>>101918777
Thank you bro
>>
Finally, comfyUI can be experienced the way it was meant to be
>>
>>101918925
forgot pic
>>
>>101918935
>lightmode
literal monster
>>
File: 1699778459948193.png (1.12 MB, 1024x1024)
1.12 MB
1.12 MB PNG
>>101918843
devil noko:

surprised the inpainting is consistent in flux even without openpose or canny being used. still want all the controlnet stuff though.
>>
>>101918935
>>
>>101918951
wonder if it's a model size thing
like how a big LLM can intuitively one shot stuff that needs to be explained to a smaller one with examples
a large image model can inpaint coherently without assistance
>>
File: wtf.png (51 KB, 1126x614)
51 KB
51 KB PNG
okay wtf is this shit?
why does the search and the nodes look different now
>>
>>101918987
Because the UI was updated to look less shit
>>
>>101918962
yeah cause normally i'd need to use openpose to 100% get a figure in proper proportion if inpainting, or canny for 1:1 lineart (and it's fast, not really a problem), but flux seems to work well as is which is nice
>>
>>101918951
What are you using for inpaining? Forge was trash,
>>
>>101919001
swing and miss, I guess
>>
File: 1702187293077325.png (1.1 MB, 1024x1024)
1.1 MB
1.1 MB PNG
only prompt: large white angel wings. anime style.

made two blobs on both sides and voila, wings. I know inpainting is nothing new but flux is good at lots of stuff including text in perspective so it's also good for edits.

>>101919011
this is forge with the flux nf4 model, default img2img inpaint tab. seems like auto1111 to me
>>
>>101919016
>>101918987
Comfy really made a UI change on a system used almost exclusively by autistics. I don't think he thought this through.
>>
>>101919033
>this is forge with the flux nf4 mode
ditch that shit anon and go for Q4_0 instead >>101918579
>>
>>101919064
Would if I could Anon. This is all I get.
>>
>>101919083
maybe you should make an Issue on his repo so that he can fix it
https://github.com/city96/ComfyUI-GGUF/issues
>>
>>101918919
Nice
>>
>>101919093
Is it even a comfy issue because nobody else seems to be getting it. It's something on my side but I have no idea what
>>
>>101919083
did you update that GGUF package? he did fix some shit for the vramlets
>>
>>101919061
well I think comfy's UI is utter trash anyways, but the point is that it looks the same or slightly worse
>>
LoRA for gguf when?
>>
File: 1655701130683.png (165 KB, 960x540)
165 KB
165 KB PNG
>>101919108
maybe there's more people having the same issue as you and aren't as vocal about it and decided to give it up, you're different anon, you're a man of sheer will and determination, and you will make an issue so that he can look at it and make it work with you
>>
>>101919129
already possible on forge, the gguf anon said he's working on it to make it work for the comfyUi node
>>
>>
wtf is "distilled cfg" vs regular cfg
>>
>>101919203
distilled cfg is a cfg that works on flux because the regular cfg burns the picture at cfg > 1, if you want my opinion, distilled cfg is a fucking meme and nothing beats the old cfg, fortunately you can turn cfg > 1 without burning the image with Dynamic Thresholding
https://reddit.com/r/StableDiffusion/comments/1ekgiw6/heres_a_hack_to_make_flux_better_at_prompt/
>>
>>101919235
For how much you've been shilling your process you'd think your images would look better than they do
>>
>>101919250
srsly
>>
File: 1716304732143457.png (1.73 MB, 1024x1216)
1.73 MB
1.73 MB PNG
if i want to train a flux lora, i should use the fp16 model, right
>>
>>
>>101918811
yeah, strangely this llama-3-vision-alpha can read text (like their book example image) but it tends to go off the rails and skip over stuff

someone made a nice modificaition where you can change the image without restarting https://huggingface.co/qresearch/llama-3-vision-alpha/discussions/2
>>
>>101919131
I did it
>>
>update comfy
>math nodes no longer work
Many such cases
>>
>>101919083
I get that too but it works, that being said, nf4 is faster
>>
>>101919250
Silence, Nigger. His CFG method has unsloppified many of my gens.
>>
>>101919250
They don't? https://imgsli.com/Mjg2Mzgy
>>
>>101919235
i answered my own question by looking in the previous bread but ty
it doesn't make sense to call it "distilled cfg" when its conditioning guidance. i must have missed that one
>>
Is there any way to get rid of that plastic look of flux dev right now? I tried some "realism" Loras but they don't really make much difference.
>>
>>101919368
I've found lower guidance of 2.0 or 2.5 can help sometimes. Beta scheduler too.
>>
>>
Comfy bros, they are making fun of us, bros. That Forge guy apparently has some superior tech, bros. What do we now? This Comfy cuck is leaving us wanting, bros. He's making a joke of us, bros. Should we start a harassing campaign against this Forge guy so he gives us his tech, bros?
https://www.youtube.com/watch?v=90cz8prKfMY
>>
>>101918579
how much vram do you have anon?
>>
>>101919432
It says in the image dumbass
>>
>>101919432
I have 2 GPUs, the first one (3090, 24gb) is for the unet model, the 2nd one (3060, 12gb) is for the text encoder
>>
File: ComfyUI_Flux_10.png (1.17 MB, 1216x832)
1.17 MB
1.17 MB PNG
Fucking love GGUK 4.@*
>>
>>101919449
lick my taint, faggot

>>101919450
dang, no wonder you get those speeds
>>
>>101919273
if you're using kohya sd-scripts he said it has to be that model or it wont run
>>
>>101919464
>dang, no wonder you get those speeds
the speed you're seeing on the picture is with GPU1 vs unet, you can get the same speed if you force the text encoder to go to your cpu, the text encoding will take some seconds, but after all you'll get normal speed on your gpu if that one has enough vram to run your unet model
https://reddit.com/r/StableDiffusion/comments/1el79h3/flux_can_be_run_on_a_multigpu_configuration/
>>
>>101919486
thanks
>>
>>101919453
Can you post (or link to) your settings? What are your gen times? Are you doing anything special to offload the model? Schnell or dev?
I'd appreciate if you could share your workflow. I've been struggling to get usable times with 8 VRAM.
>>
>>101919450
how the hell did you do that? i've wanted to do that for ages with my handful of GPUs
>>
jesus christ, I've practically re-coded app.py for joy caption to get it to run locally without any errors or issues. add that the requirements.txt is missing 15 different requirements so you have to manually pip install

all because huggingface is cucked and won't give me full uptime on a gpu even if I pay for it. I hate this shit
>>
>>101919521
look at this reddit post >>101919490
>>
File: FD_00115_.png (1.4 MB, 1344x768)
1.4 MB
1.4 MB PNG
>q4 is better than fp8
what the fuckl
>>
>>101919235
Even when using your setup, I find it still deepfries the hell out of my gens with the CFG set to 6, especially on photos.
>>
>>101919548
Dynamic Thresholding is bad on realistic pictures, go for its alternative, Tonemap
https://reddit.com/r/StableDiffusion/comments/1estj69/remove_the_blur_on_photos_with_tonemap_an/
>>
>>101919527
>add that the requirements.txt is missing 15 different requirements so you have to manually pip install
I just used my ComfyUI venv and missed no deps lmao you brainlet
>>
>>101918883
I get this too, but it still runs, so I'm ignoring it for now since the loader is literally called "bootleg" you don't need to think too much about it
>>101919083
oh i see, it doesn't work for you. good chance you installed the gguf package but incorrectly and missed your comfyui environment. try launching your terminal of choice, installing it, and immediately running comfy in that same terminal
>>
>>101919490
interesting
I'm on 12Gb so I doubt I'll make a difference
so far NF4 is the fastest, but the quality doesn't seem worth the speed against Q4
what about the workflow? is it better to use the fp8 workflow, or the dev workflow?
>>
>>101919573
lmao I'm a literal retard
oh well, its over now
its done, it works
running on --listen so I can comfily caption from my laptop
>>
File: FD_00116_.png (1.33 MB, 1344x768)
1.33 MB
1.33 MB PNG
>>101919578
Weirdly it ran for these 2 images, now it gives the same error
>>
>>101919583
>what about the workflow? is it better to use the fp8 workflow, or the dev workflow?
can you elaborate? I don't think I understand what those mean? And if you want to use a fp8 model, go for Q8_0 instead, that one is the closest to fp16 >>101918579
>>
>>
What is the difference between _1 and _0 and which one do I need for 16GB VRAM?
>>
>>101918597
>this shit is esoteric as fuck, the only way to know for sure is to test all this crap for yourself and compare
i have said it when SD started
I have said it when the chinese spellbook got posted
I have said it when LLMs started
and I will say it again:
THIS AI SHIT IS BLACK VOODOO
>>
put me in the collage
>>
>>
>>101919612
sounds like some kind of oom thing then. i also get inconsistent behaviour from quanted flux, i can run base flux just fine (but at 6s/it so i have to use schnell and it still takes half a minute), while q4 either gives me 2s/it or overflows into system ram and takes >60s/it. so for now i'm just doing pixel art on pony and waiting for the half-working flux configurations to be a bit more user friendly, since even after i get it semi working flux is still pretty slow for me
>>
>>101919650
Prompt:
>Anonymous 08/16/24(Fri)19:52:58 No.101919650▶
>put me in the collage
>>
>>101919614
here:
https://comfyanonymous.github.io/ComfyUI_examples/flux/
the FP8 workflow and the DEV workflow have different nodes
the FP8 workflow doesn't have the the dual clip loader
>>
Flux widened the skill gap so much I can't fucking make non-slop gens now.
>read the reddit post
Nothing works and do not try to help.
>>
Anyone got a workflow with multiGPU support and Loras?
>>
>>101919662
I am running 16GB on a 4080. The model seems to unload when I finish a gen, as my vram and ram usage goes down. It's very confusing.
I am re-downloading the models to see if that helps.
>>
>>101919677
So you just want to whine? This isn't your blog post nigga
>>
>>101919700
Shut the fuck up I am furious and will continue to be until I have a heart attack or figure it out.
>>
>>101919676
If you want to use flux-dev then go for the dev workflow, simple as that, and change your model accordingly on the unet loader if you want fp8
>>
>>
>>101919706
why won't you just say what's the problem? I can try to help, 2 brains is better than 1
>>
File: file.png (2.97 MB, 2902x3350)
2.97 MB
2.97 MB PNG
>>101919431
Forge has a lora support for all the quants, but it's pretty hacky and it "patches" a lora in every time you generate an image, even if you re-generate with the same loras and weights
>>101919504
My workflow is a bit of a mess and relies heavily on AnythingEverywhere nodes so I drew over it to help understand it a bit better
The load times are with cfg 6 and negatives, so without them it's even faster. Using the modified tonemap node >>101919565
>>
>>101919692
I wouldn't bother with that, just learn to check a file's sha to confirm nothing changed it https://huggingface.co/city96/FLUX.1-dev-gguf/commit/860865faf9e5ddf33132b703afe11e21baeac32d it's easier than anything else you've done to get flux running. fwiw if i cared more to get flux running, my own next step would be to just try forge, it's probably more idiotproof than comfy. If you strongly prefer comfy's ui you'll probably be able to move back to it within a few days regardless, these nodes are barely 24 hours old. leech off the commons and let someone else improve it.
>>
>>101919631
8_0 would fit in your vram completely so don't bother with the rest.
Also, what's the point of a 4_1 quant? It won't fit in 8gb vram, and 10+gb vram cards are much better off using 5_0 or 5_1
>>
Is Flux vram usage not dependent on resolution being generated..? I seem to be able to gen at 512x512 fairly quickly, but it gets orders of magnitude slower at 1024x1024, but nobody else seems to be mentioning attainable resolutions as a factor in which quant to run.
>>
>>101919711
let me explain again:

you can use GGUF and DEV (dev-fp8) models with t5xxl_fp8_e4m3fn or t5xxl_fp16, those are all inside the UNET folder

then you have NF4 and comfy's own dev-fp8 that go inside the checkpoint folder, and are not paired to t5xxl_fp8_e4m3fn or t5xxl_fp16, thae clip loader node is not present in that workflow
>>
>>101919801
>Is Flux vram usage not dependent on resolution being generated..?
of course, like every single image model
>>
>>101919779
>what's the point of a 4_1 quant
It's for the 9gb vram cards
It all works now after re-downloading the models, except my math nodes for the upscale apparently don't work
Same workflow just with gguf and the clip_l changed to ViT-L and t5xxl changed from fp16 to fp8.
I don't really get it
>>
>>101919868
clip_name1 should be for the t5, and clip_name 2 should be for clip_l or that VIT-L, you switched the both of them
>>
>>101919801
Because of how huge the models are (before all the quants and stuff that have been made), the people who could run it had enough VRAM that it didn't matter what resolution they genned at. With the smaller models being made and people with less VRAM starting to use Flux locally, I think in the next week or so we'll see more people test how resolution size impacts VRAM usage
>>
Comfy quant lora yet?
>>
>>101919881
Seems to be working with how I set it. I will do the other way and see what happens.
>>
>>101919881
the order doesn't matter
>>
How the fuck can Forge run so fast? How can one man be so good at optimizing code. I thought that was Comfy's shtick.
>>
>>101919955
>Comfy's shtick
Is copypasta.
>>
>>101919923
>>101919903
>>101919881
Correct, it does not matter. I got identical outputs
>>
>>101919955
Same speeds for me except the UI is buggy as fuck and it regularly OOM's when comfy doesn't. Loras and inpainting work tho.
>>
>>101919882
That makes sense, thank you
>>
>>101919955
Because he's not a retard like A1111. The whole point of Forge was to make a gradio interface that wasn't dogshit.
>>
>>101919729
>My workflow is a bit of a mess and relies heavily on AnythingEverywhere nodes so I drew over it to help understand it a bit better
Thank you, absolute legend.
>>
Just found a comfy node implementation for JoyCaption.

https://github.com/StartHua/Comfyui_CXH_joy_caption
>>
>>101919955
I would love to use forge but image quality looks like shit, no idea why
>>
>>101919992
Bro, I don't read chinese.
>>
>>101920000
Instructions in the README look fairly obvious just from the screenshots
>>
>>101919992
I don't get it, is JoyCaption as good as 4o or InternVL?
>>
>>101920010
>4o
No, but it is uncensored.
>InternVL
Maybe.
>>
>>101919955
Does it work with fp16?
>>
I still don't understand why my math nodes broke.
Can any Anon take a look and see if you can figure it out for me? I am absolutely lost.
https://files.catbox.moe/19ct20.png

I must have changed something setting this up but I swear I didn't.
>>
>>101919955
Forge is downright unusable for me. I'll stick to A1111 and Comfy thank you.
>>
>>101920037
I think he updated it recently to work with fp16 yes.
>>
>>101920028
If it's not as good as 4o then it's not as good as InternVL as it is SOTA anon, take a look
https://huggingface.co/spaces/opencompass/open_vlm_leaderboard
>>
>>101920000
If only we could develop a technology that let us translate text automatically. Perhaps some day we might even get it integrated in the browser. That would be awesome.
I guess you're out of luck until then, bro.
>>
>>101920058
And which of those models are local, Anon?
>>
>>101920058
>76B
Joy Caption uses llama 3.1 8B so I don't think you should be comparing them
>>
>>101920075
>>101920058
you use GPT4V for SFW captions, and that joy thing for NFSW captions?
>>
File: file.png (126 KB, 2368x527)
126 KB
126 KB PNG
>>101920040
Try explicitly converting your values? idk
>>
>>101920041
same, shame that A1111 can't flux
>>
File: image_webp(3).png (1.32 MB, 1024x1024)
1.32 MB
1.32 MB PNG
>>
>>101920088
>sfw/nsfw status of finetuned model/lora output now relies on whether your prompting style pattern matches better to gpt4v or joycaption writing style and topic tendency
>>
What the fuck is the point of the Guidance Scale? What was wrong with CFG? I'm a CFG guy. That's how I learned to use this shit and that's how I mean to keep using it.
>>
>>101920040
>>101920093 is essentially what i had to do but i also swapped the math node for int math from searge
>>
>>101920106
>What the fuck is the point of the Guidance Scale? What was wrong with CFG? I'm a CFG guy. That's how I learned to use this shit and that's how I mean to keep using it.
that's what I do, I let GuidanceScale positive at 3.5 and I only change the CFG
https://reddit.com/r/StableDiffusion/comments/1ekgiw6/heres_a_hack_to_make_flux_better_at_prompt/
>>
>>101919431
>not using both
NGMI
>>
>>101920106
FluxDev was trained to emulate CFG without actually having to do CFG
They used FluxPro to generate a ton of images at different CFGs then finetuned FluxDev with those images plus the CFG value used concated to the prompt so it learned what different CFG values look like
>>
>>101920106
Using real CFG makes it twice as slow
>>
File: 00002-2813909806.jpg (462 KB, 1344x768)
462 KB
462 KB JPG
>>101919453
With 6 GB vram I got OOM in Comfy but just werks in Forge
>>
File: image_webp(3).png (1.32 MB, 1024x1024)
1.32 MB
1.32 MB PNG
>>
>>101920098
>muh japanese retrowave look
>but all the settings are a weird mix of californian and 50s americana
>>
>>101920133
*finetuned FluxPro into FluxDev
>>
>>101920107
>>101920093
Yeah that's the issue. Both were int and the node was looking for floats.
Thanks Anons.
>>
>>101920088
InternVL 26b is completely local and fits on a 3090, as it is better than 4V you don't need more.
>>
>>101920133
Sounds plausable. Source?
>>
File: file.png (86 KB, 1317x508)
86 KB
86 KB PNG
what in the sam hell is this shit and why is it taking so fucking long
>>
>>101920160
But it still begs the question, what changed? It was working perfectly this morning.
>>101920107
Did yours break today too?
>>
>>101920152
Forge is the new king. We should put him first in the general. Now all he has to do is to make his own node based ui to completely cuck Comfy.
>>
>>101920172
share your workflow
>>
>>101920133
Interesting. I guess I'll have to learn to use the Guidance Scale then.
>>
>>101920172
This is the same problem I have with forge. Even when I use SD 1.5.
>>101920190
>your workflow
kek clueless
>>
>>101920172
Open your task manager and watch your RAM/SSD usage, maybe it offloads shit there
>>
>>101920163
The source is the description of FluxDev, the BFL code that runs it and my average IQ connecting the dots.
>>
>>101920195
I am clueless what are you doing?
>>
Finally gave in and decided to install the Flux to see what it's like. It's 21 fucking GB. Fuck that shit. That's 10% of my hard drive. I guess this one is only for the rich kids who can afford the storage.
>>
File: Capture.jpg (145 KB, 1436x1215)
145 KB
145 KB JPG
>>101920161
If InternVL can't say that this character is eating with her foot, then it's not better than 4V, can it do that? https://files.catbox.moe/d0evak.png
>>
>>101920180
yes, and now workflows / seeds ive used before produce different outputs but that must be something on my end because i imagine others would have reported this. i used the math node for some resolution... math, but now even when i simply replace that with a primitive node, specifying exactly the res it would have been with the math, i get a different output. not sure whats going on.
>>
>>101920228
storage has evolved since 2005 bro
>>
>>101920228
>Finally gave in and decided to install the Flux to see what it's like. It's 21 fucking GB. Fuck that shit. That's 10% of my hard drive. I guess this one is only for the rich kids who can afford the storage.
download the Q8_0 quant instead, it's twice as light (12gb) and the quality is similar to fp16 >>101918579
>>
>>101920210
NTA but in what instance could forge offload a 7 GB model to disk if I have 52 GB of RAM? Is there a setting to control this?
>>
File: 00000-961690828.png (615 KB, 768x1024)
615 KB
615 KB PNG
>>101920210
ah yep, that's what's happening
i lowered the resolution a bit to where it would only use VRAM and it's zoomin' again
guess i just need to find that sweet spot
>>
>>101920156
Sadly Flux cannot actually do vintage photography properly until someone makes a Lora. Not only it doesn't replicate the analog + film grain style, it also does not output period-specific hairstyles etc
>>
>>101920152
Something has been severely wrong with Comfy's implementation and I say this as a 24GB user it has never been smooth sailing. Always slows down when I start using my PC so I have to close everything and restart. Not even having completely cleared RAM fixes the issue apparently, so it seems to be a caching issue.
>>
CCCOOOOMMMFFYYYYYY!!!!!!!!!!
>>
>>101920250
Thanks. Now that's more reasonable. 21 fucking GB. Seriously? Do we expect we run this in a server?
>>
>>101920284
it can be run on a 24gb card like the 3090/4090 but yeah... it's really reaching its absolute limit and I hate that, glad that quantization is a thing kek
>>
>>101920284
anon, it's not their fault you're a third worlder running mid 2000s hardware
>>
>>101920270
Also I have 64GB of RAM, I may upgrade soon but there shouldn't be a problem when I tend to have a ton available for Flux.
>>
>>101920244
Hmm, my outputs seem identical. It's only the math nodes that broke for me. But my workflow is pretty basic, just a gen and an upscale
>>
>>101920298
64GB of RAM may be enough for Forge, but Comfy isn't for poorfabs. Comfy was fast compared to Auto, yes, but only because Auto is a fucking mess. Comfy priority was always quality. If you want quality you go Comfy. If you want something fast and badly coded you go Forge. Simple as. I know in which side I am. I advice you to at least double that RAM if you want to experience what Comfy truly is capable of doing.
>>
>>101920058
taking a look at https://github.com/Z-L-D/Autocaption
>>
>>101920298
I got 32GB and don't have OOM issues
Comfy does have some memory management issues, for example if you change the encoder it won't unload the previous one first and OOMs but that's something you rarely do
>>
>>101920238
Feel free to test it out https://huggingface.co/spaces/OpenGVLab/InternVL
>>
File: file.png (1.63 MB, 1024x1024)
1.63 MB
1.63 MB PNG
hehehehe
>>
>>101920325
Wow imagine being this much of a faggot.
You go comfy if you want nodes, you go forge if you want a webui interface, and you go a1111 if you are retarded or new.
>>
>>101920268
holy shit, someone already did it, I had no idea lol
https://civitai.com/models/646686/japanese-photo-1980s-style-1980
>>
nochekaiser881 started doing Flux loras, here comes the anime flood
>>
>>101920389
Can Joy do it?
https://huggingface.co/spaces/fancyfeast/joy-caption-pre-alpha
>>
>>101920363
HF seems to be broken, use
https://internvl.opengvlab.com/
Their default is pro, their best model, change it to 26b in settings.
>>
>>101920389
>>101920394
>>
File: Capture.jpg (317 KB, 3669x1455)
317 KB
317 KB JPG
>>101920363
I can't ;'(
>>
>>101920412
>realistic uncircumcised state
>>
>>101920324
i reverted to a commit from a few days ago and im still getting different outputs
fuck this gay ass earth im going to bed
>>
>>
>>101920389
>>101920412
This is not the foot prompt you described but NSFW...
>>
>>101920389
>>101920394
this is joy, going to test this next: https://github.com/Z-L-D/Autocaption

The image is a photograph featuring a penis with a cartoonish, exaggerated expression. The penis is circumcised, with a smooth, pinkish-red glans. The shaft is slightly darker and more textured. Above the penis, in blue, handwritten text reads "TEH REIL" in a playful, childlike script. The background is a plain white surface, likely a bed sheet or a tablecloth, which provides a stark contrast to the subject. The overall tone of the image is humorous and playful, with an emphasis on the absurdity of the situation. The image's lighting is bright, with no shadows, which highlights the textures and colors of the subject. The style is reminiscent of amateur, humorous digital art. The image is likely intended to be humorous or ironic, rather than serious or educational. The subject is explicit, and the image may not be suitable for all audiences. The image's composition and style are straightforward and unadorned, focusing attention on the central subject. The image's tone is lighthearted and playful, aiming to elicit a chuckle or a raised eyebrow. The image's explicit content and playful tone make it a memorable and attention-grabbing visual. The image's focus on the penis and the accompanying text creates a sense of whimsy and humor. The image's style and content are reminiscent of amateur, humorous digital art. The image's overall tone is lighthearted and playful, aiming to elicit a chuckle
>>
File: Capture.png (975 KB, 1389x1378)
975 KB
975 KB PNG
>>101920363
And... that's a failure!
>>
>>101920444
>a stark contrast
GET OUT OF MY HEAD GET OUT OF MY HEAD GET OUT OF MY HEAD
>>
>>101920444
note I've modified joy caption to use the uncensored llama 3 and modified the prompt, but from testing it doesn't help too much
>>
>>101920430
to clarify, the math node is working again but im still getting different outputs and somehow they are more sloppa than before. it must be a custom node that changed but im not about to git revert every single one. goddamnit.
>>
>>101920444
Prompted this
>>
>>101920468
Certainly!
>>
>>101920392
God fucking dammit I was hoping he'd be filtered, now if I post my Loras to shitvit they'll sit beside his shitheaps
>>
>>101920468
It's important to remember that language is a tool we use to convey ideas, and sometimes common phrases can be a convenient way to express a thought. While "a stark contrast" might be a familiar expression, it serves its purpose in illustrating differences clearly. Let's not let the choice of words overshadow the message we're trying to communicate. The real value lies in the ideas and perspectives we're sharing, not just in how they're phrased.
>>
>>101920412
lmaoooooo, that's impressive
>>
>>
>>101920499
If I didn't already have enough datasets to recaption to last me a lifetime I'd redo every lora he puts out but better, out of pure, unadulterated hatred
>>
File: 1714706426927015.jpg (339 KB, 2304x2304)
339 KB
339 KB JPG
aren't we using flux1-dev-bnb-nf4-v2 ? that's 11GB, am I using the wrong model?
>>
>>101920522
that's the old thing, we're all about the new thing now
>>
>>
File: file.png (80 KB, 1324x480)
80 KB
80 KB PNG
>>101920260
i take it back, it's still fucking up
goddammit
>>
>>101920522
>aren't we using flux1-dev-bnb-nf4-v2 ? that's 11GB, am I using the wrong model?
use Q4_0 instead, same size, better quality >>101918579
>>
>>
Come and get that nice late night/early morning bread...
>>101920553
>>101920553
>>101920553
>>
>>101920522
we all hate nf4 and think it stinks! try city96's gguf quants instead. probably q4 if you're an 8-12GBer, q8 if you have 16+, though it depends on the speeds and resolutions you're wanting to work in. just start with the q4 and upgrade if you find the resolutions you want generate plenty fast enough for you. for comfy you'll also need his "bootleg" gguf loader for now. think they're natively loadable in forge.
>>
>for some reason joycaption uses up exactly 4 GB vram per each of my GPUs and refuses to run on just one of them, no matter what I add to the code or modify in accelerate config

I give up, use them both then
>>
>>101920558
what about the _1s ?
>>
So does Comfyui have malware or not?

>t. newfren
>>
>>101920238
>>101920460
Are you sure you are using 4o or 4v?
>>
im going to have an aneurysm
>>
>>101920604
the _1 are better than _0 abeit a bit bigger, it's something like this:
Q4_0 < Q4_1 < Q5_0 < Q5_1 < Q8_0

>>101920611
>Are you sure you are using 4o or 4v?
Yes I'm sure
>>
>>
>>
File: 47908640.png (1.34 MB, 896x1152)
1.34 MB
1.34 MB PNG
>>
File: 0.jpg (96 KB, 1024x1024)
96 KB
96 KB JPG



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.