[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: vocaloid teto.jpg (3.39 MB, 3072x3072)
3.39 MB
3.39 MB JPG
Discussion of free and open source text-to-image models

Previous /ldg/ bread : >>102000715

>Beginner UI
EasyDiffusion: https://easydiffusion.github.io
Fooocus: https://github.com/lllyasviel/fooocus
Metastable: https://metastable.studio

>Advanced UI
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
ComfyUI: https://github.com/comfyanonymous/ComfyUI
Forge: https://github.com/lllyasviel/stable-diffusion-webui-forge
InvokeAI: https://github.com/invoke-ai/InvokeAI
SD.Next: https://github.com/vladmandic/automatic
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI

>Use a VAE if your images look washed out
https://rentry.org/sdvae

>Model Ranking
https://imgsys.org/rankings

>Models, LoRAs & training
https://civitai.com
https://huggingface.co
https://aitracker.art
https://github.com/Nerogar/OneTrainer
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts

>Flux
https://huggingface.co/spaces/black-forest-labs/FLUX.1-schnell
https://comfyanonymous.github.io/ComfyUI_examples/flux

>Pixart Sigma & Hunyuan DIT
https://huggingface.co/spaces/PixArt-alpha/PixArt-Sigma
https://huggingface.co/spaces/Tencent-Hunyuan/HunyuanDiT
https://huggingface.co/comfyanonymous/hunyuan_dit_comfyui
Nodes: https://github.com/city96/ComfyUI_ExtraModels

>Index of guides and other tools
https://rentry.org/sdg-link
https://rentry.org/rentrysd

>GPU performance
https://vladmandic.github.io/sd-extension-system-info/pages/benchmark.html
https://docs.getgrist.com/3mjouqRSdkBY/sdperformance

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
sd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium

>Maintain thread quality
https://rentry.org/debo

>Related boards
>>>/g/sdg
>>>/h/hdg
>>>/e/edg
>>>/d/ddg
>>>/b/degen
>>>/vt/vtai
>>>/aco/sdg
>>>/trash/sdg
>>
File: 1722484139767935.jpg (331 KB, 1536x2048)
331 KB
331 KB JPG
>mfw
>>
File: ComfyUI_01084_.png (1.1 MB, 1280x720)
1.1 MB
1.1 MB PNG
>>102003576
Thanks for the bread baker
>>
File: ComfyUI_08009_.png (904 KB, 1200x768)
904 KB
904 KB PNG
>>102003621
>haven't used pony at all since Flux came out
I can never go back
>>
>102003588
I would be embarrassed to make such garbage level gens after putting that many hours into the hobby
>>
File: ComfyUI_00999_.png (893 KB, 720x1280)
893 KB
893 KB PNG
>>102003663
Flux really did set the bar high for any competitor
>>
File: 00002-2024-08-21-cJak.jpg (2.78 MB, 2048x2688)
2.78 MB
2.78 MB JPG
>>102003677
Post cut
>>
File: ComfyUI_05047_.png (1.94 MB, 1200x1200)
1.94 MB
1.94 MB PNG
PSA: LoRas with Network DIMM 16 and Alpha 16 can supports multiple costumes in Flux Dev via token activation and booru tags.

>>102003405
>>102003569
>>
>>102003695
I must admit, I like first one better. oh man lol
>>102003663
you'll lust for "tentacle sex" sooner or later.
>>
File: bComfyUI_105561_.jpg (393 KB, 768x1024)
393 KB
393 KB JPG
>>
File: 00018-3991584975.png (2.81 MB, 1280x1920)
2.81 MB
2.81 MB PNG
>>
Did anyone the T5 GGUF? If so, how does the quants compare to f16 (are they comparable or shit like the fp8 version)
>>
How are the T5 GGUF quants? are they comparable to the fp16 version or shit like fp8.
>>
File: ComfyUI_01905_.png (724 KB, 1024x1024)
724 KB
724 KB PNG
>>
File: ComfyUI_00936_.png (930 KB, 1280x720)
930 KB
930 KB PNG
>>102003873
The results are pretty good
Not much of a difference so far
>>
>>102003873
>>102003881
use Q8 or Q6_K, it's free:
https://www.reddit.com/r/StableDiffusion/comments/1ewpwtp/flux1_t5_v11xxl_gguf_clip_encode_compare_gguf_vs/
>>
File: ComfyUI_01103_.png (872 KB, 1280x720)
872 KB
872 KB PNG
>Is that true anon?
>You won't disappear once I take my meds?
>>
LoRA training. 1024x1024 or 512x512 (cant be both, it fucks things up because of buckets)
Rank 16 or rank 32... or more?

So many little levers to pull.
>>
>>102003930
Tuning hyper parameters is my kind of pseudo-science.
>>
>>102003952
You don't write good luck prayers in the script them comment them out?
>>
After extensively testing the instant x flux union model, I can safely conclude that it is complete garbage. It will never adhere to the input and the outputs by simply having it on are guaranteed to be fucked up.
>>
You guys are fucking weird man
>>
>>102003725
>tentacle sex
I think a well trained LoRA can satisfy that. Besides, single GPU (24gb) finetuning is now fully available.

https://github.com/kohya-ss/sd-scripts/tree/sd3
Time to caption your dataset.
>>
>>102003696
*Smell
>>
Fuck yes thank God for math nodes, I fixed my upscaling issue. Setting a +32 pixel buffer on the tiles helps a lot.
>>
>>102004050
even me?
>>
>>102004050
it's like a freakshow, it's the only reason to come here
>>
File: ComfyUI_05049_.png (2.15 MB, 1200x1200)
2.15 MB
2.15 MB PNG
>>102003930
16 is pretty good. I crammed 5 costumes into 16 ranks. 768x768... Cause even my 4090 on W11 seems to be leaking memory to RAM on 1024x1024 and slowing things to a crawl. Maybe there are a few optimization options available to make 1024x1024 viable, haven't check all of them yet.
>>
>>102004104
ESPECIALLY you
>>
>>102004124
c'mon man..., don't be like that
>>
File: ComfyUI_01117_.png (933 KB, 1280x720)
933 KB
933 KB PNG
>>102004124
Me! Me! I wanna be weird!
>>
File: fakecat.jpg (45 KB, 578x566)
45 KB
45 KB JPG
>>102003621
>>
>>102004132
>ellipses followed by a comma
The weirdest of all.
>>
>>102004156
Something is wrong with your cat. He may have a syndrome.
>>
>>102004158
..., ?
>>
File: ComfyUI_01090_.png (1004 KB, 1280x720)
1004 KB
1004 KB PNG
>>102004156
Does it mean I win?
>>
File: ComfyUI_01119_.png (848 KB, 1280x720)
848 KB
848 KB PNG
>>102004156
He just saw lots of weird things lately, don't mind the cat
>>
File: drippinga2.png (3.41 MB, 1536x1536)
3.41 MB
3.41 MB PNG
>>
>>102004201
Cool outfit. Prompt?
>>
File: ComfyUI_00782_.png (992 KB, 1024x1024)
992 KB
992 KB PNG
>>
File: 1722098651210483.png (1.95 MB, 1024x1024)
1.95 MB
1.95 MB PNG
dude get off my The Sims 3(tm) lawn

wait why am i an insect creature? how kafkaesque of you...
>>
>>102004123
Using Ai toolkit or Kohya?
>>
>>101997631 >>101996982 >>101998799 >>101999002
these gens look awesome
>>
>outputs randomly coming out as fried noise
is this a comfy update or something? was working fine before
>>
File: 00136-13434.png (3.92 MB, 1912x1744)
3.92 MB
3.92 MB PNG
>>102004293
thanks anon, if you want to reproduce that you can use this lora and this kind of prompt:
https://civitai.com/models/667307
>y2k style cover art with a stylish girl drawing, y2k style text at the bottom: 'FRESH'
>>
>>102004212
i dont remember i made it with an a2 lora a while ago. i think i used a gradient
>>
File: ComfyUI_01127_.png (811 KB, 1280x720)
811 KB
811 KB PNG
>>
File: ComfyUI_01128_.jpg (95 KB, 783x720)
95 KB
95 KB JPG
>>102004342
>>
File: ComfyUI_01129_.png (811 KB, 1280x720)
811 KB
811 KB PNG
>>102004361
>>
File: ComfyUI_01131_.png (508 KB, 1280x720)
508 KB
508 KB PNG
>>102004374
>>
>Train LoRA at 512x512
>Sample images come out with the quality of a 512x512 image

I don't know why I expected any different.
>>
guys...

https://civitai.com/models/668646/full-female-anatomy-fineart-nudity-flux-lora-nsfw-breastspussyassand-more
>>
File: Capture.png (456 KB, 2773x1452)
456 KB
456 KB PNG
https://github.com/city96/ComfyUI-GGUF/issues/54#issuecomment-2300129278
if your images went blurry with ddim_uniform scheduler it's probably because of this
>>
>>102004445
it's funny how easy it was to uncensor flux, like this took not even 2 weeks wheras for SDXL it took fucking months to achieve something remotely close to this, and this is just with loras
>>
I want a workflow that generates a sdxl/pony image with a lora, then uses flux to generate backgrounds based on that image (via controlnet depth or something like that)

would be neat, the only thing pony can't do is detailed backgrounds like flux. sdxl/pony still have all the best character loras (for now). I mean anime style, there are plenty of good flux loras for real people.
>>
File: 00035-3647639194.png (1.77 MB, 1024x1440)
1.77 MB
1.77 MB PNG
>>102004405
I've trained a few now at 512x512. I'm getting good results and likeness but I'm noticing at lot of fine detail is lost. I deliberately chose images where her cheek mole is very prominent for my dataset and it's barely shows up
>>
>>102004291
Kohya SS, with GUI
>>
Is there an efficient loader for Flux?
>>
>>102004306
they finally added sovl into flux.. the future is bright.. i close ma eyes..
>>
>>102004474
it's not going to fit together, it'll look green screen
>>
Feature/Update News:

[--fast] A '--fast' option has been added that can boost performance for 'float8_e4m3fn' on 'RTX 4000' or 'ADA' GPUs. (A recent version of torch is required.)
>>
Since lora training seems to work so well, we can be sure that finetuning will work equally well or better, right? Like it isn't technically possible for a model to respond well to lora training yet somehow be untunable?
>>
File: 19240467.jpg (21 KB, 460x460)
21 KB
21 KB JPG
>>102004488
Right, I'm looking into it now, it seems the GUI isn't caught up to the SD3 branch yet... or is it?

It's documentation isn't as straight forward as AI toolkit, and all the discussions of the subject are full of the cerfucking guy.
>>
>>102004509
or another option, just take an image and use something like segment anything to make a perfect mask in flux.
>>
>>102004512
I cant wait for anime finetunes so Miku can play with her friends without a bunch of loras.
>Miku Hatsune fighting Saber from Fate Stay Night with a celery stick
>>
>>102004512
lora and finetuning are functionally the same thing one is just on a mass scale
>>
>>102004467
and that lora actually works, its abit samefacey (but the creator wrote so) but he is already working on v2 ... and with a few tricks I actually achieved tentacle porn with just that lora
>>
File: 00038-2024-08-21-cJak.jpg (2.96 MB, 2048x2688)
2.96 MB
2.96 MB JPG
>>
>>102004525
>I cant wait for anime finetunes so Miku can play with her friends without a bunch of loras.
this, exactly this, what flux needs is just more knowledge of characters/celebrities and we won't be far from perfection
>>
>>102004519
>Inserts himself into discussion
>Spams his ugly fucking face
>Subscribe to my patreon to see my findings!
Hate this fag
>>
>>102004512
>Like it isn't technically possible for a model to respond well to lora training yet somehow be untunable

Probably not. They're very similar in principle. I think the main thing is that it might be difficult to actually "improve" the model. The dataset sees pretty good, so whatever you add into in a finetune probably also needs to be immaculately captioned and curated as well.
I think you can do little fine tunes here and there and it probably won't do much, but big fine tunes might be a bigger undertaking than before.
>>
File: ComfyUI_01138_.png (1001 KB, 904x1280)
1001 KB
1001 KB PNG
>>102004536
>>
>>102004486
Did you tag the cheek mole?
>>
>>102004445
>all front facing portraits
>all the same stupid landing strip
it's all so tiresome. training complex concepts isn't this easy.
>>
>>102004543
the worst part is that I looked up at his patreon and this fucker is making fucking thousands per month, that's insane, I live in a world surrounded by retards...
>>
brainlet here, the clip encodes your prompts and then sends the conditioning to the sampler, right. Could you encode a bunch of prompts you like separately, save them, then later send them directly to the sampler without ever having to load clip?
>>
>>102004556
>>102004543
I hate him so much it's unreal. He literally does nothing. He just hangs around developments like a bad smell. He contributes nothing, teaches nothing that isn't explicitly outlined in the documents, he just IS.
>>
File: 00031-2024-08-21-cJak.jpg (2.88 MB, 2048x2688)
2.88 MB
2.88 MB JPG
>>
>>102004544
And the inability to sell the resulting finetune removes commercial incentives, meaning that only rich autists will be able to afford training the vast number of steps required to substantially change the model

Which means the first serious finetune that totally changes the model will probably be made by furries
>>
File: ComfyUI_01140_.png (1.06 MB, 904x1280)
1.06 MB
1.06 MB PNG
Tell me again why I need a LoRA?
>>
>>102004567
What like load a bunch of clip converted prompts from the text encoder into a document or something and the fire them into the sampler at a later date?

I don't see why that wouldn't be impossible, but I don't know enough about what's going on under the hood.
>>
>>102004445
>no puffy nipple
>>
>>102004577
People are screeching way to hard about the license. Until BFL actually comes knocking for their payment I call bullshit on them ever enforcing it.
>>
>>102004590
>What like load a bunch of clip converted prompts from the text encoder into a document or something and the fire them into the sampler at a later date?
yes, just as you described it
>>
>>102004604
who's gonna take the risk? do you think people are willing to pay thousands for a finetune and then having the stress of possibly getting a lawsuit because they decided to get that money back through donation and shit?
>>
>>102004614
That sounds possible, in fact, I bet you could put the script for the t5 outputs from the text ecoder in the chat GPT, ask it to modify the node to send it to a text document in the form of a list and then make another node that takes that list as its input and you could even cycle between your cached prompts with a slider or something.
>>
File: 00051-1495168978.png (2.45 MB, 1024x1440)
2.45 MB
2.45 MB PNG
>>102004549
I included "she has a cheek mole" in my captions for the closeup photos. Maybe I'll try WD14 tags next
>>
>>102004630
I think you're overestimating the cost of the fine tune and the potential risk to the uploader. There are other concerns around BFL getting their money as well including international borders and bad KYC on the part of the uploader. Assuming the uploader was intentionally hiding their identity, which I don't think is even necessary.
>>
>>102004540
thats the only thing it lacks, I want an anime finetune and a real/celeb finetune. loras are fine but what is neat about pony models like autismmix is you can get GREAT results even with basic booru tags. it knows what saber/mordred/nero are, even without a lora.
>>
>>102004586
a simple lora enables full lewds if you desire it
>>
>>102004630
Just lie about what the base model is... I doubt they could tell after a fine tune.
>>
>>102004586
I wish I was as easy to please as you.
>>
>>102004656
yeah, you want ton include a new character to your gens, with a greatly finetuned model that can be instant, loras are just a patch, you have to searsh for civitai, pray that your beloved character exists, then download it, then load the lora, then wait for the model to unload/reload to include the lora (that can take minutes), and often the lora is too strong and alter the whole flux's capabilities, and you can't stick too much loras at the same time it's getting slow and flux starts to shit itself, it's tne antithesis of fun, nothing will beat a model that already has the knowledge inside of it
>>
File: 1701969225418166.png (1.33 MB, 1024x1024)
1.33 MB
1.33 MB PNG
it's neat how much the model seems to know conceptually, even a "world of warcraft ingame screenshot with UI":
>>
>>102004630
How does CivitAI do it? People can pay buzz for early access to Flux Dev loras. Buzz can be exchanged for USD. Both CivitAI and the lora maker are profiting from Flux Dev.
>>
>>102004667
how are you gonna do that? the finetune will keep the model's size, everyone will know it's still a 12b model, and only flux is a 12b model
>>
>>102004667
There's no other 16ch model like it on the market, anon
>>
>>102004687
Civitai is actually paypigging BFL for a commercial license though
The discussion is about amateur trainers who won't want to do that
>>
>>102004687
>How does CivitAI do it? People can pay buzz for early access to Flux Dev loras. Buzz can be exchanged for USD. Both CivitAI and the lora maker are profiting from Flux Dev.
maybe a % of that money is given to the BFL labs?
>>
>>102004679
yeah, flux makes amazing mikus for example because it has knowledge of miku in the data: now if a finetune did this for all popular anime characters you wouldn't even need a lora for most gens. But loras are still very useful, it's like plugging in a character preset.
>>
>>102004693
If you use it commercially, I assume you're talking about setting up an API endpoint instead of just giving the weights.
>>
>>102004688
Those 2 red dots are gonna cost you three days my dude.
>>
>>102004718
if it's just making an API that is problematic, then the pony-fag would've worked on flux dev already, it's probably about the donations aswell
>>
>>102004655
>I think you're overestimating the cost of the fine tune and the potential risk to the uploader.
we don't live in the SDXL era anymore, flux is a 12b model, it's gonna be really expensive to make a real finetune of it, the pony-v6 finetune cost tens of thousands of dollars, and it was just for a 3.5b model, now imagine for 12b
>>
File: ComfyUI_01145_.png (1.22 MB, 904x1280)
1.22 MB
1.22 MB PNG
>>102004725
kek, its about time we replace the mods with AI
>>
what is the diminishing return point for steps in flux? 20? is there any benefit to a 50 step gen, default is euler but it's not euler a
>>
>>102004758
I think most people use 25-30 for euler. That's what I've seen from the workflows posted here at least
>>
File: 1711485332033427.png (1.26 MB, 1024x1024)
1.26 MB
1.26 MB PNG
now we're gaming. not even a WoW aesthetic lora, just base model.
>>
>>102004758
>what is the diminishing return point for steps in flux? 20?
I think it's 30, that's when you get consistency in quality
https://reddit.com/r/StableDiffusion/comments/1er3wt7/if_you_want_a_good_compromise_between_quality_and/
>default is euler but it's not euler a
because the ancestral thing doesn't work on flux unfortunately
>>
>>102004741
We've yet to really see how flux responds to finetuning, but it's actually quite resource efficient so the price might not be as out of this world as you think if everything goes well.
>>
>>102004634
I only ask because it might make the process of using larger models on systems with less memory more efficient
>>
>>102004771
I hope so anon, and I agree it's true that flux needs less work to be good, SDXL base needed a lot of work, what flux needs the most is more concept and some NFSW training and that's pretty much it
>>
>>102004783
My hope is that you can train what you want into the model, throw the model out there and someone else will merge it back into their model until an ultra muttified model is produced.
>>
>>102004794
that was the process that created good models for SD15 at the beginning.. omy good old berrymix
>>
File: 1706539424716252.png (1.15 MB, 1024x1024)
1.15 MB
1.15 MB PNG
rip and tear!
>>
File: 00060-2024-08-21-cJak.jpg (3.25 MB, 2048x2688)
3.25 MB
3.25 MB JPG
>>
>>102004783
the core of flux is very good so the gigabrain people working with it will make really good stuff with it. SD3 had fundamental flaws that they can't fix.
>>
File: ComfyUI_01148_.png (1.16 MB, 904x1280)
1.16 MB
1.16 MB PNG
Goodness no! What have I created!
>>
File: ComfyUI_01149_.png (1.11 MB, 904x1280)
1.11 MB
1.11 MB PNG
>>102004829
I wanted something cute like this! Not that furry cancer!
>>
>>102004848
>yuri
a man of culture I see
>>
File: ComfyUI_01151_.png (1.13 MB, 904x1280)
1.13 MB
1.13 MB PNG
>>102004848
Here, I failed gen for you
>>
File: 2.jpg (1.57 MB, 3072x1536)
1.57 MB
1.57 MB JPG
Two half-assed kissing prompts on two different models
>>
how do i actually do a proper hiresfix/2nd pass on flux? i turned denoise to 1 to test it and it just generates a blurry image rather than the oddly scaled images like 1.5. when i lower the denoise, the same thing happens: i just get a slightly more blurry version of the input
>>
>>102004950
I don't understand why people are still using Stable Diffusion workarounds on Flux...
>>
>>102004950
Flux works at any res out of the box because of its architecture. Although in practice, results may vary.
>>
>>102004950
I've never had issues with inpaint on comfy with flux. Forge was fucky.
>>
>>102004950
Don't use latent upscale
Upscale the decoded image with nearest exact/bicubic (or an ESRGAN model if you want to get fancy), then vae encode and run img2img at low denoise
>>
How much benefit would I get from moving this from my sata to nvme ssd?
>>
>>102005021
I imagine model loading speeds would go way up but nothing else
>>
>>102005021
the model would load much faster for the first gen
after that no benefit, unless you're swapping models a lot
>>
File: 2024-08-20_00368_.jpg (295 KB, 2560x1440)
295 KB
295 KB JPG
>>102004950
either this >>102004990
or use SDUltimateUpscale .. I scale 1280x720 gens with SDUltimateUpscale using flux itself

pic related
>>
File: ComfyUI_01159_.png (1.12 MB, 904x1280)
1.12 MB
1.12 MB PNG
>>
>>102005046
>Those fused arms.
>>
File: ComfyUI_07921_.png (1.15 MB, 1200x768)
1.15 MB
1.15 MB PNG
Lora's on Flux are so much more effective than they were on stable diffusion. Flux is just so perfect in many way. Just makes me more pissed that Stability AI was trying to pass off SD3 as the next big thing.
>>
File: ComfyUI_01160_.png (1.02 MB, 904x1280)
1.02 MB
1.02 MB PNG
>>102005048
>Those fused arms.
Love at its finest
>>
>>102004800
>berrymix
jesus, that was a long time since.
NAI v3, Elysium, URPM, ATSB....
>>
>>102005074
Bstaber
>>
>>102005082
Isn't that the pizza model?
>>
>>102005089
people said that, but I only ever used it for landscapes
it was really good for them for some reason
>>
>>102005089
that was the mythical cstaber
>>
When are we getting actually good controlnets, not the cheap shit that instant x is putting out?
>>
File: Nice.jpg (2.49 MB, 3072x3072)
2.49 MB
2.49 MB JPG
>>
trinart... My sweet
>>
>>102005157
why does this make me so uncomfortable to try to read/look at
it's like uncanny valley or I'm having a stroke
>>
>>102004543
Jesus fuck, even if I filter him from GitHub just knowing he exists irritates me. It's worse than Nochekaiser
>>
File: ComfyUI_05011_.png (1.32 MB, 1024x1024)
1.32 MB
1.32 MB PNG
>>
>>102004961
>>102004974
a flux image at 2k doesnt magically produce a bigger version of the 1k with the same seed. it's stupid to waste time generating at 2k+ base res only for the seed to be shit.
>>
>>102005217
>a flux image at 2k doesnt magically produce a bigger version of the 1k with the same seed.
it doesn't, but the quality is still great at 2k, so why not making your gens there
>>
File: ComfyUI_05012_.png (1.43 MB, 1024x1024)
1.43 MB
1.43 MB PNG
>>
>>102005217
If you want to upscale then use a proper upscaler, not the hiresfix shit
>>
File: ComfyUI_05013_.png (1.57 MB, 1024x1024)
1.57 MB
1.57 MB PNG
>>
>>102004446
They do
>>
File: ComfyUI_00129_.png (1.19 MB, 768x1280)
1.19 MB
1.19 MB PNG
Man, a fashion lora with designer clothes and more face variety would be sick just to make outfits and cool poses. Flux really enables way more creativity...
>>
>>102004567
Yea, this is possible. But seeing how t5 is quantized now the time it saves is negligible
>>
>>102004990
DAT that shit.
>>
Can I set the outputs folder for saved images on comfy to a different drive? Every time I try to set one it errors. I don't want them in the outputs folder
>>
>>102005298
No I didn't recommend that for a reason
If he's impatient enough to be trying to use latent upscale to avoid waiting for the extra vae steps, there's no way he's going to tolerate the speed of DAT
>>
File: ComfyUI_05016_.png (1.24 MB, 1024x1024)
1.24 MB
1.24 MB PNG
>>
File: ComfyUI_05017_.png (1.33 MB, 1024x1024)
1.33 MB
1.33 MB PNG
>>
File: ComfyUI_00939_.png (1.32 MB, 928x1232)
1.32 MB
1.32 MB PNG
https://civitai.com/models/668799/game-assets-cartoon-style-3d-isometric-background-assets-for-small-games-flux?modelVersionId=748705
that one is a win
>>
>>102005321
that absolutely make sense. I'm easing into it. the heavy ones are nasty tho.
>>102005351
very nice!
>>
It just occurred to me how fucked Stability AI actually is. Basically anyone of any worth left, all they have is a handful of finetuners. All the researchers are gone. Wonder what they’re even doing at this point.
>>
File: ip_adapter_example1.jpg (525 KB, 2661x1497)
525 KB
525 KB JPG
https://huggingface.co/XLabs-AI/flux-ip-adapter
holy fuck that's a huge deal no? with this you just need a picture of something to make it work
>>
Please gen literally anything other than Miku, I'm sick of seeing her obnoxious face, I'm begging you. Anything else. Every thread is spammed with Miku. I'm so tired..
>>
>>102005413
there's no much native characters than Miku that Flux knows...
>>
>>102005412
but can I run it with less than 16gb VRAM yet
>>
>>102005445
it's also somehow both undercooked and fried at the same time. just be patient, a proper ipadapter is going to take months.
>>
>>102005445
>ditch the original comfyUi repo?
are you the same anon who didn't understand he had to pip install the requirements.txt from yesterday... be honest...
>>
>>102005427
Given you can run flux with 8GB VRAM using the gguf quant checkpoints and that these models are 2GB combined, I'd think yes.
>>
>>102005412
>Go to ComfyUI/custom_nodes/x-flux-comfyui/ and run python setup.py
I don't think it's that simple, it would be installing the packages on the default python shit, and not on ComfyUi's own venv
>>
>>102005483
>gguf
wait, is this compatible with gguf quants though?
>>
File: 1716804847210620.png (1.74 MB, 1024x1024)
1.74 MB
1.74 MB PNG
>>102005472
No, he's not, that's me
>>
File: 1699812400729585.png (3 KB, 334x46)
3 KB
3 KB PNG
>>102003576
3060 isn't enough anymore
>>
>>102005412
>Download our IPAdapter from huggingface, and put it to ComfyUI/models/xlabs/ipadapters/*.
I don't see a "xlabs" folder in there, should I create it?
>>
File: FLUX00013.png (1.89 MB, 1536x1248)
1.89 MB
1.89 MB PNG
>>
>>102005626
yes.
>>
>>102005412
having an ipadapter is a huge deal for fixing style if it works, since you can just show the model an image in the style you're looking for

but xlabs' controlnets fucking suck, so I don't have high hopes for their ipadapter being good
>>
File: ComfyUI_00008_.png (1.22 MB, 1024x1024)
1.22 MB
1.22 MB PNG
>>
>>102005661
>having an ipadapter is a huge deal for fixing style if it works
and for characters aswell, no need loras with this shit
>>
File: 1700388400653404.png (2.3 MB, 1416x920)
2.3 MB
2.3 MB PNG
flux does inpainting well too, here's a random inpaint of a random KCD image with a square blob:
>>
File: 1718467357756014.png (2.49 MB, 1280x1280)
2.49 MB
2.49 MB PNG
>>102005529
>>
File: 1699541494302768.png (88 KB, 400x87)
88 KB
88 KB PNG
>>102005683
Thought you were bullshitting at first, that little HUD at the bottom looks so damn clean
>>
>>102005676
it's not as accurate, it sees only one perspective of the character
>>
File: Capture.png (63 KB, 1494x1151)
63 KB
63 KB PNG
>>102005412
wtf it's OOMing the clip encoder part
>>
>>102005712
I only did an inpaint on the mid-right area in a box. normally it's a forest area. but if you do WoW UI, it kinda knows how it looks.
>>
>>102005714
yeah, InstantID is better for example, I was able to use just one picture of anyone and it could work on every angle
>>
6GB vram mafia bros...
>>
Is latest comfy update still a bit broken for some?
>>
>>102005412
It's a shame quality suffers from the race to be first, then nobody else bothers to try. Like cool, it's ipadapter for flux, except it looks worse than 1.5 and take minutes per generation.
>>
>>102004644
Does your lora work with different poses and camera distances besides the generic portraits?
>>
>>102005725
anon... it still is making shit up, it only has one perspective to work with
a lora can get all around detail
InstantID can't know if your character has a huge scar on the back
>>
>>102005744
>InstantID can't know if your character has a huge scar on the back
that's too specific to be a redflag, 99% of characters have normal bodies
>>
https://github.com/XLabs-AI/x-flux-comfyui?tab=readme-ov-file

ipadapter readme, dling files now
>>
>>102005786
and even if it isn't perfect it's new, but controlnet functionality in general is one of the best things about 1.5/XL. When canny started working for SDXL models you could literally do one click edits, source image -> prompt/lora character
>>
File: Capture.png (1.14 MB, 3387x1294)
1.14 MB
1.14 MB PNG
>>102005412
Ehhh??? It's not even accurate to the original picture
>>
>>102005661
>but xlabs' controlnets fucking suck, so I don't have high hopes for their ipadapter being good
this, the quality is atrocious
>>
File: file.jpg (17 KB, 720x194)
17 KB
17 KB JPG
>>102005814
>>
>>102005825
xlabs always sucked though, so the "beta" thing is just copium
>>
>>102005723
Can you catbox that?
>>
>>102005814
you got a blonde anime girl with no girl in your prompt, it even has the details like a bow, that's still impressive
>>
>>102005829
it may ta
>>
>>102005841
>you got a blonde anime girl with no girl in your prompt,
that's because I used the exact same prompt as their workflow, I guess that's how it's supposed to work?
>>
File: ComfyUI_temp_zxslf_00009_.png (1.19 MB, 1024x1024)
1.19 MB
1.19 MB PNG
>>102005412
ya it works, somwhat. This was made with a picture I made in SDXL. It changed the style considerably. For comparison this is the original ... (next post)
>>
File: 1693468798522464.jpg (1.2 MB, 2560x1440)
1.2 MB
1.2 MB JPG
>>102005839
all I did was inpaint a box on the right part of the image with the prompt "restaurant", here is the source image: all default settings, denoise was 0.75 and default inpaint settings.
>>
>>102005852
anon I think you're retarded
like no offense
just the way you respond to things is retarded
like you don't quite get what is being said
>>
File: Capture.png (1.35 MB, 3342x1325)
1.35 MB
1.35 MB PNG
>>102005412
>>102005814
>If you get bad results, try to set true_gs=2
ok that's better, maybe using a character with no background was a bad idea, I'll try something else
>>
File: 00462-3801570587.png (3.88 MB, 2048x2048)
3.88 MB
3.88 MB PNG
>>102005857
.. this is the original. Maybe it works better for photorealism.
>>
>>102005860
>all I did was inpaint a box on the right part of the image with the prompt "restaurant"
Oh shit that is impressive then. Thanks for the heads up, will try
>>
>>102005863
>a retard calling others retards
such many cases
>>
File: 1696589800512102.png (2.39 MB, 1280x1280)
2.39 MB
2.39 MB PNG
>>
>we are patching diffusion model, be patient
is it patching the base checkpoint? is it fine for non ipadapter gens?
>>
File: 1709466849984599.png (2.4 MB, 1280x1280)
2.4 MB
2.4 MB PNG
>>
Error occurred when executing XlabsSampler:

_scaled_mm_out_cuda is not compiled for this platform.

do I need a new cuda or torch or something?
>>
>>102005882
My IQ is 124
>>
File: Capture.png (2.48 MB, 3365x1343)
2.48 MB
2.48 MB PNG
>>102005871
>Maybe it works better for photorealism.
ok in the trash it goes kek
>>
Yes these numbers are coming together nicely indeed.
>>
>>102005922
>124
>He can't even count
>D>>102005930
oesn't realize 3 comes after 2
What a fucking unrepentant tard..
>>
>>102005412
>x labs
From the moment Flux was released, they have be sprinting to be the first to release the most undercooked and shitty controlnets possible. It's almost offensive how bad their work has been.
>>
>>102005945
Wtf how did that quote of >>102005930
get in there, I'm committing sudoku out of embarrassment
>>
Everything is changing so fast.
I very much doubt the way i had flux set up over the weekend is the most efficient method any more.
Is there a handy compiled chart of what models work at what level of quality and speed chart floating around, did we come to a consensus?
I use Linux on a 4060ti 16GB a 13th gen 13500, 64GB ddr5 and a fast m.2
Speed is the priority but happy to give up some for maintaining the highest quality that fits the specs with wiggle room for Loras etc. No worries if not, just wondered if I'd missed some vital resources.
>>
>>102005683
>KCD
based
>>
>>102005930
kek, yea this seems to a work in progress at best:
I tried to make her hair black - fail
I tried to make her ride a bike - fail
I tried to make her realistic - fail
I tried to make her be painted in picasso style - fail
I tried to make her a V sign - fail

.. two more weeks.
>>
>>102005968
I'm not convinced the quant models actually help with speed at all, it's just as slow for me using them as not. I figured they were just to prevent oom.
>>
_scaled_mm_out_cuda is not compiled for this platform.

I dont get why im getting this error, I have cuda 11.8, torch is installed, torchvision, etc, files are all there. fp8 checkpoint, but error when I try to use the ipadapter.
>>
>>102006004
>I figured they were just to prevent oom.
it's a huge deal for the vramlets, they aren't forced to offload some of the model into the ram, and that makes shit way faster
>>
>>102005968
the concensus is, shit's whack. gonna be a while before things calm down.
>>102005892
kinda soothing, I like
>>
File: ComfyUI_00019_.png (3.1 MB, 1536x1536)
3.1 MB
3.1 MB PNG
>>
File: ComfyUI_05019_.png (1.44 MB, 1024x1024)
1.44 MB
1.44 MB PNG
>>102005373
indeed
>>
Error occurred when executing ApplyFluxIPAdapter:

'Linear' object has no attribute 'out_features'

fun
>>
>>102006004
there's nothing inherently faster about them, they just compress the model to be smaller which helps a lot if you can't fit it all in VRAM and is useless if you can
>>
>>102006027
did you update comfy and that xlabs shit?
>>
>>102006035
will try, used to doing an occasional git pull but there are rapid changes I guess

still this

Error occurred when executing ApplyFluxIPAdapter:

'Linear' object has no attribute 'out_features'

it dies at apply flux ipadapter node.
>>
File: ComfyUI_05020_.png (1.3 MB, 1024x1024)
1.3 MB
1.3 MB PNG
>>
>>102006055
im gonna do the clone repo stuff and do it fresh, might be something that messed up
>>
>>102006069
Error occurred when executing XlabsSampler:

_scaled_mm_out_cuda is not compiled for this platform.

do I need a certain CUDA version or torch, everything else for flux works
>>
stop wasting your time with xlabs trash, nothing they make is worth using
they are rushing out garbage because they want to be first and they don't care if it's actually usable at all
>>
>>102006113
this, 100% this
>>
Does nvlink give you more available VRAM?
>>
>>102005892
>>102005915
calming
>>
>>102006142
no, sadly nvlink doesnt work with desktop cards and torch, best you can do is use your second card to load the text encoder for some minder speed improve
>>
>>102005727
>gtx1060 wait forever*
ftfy
>>
>>102005860
>>
File: ComfyUI_00027_.png (2.9 MB, 1536x1536)
2.9 MB
2.9 MB PNG
>>102005727
>not cpu forever
casual
>>
>>102006223
How long does a CPU gen take?
>>
File: ComfyUI_00029_.png (3.09 MB, 1536x1536)
3.09 MB
3.09 MB PNG
>>
>>102006222
awesome, inpainting blew my mind when I first used it and it still blows my mind how it can alter reality, basically. To do that without AI would take ages in photoshop.
>>
>>102006231
forever
>>
>>102006068
s-sovl
>>
>>102006236
yup. my cock gonna be so long.
>>
>>102006231
I'm not actually doing it but lol, it was 3 minutes per iteration when I tried. Maybe that could be improved down to 1 minute with the fastest possible RAM and a really good CPU?
>>
>>102006259
>Flux
>cock
>>
>>102006068
which lora/style is this? I know it's animal crossing
>>
>>102006285
nm found it

https://civitai.com/models/668799/game-assets-cartoon-style-3d-isometric-background-assets-for-small-games-flux

looks like it sorta
>>
File: ComfyUI_00031_.png (2.84 MB, 1248x1848)
2.84 MB
2.84 MB PNG
>>
>>102006259
>>102006236
flux is an inpainting beast. hands, feet, guns, you name it. of course, one thing it can't do, the naughty stuff. I inpainted an old (real) stupid image I had lying around with a cup and a pricetag and changed one letter, it even created the jpg artifacts. its scary good.
>>
>>102006316
it can do lewds with the topless loras I think, for regular gens they work, and if not you can just do that in sdxl/pony inpainting if you want
>>
>>102006303
cool, that'd be an amazing skylight desu
>>
File: Capture.png (201 KB, 1370x1597)
201 KB
201 KB PNG
https://github.com/aredden/flux-fp8-api
Looks like he improved the fp8 speed, I wish it would be the same for Q8
>>
sorry nogen because working on smut
>>102006331
yeah "fluxtastic" seems ok, havent had enough time to try the other ones yet. I ended up doing some inpainting with a sd15 model for non mutated nipples and bulge removal, works too I guess.
>>102006303
slick
>>
>>102006303
this would be a good gen if it weren't about some soulless psychiatric ward-white abomination that only a sheikh with shit tastes would consider tasteful
>>
File: honry.png (1.12 MB, 768x1280)
1.12 MB
1.12 MB PNG
For the attention of horny anons across the globe, Flux can create nudes if you prompt it in the other languages it's trained on which is French and German.
>>
File: ComfyUI_00033_.png (2.61 MB, 1248x1848)
2.61 MB
2.61 MB PNG
>>102006345
I'd buy it
>>
File: 1717932653892760.png (1.08 MB, 1024x1024)
1.08 MB
1.08 MB PNG
https://civitai.com/models/668799/game-assets-cartoon-style-3d-isometric-background-assets-for-small-games-flux

Doom Eternal ingame screenshot, the main character is Miku Hatsune, she is dressed in a green suit of armor like doomguy from the videogame DOOM and is holding a large shotgun. The background is the first level of the FPS game Doom Eternal with fire coming from the ground and skulls filling the floor. The image is in the style of the game Animal Crossing.

random doom prompt + animal crossing style + lora, magic
>>
gday anons
>>
File: 1167752492635494098-SD.png (1.8 MB, 896x1152)
1.8 MB
1.8 MB PNG
>>102006468
>attachment missing
>>
>>102006426
any French or German training was incidental and happened only with T5, not Flux
>>
>>102006426
very nice. gonna try me some honry later, in german. danke
>>102006468
BRO
>>
>>102005727
can flux run on 1060
>>
Am I taking crazy pills or are all the things xlabs putting out simply not working? I mean that in the sense they actively ignore the controlnet input while making all the output gibberish.
I think ONE of their canny models stuck to an input once. Aside from that it's all been shit.
>>
File: honry.png (1.17 MB, 768x1280)
1.17 MB
1.17 MB PNG
>>102006494

explain what you mean to a low IQ zambo like me, I thought the text encoder is intrinsic to the model used? If Flux used T5 in the training, wouldn't that mean Flux has horny concepts trained for german/french tokens?
>>
>>102006527
Run is a strong word. It can limp or crawl on a 1060
>>
>>
>>102006531
Being correct is not important, what matters is being first and shouting the loudest, such is the nature of the grift
>>
File: 1720075885721747.png (1.08 MB, 1024x1024)
1.08 MB
1.08 MB PNG
>>102006462
oh! Miku met a villager!
>>
>>102006541
I mean yes it'd be a humiliating experience but for some reason I thought nf4 didn't work on older gpus. renting cloudgpu would make more sense I guess
>>
>>102006545
catcatbox? i always appreciate a good art render
>>
File: 1716328063506362.png (1.06 MB, 1024x1024)
1.06 MB
1.06 MB PNG
>>102006561
but they are friends so it's fine.

flux is going to be a godsend for concept artists/modelers/people that want graphics btw, that's a perfect isometric mario.
>>
>>102006567
>nf4
Nobody is using NF4, just variations of gguf.
>>
>>102006539
The words in different languages will be linked to a certain degree but the diffusion model itself was trained with English captions and T5 itself was trained for English.
Where did you read that T5 knows French and German?
>>
File: 1167752492635494078-SD.png (1.65 MB, 896x1152)
1.65 MB
1.65 MB PNG
>>102006479
Is this fine establishment hiring?
>>
File: honry.png (1.3 MB, 768x1280)
1.3 MB
1.3 MB PNG
>>102006596
The source of all known knowledge that's infallible, Reddit. A Redditor said so, which makes it the truth.
>>
>>102006608
redditors are retarded, I know because I am one
>>
>>102006608
What the fuck is German for boobies? Boobenßnell or something?
>>
>>102006629
gros seins
>>
>>102006629
Boobenbounsendammerung
>>
>>102006426
it can do that for English as well
>>
File: cunt.png (1.65 MB, 790x786)
1.65 MB
1.65 MB PNG
Hey guys...
>>
>>102006650
turkman made it to 4chin, no shilling the patreon here sirs
>>
>>102006539
catbox or not real
>>
File: honry.png (1.32 MB, 768x1280)
1.32 MB
1.32 MB PNG
>>102006629
I asked it nicely for a little "aux seins nus, cuisses épaisses" and received thighs and tiddies
>>
>>102006568
https://files.catbox.moe/puyvjd.png
it's messy and you'll get a lot of errors about missing nodes probably
>>
>>102006650
I hate this fucker. It's always the same test images, he never tries something complex that would show his shitty loras are trained shittily.
>>
File: 1697919110043987.png (1.13 MB, 1024x1024)
1.13 MB
1.13 MB PNG
>>102006578
the power of friendship!

it's kinda wild how you are only limited by creativity and prompts now, in the past you could try to paint something wild, but it would still take a lot of effort to make a finished product, this takes mere seconds to conceptualize. Technology is awesome.
>>
File: 2024-08-21_00323_.png (1.68 MB, 1024x1536)
1.68 MB
1.68 MB PNG
>>102006664
>aux seins nus, cuisses épaisse
it may work in french.. just tried in in German.. no go .. got ugly german old women with bikinis
>>
>>102006666
thanks, i can figure it out
>>
>>102006596
"certain degree" is the key expression here. using french or german (those horndogs!) words just gives the latent vector created by the text encoder a little nudge in whatever direction. the model obviously can generate ass&tiddies.
>>102006629
brüste. titten. (rude) euter. (even worse, think cow), etc
>>102006578
this looks fantastic lol
>>
File: cunt.png (1006 KB, 798x805)
1006 KB
1006 KB PNG
>>102006668
Is there a feature on github where the owner of a repo can block people from posting in the discussions. I shouldn't have to see a link to this guy's patreon when I'm reading up on the latest Kohya pull.
>>
>>102006661

https://files.catbox.moe/wiuwuk.png

Isreal
>>
>>102006669
>it's kinda wild how you are only limited by creativity and prompts now, in the past you could try to paint something wild, but it would still take a lot of effort to make a finished product, this takes mere seconds to conceptualize. Technology is awesome.
the only limitation now is the lora, we don't have a lora on everything, I hope a serious finetune will add more concept to flux so that we can get everything we want without downloading any loras
>>
File: 0_3.jpg (518 KB, 1344x896)
518 KB
518 KB JPG
>>102006677
I forgot to include the starter image since it's a Flux img2img upscale of a small midjourney gen, picrel
>>
>>102006697
Flux nipples are kind of boring.
What I liked about pony was the sheer variety of nipples. Flux is just little circles, and the LoRAs coming out have not helped that of all. Nobody here has heard of an areola.
>>
I get topless output and thick thighs in english as well
I dunno what you guys are on about
>>
>>102006697
bwahahahaah those are not tittes nor puss
>>
>>102006697
tumor nipples
>>
>>102006703
damn maybe flux can generate that natively one day. midjourney remains winning for art, as always.
>>
>>102006700
I think the base model knows what animal crossing is but this lora was trained on graphics like AC and other games so it gives really good results

https://civitai.com/models/668799/game-assets-cartoon-style-3d-isometric-background-assets-for-small-games-flux

finetunes with game characters or anime characters will be amazing, for pony there is a model named autismmix and it can do TONS of characters with basic booru tags. loras are great but you dont even need them a lot of the time.
>>
>>102006697
so it's the same big chewable nipples that Flux creates with English words
you fucking retard
>>
>>102006697
>https://files.catbox.moe/wiuwuk.png
yaaa okay I see what you meant but thats still flux nudity, I can do that to in good prompting in english, the problem is the nipples are borked and her pussy is just a black triangle, but you are right atleast with the french it goes easier to nude than in English (just tried with google translate of my prompt), you can do it in English to tho, but you always get thos pluff nipples and no genitalia
>>
File: 1694844795166521.png (1.27 MB, 1024x1024)
1.27 MB
1.27 MB PNG
Miku Hatsune is in the game Legend of Zelda, she is dressed in a dress like Princess Zelda. The background is the first level of the game Legend of Zelda. She is talking to Super Mario from the game Super Mario Bros and giving him a high five and smiling. The image is in the style of the game Animal Crossing.
>>
>>102006725
yeah but midjourney's gens are small and their upscaler is shit
so I'm using MJ to gen starter images and then making them huge and nice with that Flux workflow
works nice as you saw with the cat
>>
can anyone tell me about align your steps sampler?
>>
File: 1706605325123386.png (1.15 MB, 1024x1024)
1.15 MB
1.15 MB PNG
ignore derpy mario, miku as zelda is pretty cute:
>>
>>102006767
>miku as zelda

I'm like 99% sure that it's just Zelda and the Miku prompt got overpowered.
>>
>>102006747
I'm impressed all it takes is "The image is in the style of the game Animal Crossing."
>>
File: MarkuryFLUX_00191_.png (1.41 MB, 768x1280)
1.41 MB
1.41 MB PNG
Well GEE YOU GUYS I'M SORRY if everyone here is genning nudes with english tokens, for me they never worked and instead I got feet
>>
Early morning bread, get it while it's fresh...
>>102006777
>>102006777
>>102006777
>>
>>102006273
I'm getting 5 minutes/it for a 1 mpix gen. Must be busted to hell.
>>
File: 1706667774516978.png (1.19 MB, 1024x1024)
1.19 MB
1.19 MB PNG
Miku Hatsune is in the game Animal Crossing. She is talking to Tom Nook from the game Animal Crossing. In the background is a store with a sign that says "Tom Nook's cocaine shop". The image is in the style of the game Animal Crossing.

lol

>>102006775
im using the lora trained on 3d/isometric graphics too, it makes perfect villager types.
>>
>>102006788
you have to use something like "artistic nude photography" or "she is unclothed and nude also she is naked" .. like double tripple.. but if you wanna see what is possible with just one lora and flux check this:
https://litter.catbox.moe/0jjboj.png
>>
File: 2024-08-21_00343_.png (1.22 MB, 1024x1536)
1.22 MB
1.22 MB PNG
>>
File: 0.jpg (115 KB, 1024x1024)
115 KB
115 KB JPG



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.