[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: 1704037730348534.jpg (877 KB, 3264x3264)
877 KB
877 KB JPG
Discussion of free and open source text-to-image models

Previous /ldg/ bread : >>101704253

>Beginner UI
EasyDiffusion: https://easydiffusion.github.io
Fooocus: https://github.com/lllyasviel/fooocus
Metastable: https://metastable.studio

>Advanced UI
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
ComfyUI: https://github.com/comfyanonymous/ComfyUI
InvokeAI: https://github.com/invoke-ai/InvokeAI
SD.Next: https://github.com/vladmandic/automatic
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI

>Use a VAE if your images look washed out
https://rentry.org/sdvae

>Model Ranking
https://imgsys.org/rankings

>Models, LoRAs & training
https://civitai.com
https://huggingface.co
https://aitracker.art
https://github.com/Nerogar/OneTrainer
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts

>Pixart Sigma & Hunyuan DIT
https://huggingface.co/spaces/PixArt-alpha/PixArt-Sigma
https://huggingface.co/spaces/Tencent-Hunyuan/HunyuanDiT
https://huggingface.co/comfyanonymous/hunyuan_dit_comfyui
Nodes: https://github.com/city96/ComfyUI_ExtraModels

>Kolors
https://gokaygokay-kolors.hf.space
Nodes: https://github.com/kijai/ComfyUI-KwaiKolorsWrapper

>AuraFlow
https://fal.ai/models/fal-ai/aura-flow
https://huggingface.co/fal/AuraFlows

>Flux
https://huggingface.co/spaces/black-forest-labs/FLUX.1-schnell
https://comfyanonymous.github.io/ComfyUI_examples/flux

>Index of guides and other tools
https://rentry.org/sdg-link
https://rentry.org/rentrysd

>GPU performance
https://vladmandic.github.io/sd-extension-system-info/pages/benchmark.html
https://docs.getgrist.com/3mjouqRSdkBY/sdperformance

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
sd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium

>Related boards
>>>/h/hdg
>>>/e/edg
>>>/d/ddg
>>>/b/degen
>>>/vt/vtai
>>>/aco/sdg
>>>/trash/sdg
>>
File: 1girl.jpg (314 KB, 1728x2304)
314 KB
314 KB JPG
>>
File: file.png (1.63 MB, 1024x1024)
1.63 MB
1.63 MB PNG
>>
Blessed thread of frenship
>>
File: file.png (1.62 MB, 1024x1024)
1.62 MB
1.62 MB PNG
>>
File: ComfyUI_Flux_2105.jpg (161 KB, 1152x864)
161 KB
161 KB JPG
always a pleasure seeing your own gens in the bread collage
>>
Hello /long dick general/, is it correct to say that flux schnell is a worse version of flux dev for people with smaller GPUs? How much worse is it?
>>
>>101709609
Dev is better for coherence but takes twice as long or more to generate. The wolverine pictures above are Schnell. For text you need to use Dev but for regular text-free images Schnell seems fine.
>>
File: ComfyUI_Flux_2095.jpg (205 KB, 1152x864)
205 KB
205 KB JPG
>>
>>101709662
>>101709595
it's like warhammer fantasy movie done with 80's aesthetic
>>
File: file.png (1.35 MB, 1024x1024)
1.35 MB
1.35 MB PNG
>>
File: ComfyUI_Flux_2117.jpg (205 KB, 1152x864)
205 KB
205 KB JPG
>>101709690
its a character description of Mohg The Lord Of Blood from Elden Ring. I'm cycling through the character wikis
>>
File: ComfyUI_30634_.png (2.71 MB, 1536x1536)
2.71 MB
2.71 MB PNG
>>
File: flux192.png (871 KB, 1024x1024)
871 KB
871 KB PNG
>>
File: file.png (1.63 MB, 1024x1024)
1.63 MB
1.63 MB PNG
>>
File: FD_00042_.png (936 KB, 1024x1024)
936 KB
936 KB PNG
>>
File: ComfyUI_30636_.png (2.57 MB, 1536x1536)
2.57 MB
2.57 MB PNG
>>
>>101709625
Thanks, I had been using schnell this whole time and was already very pleased with it.
>>
>>101709433
Wish flux wasn't so bad with generic styles. Why does it give them all an anime style? Even when prompting for manga. It's uncanny, they went overboard with the DPO.
>>
I like idealised depictions of ancient China.
>>
File: ComfyUI_00144_.png (1.8 MB, 1024x1024)
1.8 MB
1.8 MB PNG
What scheduler and sampler do you recommend for Flux (Schnell)?
Also, this is a big sheep.
>>
File: file.png (1.68 MB, 1024x1024)
1.68 MB
1.68 MB PNG
>>
Can someone make me a realistic milf with big tits wearing a dress?
>>
File: ComfyUI_01857_.png (1.68 MB, 1280x1024)
1.68 MB
1.68 MB PNG
Hello
>>
File: FLUX__00096_.png (1.17 MB, 896x1152)
1.17 MB
1.17 MB PNG
>>101709974
ye probably
>>
>>101709924
These painting turned realistic 3D cause huge uncanny valley for me.
When flux is trying to imitate 2D styles and failing due to being mostly trained on 3D, it accidentally does the most convincing (and thus uncanny) 2D -> 3D conversions I've ever seen in image gen. Compression and pixelization artifacts are also the best.
>>
File: FD_00535_.png (1.11 MB, 1024x1024)
1.11 MB
1.11 MB PNG
>>101709974
>>
Replicate forcing the safety filter on and then writing:
>The safety checker is intended to protect users
What if I don't want any protection? I want to live dangerously.
>>
>>101710056
This better?
>>
>>101710118
please understand
>>
File: 00070-2063253280.jpg (706 KB, 1210x1613)
706 KB
706 KB JPG
>>101709974
>>
>>101710118
By users they mean shareholders
>>
>>101709924
Great image senior, I was truly a frog in a well.
>>
File: 00080-2063253277.jpg (688 KB, 1210x1613)
688 KB
688 KB JPG
>>
File: 00085-2063253277.jpg (820 KB, 1210x1613)
820 KB
820 KB JPG
>>
>>101709974
>>
File: ComfyUI_Flux_2141.jpg (212 KB, 1152x864)
212 KB
212 KB JPG
>>
>>101710090
Tits look fake
>>101710153
That's nicer, is that flux?
>>
>>101710282
>That's nicer, is that flux?
it's bstaber
>>
>>101710282
>That's nicer, is that flux?
Nah just 1.5 mix
>>
>>101710124
actually there's nothing wrong with those uncanny photorealistic paintings

this one looks uncannily like a Chinese painting made by Shishkin, lmao. (and 3D is still a bit leaking). This model is incredible.
>>
Flux Schnell img2img:
Denoise up to 0.81: very few changes
Denoise >= 0.82: almost entirely different image
Is this normal?
>>
>>101710318
On flux1-dev-fp8 when img2img with euler karras 6 steps, denoise 0.85 works well, but denoise 0.9 turns the picture into blurry mess.
>>
Sorry if this is a stupid LoRA question, but... How do people make LoRAs with specific objects/options constantly appearing?

Like, I would like to make a LoRA where each picture generates with a similar border around it (like a polaroid picture frame with some tape on it) and have it generate the image inside of it. Do I just do a regular style lora and tag everything BUT the picture frame?
>>
It's really frustrating how Nvidia is holding consumer AI back by being so fucking cheap with vram. THIS IS NOT OKAY!!!!
>>
>>101710373
>Do I just do a regular style lora and tag everything BUT the picture frame?
you can still tag the frame, it will just need the tag when generating, and tagging is always the better option because by not tagging you risk the model not really learning that thing
>>
>>101710310
I love me some Shishkin. He and Vereshchagin were staples of my gens back during 1.5 times. These gens I'm posting now of course use no artist tags at all.
>>
File: ComfyUI_30644_.png (3.69 MB, 2048x2048)
3.69 MB
3.69 MB PNG
>>
File: 00122-2063253280.jpg (722 KB, 1210x1613)
722 KB
722 KB JPG
>>101710373
I think you want to tag the frame since SD should know polaroid style photos. This kind of lora could also work without tags, try it out.
>>
>>101710394
it's a monopoly issue (AMD doesn't compete) that should hopefully be solved when intel makes more GPUs
>>
File: FLUX__00192_.png (1.22 MB, 896x1152)
1.22 MB
1.22 MB PNG
>>
File: frog for sacrifice (1).jpg (200 KB, 1024x1024)
200 KB
200 KB JPG
>>101709772
Now make the loli a frog instead.
>>
File: ComfyUI_01917_.png (1.63 MB, 1280x1024)
1.63 MB
1.63 MB PNG
>>101710124
I like it alot
>>
File: 00158-2063253280.jpg (581 KB, 1210x1613)
581 KB
581 KB JPG
>>
>>
File: 00172-2063253280.jpg (426 KB, 1210x1613)
426 KB
426 KB JPG
>>
What is guidance and what are the min and max number for it?
>>
>>101710480
>intel makes more GPUs
Don't hold your breath.
>>
>>101710480
Intel would be lucky to not die and be saved by the American government in the next years, they're in no position to invest in better GPUs.
>>
I like flux, but the fact it can never get better is really hurting my hype.
If I can't train a LoRA or find better fine tunes, then what's the point? It's just a toy. Can controlnets even be trained for it?
>>
File: SDXL_base_00307_.png (1.37 MB, 896x1152)
1.37 MB
1.37 MB PNG
>>
>>101710978
it's been 1 day, relax
>>
>>101710978
>then what's the point
It helps me with some stuff for my illustrations, im a digital artist
>>
Bros, why don't we just make our own models? Imagine a model as good as flux but low on vram and incredibly easy to finetune. WHOA!
>>
>>101711009
I'm sure we all have the hundred million USD and knowledge/dataset to do the compute.
>>
>>101711009
Just spend 10M bro, just hire a team of 12 phds bro
>>
>>101711009
>we (read: someone who isn't me)
>>
>>101711009
bigma
>>
>>101711065
Bigma who?
>>
>>101709595
Very nice
>>
>>101710978
Let's just wait and see. At the very least Flux is proving what is possible. This is the first model that makes me think the output is good enough. It's not perfect by any means, but it's good enough. And the fact it can run even on 12gb is great news. In the coming 5 years, VRAM should be more easily available and we may have cards specialized in AI as well, and there's also still a lot of room for quantization. In 5 years it's entirely in the real of possibility that mid sized communities could train a model comparable to Flux, so even in the worst case scenario, I think we will be okay. Fully customizable models will be a thing, although it sucks to wait.
>>
File: image.jpg (97 KB, 1024x768)
97 KB
97 KB JPG
Anyone managed to generate real micro bikini on this piece of trash?
>>
>>101711337
its not in the dataset, you can't
>>
>>101711377
FUCK
>>
>>101711382
Just wait. Porn models will be coming any day now.
>>
>>101711382
just like there is no genitalia in the data set, it is what it is.
>>
https://x.com/ostrisai/status/1819865159227338946
>>
File: index (3).jpg (4 KB, 212x238)
4 KB
4 KB JPG
>>101711397
>>101711401
I can and will simply wait a year.
>>
>>101711406
tfw no 16ch
>>
The more I see from pony dev the more I realise he's a piece of shit. He and his dataset should be separated and given to someone with the balls to use it.
>>
File: ComfyUI_01111_.png (1.06 MB, 1200x768)
1.06 MB
1.06 MB PNG
>>101711497
What did he do?
>>
>>101711504
He's an anti loli moralist cuck.
>>
>>101711504
Every time a promising model comes out he screeches about the license then does nothing with it.
>>
>>101711406
This... this it is! HERE WE GO!!!!!
>>
File: file.png (182 KB, 256x256)
182 KB
182 KB PNG
>>101711079
bigma deez nuts
not going to be the level of Flux though
>>
>>101711497
Pony is the only good sex model
>>101711519
Make your own model then retard
I'm a ossss developer and hate users so much, no matrer how mhch you give them they always demand more and more and never thank you, they just nag
>>
>>101711406
>...intentionally breaking down the step compression to hopefully end up with a training base that we can train LoRAs on that will work with schnell
medium hype
>>
>>101711539
No. Fuck you. You're going to make models for the community and you're going to make them good. Now stop this whinny bullshit and start working. And no, we won't say thank you.
>>
>>101711519
it also doesn't help that he has no competition currently, maybe if someone made a good enough finetune to replace pony he would do something about it.
>>
>>101709433
Flux is so good at text bros. How is it so much better than ideogram? Crazy.
>>
File: ComfyUI_temp_cpmuu_00478_.png (1.34 MB, 1024x1024)
1.34 MB
1.34 MB PNG
>>
File: ComfyUI_temp_cpmuu_00486_.png (1.32 MB, 1024x1024)
1.32 MB
1.32 MB PNG
>>
File: Flux_00251_.png (1.12 MB, 1400x1024)
1.12 MB
1.12 MB PNG
>>
File: Sigma_12324_.jpg (2.07 MB, 2048x2048)
2.07 MB
2.07 MB JPG
>>101711536
BIGMA HYPE
>>
How do you get flux to put text on something? I want to generate a character that has a text tattoo.
>>
>>101711746
thank you, honey
>>
>>101711009
Just wait for Bigma. When it comes out 10B will seem like a waste.
>>
>>101711791
>A tattoo that reads "text"
>>
File: FLUX__00002_.png (1.09 MB, 1024x1024)
1.09 MB
1.09 MB PNG
>>101711791
double quotes
>>
File: Sigma_12327_.jpg (2.56 MB, 2048x2048)
2.56 MB
2.56 MB JPG
>>
>>101711746
Amazing. What is the prompt?
>>
File: 2024-08-04_00247_.png (1.13 MB, 1024x1024)
1.13 MB
1.13 MB PNG
what a time we life in
>>
File: 2024-08-04_00250_.png (1008 KB, 1024x1024)
1008 KB
1008 KB PNG
>>
File: 2024-08-04_00256_.png (906 KB, 1024x1024)
906 KB
906 KB PNG
>>
Man, I thought the advice to use an LLM to flesh out prompts was for absolute scrubs after I tried it for SDXL, but it really worked well just now when I tried it with flux. I used windows copilot and I suspect they must have used an OAI model to generate their tags because it really seemed to speak flux's language. Flux handles long prompts way better than SDXL, too.
>>
File: 2024-08-04_00260_.png (814 KB, 1024x1024)
814 KB
814 KB PNG
>>
>>101712010
This is no secret. It's because of how the model was trained.
>>
File: Sigma_12329_.jpg (2.61 MB, 2048x2048)
2.61 MB
2.61 MB JPG
>>
File: 2024-08-04_00261_.png (884 KB, 1024x1024)
884 KB
884 KB PNG
>>
File: Sigma_12330_.jpg (3.11 MB, 2048x2048)
3.11 MB
3.11 MB JPG
>>
>>101711497
>>101711519
Previous gen datasets don't work with this model. Even if Pony could train the model right away, theoretically he'd still be unable to since he would have to manually caption the model for it to be meaningful (otherwise you're just going to be getting SDslop by teaching it booru tags and no one wants that), this an expensive task. Couple that with how much more exponentially expensive this model is to finetune. It's safe to say that it's likely outside of Pony's budget since a full team has to be hired just to do the captioning and to do everything else.
>>
>>101712099
It going to be tagged by llm. No human is going to do it.
>>
File: Image.jpg (882 KB, 2688x1152)
882 KB
882 KB JPG
>>
>>101712131
Look at this list-
https://huggingface.co/spaces/opencompass/open_vlm_leaderboard
There is no fully uncensored VLM, he would have to make one himself
>>
>>101712148
He can't, he's a retard. Just a furfag with a lot of money.
>>
File: Flux_00272_.png (1.22 MB, 1000x1200)
1.22 MB
1.22 MB PNG
>>
File: ComfyUI_01044_.png (1.31 MB, 1024x1024)
1.31 MB
1.31 MB PNG
>>
>>101712310
Offensively boring generation.
>>
>>101712310
Helen of Troy looking ass bitch.
>>
>>101711753
Lol
>>
File: ComfyUI_01047_.png (1.26 MB, 1024x1024)
1.26 MB
1.26 MB PNG
>>101712322
Sorry to offend you
>>
File: FLUX__00032_.png (1.09 MB, 896x1152)
1.09 MB
1.09 MB PNG
>>
File: Flux_00277_.png (1.13 MB, 1024x1024)
1.13 MB
1.13 MB PNG
>>
File: 0.jpg (1 MB, 2048x1024)
1 MB
1 MB JPG
>>
File: ComfyUI_30661_.png (3.84 MB, 2048x1224)
3.84 MB
3.84 MB PNG
>>
File: FLUX__00034_.png (1015 KB, 896x1152)
1015 KB
1015 KB PNG
>>
File: ComfyUI_30662_.png (3.46 MB, 2048x1224)
3.46 MB
3.46 MB PNG
>>
File: Image.jpg (550 KB, 1792x1152)
550 KB
550 KB JPG
>>
does the huggingface link for FLUX from the OP work with mobile? I want to send to someone to try but I'm on desktop
>>
I'm speechless. Flux is just phenomenal. And it's only going to get better
>>
File: ComfyUI_00225_.png (1.26 MB, 1200x768)
1.26 MB
1.26 MB PNG
Something Flux seems to excel at over other image generation is that most things are proportionate. SD 1.5 and SDXL struggled with this and SD3 continued to struggled with this.
>>
>>101712683
>forgot pic
>>
>>101712683
>And it's only going to get better
That's where you're wrong, bucko. Flux is dead on arrival. It's an untrainable advertisement for their saas model.
>>
File: image (7).jpg (180 KB, 1024x768)
180 KB
180 KB JPG
flux is insane. i am a vramlett but good for you chads to be able to run such a beast locally uncensored. bless the makers.
>>
File: image (9).jpg (171 KB, 1024x768)
171 KB
171 KB JPG
>>101712800
>>
>>101712800
>>101712830
Prompt for mixed anime and real?
>>
>>101712842
Stolen from /lmg/.
>A professional real estate photograph selfie in a living room, 24mm, f/16 lens. The background is sharp and in focus. An anime cutout of Hatsune Miku is edited into the photo. There is a fat otaku man with manboobs standing beside her with his hand around her shoulder.
>>
>>101712848
lol posted my fat otaku edit.
>A professional real estate photograph selfie in a living room, 24mm, f/16 lens. The background is sharp and in focus. An anime cutout of Hatsune Miku is edited into the photo. There is a photogenic man standing beside her with his hand around her shoulder.
but its pretty much the same.
>>
So is there any way to run flux now on 3080 ti 12gb that i have 4 of
>>
File: ComfyUI_170287_.png (1.07 MB, 1024x1024)
1.07 MB
1.07 MB PNG
https://comfyanonymous.github.io/ComfyUI_examples/flux/#simple-to-use-fp8-checkpoint-version

If anyone is looking for an easy to use single file version of flux dev in fp8. Just make sure you update ComfyUI.
>>
>>101712870
>that i have 4 of
LLM?
>>
Is there a way to run flux dev without comfyui? I have tried the example diffusers script and it doesn't even load the model.
>>
>>101712874
Ty
>>
>>101712882
Managed to fit a 70b model on all of em
>>
File: image (10).jpg (155 KB, 1024x768)
155 KB
155 KB JPG
How far we have come. I remember the 64*64 horror pics not so long ago.
>>
>>101712756
This is proof of what's possible with for image generation. And honestly, it's impossible for it to get worse from here on- it can only get better
>>
https://twitter.com/ostrisai/status/1819865159227338946
>>
>>101712966
>forgot the pic...
>>
>>101711825
You can 't just force things to be over because you go and gen a fat lady singing. It's just not the right way.
>>
>>101712756
>>101712966
i dont understand, why can't flux get better?
>>
>>101712856
>>101712848
>The background is sharp and in focus.
>Doubt
>>
>>101713048
speculation that finetunes aren't possible
>>
>>101713048
it can, just not in a reasonable way right now >>101712977
think of it like this, what crysis was for pc gaming, flux is for imggen
>>
>>101713115
>it can
Not a sure thing yet.
https://x.com/ostrisai/status/1819865743929790533
>>
File: image (11).jpg (143 KB, 1024x768)
143 KB
143 KB JPG
>>101712756
This is you

>>101713055
Flux loves the blurry background. I bet SD fags especially are bothered by this. I think you always needed to explicitly prompt a blurry bacckground. lol
>>
File: ComfyUI_01070_.png (1.43 MB, 1024x1024)
1.43 MB
1.43 MB PNG
>>
File: FD_00049_.png (1.48 MB, 1024x1024)
1.48 MB
1.48 MB PNG
>>
>>101713071
From what I've heard it's just a hardware limitation. In that case, consumer level hardware will grow into it in a few years
>>
File: IMG_1281.png (453 KB, 1024x768)
453 KB
453 KB PNG
>>
File: FD_00059_.png (1.31 MB, 1024x1024)
1.31 MB
1.31 MB PNG
>>
>>101713244
and something about distillation from the pro model
I dunno, I'm not well versed in this stuff
>>
>>101713244
>40gbs for a lora
>consumer level hardware will grow into it in a few years
not with nvidia at helm
>>
finetuning wont be limited by architecture but hardware. people cried about sdxl finetuner thanks to the refiner but we still have sdxl finetuners and barely any need the refiner. the problem is the model is so fucking big that nobody can afford to finetune it. sdxl already took months to get semi-decent finetunes. with this it will be years. nvidia has local by the balls, you need $100k+ to seriously consider finetuning this
>>
>>101711518
Ponies are lolis though, just in pony form.
>>
>>101713291
>nvidia has local by the balls
what happened to trust busting?
>>
File: 00005-3804204968.png (1.06 MB, 1280x720)
1.06 MB
1.06 MB PNG
>>101713253
how do you get accurate copyrighted characters like Goku? mine are always way off. do I just need better models?
>>
>>101713279
Whatever happened to QLoRA (which is now QDoRA?) Don't tell me you don't plan on running Flux Q4.
>>
>>101713384
Unironically write the prompt as if you were praying.
>>
>>101713384
I just wrote "with Goku from Dragonball Z"
>>
>Create anything, just make it really cool to impress everyone

NLP might indeed solve my issue with ordinary styles
>>
File: image (13).jpg (147 KB, 1024x768)
147 KB
147 KB JPG
>>101713384
With flux its easy.
>>
>>101713291
Nvidia should unironically be forced to license CUDA. It's pretty much a form of vertical integration that is giving them a virtual monopoly.
>>
>>101713384
>filename
huh
>>
>>101711065
>>
>>101713456
>>101713456
Most of mine are ending up with 3D characters, how to I make them 2D? Just roulette?
>>
File: image (14).jpg (142 KB, 1024x768)
142 KB
142 KB JPG
>>101713486
>A professional, high-resolution photograph inside an opulent mansion, 24mm wide-angle lens, f/16 aperture for deep focus. President Donald Trump and Russian President Vladimir Putin standing side by side, appearing photogenic and formal. Behind them an anime cutout of goku from dragon ball Z is edited into the photo. A thoughtbubble above their heads that reads "Flux is cool..."
>>
File: FD_00093_.png (1.05 MB, 1024x1024)
1.05 MB
1.05 MB PNG
>>101711079
>>
>>101713291
So- out of the frying pan and into the fire

If we don't get any extras for flux for a year or two then I think I can cope. Who knows how far we can get with just by improving prompting methodology. And there's always img2img with flux as input if it comes to that.
>>
Is PAG node worth it? It doubles the gen time
>>
>>101712874
>17GB
Why is it bigger than this ones? https://huggingface.co/Kijai/flux-fp8/tree/main
>>
>>101713558
my thoughts: >>101686791
we wont get anything insane for flux in a year but by that time flux will be outdated. it was only a year ago that sdxl released. right now the local limit is around 8b. 12b is simply too much for any kind of ecosystem to develop around it. flux is a nice model that proves what's possible, but it's unwieldy thanks to gimped local hardware.

8b is the sweet spot for the 24gb high-end. a good model baked with 8b as the target, not distilled, will be local's next mass-adoption. i don't think what flux does is only possible on 12b, i think the comprehension and more can be fit on 8b especially if they improve their datasets.
>>
>>101713291
All that does is give birth to or create a need for finetuning services. A pretty smart business model if that's something flux devs are considering imo.
>>
The truth is:
I don't care if Flux gens are accurate to the prompt when they're all ugly
There, I said it
>>
>>101712874
Why do we need this as opposed to just doing this?
>>
File: FLUX__00124_.png (632 KB, 1024x1024)
632 KB
632 KB PNG
>>
>>101713634
is this not just renting out runpod or something? i mean for an actual finetune (not lora) you'd easily need money in the 5-digits minimum
>>
File: ComfyUI_170295_.png (1.09 MB, 1024x1024)
1.09 MB
1.09 MB PNG
>>101713644
You can do it that way too if you want. There's just a lot of people confused about the multiple files.

>>101713596
That's a fp8 version of the diffusion model only, Mine has everything in it so you can use the regular "load checkpoint" node with it.
>>
File: ComfyUI_00168_.png (812 KB, 1024x1024)
812 KB
812 KB PNG
>>101710630
>>
>>101712618
>does the huggingface link for FLUX from the OP work with mobile? I want to send to someone to try but I'm on desktop
please, is anyone here on mobile who can check?
>>
File: FD_00351_.png (1 MB, 1024x1024)
1 MB
1 MB PNG
>>101713947
>>
File: 1722742523.png (1.09 MB, 1024x1024)
1.09 MB
1.09 MB PNG
>>
>>101713870
>>101713644
>>101712874
Produces different results. Left is yours, right is with weight_dtype
>>
>>101714093
forgot pic
>>
File: ComfyUI_170320_.png (1.13 MB, 1024x1024)
1.13 MB
1.13 MB PNG
>>101714093
>>101714108
fp8 T5 vs fp16 T5?
>>
>>101714014
how are you prompting pixelart? just "in the style of pixel art?" or "pixelized"?
>>
>>101714137
I guess? I don't really know what I'm doing.
It's like this. The rest is the same.
>>
>>101714151
You are free to implement such a feature.
>>
>>101714151
it's pretty jank. stenography would be better but has risks of hidden files
>>
File: FD_00127_.png (1.51 MB, 1024x1024)
1.51 MB
1.51 MB PNG
>>101714146
Literally just "pixel art" at the front of the prompt.
>>
You are forcing me to download ComfyUI.
>>
>>101714235
Foxfucker has monopoly as tight as nvidia.
>>
File: ComfyUI_00174_.png (1.36 MB, 1024x1024)
1.36 MB
1.36 MB PNG
>>
>>101714235
technically voldy is
>>
>>101714260
prompt? I know cutout of miku works for mikus with real people
>>
File: FD_00146_.png (1.82 MB, 1024x1024)
1.82 MB
1.82 MB PNG
>>
>>101714235
>forcing
Don't like comfy? or just can't be bothered?
>>
>>101714235
you can set up the workspace by loading a file or dragging an image. it's honestly very simple.
>>
File: fallsched_2.jpg (274 KB, 1240x1448)
274 KB
274 KB JPG
>>
File: ComfyUI_00073_.png (1.22 MB, 1024x1024)
1.22 MB
1.22 MB PNG
>>101714256
I think that auto1111 is based for not giving so much of a shit about his webui anymore, considering how little you get in return for open-source software development. As time goes on, the effort to maintain the project increases, and so does the amount of annoying comments, demands, and criticism that you receive (maybe this comment itself fits in the "annoying comment" counterproductive category).
Meanwhile, comfy's efforts are commendable. He works nonstop on his project and is very passionate. I'm glad he figures out how to run the model for me so that I don't have to spend weeks experimenting myself.
>>101714292
kek yes.
>>101714334
I agree. The UI is very fast as well. But it feels nicer to use auto's UI, at least to me. I'm used to Gradio UIs and find them nice for their amateur look and big buttons.
>>
File: ComfyUI_00089_.png (911 KB, 1024x1024)
911 KB
911 KB PNG
>>
>>101714355
for flux which has no stuff like controlnet/etc yet, it's fine. I like auto for sdxl/ponyxl and making characters or doing controlnet stuff like canny meme edits of photos with characters/loras.
>>
>>101714311
"Photograph of Donald Trump smiling and hugging a body pillow. The body pillow has a large anime drawing of Hatsune Miku on it." It can do this kind of stuff surprisingly well.
>>
File: 1714905724462602.png (1.07 MB, 1024x1024)
1.07 MB
1.07 MB PNG
>>101714413
neat, cutout also works really well
>>
>>101714344
Neat
>>
File: ComfyUI_00118_.png (1.03 MB, 1024x1024)
1.03 MB
1.03 MB PNG
>>
>>101714445
I love how Donald Trump has become the new Emma Watson for testing shit lmao
>>
File: ComfyUI_00120_.png (890 KB, 1024x1024)
890 KB
890 KB PNG
>>
>>101714516
how did you prompt text per panel or panels in general? 4 panel comic or something?
>>
File: ComfyUI_00128_.png (1 MB, 1024x1024)
1 MB
1 MB PNG
>>
>>101714531
anon, just describe LITERALLY, there's no issue with that. literally describe that it's a comic with N panels, describe each panel, just PROMPT IT NORMALLY
>>
>>101712286
That's pretty funny, catbox?
>>
>>101712870
Might have found something myself
https://twitter.com/cocktailpeanut/status/1819482952071323788
>>
File: ComfyUI_00135_.png (1 MB, 1024x1024)
1 MB
1 MB PNG
>>101714531
It's very simple as >>101714537 says, all I'm using is
>A comic scene with 4 panels. In the 4th panel (the last one), the beautiful anime girl is holding a sign that says "..."
>>
>>101712010
vlms are used to caption their datasets, it's why all these models respond so well to llm sloppa
>>
File: FD_00156_.png (1.39 MB, 1024x1024)
1.39 MB
1.39 MB PNG
>>101714537
>>
File: landscape_00417_.jpg (738 KB, 2560x1440)
738 KB
738 KB JPG
>>
File: ComfyUI_00150_.png (999 KB, 1024x1024)
999 KB
999 KB PNG
>>
File: Flux_00323_.png (844 KB, 1024x768)
844 KB
844 KB PNG
>>
File: ComfyUI_00154_.png (1.17 MB, 1024x1024)
1.17 MB
1.17 MB PNG
>>
File: ComfyUI_01695_.png (924 KB, 1152x768)
924 KB
924 KB PNG
>>
File: ComfyUI_00172_.png (1.28 MB, 1024x1024)
1.28 MB
1.28 MB PNG
>>
File: fallsched_1.jpg (185 KB, 1200x1528)
185 KB
185 KB JPG
>>101714454
Thanks friend
>>
File: ComfyUI_00184_ (1).png (729 KB, 1024x1024)
729 KB
729 KB PNG
>>
>>101714720
Ironic
>>
>>101714750
that's not safe
>>
File: ComfyUI_00178_.png (1.15 MB, 1024x1024)
1.15 MB
1.15 MB PNG
>>101714751
>>
File: flux.jpg (136 KB, 1024x1024)
136 KB
136 KB JPG
Interesting...
https://files.catbox.moe/bdr8uh.png
>>
File: ComfyUI_00109_ (1).png (977 KB, 1024x1024)
977 KB
977 KB PNG
>>101714760
>>
File: __00862_.png (410 KB, 768x768)
410 KB
410 KB PNG
>>
File: modelo_5.jpg (299 KB, 1024x1024)
299 KB
299 KB JPG
>>101709433
What is the best model to generate images for development purposes?

I'm trying to get realistic images and they are very good, I got these results at https://fal.ai/
specifically whit fal-ai/realistic-vision

I would like to know which one is the most recommended in terms of cost and ease, I probably want to create images very similar to an image I already have, that is, have the model generate another one with the same person that is in the input image

I can probably fine-tune it and/or deploy it to a server or maybe use an already existing model and use the api to get the image I want

I would like to know the best way to learn
>>
>>101714789
Very christian of you
>>
>>101714901
this reeks of shilling and llm
>>
>>101714912
christianity is when ned flanders
>>
File: FLUX__00057_.png (1011 KB, 1024x1024)
1011 KB
1011 KB PNG
>>
>>101709433
Does anyone have the full size top center image?
>>
>>
Is there an alternate GUI to run flux locally besides comfyui
>>
>>101715019
stable swarm UI, which is an auto1111-like UI on top of the comfy backend
>>
>>101715034
Thank you
>>
File: ComfyUI_01488_.png (1.42 MB, 1024x1024)
1.42 MB
1.42 MB PNG
>>101714531
That's the magic of good prompt adherence, you can literally just write exactly what you want in what part of the image.
>>
File: iLv4I_LqgLcaYyx8cej_7.png.png (1.02 MB, 1344x1536)
1.02 MB
1.02 MB PNG
>>
>>101714739
Catbox? Love this
>>
File: 1722749640.png (915 KB, 1024x1024)
915 KB
915 KB PNG
>>
Why is SDG lapping us? I thought we were winning?
>>
>
>>
How many steps do you guys do for complicated gens? I just tried one with a really detailed prompt and it followed it better and got the text more accurately with 50 instead of 20 steps. But fuck, it's so slow.
>>
File: ComfyUI_00729_.png (1.31 MB, 1024x1024)
1.31 MB
1.31 MB PNG
>>
cozy thread
>>
>>101715240
I don't need to 'beat' them, it's just nice to have an alternative imagegen thread without the insufferable avatarfagging and lonely people talking about their personal lives using imagegen as a pretext
>>
File: FD_00167_.png (1.15 MB, 1024x1024)
1.15 MB
1.15 MB PNG
>>101715290
>>
copey thread
>>
File: 1713726454407.png (477 KB, 1024x1024)
477 KB
477 KB PNG
>>
>>101714750
They're gonna shut it down before we get the video model. I can already tell. The video model + this + a local text to speech would be too much for them.
>>
>>101715399
They don't care about dead babies. Only what you can create, share and expose with it.
>>
>>101715377
who is them?
>>
>>101715434
>>101715333
>>
>>101714235
I mean, if you go to their github instructions you can run a limited webUI
>>
File: FD_00173_.png (1.74 MB, 1024x1024)
1.74 MB
1.74 MB PNG
>>101715377
What is it like to live your life knowing you are completely retarded? Does it hurt? Are you depressed?
>>
you guys had me all afraid I'd need a billion 4090s to run Flux, it works just fine on me 3060
>>
>>101715506
wouldn't hurt tho
>>
>>101715506
What's the gen time for one image? I'm contemplating whether to download the 24 GB blob or not
>>
File: FD_00176_.png (1.73 MB, 1024x1024)
1.73 MB
1.73 MB PNG
>>101715506
If you want to run it at full power you will need a 4090 but fp8 is good enough, and significantly better than both XL, Pony, and SD3. If it could make a vagina we'd be able to delete every other model.
>>
>>101715493
>He doesn't get the motive behind (((safety))) and censorship
>>
>>101715518
2.5~ minutes for me
>>
>>101715531
It is though and by a lot
>>
>>101715518
it's about 1 minute for every 10 sampling steps, so 2 mins for the 20 steps comfy's workflow uses
>>101715529
I'm doing everything fp16 (model and clip) and it works, uses almost all my vram though
>>
File: girlfrienduru.png (622 KB, 600x904)
622 KB
622 KB PNG
>>
>>101715534
You dumb idiot. It's already out. We can already animate all this flux shit. There's nothing you or your conspiracy theories can do to stop it.
Flux successfully proved you can make a "safe" model not shit.
>>
>>101715574
>almost
Fuck me. On my 4080 it takes 10-20 minutes per image if I run it on full.
>>
File: ComfyUI_00001_.png (995 KB, 1024x1024)
995 KB
995 KB PNG
almost got it :(
>>
>>101715580
Retard, where did I say anything about Flux?
>>
>>101715608
Now I'm confused, my vram usage is the same whether I select fp8 or default bfloat16. wtf
>>
>>101715679
The image you replied to was flux, it's a safe assumption. What do you think they are going to shut down? The internet?
>>
>>101715636
kek
>>
File: out-0 (1).png (980 KB, 1344x768)
980 KB
980 KB PNG
>>101715636
HOLY SHIT ANON I WON THE FUCCKING DIFFUSION LATTETY NO WAYYY

>Breaking news on NBC showing a reporter on the left side with Trump making a speech on the right side. The text on the bottom reads "BREAKING NEWS", and below it "HIS IMPERIAL HIGHNESS ANNOUNCES TOTAL NIGGER DEATH"

I SHIT YOU NOT THIS IS A DEV GEN FROM THE API SCALE 3.5 SEED 51646902 (replicate API flux dev, you can test that it's the same, AR 16:9)

This gets the text right like 25% of the time
>>
>>101715518
Trying Flux for the first time on my 2060 12GB, looks like about 4-5 minutes for a single 1024x1024 image (also first time using SwarmUI and don't know how to see precisely how much time is estimated)
>>
>>101715705
catbox your workflow?
>>
>>101715727
KEK
>>
>>101715727
incredible
>>
>>101715727
what a time to be alive

overlay prompt?
>>
>>101715733
Correction, exactly 6 minutes
>>
>>101715776
The whole prompt is there
>>
>>101715776
The prompt I shared is the complete one, I did in on replicate api
>>
>>101715740
it's just the default flux workflow suggested by comfyanon https://files.catbox.moe/fpiehr.png
>>
>>101715717
Retard, the US company behind Flux. If it were Chinese it wouldn't matter (except to Xi).
>>
>>101715793
You must be hitting lowvram mode and bleeding into RAM is my guess.
What card?
>>
>>101715795
The US company behind flux that just got 40 million dollars? That company? They're going to shut it down are they?
>>
>>101715803
yeah, it is saying lowvram mode but the speeds aren't bad at all compared to what some people have reported, ~2 mins per gen
RTX 3060
>>
>>101715832
How many steps do you use?
>>
>>101710027
We have no donations here Reimu sorry
>>
>>101715832
How the fuck are you getting 2 minutes per gen? This is me on my 4080
1024x1024 gen fp16
>>
>>101715832
at around 1MP
around 2 it's 8~ minutes
>>
>>101715849
What resolution are you prompting?
>>
>>101715867
1024x1024 I said already
>>
File: out-0 (2).png (1001 KB, 1344x768)
1001 KB
1001 KB PNG
>>101715727
>>
File: Flux_1722753980_0001.jpg (1.78 MB, 2688x1536)
1.78 MB
1.78 MB JPG
>>101715727
fucking kek
>>
File: 1708778030889150.png (909 KB, 1024x1024)
909 KB
909 KB PNG
>>101715877
how are you getting it so clear? my broadcasts are a bit blurry, this is fp8 on a 4080 btw
>>
>>101715889
sorry I'm using replicate api which is probably fp16, dev model.
>>
File: 1696533491210150.png (1.94 MB, 1024x1024)
1.94 MB
1.94 MB PNG
First Flux proompt, 6 minutes generation time on a 2060 12GB. Going to take a while to figure out how to get the styles I wanted.
>>
File: ComfyUI_00006_.png (1017 KB, 1344x768)
1017 KB
1017 KB PNG
>>101715877
finally got one that works myself :D
>>101715847
1 minute per 10 steps, so 2 mins for a 20 step gen
>>101715849
idk anon... are you on windows? Only thing I can think of is that it might help that I'm on linux, where my vram is almost empty until I start up flux
>>
>>101715910
imagine how many boomers on social media can be tricked with this
>>
Ok it's over I got the perfect one
>Breaking news on MSNBC showing a reporter on the left side with Trump making a speech on the right side, standing next to the flag of the USA. Behind Trump is the White House logo with the text "THE WHITE HOUSE" and "Washington" below it. The text on the bottom reads "BREAKING NEWS", and below it "HIS IMPERIAL HIGHNESS ANNOUNCES TOTAL NIGGER DEATH". On the bottom right is the text "MSBNC" with the logo.
>>
>>101715933
probably easier and more effective to grab any screenshot of a story about trump and edit the headline
>>
File: out-0 (3).png (1000 KB, 1344x768)
1000 KB
1000 KB PNG
>>101715937
Forgot pic, and his face is a bit bad after all, but yeah
>>
>>101715939
is it tho
>>
>>101715939
inpainting with this model like in auto1111 or an inpaint node would be fun too
>>
The difference between fp8 and fp16 is very noticeable.
Fuck. Now I need to buy a 5090. :(
>>
File: 1702392523592883.png (826 KB, 1024x1024)
826 KB
826 KB PNG
>>
>>101715965
I've tried both and they're basically indistinguishable. Even when I think the image is screwing up due to being FP8, changing it to FP16 doesn't help.
>>
>>101715979
what about fp4 for the ramlets
>>
File: 1706149311678131.png (828 KB, 1024x1024)
828 KB
828 KB PNG
for scheduler (basicscheduler), some have said "beta" is also good results, like euler
>>
File: Flux_00357_.png (1.04 MB, 1344x768)
1.04 MB
1.04 MB PNG
>>
>>101715889
Set cfg to 1 if its local
>>
File: out-0 (5).png (969 KB, 1344x768)
969 KB
969 KB PNG
It didn't want to say Pooping news, and three s
>>
>/g/ - Technology
>>
baking
>>
File: ComfyUI_00008_.png (1007 KB, 1344x768)
1007 KB
1007 KB PNG
>>
>>101716006
I know Flux is nice but >>>/pol/ is that way
>>
>>101715992
I have a basicscheduler that outputs sigmas (options are scheduler/steps/denoise, no cfg) but if I use ksampler it outputs latent

which node do I need for flux?
>>
File: c77kYbFqM-SO4IPztlJT7.png (1.53 MB, 1024x1024)
1.53 MB
1.53 MB PNG
>>101715965
vs pro holy shit fuck me
>>
Le Fresh Bread...
>>101715949
>>101715949
>>101715949
>>
>>101716023
nm most of the workflows I see just have the node with 3
>>
File: __00873_.png (1 MB, 600x904)
1 MB
1 MB PNG
>>
File: ComfyUI_temp_riblx_00219_.jpg (3.6 MB, 2560x1536)
3.6 MB
3.6 MB JPG
>>
>>
>>
>>101714445
Can it do kigurumi?
>>
>>101714355
One is a slav and a Dota 2 addict and the other is a massive attentionwhore
Of course only one will keep going long term because his ego depends on it



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.