[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


File: tmp.jpg (1.03 MB, 3264x3264)
1.03 MB
1.03 MB JPG
Discussion of free and open source text-to-image models

Previous /ldg/ bread : >>101753017

>Beginner UI
EasyDiffusion: https://easydiffusion.github.io
Fooocus: https://github.com/lllyasviel/fooocus
Metastable: https://metastable.studio

>Advanced UI
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
ComfyUI: https://github.com/comfyanonymous/ComfyUI
InvokeAI: https://github.com/invoke-ai/InvokeAI
SD.Next: https://github.com/vladmandic/automatic
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI

>Use a VAE if your images look washed out
https://rentry.org/sdvae

>Model Ranking
https://imgsys.org/rankings

>Models, LoRAs & training
https://civitai.com
https://huggingface.co
https://aitracker.art
https://github.com/Nerogar/OneTrainer
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts

>Flux
https://huggingface.co/spaces/black-forest-labs/FLUX.1-schnell
https://comfyanonymous.github.io/ComfyUI_examples/flux

>Pixart Sigma & Hunyuan DIT
https://huggingface.co/spaces/PixArt-alpha/PixArt-Sigma
https://huggingface.co/spaces/Tencent-Hunyuan/HunyuanDiT
https://huggingface.co/comfyanonymous/hunyuan_dit_comfyui
Nodes: https://github.com/city96/ComfyUI_ExtraModels

>Kolors
https://gokaygokay-kolors.hf.space
Nodes: https://github.com/kijai/ComfyUI-KwaiKolorsWrapper

>AuraFlow
https://fal.ai/models/fal-ai/aura-flow
https://huggingface.co/fal/AuraFlows

>Index of guides and other tools
https://rentry.org/sdg-link
https://rentry.org/rentrysd

>GPU performance
https://vladmandic.github.io/sd-extension-system-info/pages/benchmark.html
https://docs.getgrist.com/3mjouqRSdkBY/sdperformance

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
sd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium

>Related boards
>>>/h/hdg
>>>/e/edg
>>>/d/ddg
>>>/b/degen
>>>/vt/vtai
>>>/aco/sdg
>>>/trash/sdg
>>
Blessed thread of frenship
>>
File: ComfyUI_02527_.png (1.32 MB, 1152x896)
1.32 MB
1.32 MB PNG
>>
>>101758249
you dropped these

>>101757907
>>101758130
>>101758169
>>101758185
>>101758191
>>101758347
>>101758353
>>
File: FD_00107_.png (1.81 MB, 1024x1024)
1.81 MB
1.81 MB PNG
>>101758369
skill issue
>>
File: 55.jpg (550 KB, 1344x768)
550 KB
550 KB JPG
>>
Making a discord if anyone is interested. Any rules we should have?
>>
>>101758467
no trannies
>>
>>101758467
Yes there are rules. The rule is fuck off
>>101755962
they already have a dataminercord
>>
File: FD_00111_.png (1.55 MB, 1024x1024)
1.55 MB
1.55 MB PNG
>>
Coomfy, if you are here please implement this to make flux faster
https://github.com/aredden/torch-cublas-hgemm
>>
>>101758499
Comfy usually hangs out in the troon thread
>>
>>101758241
>So what would make Flux perfect right now?
- inpainting support
- controlnets
- style/face/whatever transfer adapters
That's the bare minimum to get it usable in saner pipelines like Krita Diffusion and make what you want, not what this thing wants.

Tooling should always come first. Too bad Black Forest are AI eggheads and don't understand that. Either a "funny picture generator" is their actual target, or they are genuinely misguided enough to believe that text prompt is enough.
>>
>>101758499
>>101758508
He is there right now
>>101758424
>>
>>101758452
>>101758498
another centered subject gen? can't flux output anything else?
>>
>>101758452
nta but there's absolutely no sovl in this style
you posted the epitome of the AI slop look, in the apparent belief that you were refuting the slop claim lol
>>
>>101758499
fp16 only, damn it
>>
>>101758568
NTA but you need to post AI stuff that you believe isn't slop to support your claim of Flux being sloppy.
>>
>>101758527
Normies don't care about these tools, anon.
>>
File: file.png (834 KB, 1024x1024)
834 KB
834 KB PNG
>>
>>101758587
Thinking of older Dalle-2 stuff mostly. Also some of the stuff I've seen from Pixart.
>>
File: 37177757874.jpg (686 KB, 1320x2304)
686 KB
686 KB JPG
>>101758241
Making it able to render hard veiny cocks
>>101758560
coombrains can only 1girlgen
>>
File: file.png (1.29 MB, 1280x864)
1.29 MB
1.29 MB PNG
>>
>>101758590
Normies don't have the hardware to run Flux so who cares what they care about
Normies use Midjourney or Bing
>>
File: ComfyUI_02533_.png (1.4 MB, 1152x896)
1.4 MB
1.4 MB PNG
>>
File: file.png (1.33 MB, 1280x864)
1.33 MB
1.33 MB PNG
>>
File: ComfyUI_02534_.png (1.36 MB, 1152x896)
1.36 MB
1.36 MB PNG
>>
File: FD_00259_.png (1.55 MB, 1024x1024)
1.55 MB
1.55 MB PNG
>>101758560
>wtf why is the square image centred around the subject
>>
Anyone notice that when you try to gen nudes, the closer to the actual nudes you get the more blurry the output is? Did they actually just blur nudes in their dataset?
>>
>>101758590
Normies don't care about funny pictures either, especially not for this price. The whole image-gen is a fad without tools.

>>101758527
Also /ldg/ seems misguided too - we had a retard trying to prompt an entire multi-panel comic with text here, instead of prompting it part by part and combining the parts.

To all /ldg/ newfags who don't seem to be picking skills - check on how hard you're missing out by looking at nhentai id 506529. I'm not even into that, but this is a multi-panel multi-page story that has:
- consistent style and characters
- exact poses and scenes the artist wants
all of that with measly SD 1.5 that can't do that no matter how much you try. No current model can do that either - you need controlnets and zero-shot adapters for that.
Same goes for the best AI works on any hentai site that are above the usual slop. They are not just prompting.
>>
File: 00269-4286467286.png (1.97 MB, 1024x1024)
1.97 MB
1.97 MB PNG
Flux -> Pony is the way to go.
>>
>>101758720
>trying to prompt an entire multi-panel comic with text
NTA, but I don't see the point of condemning someone trying to do that. It's new model and trying to see what's possible with it isn't retarded.
>>
>>101758720
imagen is just a replacement for boorus and maybe some basic stock photography
>>
File: FLUX_00019_.png (1.44 MB, 1152x896)
1.44 MB
1.44 MB PNG
>>
File: ComfyUI_02539_.png (1.25 MB, 1152x896)
1.25 MB
1.25 MB PNG
>>
>>101758731
The maker of pony being a massive prima donna has actually turned me off the model and its derivatives.
>>
>>101758720
>To all /ldg/ newfags who don't seem to be picking skills
I agree. we need to start banning posters who aren't improving
>>
>>101758736
Yeah but the anon concluded the model can't do that hence it's shit. It's like trying to make LLMs solve math problems and concluding they're shit. LLMs are bad at math, and image generation models are bad at generating full fledged comics, something that can be easily done in a simplest image editor.

>>101758740
Yes if you don't have tools with better semantic capacity than text prompts
>>
>>101758786
definitely not what I said
>>
>>101758787
Well then that anon was a retard.
>>
>>101758787
SDXL barely got tooling, don't hold your breath for Flux.
>>
>>101758499
>2 times speed improvement
let's goo, comfy definitely needs to see that shit yeah
>>
>>101758806
sdxl actually has very good controlnets, inpainting and ipadapter now
though they took longer to arrive than they should have
>>
>>
File: FD_00129_.png (913 KB, 1024x1024)
913 KB
913 KB PNG
>>101758610
>older Dalle-2
>>
>>101758836
kek'ed
>>
>>101758836
>slop gen that has nothing going for it except coherent text
inadvertent chef's kiss reply
>>
Flux inflated the ego of anon far more than any previous model.
>>
File: ComfyUI_00860_.png (637 KB, 1024x1024)
637 KB
637 KB PNG
>Anatomically correct 3D reference sheet depicting woman from front side and back angles
I didn't even prompt brap hog.
Is this model blurring everything even slightly risqué going to be an issue?
>>
>>101758854
see
>>101758836
>>
>click queue prompt using flux
>my computer goes into epileptic seizure and everything goes black
It's over.
>>
>>101758854
You are in a definite minority and almost certainly wearing rose tinted goggles if you think Dalle 2 was preferable to this.
>>
>>101758861
It often blurs pictures with white background for some reason.
>>
>>101758874
GPU sag or power supply
>>
>>101758861
Definitely a skill issue, not even sure how you are having that happen consistently
https://files.catbox.moe/w70kv0.png
>>
File: file.png (953 KB, 1280x864)
953 KB
953 KB PNG
>>
>>101758907
*Holds up spork* Deadpool is so randum XD
>>
>>101758861
did you put a CFG > 1 by any chance? when you do something like that you get blurry pictures
>>
>>101758241
Better understanding of art styles and/or artist names. But I'd settle for just style understanding alone, without the names.
It should know what impressionism is, what romantic luminism is etc and be able to produce a reasonable approximation of them when asked.
Right now it has close to zero idea what those terms mean, while SD1.5, SDXL, PIxart etc. all do.
>>
File: FD_00137_.png (1.07 MB, 1024x1024)
1.07 MB
1.07 MB PNG
>>101758907
Oh shit yeah I forgot that it makes great Deadpools.
>>
>>101758720
>The whole image-gen is a fad without tools
This. The models get better, but the usability is improving too slowly. Two years ago everyone thought that by now we would have very streamlined workflows or figured out how to get consistent characters. It feels like there hasn't been a real breakthrough in usability since controlnet.
>>
why is inpainting so slow in comfy? it seems to recalculate everything for no reason
is there an equivalent to "just inpaint area" like in A1111?
>>
File: 00270-4015753750.png (1.72 MB, 1024x1024)
1.72 MB
1.72 MB PNG
>>101758778
>ponyfag is a faggot
Hardly a surprise.
>>
>>101759029
every time there's a consistent character announcement it goes dead with no code
>>
File: FLUX_00026_.png (903 KB, 1152x896)
903 KB
903 KB PNG
flux cannot play pool, or even hold a pool cue
>>
>>101759041
We've known how to make consistent characters for a long time now. The only issue people aren't willing to go through the slight effort needed to curate the dataset for that and instead look for jeety methods like IP adapters and weird detailers in a vain attempt to slap their favorite Instagram model's face over porn.
>>
>>101758778
wym? What happened? My only grudge with him is the artists thing... he is either retarded or lying and I can't decide which one is worse
>>
File: file.png (585 KB, 1280x864)
585 KB
585 KB PNG
>>
>>101759029
>usability is improving too slowly
Because people are stuck in there ways. There are people in this very thread who refuse to use any interface except A1111.
We have a shit load of tools available to us, people simply don't use them.
>>
File: FD_00332_.png (1.42 MB, 1024x1024)
1.42 MB
1.42 MB PNG
>>101759076
Can't play card properly either. It always shows its hand.
>>
>>101759085
not everything is shitty fanart
what if someone wants to make a visual novel with actually new and unique characters
and those characters wear consistent outfits across scenes and poses
but also can wear different clothes
>>
>>101759109
Guess they better learn to draw those characters first then.
>>
>>101759109
Train a LoRA, it takes 10 minutes
>>
>>101759127
that's what I thought
>>
>>101759031
>"just inpaint area"
I meant, "only masked"
>>
>>101759138
make of lora of this character >>101759105 see you in 10 minutes
one picture of him sitting at a table
one picture of him laying in bed
one picture of him hitting a beach ball at the beach
>>
>>101759094
I think part of the problem is that the community has gotten way to decentralized. There are a lot of cool projects, but nothing is becoming the standard. Instead of throwing money through the drainage by training models that no one used, SAI would have been better if they had focused on developing tools and tried to monetize that. People are not going to pay for open source models. The bureaucratic mess alone of deciding how to monetize fine-tune's outputs makes it an annoyance. But I think people would have been willing to pay for an user interface that came with genuinely useful tools to squeeze the tech.
>>
https://reddit.com/r/StableDiffusion/comments/1eluf6f/workflow_sam2_flux_inpainting_facial_expression/
flux inpainting seems possible now
>>
>>101758778
worst thin about it is that the pony guy is so fucking stupid that the moment he steps out and interacts with anyone outside his cult and actual ML people he gets laughed at and made jokes of.
>>
>>101759164
I think it always was. I'm not sure why the guy above thought it wasn't.
>>
File: FLUX_00028_.png (616 KB, 1152x896)
616 KB
616 KB PNG
damn, that would have been good
>>
>>101759182
why do you have blurry shit like that? do you use cfg > 1?
>>
>>101759156
I do this all the time, but with real people
The training time is 10 mins, but preparing the dataset is equivalent of designing your character
If you start with only 3 pictures you are also spending some extra time on making more training data
>>
>>101759206
yawn, what I thought
>>
>>101759204
nope, it just happens every now and again
haven't figured out the pattern yet
>>
File: 1722957494_S1.png (3 MB, 1024x1024)
3 MB
3 MB PNG
>>101758452
wrong
>>
>>101759156
>make of lora of this character >>101759105 (You) see you in 10 minutes
You need more than one image to train a LoRA.
This would be a situation where you actually need to pick up a pencil and learn how to draw.
You create 10 images of a consistent character and that would be enough to generate infinite variations of it.
However
Tools like
https://huggingface.co/spaces/InstantX/InstantID
and
https://huggingface.co/spaces/TencentARC/PhotoMaker-V2
can do it with 1 image, no LoRA.
>>
>>101759094
i'm real interested in learning how to use the comfyUI options more effectively right now but the documentation aint great https://docs.comfy.org/get_started/introduction
>>
>>101759235
I'm glad it was so easy and fool proof that you proved it and showed it was extremely high quality, you proved me wrong yessir
>>
What happened to the guy who was training a cunny lora last night on a single 4090? I don't really care for the cunny, I just wanna know how the Lora went.
>>
>>101759210
How is this bad? If you want to make a consistent character and outfit, you have to invest some time getting the look right
When the LoRA is done, you can make 1000s of images with it
That's an excellent tradeoff
>>
File: FD_00149_.png (2.44 MB, 1536x1024)
2.44 MB
2.44 MB PNG
>>101759236
The best documentation is Anons here.
"Is it possible to do xyz in comfy?" usually elicits an answer.
>>
>>101759262
>just spend hours training a lora that is shitty
>>
>>101758861
I could have someone wearing a burka and it's still fuzzy. The more realistic the image the more blurred it gets.
>>
>>101759257
Not everything is competitive or an argument, Anon. I am simply providing you and anyone else who reads the comment information related to the discussion.
>>
File: sampl444e.jpg (341 KB, 1024x1024)
341 KB
341 KB JPG
>>101759232
here is flux PRO
>>
>>101759288
The dude doesn't want to accept that LoRA training is an important part of generating consistent characters.
>>
>>101759277
The more time you invest into the training set, the better the LoRA will be
The training data is basically the character sheet that any drawfag uses for reference, so there's no way around that anyways
If you suck at training LoRAs, you have skill issue
>>
File: file.png (2.16 MB, 1024x1024)
2.16 MB
2.16 MB PNG
>>101759266
well i'm happy to lurk and experiment, there's plenty to play around with since the flux model is new.
really just hoping emphasis and de-emphasis in prompts becomes possible like was discovered in SD 1.5 days after a few weeks with () and [], as far as i can tell there's no way to do that with flux at all for now
>>
>>101758249
I'm newish to this. How do you make sure to get the top of an anime girl's head in the picture? Too many of my generations cut off the very top of the head.

I started adding 'ahoge' to my keywords, which seems to have worked, but it's less than ideal.

'Full body' mostly works, but sometimes I want more of a close up with more detail.
>>
>>101759299
>>101759315
no proof lmao
you had 10 minutes yet no proof
so easy, yet no proof
weird
I fucking hate you summer fags using mom's pc
>>
>>101759342
head out of frame in the negatives.
>>
is a flux fp12 possible? fp16 is just too big and fp8 is too small kek
>>
Why exactly is it that flux latent previews look so much less noisy than SDXL (pony)'s at the same resolution?
>>
>>101759355
swap between fp8 and fp16 in-between each step
>>
>>101759360
my guess would be that the DiT architecture (flux) is just more stable than a unet architecture (SDXL)
>>
File: FLUX_00033_.png (1.12 MB, 1152x896)
1.12 MB
1.12 MB PNG
>>
>>101759346
>here is an impossible task that has clearly been defined as impossible
>What, you didn't complete the impossible task? LMAO I WIN
>>
>>101759352
thanks
>>
>>101759029
>It feels like there hasn't been a real breakthrough in usability since controlnet.
Huh? Off the top of my head:
>a ton of face and style transfer adapters
>realtime painting
>relighting
>camera positioning
>ready made 3D rigs for posing, emotions, proportions etc
>neural rendering of the entire scene
>superior workflows to get shit done, like Krita Diffusion
Nowadays, you can simplify your entire workflow to some basic operations in Krita:
- gen or find your look, make a lora
- gen some poses, or find them in google
- copy them to openpose, tweak
- transfer your style/faces/looks
- draw some color smudges
- get some basic prompt
- inpaint and compose
- render
- repeat
That's it, with this you do arbitrarily complex compositions, and never look back to A1111/ComfyUI (let it remain a backend, leave frontend to something competent like Krita). To combine things though, you need better models.

In actual hard cases and for extreme character consistency you can use a full-fledged 3D rig and depth maps.
>>
>>101759346
How do you think the thousands of celeb LoRAs on civit are made you mongrel?
Obviously many are bad, but some people actually know what they are doing
If it works with a photorealistic checkpoint of a real world person, I'm sure I could manage the same for a little comic character
>>
File: FD_00076_.png (1.09 MB, 1024x1024)
1.09 MB
1.09 MB PNG
>>101759317
>really just hoping emphasis and de-emphasis in prompts becomes possible
Yeah all this does it add noise to the image, doesn't actually emphasise anything.
But, while I understand how it would be great, I REALLY don't want to go back to the 1.5 ways of generation.
>>
>>101759395
completely missed the point, classic
>>
>>101759317
Exclamation points.
>>
>>101759346
>no proof
I'm sorry, did you want proof? Am I supposed to collect a few dozen images of a character and make a LoRA for you just to prove to you, the man who is too lazy to use full stops and capital letters, that LoRAs are useful in creating consistent characters?
No.
>>
>>101759410
moron
>>
>>101759407
what would be a good way to catch up for someone who dipped out after SD2 because of stability AI self sabotaging its own models?
i'm fine with just lurking, not really asking for spoon feed
>>
>>101759436
I already know it can't make consistent characters, I just want you to do it so I can laugh at you as the outfit changes every picture.
>>
File: file.png (2.35 MB, 1024x1024)
2.35 MB
2.35 MB PNG
>>101759431
i think i tried that and it didn't make a huge difference, but i'll do a bit more testing
>>
File: FD_00154_.png (1.65 MB, 1920x1088)
1.65 MB
1.65 MB PNG
>>101759446
What's your hardware?
Everything up to and including XL works the same way as it did in 1.5 more or less. Pony is the go-to version of SD right now. If you have lowVRAM or only care about making porn, this might be where to jump in.
If you have relatively decent hardware, jump directly into Flux. All the info you need is in the OP.
Flux uses natural language prompting instead of tags, the only thing you need to learn is to just describe the image exactly as you want to see it.
>>
>>101759431
Samefagging but also the MJ
term::{integer}
seems to do something but it might just be noise.
>>
File: FLUX_00036_.png (999 KB, 896x1152)
999 KB
999 KB PNG
where'd everybody go
>>
>>101759498
Can you make her S a swazi?
>>
File: YWNBAW.png (1.11 MB, 1024x1024)
1.11 MB
1.11 MB PNG
is there any guides on what artist artstyle FLUX could do. Plus the best method for prompting.
>>
>>101759501
no u
>>
File: file.png (959 KB, 1280x864)
959 KB
959 KB PNG
>>
File: file.png (863 KB, 1280x864)
863 KB
863 KB PNG
>>
>>101759505
Go to an LLM of your choice and ask them to write you a prompt and study the language they use real closely.
>>
File: file.png (1.08 MB, 1280x864)
1.08 MB
1.08 MB PNG
>>
>>101759542
what happens if you prompt
>I'm sorry but as an AI language model I cannot describe this picture.
>>
File: ComfyUI_temp_txvpv_00068_.png (1.43 MB, 1024x1024)
1.43 MB
1.43 MB PNG
>>
File: FD_00157_.png (1.48 MB, 1920x1088)
1.48 MB
1.48 MB PNG
>>101759510
I have made enough nazi girls, I am genning emaciated goths now.
>>
What is the Flux webui site?
>>
>>101759448
You are an idiot
Outfits getting picked up from the training data is a common headache when making celeb LoRAs
But it's not an issue with a character LoRA since all the training data will have the consistent outfit you want
What kind of troll are you, pretending to be retarded?
>>
>>101759559
Prompt something they can write about. Because you wanna look at how they describe an image and then repurpose that for whatever you want to do.
>>
>>101759587
yeah but what happens if you prompt
>I'm sorry but as an AI language model I cannot describe this picture.
>>
File: FD_00158_.png (1.26 MB, 1024x1024)
1.26 MB
1.26 MB PNG
>>101759559
>I'm sorry but as an AI language model I cannot describe this picture.
This
>>
>>101759568
https://comfyanonymous.github.io/ComfyUI_examples/flux/
>>
>>101759580
No I just think you're a child with extremely low standards who says it's good when it's not so I want you to prove me wrong and fall on your face. Basically I think you're lying or completely lack for taste because you have to defend your stupid toy to the death.
>>
File: FLUX_00046_.png (1.47 MB, 896x1152)
1.47 MB
1.47 MB PNG
>>101759559
this
>>
File: file.png (1.73 MB, 1280x864)
1.73 MB
1.73 MB PNG
>>
File: file.png (1.55 MB, 1280x864)
1.55 MB
1.55 MB PNG
>>
>>101759597
>refusal prompt
>gives the dog like on dalle3
what does flux mean by this?
>>
>>101759600
Thanks
>>
File: ComfyUI_temp_txvpv_00069_.png (1.5 MB, 1024x1024)
1.5 MB
1.5 MB PNG
>leverage llms to elaborate on simple input prompts to take advantage of flux's t5xxl
>>
>>101759633
kek
>>
>>101759601
I'm not doing your work for you
The other day I took 5 pictures of a stockphoto model girl I like
Her sweater shows up in every gen of course
I actually had to do extra work so I can gen her with other outfits
If she had been a comic character I would have been done, the sweater would just be part of her
What is so hard to understand about this? Dozens new celeb LoRAs are uploaded to civit every day that do the same thing
>>
File: file.png (1.45 MB, 1280x864)
1.45 MB
1.45 MB PNG
>>
>>101759661
Fantastic gen
>>
>>101759670
Disengage. The man is clearly parasitizing off of (You)'s and has chosen you as his victim.
>>
>>101759661
I wish T5 would be able to get simple prompts though, going for the boomer prompt each time is boring
>>
>>101759661
At this point aren't you just copy/pasting between generative models while the AI does all the creativity?
>>
>>101759688
that's because they trained on AI slop captions and didn't vary the lengths
>>
>blue and brown
im nooooticing, the pattern recognizing brain is kicking in...
>>
>>101759637
no problem
do we know for sure if flux works better with natural language or is everyone just using LLMs to generate prompts now and we dont actually know that's better for sure?
>>
File: ComfyUI_02549_.jpg (738 KB, 2304x1792)
738 KB
738 KB JPG
>>
>>101759696
>that's because they trained on AI slop captions and didn't vary the lengths
that's another reason to make a finetune that will add more trivias (characters anime, celebrities, drawing/painting style) with simple prompts instead of going for LLMs, you'll kill two birds with one stone because you'll also make the model understand simple prompts
>>
File: FD_00161_.png (1.15 MB, 1024x1024)
1.15 MB
1.15 MB PNG
>prompt: A self portrait
guys this is what Flux looks like :3
>>
>>101759718
looks like cfg is a touch high
>>
>>101759723
I'll make the logo
>>
File: ComfyUI_02550_.png (1.45 MB, 1152x896)
1.45 MB
1.45 MB PNG
>>
File: file.png (193 KB, 1280x864)
193 KB
193 KB PNG
>>
File: FLUX_00041_.png (924 KB, 896x1152)
924 KB
924 KB PNG
>>
File: FD_00162_.png (2 MB, 1920x1088)
2 MB
2 MB PNG
>>101759741
Thanks. She's great
>>
File: ComfyUI_temp_txvpv_00072_.png (1.35 MB, 1024x1024)
1.35 MB
1.35 MB PNG
>>
>>101759772
I am so angry that she is eating the pizza sideways. What a fucking bitch.
>>
>>101759695
Writing a 2500 characters long prompt takes too much time. Besides, 1girl, (best quality, HDR, 4k, by Greg Rutkowski:1.4) isn't the best way to flex your creativity.
>>
https://xcancel.com/HalimAlrasihi/status/1820918388002009363#m
How much vram is this gonna ask? If it also works fine on fp8 like flux then maybe we're back
>>
is it possible that the t5 encoder has some form of content filtering trained into it, making jailbreak prompts like with early Dall-e a possibility
>>
File: file.png (853 KB, 1280x864)
853 KB
853 KB PNG
>>
File: FLUX_00042_.png (1.49 MB, 1024x1024)
1.49 MB
1.49 MB PNG
>>
>>101759772
>anime sucks
>posts anime
>>
>>101759840
nope it's raw, same one you can use on an uncensored Pixart model, all it does is convert the text you write into something that can be used to condition the VAE, it's completely uncensored and probably knows all sorts of horrific shit
>>
>>101759840
no, the censorship is directly on the DiT model, not on the text encoders, because that would be easy as fuck to jailbreak otherwise
>>
>>101759857
lol
>>
File: ComfyUI_01002_.png (1.73 MB, 1024x1024)
1.73 MB
1.73 MB PNG
>A blitzball stadium
Oh man, flux doesn't know anything about final fantasy X :(
>>
File: file.png (1.06 MB, 1024x1024)
1.06 MB
1.06 MB PNG
>>
File: ComfyUI_00889_.png (1.06 MB, 1280x864)
1.06 MB
1.06 MB PNG
>>
File: FD_00173_.png (1.07 MB, 1024x1024)
1.07 MB
1.07 MB PNG
>>101759858
>>
File: FD_00172_.png (1.1 MB, 1024x1024)
1.1 MB
1.1 MB PNG
>>101759918
Describe a Blitzball stadium without using the word Blitzball.
>>
File: Flux_00203_.png (1.47 MB, 1024x1024)
1.47 MB
1.47 MB PNG
>>
>>101759594
Pretty far out lol.
>>
File: file.png (1.76 MB, 1024x1024)
1.76 MB
1.76 MB PNG
>>
File: Flux_00207_.png (1.47 MB, 1024x1024)
1.47 MB
1.47 MB PNG
>>101759964
>>
File: Flux_00208_.png (1.47 MB, 1024x1024)
1.47 MB
1.47 MB PNG
>>101759975
>>
File: ComfyUI_01004_.png (1.56 MB, 1024x1024)
1.56 MB
1.56 MB PNG
>>101759945
>Hatsune Miku performing on a futuristic, spherical water-filled arena suspended in the air. The massive transparent globe is surrounded by stadium seating. Inside the watery sphere, Miku is singing and dancing while suspended in the liquid, her long teal pigtails flowing around her. The scene has a sci-fi sports venue atmosphere, with cheering crowds and colorful lighting effects.
That's better, but that will never be as close as a concept the model learned by itself, I hope the finetunes will focus on adding more trivia into it
>>
>>101759986
getting the trivia and captioning it will be a lot of work
>>
>>101760012
you don't need to caption it in a complex, some pictures described quickly like "a blitzball stadium" would suffice
>>
File: Flux_00209_.png (1.35 MB, 1024x1024)
1.35 MB
1.35 MB PNG
>>101759984
>>
Why does flux doesn't work on _cfg_pp samplers?
>>
File: Flux_00210_.png (1.47 MB, 1024x1024)
1.47 MB
1.47 MB PNG
>>101760037
>>
Is there a way to hide some samplers on comfyUi? flux doesn't work with a lot of them
>>
File: ComfyUI_00900_.png (971 KB, 1024x1024)
971 KB
971 KB PNG
>>
File: FLUX_00047_.png (1.38 MB, 896x1152)
1.38 MB
1.38 MB PNG
not great
>>
File: Flux_00211_.png (1.25 MB, 1024x1024)
1.25 MB
1.25 MB PNG
>>101760048
>>
>>101759964
>>101759975
>>101759984
Do you think I can run this on my 1200W PSU?
>>
>>101760034
I don't think you understand how infinite trivia is
>>
>>101760072
say sike and mean "1200kW" instead kek
>>
>>101760086
flux knows a lot of trivia, the goal here is to finish the job, not starting from scratch
>>
File: FD_00178_.png (1.29 MB, 1024x1024)
1.29 MB
1.29 MB PNG
>>101760060
I have laughed at this for several minutes. Good job, Anon.
Flux is a meme machine.
>>
>>101760096
This. These models are pretty good at filling in the gaps once you find a few examples and tag them correctly. Just needs a little training to bring them out with that tag.
>>
File: file.png (882 KB, 1024x1024)
882 KB
882 KB PNG
>>101760096
I'll make the logo
>>
File: ComfyUI_00904_.png (958 KB, 1024x1024)
958 KB
958 KB PNG
>>101760100
It's fucking impossible to get people around elmo to not appears as muppet parodies of themselves.
>>
File: Flux_00213_.png (1.42 MB, 1024x1024)
1.42 MB
1.42 MB PNG
>>101760072
probably not bro
>>
>>101760096
Anon, you can't "finish" trivia. That's essentially asking for flu to contain the entire collective knowledge and creative artworks of humanity.
>>
>>101760132
>That's essentially asking for flu to contain the entire collective knowledge and creative artworks of humanity.
come on anon, it's not that deep, even SD1.5 has more trivia than flux, it's a drawback that needs to be fixed

And don't pretend it's impossible either, Midjourney has pretty much all the trivia people are expecting for
>>
>>101760121
Try describe the people first and elmo later.
>>
File: file.png (1.57 MB, 1024x1024)
1.57 MB
1.57 MB PNG
>>
>>101760138
I think loras for these concepts are perfectly adequate. No need to retrain the thing to understand and obscure sport from a 25 year old video game.
>>
>>101760138
Midjourney has people being paid full time to assemble trivia
>>
>>101760158
yet there's some papers showing that the mode a model has diversity in it, the better it is overall, loras aren't the solution
>>
>>101760168
I hate normiejourney so much its unreal.
>>
>>101760174
This is the best model we have ever had locally and you are complaining about the lack of diversity about specific subjects that it has.
>>
>>101760190
Way to miss the point, I'll repeat again in case you decide to ignore it again, the more a model has diversity, the better it will be overall. I'll make it more clear so that you can understand it better, you and I will benefit from more trivia for two reasons:
- More trivia lol, seems simple enough, having to download a lora for every single concept you're thinking off is a retarded idea, we should stop relying on loras so much
- The model will be better overall, it means better quality pictures, better understanding and so on, you don't want that anon?
>>
File: file.png (1.76 MB, 1024x1024)
1.76 MB
1.76 MB PNG
>>
File: 0.jpg (208 KB, 1024x1024)
208 KB
208 KB JPG
>>101760070
yep
>>
File: file.png (711 KB, 1024x1024)
711 KB
711 KB PNG
>>101760205
>>
>>101760205
NTA but I don't think anyone is arguing that more diversity is a bad thing. I don't think it's very realistic to expect one team to accurately tag every obscure piece of trivia into a model in a timely fashion either. LoRA are a good compromise in the absence of what would probably be hundreds of millions in funding tagging and research.
>>
>>101760228
like I said, I don't expect flux to know everything in existance, but c'mon it doesn't know final fantasy X and Picasso, that's a red flag, besides, adding trivia is the most simple way of improving the model's overall quality. That's two birds in one stone, it should be done at some point imo.
>>
File: file.png (912 KB, 1024x1024)
912 KB
912 KB PNG
>>101760252
>it's problematic that other people didn't do this work for me
>>
>>101760252
Is me not know jack shit about final fantasy a red flag too?
I'm sorry your obscure sport from a game most of the humans on earth have never played is not in the model, guess you'll need to foot the bill to train it back in.
>>
>>101760260
>for me
for everyone you mean?
>>
File: FD_00180_.png (974 KB, 1024x1024)
974 KB
974 KB PNG
>>101760228
Pretty much this. MJ has everything curated into 1 model. The local community is split between hundreds of models and thousands of LoRAs.
It's unrealistic to train a bunch of these concepts into a local model, and that's why LoRAs are exeptional, they are option addons to customise your model.
>>
File: Flux_00212_.png (1.61 MB, 1024x1024)
1.61 MB
1.61 MB PNG
>>
>>101760263
>he ignored Picasso to make his argument stronger
lol
>>
File: file.png (1.13 MB, 1024x1024)
1.13 MB
1.13 MB PNG
>>101760266
>>
>>101760263
>final fantasy
>obscure
excuse me?
>>
>>101760275
>He added picasso to make his argument stronger while glossing over the fact only he cared about some random weeb shit from an old game.
>>
>>101760277
So you don't want the model to be better in quality because it will know more trivia? weird take but ok

>>101760284
at least you can admit that an image model not knowing fucking Picasso is a red flag?
>>
File: FD_00025_.png (1.48 MB, 1024x1024)
1.48 MB
1.48 MB PNG
>>101760284
What about picasso?
>>
>>101760045
cfg pp needs cfg to work.
>>
File: file.png (972 KB, 1024x1024)
972 KB
972 KB PNG
>>101760290
>>
>>101760299
no anon, adding boomer prompt to make the model understand cubism isn't the same thing as the model understanding what a picasso Style is
>>
>>101760290
Without knowing how the images were tagged and catalogued I'm not wiling to admit anything. I think we can both be absolutely certain picasso is in the dataset, whether it's tagged as that though is another question. Which kind of goes back to the original point. If we want a human to accurately tag an image set full of trivia, someone needs to pay those people. Will you do it?
>>
File: Flux_00218_.png (1.15 MB, 1024x1024)
1.15 MB
1.15 MB PNG
>>
>>101760317
>Which kind of goes back to the original point. If we want a human to accurately tag an image set full of trivia, someone needs to pay those people. Will you do it?
What is Laion?
>>
>>101760312
>No not like that
>>
File: FD_00028_.png (1.59 MB, 1024x1024)
1.59 MB
1.59 MB PNG
>>101760290
>So you don't want the model to be better in quality because it will know more trivia? weird take but ok
Nobody is saying that, we are saying it's unrealistic.
>at least you can admit that an image model not knowing fucking Picasso is a red flag?
No because I can mimic his style without naming him.
>>101760312
what's the difference if the outcome is the same?
>>
File: Capture.jpg (254 KB, 2908x1529)
254 KB
254 KB JPG
>>101760301
It still doesn't work at cfg = 3
>>
File: file.png (1.36 MB, 1024x1024)
1.36 MB
1.36 MB PNG
>>101760299
>>
>>101760324
>What is Laion?
Laion is a dataset, the poor tagging of which resulted in both 1.5 and SDXL having subpar prompt understanding. Yes, it tagged artists at the expense of a lot more.
>>
>>101760330
>what's the difference if the outcome is the same?
because it doesn't work all the time, because every artists have a lot of nuances that can't be just described with words, that's a retarded take, and what about celebrities? it doesn't know how to render Messi accurately, how the fuck you're gonna manage to make it work with words? the concept exist for a reason
>>
File: Flux_00219_.png (1.48 MB, 1024x1024)
1.48 MB
1.48 MB PNG
>>
>>101760344
Clit Eastwood
>>
>>101759661
What model? Have instruction to share?
>>
File: file.png (890 KB, 1024x1024)
890 KB
890 KB PNG
>>101760324
LAION is forbidden
>>
>>101760330
>what's the difference if the outcome is the same?
because you're making the model understand what cubism is, not what Picasso's nuances are in it, there's a lot of artists that made cubism, and Picasso's cubism is different to another artist cubism, and the model won't get that if it hasn't the Picasso concept in the first place
>>
>>101760346
This really just comes back to the same question over and over again. Are you going to tag the new dataset with the artist and celebrity names to rectify this or do you expect some to foot that bill for you? We're talking millions maybe billions of images here.
>>
>>101760369
serious question, how many pictures of a concept you need to make it work? let's say a celebrity for example
>>
>>101760373
Like ten or twenty
>>
File: FD_00184_.png (398 KB, 512x512)
398 KB
398 KB PNG
>>101760346
>because it doesn't work all the time
for you maybe
>>
File: images.jpg (6 KB, 198x254)
6 KB
6 KB JPG
>>101760382
>for you maybe
Go ahead and make an accurate render of klaus schwab on flux with just words, I'll be waiting anon
>>
File: file.png (717 KB, 1024x1024)
717 KB
717 KB PNG
>>101760373
dozens and dozens and like they said, ultimately millions on top of the regularization dataset so you're talking about tens of millions of 1K+ images, properly tagged, oh and they need to be based on anon's specific trivia. So early contemporary Picasso from that one day he had that one bread.
>>
when img2vid animated hentai good?
>>
>>101760367
What exactly are his intricacies and why can't you just describe them?
I honestly think using artists names is a crutch
>>
File: file.png (2.55 MB, 1024x1024)
2.55 MB
2.55 MB PNG
hi guys i have a really retarded question to bump the thread with, does comfyUI store images somewhere if i want to queue up 10 in a row so i dont have to save each one individually
>>
File: Flux_00185_.png (1.14 MB, 1024x1024)
1.14 MB
1.14 MB PNG
>>
>>101760401
if using words would be enough to describe any styles, then we wouldn't need loras anon, stop talking with your ass please
>>
File: FD_00192_.png (352 KB, 512x512)
352 KB
352 KB PNG
>>101760394
Once a Klaus Schwab LoRA comes out I will.
>>
File: ComfyUI_00024_.png (1.19 MB, 1024x1024)
1.19 MB
1.19 MB PNG
>>101760405
output directory/folder
>>
>>101760394
>it doesn't know this specific public figure
I think we can play this game forever, anon. Remember how trivia is infinite? Add another 50 images to the millions. How about you start with your list of 100k most important public figures.
>>
File: Flux_00222_.png (1.34 MB, 1024x1024)
1.34 MB
1.34 MB PNG
>>101760405
it all goes into the output folder in your comfyUI folder.
>>
If you read the subtext of the argument. It's just some lazy weeb upset that his favorite cunny artist isn't named in the dataset and isn't willing to admit he's so technically inept that he can't gather the miniscule number of images required to train a LoRA.
>>
>>101760409
It is still a crutch. Look how many prompts to this day still have "By greg rutkowski and alphonse mucha"
I stopped prompting for artists a long time ago
>>
>>101760382
>I can make it work all the time
>>101760411
>I can't make it work all the time
>>
>>101760414
>>101760416
thanks guis
>>
>>101760415
I'm not asking it to know fucking random people, but c'mon it doesn't know Messi and Ronaldo, you think this is some trivia that should be ignored? that's delusional
>>
>>101760428
I can produce cubism constantly
I can't create a person unknown to the model
Also nogen.
>>
>>101760437
Actually you are, you are talking about hundreds of thousands of public figures all completely based on your personal bias. I don't give a fuck about soccer. What about all the gymnasts in the Olympics? Where's their names on your list? What about the swimmers? What about all the astronauts that have flown to space? TRIVIA. IS. INFINITE.
>>
You guys really need to learn when to stop replying.
>>
>>101760437
You're making some wild assumptions about the resources and time available to the team.
The dataset was likely captioned using an LLM. If it doesn't know who the people in the image are, they're not gonna get tagged. If you want those people in the dataset you have two options. Get a job at BFL and manually go through their entire dataset and edit every tag to accurately display artist names and celebrity names over the course of several years or train a LoRA yourself.
I'm sorry you can generate Messi porn right off the bat but the focus of this model was clearly prompt adherence and not tagging artists and celebrities.
>>
File: Flux_00159_.png (1.46 MB, 1024x1024)
1.46 MB
1.46 MB PNG
>jeets going to the meatgrinder
>>
>>101760452
>TRIVIA. IS. INFINITE.
Stop ignoring the point, it should know mainstream trivia at least, not knowing Messi is a huge red flag
>>
>>101760437
But it doesn't need to if LoRAs can be trained, that's the thing, the base model does not need to contain this information. You can add any trivia you want to a model via LoRAs. That's why they exist, it's their entire purpose.
Explain why the base model absolutely NEEDS to contain this information? And where does it end?
>The model doesn't know who Winston Peters is, dogshit model. Someone should spend $30k training him into the base model instead of $25 making a LoRA of him
>>
File: Flux_00043_.png (1.16 MB, 1024x1024)
1.16 MB
1.16 MB PNG
>>
i was the first anon to bring up picasso days ago and i love to see how it's taken off into a schizo war.
>>
>>101760452
>I don't give a fuck about soccer
billions of people give a fuck about soccer, Messi is one of the most popular guy in the world, you're sounding like the guy who onlly want obscure trivia at the end, how ironic is that?
>>
File: ComfyUI_00012_.png (1.18 MB, 1024x1024)
1.18 MB
1.18 MB PNG
>LISTEN THIS THING IS SO SHIT IT DOESN'T EVEN KNOW WHAT MY 8TH GRADE MATH TEACHER LOOKS LIKE
>>
File: ComfyUI_Flux_02027_.png (1.19 MB, 1024x1024)
1.19 MB
1.19 MB PNG
This took a filthy number of tries, and I had to describe as raising the third finger showing the back of her hand, because the model can count after all, but is it just a bad dataset on their end?
>>
>>101760478
>red flag
What does this mean? This is the best model we have had to date, what does a red flag mean to you? Are you going to go on a date with Flux?
>>
>>101760493
in 3 years this is what will be teaching 13 year old kids math
>>
File: Flux_00157_.png (1.3 MB, 1024x1024)
1.3 MB
1.3 MB PNG
>>
>>101760466
>The dataset was likely captioned using an LLM. If it doesn't know who the people in the image are, they're not gonna get tagged.
it works fine on some people like Trump and Miku, they haven't used LLM all the time, and who cares? we got the model and we see its drawback, our job is to fix that

>I'm sorry you can generate Messi porn right off the bat
what the fuck? you sound like a fucking creep right now
>>
>>101760478
Mainstream to what? To you? Again, I'll say it again, everything is about you. Everything is based on your bias and preferences, you don't give a shit about anything except the trivia you care about. How much K-Pop knowledge is in the dataset? How about the top 100 most popular singers in history? Fuck off, seriously. It's fucking obnoxious seeing someone bitch about something they personally care so much about but at the end of the day they do fucking nothing to solve. I look forward to your Huggingface dataset with the top 100k public figures.
>>
>>101760492
I'll make the logo for your dataset
>>
>>101760509
>Mainstream to what? To you?
To billions of people, Messi is one of the most popular guys in the world, like I said you're the one that is more interested on obscure trivia instead of obvious one, not me
>>
>>101760519
I look forward to the work you put in to solve this. In the time you've been bitching you could've prepared and captioned 50 Messi pictures.
>>
>>101760519
Wait for a Lora, then you can train him fucking your mouth or whatever you want to do.
>>
File: Flux_00227_.png (1.15 MB, 1024x1024)
1.15 MB
1.15 MB PNG
>>
is it possible to merge multiple loras into a model to add all the trivia from those loras?
>>
>>101760504
>our job to fix
you misspelled "your"
that's what you actually mean
let's not forget this conversation is really "anybody but me"
>>
File: FD_00204_.png (172 KB, 512x512)
172 KB
172 KB PNG
I love that it's coherent at lower resolutions. Can shit out a fuck load of concepts to find a good prompt style then crank it up to get a full res one
>>
>>101760531
>you can train him fucking your mouth or whatever you want to do.
wtf? you need to take some pills anon

>>101760536
>let's not forget this conversation is really "anybody but me"
if this a joke? there's a lot of complain about how flux doesn't know enough styles, that's not a "me" problem retard
>>
>>101760534
no, loras rape the weights, they bias the weights to generate the subject of the lora
the only way to do le epin trivia fine tune is literally acquiring millions of images and do a full fine tune
>>
is "trivia" some sort of ESL speak? I think the word you want is concept/knowledge.
>>
>>101760546
>no, loras rape the weights, they bias the weights to generate the subject of the lora
what happen when we load multiple loras during an inference, if some loras have weights in common, thatm means one is overwritting the other right?
>>
>>101760544
you have a really bad habit of appealing imaginary people when you actually mean you
this type of argument might work with your parents
>>
>>101760548
>is "trivia" some sort of ESL speak?
trivia is a valid dictionary english term you ESL fuck
https://dictionary.cambridge.org/dictionary/english/trivia
>>
>>101760555
Keep coping nigger, and keep pretending this problem doesn't exist
https://reddit.com/r/StableDiffusion/comments/1els4li/flux_really_bad_at_styles/
https://reddit.com/r/StableDiffusion/comments/1ek412x/maybe_i_found_a_way_to_direct_the_style_in_flux/
https://reddit.com/r/StableDiffusion/comments/1ejilpj/flux_lack_of_artists_use_auto1111_img2img_to/
>>
>>101760569
oh no, anyways
what are you doing about this?
>>
>>101760574
Moving the goalpost I see.
>>
File: 1721466381825210.png (1.4 MB, 1024x1024)
1.4 MB
1.4 MB PNG
>>101760469
>cletus going to live the rest of his life poor and uneducated while his superiors take over his country
>>
>>101760469
is gore allowed on a blue board?
>>
>>101760578
>please someone do this thing for me everyone cares about it
>>
>>101760594
>me
>everyone
choose one anon
>>
File: Flux_00230_.png (1.22 MB, 1024x1024)
1.22 MB
1.22 MB PNG
>>101760590
is it gore when the prompt contained "ketchup" and not "blood" ?
>>
>>101760597
You see, you like to exaggerate when making arguments that personally benefit you. It's a bad argument tactic of a weasel.
>>
File: FLUX_00060_.png (1.17 MB, 896x1152)
1.17 MB
1.17 MB PNG
>>
>>101760605
>that personally benefit you
Johny Johny, telling lies again? >>101760569
>>
File: FD_00214_.png (919 KB, 1024x1024)
919 KB
919 KB PNG
>>101760569
>reddit
oh
>>
yeah flux just looks shitty sorry, very greasy and smooth-plastic. limited character knowledge is lame too. has anyone compiled a list of characters it actually does work for? i've seen the same miku, mario, and trump over and over again. making fake products is fun tho
>>
>>101760618
Yes anon. And at the end of the day you're trying to convince people like me to do the work for you. You're not doing anything yourself outside of bitching. You're basically that one asshole who guilt trips someone with a car because all your friends want to go to the mall. Anyways, I'll make the logo.

Last reply.
>>
>>101760605
I don't know why do you pretend the lack of styles diversity isn't a popular concern, just on this thread alone there's a lot of talk about it
>>101758997
>>101758527
>>101758568
>>101758568
>>
File: Flux_00231_.png (1.17 MB, 1024x1024)
1.17 MB
1.17 MB PNG
>>101760603
>>
>image models are currently a choice between:
>great coherence and prompt following with a slopped soulness style
>good knowledge of styles and artists but shitty coherence and only vague following of the prompt
grim
>>
>>101760629
this one thread alone is one asshole complaining really loud while also doing no work
>>
>>101760650
>one asshole
>one
insane amount of copium
>>
>>101760658
oh I see you're one of the friends wanting to go to the mall, where's your car again anon?
>>
>>101760646
why can't it be both? desu the work should've been done on the captioner models, they should know to recognize artists and shit instead of going for "a woman" and "a man"
>>
>>101760670
>somebody (who isn't me) should work on the caption models
>>
>>101760667
you don't see the irony there? you're pretending this issue isn't important because you're lying to yourself to the fact there's only one guy concerned by the lack of styles, like I said, keep coping and pretend that flux is perfect, not a lot of people will agree with that stank take but you do you
>>
>>101760678
>if a restaurant serves you dogshit on a plate, you're not allowed to complain about it unless you're a professional chef yourself
>>
>>101760685
>if someone gives me free food I throw it on the ground and ask why it's not gourmet
>>
>>101760678
>no one should be allowed to complain about anything, ever
>>
wtf does guidance do????????
>>
>>101760670
The weakness has always been the captioning because it's expensive to label data consistently and correctly. Finding "high quality images" to train on is comparatively easy.
>>
>>101760684
I'm pointing out that people like you complain about everything and never do anything about it. If it knew Messi you'd complain about Picasso. If it knew Picasso you'd complain it doesn't know William Shatner.
>>
File: Donkey Kong.jpg (59 KB, 1056x516)
59 KB
59 KB JPG
style will almost have to be tuned in going forward because local bakers are too scared/lazy to scrape art sites. after the artist backlash none of them want to actually put artist names in their model. if you want styles you'll be waiting around for lora and finetunes.
its either:
A: styles get removed by bakers intentionally
or B: styles/characters get washed away thanks to primitive local captioners unable to recognize them
then you also have the architecture which has its own 'style' issues built in (see: why every recent model ie sd3, flux, auraflow all look so greasy compared to older ones)
>>
>>101760695
ask in the other thread. we're busy
>>
>>101760690
it's not free, our hype give this company money through investors, they wouldn't get those millions of there wasn't users interested by their product, if it's free, you're the product, can't believe I have to explain that but if I can educate some retards on that matter at least I should try
>>
>>101760701
yeah but imo, the most work should've been done on the captioner, imagine you have a perfect captionner model that can recognize artists, styles and celebrities out of the box, that would mean that you wouldn't need to do the captioning work anymore each time you want to train a new model
>>
>>101760711
there's no other thread, stop fapping to gay furries for one second and answer the question
>>
>>101760684
The issue literally isn't important if we can train LoRAs. That's what LoRAs do, they add trivia to a model.
>>
File: file.png (767 KB, 1024x1024)
767 KB
767 KB PNG
>>101760712
>>
File: file.png (1.11 MB, 1024x1024)
1.11 MB
1.11 MB PNG
>>101760724
>>
>>101760684
Nobody is saying flux is perfect, we wouldn't be talking about fine tunes if it was. The issue comes with the unreasonable expectation that the model be both cohesive and be tagged with every unique concept possible.
The reasonable people understand that work needs to be done to achieve that. The unreasonable people expect it handed to them on a silver platter.
>>
>>101760730
How many times to I have to repeat that? Loras shouldn't be the solution to all our problems, it's a hack
>>101760174
>yet there's some papers showing that the mode a model has diversity in it, the better it is overall, loras aren't the solution
adding trivia to a model doesn't just give it more knowledge, it makes it better at image quality because it has more diversity training, we would get a flux with better quality images if it fucking new how to render Messi correctly
>>
>>101760724
Yeah but then you need good captioned data to train the captioner on. It's a catch 22
>>
File: file.png (862 KB, 1024x1024)
862 KB
862 KB PNG
>>101760758
>>
>>101760755
Flux as far as quality goes is already extremely high quality. What more are you looking for exactly?
>>
>>101760758
yeah but you only do it once on the captioner model, and then you'll be able to use it on each model training, everyone would benefit from that, wanna train a model, just use this perfect captioner to do the captionning job for you and boom
>>
File: FD_00224_.png (1.06 MB, 1024x1024)
1.06 MB
1.06 MB PNG
>>101760755
>it's a hack
Nigger this whole fucking imagen is a hack what the fuck are you on about.
What you want is a single model that knows ever concept known to main to be given to you for free.
I will tell you what it does know, the reddit logo, which is where you need to be. Literally nobody in here agrees with you.
>>
File: file.png (819 KB, 1024x1024)
819 KB
819 KB PNG
>>101760774
>>
>>101760773
I never said it's not high quality, I said it could've been even better, who wouldn't want that?
>>
File: file.png (731 KB, 1024x1024)
731 KB
731 KB PNG
>>101760784
>>
File: Bt8zq8QCcAAgng1.jpg (40 KB, 475x356)
40 KB
40 KB JPG
>>101760779
>Literally nobody in here agrees with you.
>>
>>101760784
We all want that. But you keep brushing off the fact that there is a cost associated with that.
>>
>>101760797
That's a cost he's willing to have you pay.
>>
>>101760797
I never pretended it would be easy, where did I say that? stop saying random on behalf on myself please?
>>
>>101760805
>I know it's not cheap but I figured your pockets are deep
>>
>>101760724
dalle style knowledge heavily hints at it not being pure autocaptions, as some of them seem so niche that they would never be picked up (blazblue/bomberman/tales artstyle). if it actually was all autocaptioned then they must have the world's most godly tagger. there needs to be some way to preserve the image's origin/character without it being washed away entirely by ai captions. the captions should enhance the original info, not replace it.
>>
>>101760799
>no one is allowed to complain about everything, ever
whatever you say flux shill
>>
>>101760805
So if you know it's difficult, expensive and time consuming, why are you putting the onus on BFL to do that work for you? We got a pretty good model out of their effort, it will probably only get better.
>>
File: FD_00352_.png (2.23 MB, 1024x1536)
2.23 MB
2.23 MB PNG
>>101760774
You are welcome to manually caption 1 billion images to get it started.
Though I have been thinking about this exact thing, not that I would ever do it, far too lazy.
But what about a site that shows people an image and a caption generated by an LLM. A human either agrees or disagrees with the caption. Any disagrees go to a human moderator for confirmation, and if someone tries to poison the well they get shadow banned, they still see images but they don't get submitted for approval.
Then give it to a bunch of autistic people to curate a dataset.
>>
>>101760817
Actually yes, people hate people who complain and do nothing. You plan on doing nothing, it's obvious now.
>>
>>101760821
>why are you putting the onus on BFL to do that work for you?
again, you're talking random shit on my behalf, did I said it should be BFL to finish the job?
>>
>>101760828
>I thought if I complained enough you would drive me and my friends to the mall, guess I was wrong.
>>
>>101760826
I'll keep complaining and you'll keep seething about it, how about that?
>>
>>101760828
It's VERY strongly implied you expect them to undertake that work on your behalf. And if that wasn't your intention then you're probably a retard.
>>
>>101760846
nigger you've spammed the thread with way more whineposts than the other guy at this point
>>
>>101760840
>guess I was wrong.
good to know you admit that at least
>>
>>101760841
Anon in case you haven't noticed I'm taking the piss out of you because I don't care nearly as much as you do. And you see, I get my way doing nothing. You don't.
>>
>>101760858
>I don't care nearly as much as you do.
you care enough to spend hours arguing with someone who DARES to say flux could be improved, lol
>>
File: file.png (1.1 MB, 1024x1024)
1.1 MB
1.1 MB PNG
>>101760857
>>
>>101760854
>Waah why didn't BFL tag inverted nipples and clit piercings on their general purpose public use model?! They should do it all again by hand this time!
>>
>>101760872
can't wait for another episode tomorrow where you complain about Picasso
>>
>>101760815
why not just take every image in the dataset, reverse search it, and then ask an LLM to summarize the results if the image contains any figures, characters, or was illustrated by an artist?
>>
>>101760879
nta, take your meds
>>
File: straw-man-argument.jpg (217 KB, 1939x1425)
217 KB
217 KB JPG
>>101760879
>anons: "flux doesn't have enough styles"
>you: "what you mean to say is that you want BFL to tag inverted nipples and clit piercings!"
Holy strawman
>>
>>101760872
Saying flux could be improved and what you're doing are not the same thing.
Everyone agrees that flux can be improved. The only difference is you refuse the accept that someone is going to have to foot that bill and every time that fact is brought to your attention you immediately deflect and say
>But I just want it to be better, someone should tag messi.
>>
>>101760885
can't wait for another episode of you pretending to not care about flux critisism while arguing for hours with them
>>
>>101760903
>Everyone agrees that flux can be improved.
so you also wait for someone to "have to foot the bill" to improve the model, see we're not so different after all
>>
>>101760900
Imagine being this offended by a caricature of your arguments. If you think it's an ugly representation it's only that way because it's an inflated example of what you're already saying
>>
this place used to be chill before all the tourists came here
>>
>>101760919
No, were not. I just didn't walk into this thread expecting the model to know nigger ball from final fantasy 17 or what the fuck ever set you off.
>>
>>101760920
Just admit you did some strawman because you like to argue in bad faith, it's not that hard
>>
>>101760825
It's probably easier just to read a billion AI generated captions yourself lol.
>>
neggles (flux side-dev) posted here before marking this place as a confirmed feedback hub therefore all complaintposting is justified. for even a 1% chance the devs see it, it's worth it. make your feedback known!
>>
>>101760930
but you expect the model to be better, therefore expect someone to burn some GPU money for you, how is that different from me anon? tell me
>>
>>101760935
He posted here until he got spammed links to images of a dead baby.
>>
>>101760941
I'm not upset by the fact that BFL didn't bake those niche concepts in the model in the first place. I understood it and moved on.
>>
>>101760879
They should. It's their shit lol.
>>
File: FLUX_00067_.png (1.07 MB, 896x1152)
1.07 MB
1.07 MB PNG
>>
File: FD_00229_.png (1008 KB, 1024x1024)
1008 KB
1008 KB PNG
I ran this 10 times and every time it spelled the word "portray" wrong.
>>
>>101760964
>>
>>101760929
That's the worst bit, these people are the regulars.
>>
>>101760919
the difference is you initiated the bitching
>>
>>101760954
for the fourth time, adding detail not only makes the model more knowledgable, it also enables the model to produce more quality. Everyone benefit from having a model producing better images overall, why are you against something like that?
https://arxiv.org/abs/1807.01477
>In addition, the typical applications where the diversity technology improved the machine learning performance have been surveyed, including the remote sensing imaging tasks, machine translation, camera relocalization, image segmentation, object detection, topic modeling, and others.
>>
File: file.png (851 KB, 1024x1024)
851 KB
851 KB PNG
>>101760988
>please take me and my friends to the mall it's been shown that social activities make people happy
>>
>>101761011
Holy Moving the goalpost.
>>
>>101758249
I'm an old SD user. I'm just getting back into the scene now. Is SD obsolete? Is ComfyUI the way of the future?
>>
>>101760988
What the fuck why won't you just accept I think that is an unreasonably large undertaking for the team at BFL and it would be naive to expect them to tag accurately in that much detail.
No shit better tagging with more concepts would improve the model. Literally nobody here disagrees with that fact. That's not the issue though.
>>
>>101761031
>No shit better tagging with more concepts would improve the model. Literally nobody here disagrees with that fact.
then what are you arguing for? that was my point from the begining
>>
>>101761022
>studies has shown that people who go to the mall are far happier, are you trying to make me sad anon? just drive the car
>>
the salient visual differences between e.g. two different styles of classical painting is not a "niche" concept for a model designed to generate images
unless you're interested in creating a model that's only useful for photographs or memes/jokes, which to be fair to the BFL team they may have been
>>
File: 5a8bb.png (1.03 MB, 1024x1024)
1.03 MB
1.03 MB PNG
>>
>>101761037
okay, and what are you going to do about it again? I do enjoy anons who have an external locus of control because they just spin in circles.
>>
>>101761037
Because you keep implying that BFL should have done it from the start and then brush off the cost of doing that. Look. We're back to where we were half an hour ago. This is fucking retarded. I'm done.
>>
>>101760988
Literally nobody is disagreeing that more data in a model is more betterer. We are arguing that doing so is unrealistic, and the more reasonable solution is LoRAs.
You are most welcome to spend the 30K and several months it will take to train these concepts into the model. We would all be extremely appreciative if you do.
I on the other hand, am going to wait for people to figure out LoRA training, then train a bunch of concepts to share with the community for a fraction of the price. Then people can pick and choose what concepts they believe are missing and add them.
>>
File: ComfyUI_02560_.png (1.29 MB, 1024x1024)
1.29 MB
1.29 MB PNG
>>
>>101761066
>Because you keep implying that BFL should have done it from the start
I didn't imply anything, don't misinterpret stuff and stay on the factual posts, thank you
>>
>>101761074
Are you having a stroke?
>>
>>101761068
>Literally nobody is disagreeing that more data in a model is more betterer. We are arguing that doing so is unrealistic, and the more reasonable solution is LoRAs.
Again, I'm not asking to make a finetune that will add all the trivia in universe, but it should add some trivia at least, that would help
>>
>>101761024
go away. we're busy
>>
File: file.png (887 KB, 1024x1024)
887 KB
887 KB PNG
>>101761087
Well somehow I doubt your request ends with Messi and Picasso. You do another classic manipulation technique "minimization".
>well anon the mall really isn't even that far, it's not like I asked you to take us 500 miles away
>>
>>101761087
I didn't say "every piece of trivia in the universe" did I. Just choose the ones you think are missing and go ahead and fine tune it in to the model.
>>
>>101761087
>Not every concept, just final fantasy and Messi
Jesus Christ. Are you a real human being?
>>
>>101761102
>he keeps missing the point
For the 5th time, adding more trivia will improve the quality, it's not about adding messi or ronaldo, it's about the fact that adding messi and ronaldo will make the model produce better quality images, what's not clicking? >>101760988
>>
File: sd15vsXLvsFlux.png (3.34 MB, 2500x880)
3.34 MB
3.34 MB PNG
>>101760954
this. i realized on day 2 that flux had limited character/styles and it wasnt as simple as "dall-e 4 at home!". i simply left the complaint elsewhere and moved on to genning the stuff it's good at. threadshitting about how bad it is here doesnt change anything. go make some epic comparison reddit post about what flux fails at compared to midjourney/dalle and maybe the devs will notice or something.
or go back to peacefully waiting for the next toy (bigma)
>>
File: FD_00189_.png (371 KB, 512x512)
371 KB
371 KB PNG
>>101761115
Don't forget picasso.
>>
>>101761115
So you're having a stroke because someone wants to add the Pablo Picasso concept on a model? Are you a real human being?
>>
>>101761122
>really anon you're really getting practice driving your car, it's really for you why we're going to the mall
>>
>>101761138
>really anon, who cares about the quality of a picture after all?
>>
File: ComfyUI_00764_.png (1014 KB, 1024x1024)
1014 KB
1014 KB PNG
>>
File: 1722985155727211.jpg (220 KB, 1024x1024)
220 KB
220 KB JPG
>>
>>101761137
It's not just Pablo Picasso, the dude clearly has a list of things he considers important and when grilled on exactly what concepts should take priority, it's basically whatever he wants in that moment.

>>101761152
We all want the model to be better.
>>
File: ComfyUI_00927_.png (1.07 MB, 1288x848)
1.07 MB
1.07 MB PNG
>>
>>101761173
>We all want the model to be better.
good, and a way to make the model better is to add more diversity data into it, so again, what are you arguing for? we both agree with the goal we should have for flux
>>
File: ComfyUI_00767_.png (1.44 MB, 1024x1024)
1.44 MB
1.44 MB PNG
>>
>>101761137
This push pull thing you do isn't really working anon. You see, at the end of the day your request is a heavy burden, what's great about your examples is they're basically random from completely different domains, they more or less can be interpreted as infinite because without reading your mind, a ven diagram between Messi, Picasso and Blitzball effectively covers millions of trivia concepts. And you do realize that the world is more than just you right? So any argument about making an image model more knowledgeable given your examples can only be inferred that you suggest that dataset should encompass potentially tens of millions of tagged, trivia images.
>>
>>101761183
Agreed, and who is going to do the tuning to improve the model?
>>
>>101761173
>We all want the model to be better.
>>101761189
>who is going to do the tuning to improve the model?
You also want the model to be better, so I'll ask you this question aswell, who is going to do it anon?
>>
File: file.png (1.05 MB, 1024x1024)
1.05 MB
1.05 MB PNG
>>101761189
ideally you, hope you understand
>>
File: never goon.png (2.8 MB, 2048x1024)
2.8 MB
2.8 MB PNG
>>101761187
personally my domain starts and ends at minions. when will loras be out so i can fix this?
>>
>>101761194
>who is going to do it anon
I don't know, I don't have the computer nor the funds to improve the model. I want it to better but I think it will take time and incremental improvements to achieve a better model because we aren't even really sure on how the model was captioned.
>>
>>101761194
yes anon, we understand anybody but you, you're a helpless victim in all this with limited resources, of course you need someone else to take you to the mall, you can't help you need Picasso style pictures of Messi
>>
>>101761196
>I want the model to be better, and when I said it, it's legit and fair, but when another anon says the same exact thing, then he's being an entilted retard
weird flex but ok
>>
>>101761215
The crux of the argument is one side is saying they'd like it, the other side is demanding it be done on their behalf.
>>
File: ComfyUI_00771_.png (1.15 MB, 1024x1024)
1.15 MB
1.15 MB PNG
>>
>>101761218
>the other side is demanding it be done on their behalf.
I never said anything remotely close to that, why do you say random shit on my behalf, it's getting exhausting
>>
File: file.png (790 KB, 1024x1024)
790 KB
790 KB PNG
>>101761226
so you're complaining with no objective? are you a compulsive liar and manipulator?
>>
>>101761226
If we both agree the model can and should be approved and we both agree it would be expensive and I decide to leave it there and yet you keep pressing with how the model can be improved, the expectation is that you expect someone to do it for you.
Am I wrong in that assumption?
>>
File: ComfyUI_00772_.png (1.58 MB, 1024x1024)
1.58 MB
1.58 MB PNG
>>
>>101761234
>so you're complaining with no objective?
like everyone else, including you
>>101761173
>We all want the model to be better.
>>
>>101761239
>If we both agree the model can and should be approved
You want the model to be better without expecting someone to do the job? How does that work? Flux won't improve by itself
>>
File: ComfyUI_00932_.png (1.09 MB, 1288x848)
1.09 MB
1.09 MB PNG
>>
>>101761262
Who does that job and who pays for it?
>>
>>101761241
>>
File: file.png (1.12 MB, 1024x1024)
1.12 MB
1.12 MB PNG
>>101761239
No he didn't say that literally so any assumptions of his motive is bad faith because it makes him look bad. He's a helpless victim in all this, the Flux model doesn't make Messi playing Blitzball in the style of Picasso in the early contemporary period and he has the right to tell everyone how this is bad and by golly he never intended to assert that he meant for someone to do something about it.
>>
>>101761275
I'm asking you this question aswell anon, you want the model to be better, who will do the job so that it fit your expectations?
>>
>>101761245
my objective is to make you shut up and go away or start a spreadsheet of characters, concepts and trivia or something
>>
>>101761292
>seething this hard
>>
Everybody wants to improve the model, have you considered fresh bread?

>>101761268
>>101761268
>>101761268
>>
>>101761289
I don't know. I know I WANT it to be better, I have ideas on how it can be better. But I understand there is a practical and financial cost involved with that. That is where my argumentation stops.
>>
somehow this thread managed to go 4 months with retards happily gaslighting themselves into thinking bottom-of-the-barrel chinkslop like hunyuan and pixart was good, so lets all just calm down and wait at LEAST another month for (no) finetunes before threadshitting about flux.
>>
File: ComfyUI_00782_.png (1.18 MB, 1024x1024)
1.18 MB
1.18 MB PNG
>>
File: file.png (1.06 MB, 1024x1024)
1.06 MB
1.06 MB PNG
>>101761301
>deflection
Anon, you are a sociopathic manipulator.
>>
>>101761310
>I know I WANT it to be better, I have ideas on how it can be better. But I understand there is a practical and financial cost involved with that. That is where my argumentation stops.
I also have ideas on how it can be better (more data diversity) and I also understand there is a practical and financial cost involved with that. We just spent hours arguing on something we both agree on at the end, isn't that funny?
>>
File: FD_00255_.png (965 KB, 1024x1024)
965 KB
965 KB PNG
>>101761205
>>
File: ComfyUI_00783_.png (1.18 MB, 1024x1024)
1.18 MB
1.18 MB PNG
>>
>>101761327
ok now i see the plastic
>>
>>101761325
Cool, let's end it there.
>>
>>101761327
leg day



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.