[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: tmp.jpg (785 KB, 3264x3264)
785 KB
785 KB JPG
Discussion of free and open source text-to-image models

Previous /ldg/ bread : >>101905755

>Beginner UI
EasyDiffusion: https://easydiffusion.github.io
Fooocus: https://github.com/lllyasviel/fooocus
Metastable: https://metastable.studio

>Advanced UI
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
ComfyUI: https://github.com/comfyanonymous/ComfyUI
Forge: https://github.com/lllyasviel/stable-diffusion-webui-forge
InvokeAI: https://github.com/invoke-ai/InvokeAI
SD.Next: https://github.com/vladmandic/automatic
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI

>Use a VAE if your images look washed out
https://rentry.org/sdvae

>Model Ranking
https://imgsys.org/rankings

>Models, LoRAs & training
https://civitai.com
https://huggingface.co
https://aitracker.art
https://github.com/Nerogar/OneTrainer
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts

>Flux
https://huggingface.co/spaces/black-forest-labs/FLUX.1-schnell
https://comfyanonymous.github.io/ComfyUI_examples/flux

>Pixart Sigma & Hunyuan DIT
https://huggingface.co/spaces/PixArt-alpha/PixArt-Sigma
https://huggingface.co/spaces/Tencent-Hunyuan/HunyuanDiT
https://huggingface.co/comfyanonymous/hunyuan_dit_comfyui
Nodes: https://github.com/city96/ComfyUI_ExtraModels

>Index of guides and other tools
https://rentry.org/sdg-link
https://rentry.org/rentrysd

>GPU performance
https://vladmandic.github.io/sd-extension-system-info/pages/benchmark.html
https://docs.getgrist.com/3mjouqRSdkBY/sdperformance

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
sd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium

>Maintain thread quality
https://rentry.org/debo

>Related boards
>>>/h/hdg
>>>/e/edg
>>>/d/ddg
>>>/b/degen
>>>/vt/vtai
>>>/aco/sdg
>>>/trash/sdg
>>
SaaS in the OP I did it again
-the rizzler
>>
Blessed thread of frenship
>>
File: 12gb-4hours.png (181 KB, 3360x792)
181 KB
181 KB PNG
>>101908297
FOUR HOURS FOR 1600 STEPS, VRAMLET LORA BROS, AND ITS USING AT MAX 7GB OF MY GPU

ABSOLUTELY
DO-ABLE

I'M WALKING ON SUNSHIIIIINE WAAOOOOOHHH
>>
can i train flux on 12gb
>>
>>101908529
>>101908519
>>
>>101908529
loras? yes see >>101908519
>>
https://github.com/city96/ComfyUI-GGUF/commit/88fb6fa0014850615ca5b3e0ec1c018f67319237
That new commit is supposed to improve the Q8_0 and Q4_0 quality, I did some test with Q8_0, here's the comparaison between Q8_0(old), Q8_0(new) and fp16
https://imgsli.com/Mjg4MDMz/0/1
>>
>>101908519
such terrible utilization
>>
File: file.png (1.44 MB, 768x1152)
1.44 MB
1.44 MB PNG
>>
>>101908455
>lightning cat made it in
lel
>>
File: FLUX_00016_.png (1.16 MB, 896x1152)
1.16 MB
1.16 MB PNG
when you failed 9th grade for the 38th time and need to take the edge off
>>
>>101908547
looks like a good improvement, nice
>>
hhhhehehehe
>>
>>
File: file.png (1.65 MB, 896x1344)
1.65 MB
1.65 MB PNG
>>101908547
i love city
>>
File: fs_0898.jpg (91 KB, 936x1280)
91 KB
91 KB JPG
>>
>>101908547
Hmm, is it just me or does the old one still look closer to FP16 than the new?
>>
File: sonic.png (1.08 MB, 1024x1024)
1.08 MB
1.08 MB PNG
>>101908123
>cute sonic
Aw, mentioning sonic ruins the style completely. Right after I'm at the level where I can just mention sonic to get him in there I realize I have no way to make him appear with Sunflower Hedgehog...
Maybe in another year and a half...
>>
>>101908606
yeah I feel that way too, weird
>>
>>101908549
It is, but I hardly care. If I can make flux loras on my vramlet card under 12 hours I am HAPPY
not hours ago I was ready to give up and thought this would be impossible
you guys have no idea how fucking ecstatic I am right now, I feel like I'm floating
>>
File: mfw.jpg (204 KB, 1024x1024)
204 KB
204 KB JPG
>>
File: fs_0920.jpg (97 KB, 936x1280)
97 KB
97 KB JPG
>>
>Update Forge
>Now does some weird shit called patching loras between every single generation, basically doubling the generation time
>Roll back to a previous version,
>Generates pics applying loras from start without any of that patching bullshit from scratch between the pics

Why the fuck would anyone make this kind of a change and think it's a superior solution to what was there before.
>>
>>101908677
I wonder if that might be why i couldn't reproduce a single one of my old gens from the previous forge
god what a mind numbingly stupid update should've kept this shit out of date/dead.
>>
File: file.jpg (368 KB, 3744x2096)
368 KB
368 KB JPG
>>101908606
definitely not, look at the phone and the weird lines around the hair in the old one
>>
>>101908677
Trust in the plan. The anons on this board have assured me that Ilyas is a genius who can do no wrong.
>>
File: fs_0928.jpg (122 KB, 936x1280)
122 KB
122 KB JPG
>>
File: ifx36.png (1.35 MB, 1024x1024)
1.35 MB
1.35 MB PNG
>>101908623
ImageFX is next lefel for that stuff
>>
>>101908455
Running T5 on cpu takes AGES on comfy.
I want to try forge, but I don't want to have it redownloading every model I already have on comfy. Can anyone run this on forge directory so that I don't have to?

find ./ \( -type f -name "*.bin" -o -name "*.safetensors" \) | sed 's|^\./||' | sort | tree --fromfile
>>
>>101908547
It looks better, but I dunno about more accurate to FP16.
>>
>>101908547
I did the same test for Q4_0 and I got the same exact picture, weird, only Q8_0 seems affected by that commit even though he claimed it would also change Q4_0
>>
File: 1700911652652022.png (1.28 MB, 896x1152)
1.28 MB
1.28 MB PNG
>>
File: 1723743256864466.jpg (183 KB, 936x1280)
183 KB
183 KB JPG
>>101908664
why does Flux generate these weird protrusion on the torso same as SD? Some of Flux's quirks are eerily similar to SD.
>>
>>101908797
that's because comfy gave black forest labs the stability training data after he left
>>
>>101908729
>Running T5 on cpu takes AGES on comfy.
Takes me seconds even with a gigantic prompt that no sane person would ever use.
>>
>>101908808
did he upload it somewhere public?
>>
File: 1701964552301606.png (1.21 MB, 896x1152)
1.21 MB
1.21 MB PNG
>>
>>101908830
no he made a secret deal with them, it's why he was able to implement flux so quickly. he helped a lot on their own api, hence the deal with x/grok ai happening so soon after release.
>>
>>101908830
It's LAION-5B and LAION-POP for SD3, they have been taken down because of suspected CSAM in it in it but there should mirrors somewhere.
>>
File: Sunsonic.png (2.79 MB, 2048x1024)
2.79 MB
2.79 MB PNG
>>101908717
Wow! How did you do that? These are the ones I get with my prompt at ImageFX.
>children card illustration vintage style page scan of cute sunflower hedgehog. She is sitting with sonic at the park. kindergarden eyes
>>
>>101908859
CSAM?
>>
>>101908889
Continuous Semantic Abstraction Method, essentially it would make diffusion models operate 10x faster on 10x less vram. However, it was patented by Sam Altman so has only been used in OpenAI's Sora (which is why their ai videos are so far ahead of everyone else).
>>
File: 00010-3605529432.png (2.62 MB, 1280x1920)
2.62 MB
2.62 MB PNG
>>
File: ComfyUI_00116_.png (1.08 MB, 960x1096)
1.08 MB
1.08 MB PNG
>>101908817
yeah, actually it was only slow the first time. I guess it only takes ages to load the model. I don't know if it's worth running it on cpu and using the fp16, or using the fp8 and letting it go to gpu.

It was very slow to load the fp16 on cpu.
>>
>>
>>101908906
Nope, you rizz man
It stands for Continuous Sister Ass-fellating Man which is what Sam is good at
>>
File: 1169250435.png (1.62 MB, 896x1152)
1.62 MB
1.62 MB PNG
>>
>>101908919
if you can run the FP16 on the CPU do it, loading is a one time cost per session
>>
>>101908868
loose watercolor of a brown hedgehog in a. rounded cute style holding a sunflower , sat next to him is sonic in another style
>>
>>101908951
Is it worth it over the fp8?
>>
File: ComfyUI_Flux_8971.jpg (257 KB, 1024x1024)
257 KB
257 KB JPG
>>
>>101908966
completely, fp8 is retarded compared to fp16
>>
>>101908966
yes, until a Q8 quant for T5 is made
>>
>>101908729
can just point forge towards comfy directories/any directory where you store your models/upscalers/cat gifs, muppet
>>101908943
: )
>>101908966
yes.
>>
anyone using AMD Amuse here? if yes do you know if it is possible to use stable diffusion 1.5 models on it?
>>
>>101908906
apparently its another acronym for CP? Why would the dataset have CP in it? That's fucking retarded.
>>
File: grid-0131.jpg (279 KB, 1536x2688)
279 KB
279 KB JPG
Trying some of my old prompt from SD1.4-1.5 in Flux.
Regret how I deleted a fair deal of my old image since I was sure I would never look at them again.
>>
>>101908976
>can just point forge towards comfy directories/any directory where you store your models/upscalers/cat gifs, muppet
That's what I intend to do, but I need to know where each model goes where before it tries to download everything during the first run.
>>
>>101908996
Nah, that's a false flag. Twitter liberals/anti-ai Artists banded together to try and normalize CSAM as a word for child porn to get AI language models banned. It's actually got nothing to do with that.
>>
File: grid-0129.jpg (436 KB, 1536x2688)
436 KB
436 KB JPG
>>
File: 1699547427735141.png (1.23 MB, 896x1152)
1.23 MB
1.23 MB PNG
>>
>>101908996
it was created by crawling the web for images with alt text, it was bound to grab something unsavory. Also the CSAM images don't have to be anything the average person would call porn.
>>
>>101909036
the curse of the double mugs
>>
>>101909000
add the paths your user.bat file before you fire it up (and don't forget path to python)
>>
>check civit*i for flux loras
>retarded trigger words
hopeless
>>
>>101909074
wait until they train it on unfiltered, raw booru tags... Can't wait for them to make it retarded.
>>
>>101909095
must be nice being a poorfag bitching, no responsibility, lots of complaining
>>
>>101908519
Please post the command
>>
>>101909106
The hat fit that well, huh?
>>
Wow we are so back 3090 bros
https://github.com/kohya-ss/sd-scripts/pull/1374#issuecomment-2287354261
>>
>>101908547
https://imgsli.com/Mjg4MDUy
>Hatsune Miku with dreadlocks and a black skin skateboarding, she is holding a leek on her right hand and has a hat that has "MIGU" written in it, pixel art style, she's located at the left of the picture, over a pedestal. A man, located in the right is happy to see her, his text speech says "OMG IT MIGU".
the differences are really subtle, but I feel like the new Q8_0 seems to win on that one?
>>
File: 2789719938.png (1.78 MB, 1152x896)
1.78 MB
1.78 MB PNG
>>
>>101909127
well I train shit and you don't :)
>>
File: Flux_00503_.png (896 KB, 1024x1024)
896 KB
896 KB PNG
first attempt at flux Yoshiaki Kawajiri lora slop, used the existing boooru style tags and might have overbaked it

going to recaption with joy caption, may go for a fine tune and extract lora from that
>>
I'm liking the Ayami Kojima lora in the general style and detailing, but it has an unfortunate side effect of making the environment simpler it seems.

This is the same gen as >>101900142 except with the lora applied and "Tsutomu Nihei" replaced with "Ayami Kojima".
>>
>>101909143
you train shit?
>>
>>101909143
topkek
>>
>>101909133
Wrong link kek https://github.com/kohya-ss/sd-scripts/pull/1374#issuecomment-2287134623
>>
>>101909169
I heard even poorfags can make datasets, where's your dataset? You clearly know how to make them.
>>
>>101909184
>t. booru tag user
No wonder you got offended.
>>
>>101909184
where your dataset? you clearly think you're good shit
>>
>>101909177
it's been 2 days and he hasn't delivered anything, he's probably full of shit anon
>>
>>101909184
Can I obtain one of your shit datasets?
>>
>>101909213
People talking about Booru tags being bad with Flux actually are just talking out of their ass.

>>101909213
You speak as a definitive source of how to train models, so I expect your contribution. You obviously can't train them so I expected you to spend some effort curating a dataset you know works otherwise how can you speak with such authority?
>>
File: 1693876907839806.png (1.98 MB, 896x1152)
1.98 MB
1.98 MB PNG
>>
File: grid-0136.jpg (417 KB, 2688x1536)
417 KB
417 KB JPG
>>
>he linked the same comment twice
>>
File: fp113.jpg (171 KB, 1344x768)
171 KB
171 KB JPG
>>
>he only trolls
>>
>>101909272
you can tell by the way he types that he is profoundly autistic
>>
>>101909248
let's see your dataset so we can learn
>>
>>101909050
>we can't release the model publicly because it probably has naked kids and shit in it
why are you putting naked kids in the dataset?
>we didn't its just that we scraped the web and assume its in there
so you are currently in possession of cp?
>no we don't actually know... we just assume... but we aren't actually going to check because that would be too hard
if you released it publicly people could help you identify whether it is there or not
>no because then we would get in trouble for being in possession of cp
????
????
????

Can someone make this make sense.
>>
>>101909248
Raw booru tags are a shit way to train image models. How can you even dispute that?
>Anime girl holding sign written "AI"
vs
>1girl, holding, sign, english text
How can anyone expect bottom option to be feature complete?
>>
>>101909299
no, you said definitively there is a right way so you must have proof
>>
File: 1700175310770746.png (1.92 MB, 896x1152)
1.92 MB
1.92 MB PNG
>>
>>101909303
>Can someone make this make sense.
no
>>
>>101909050
I can confirm and I won't elaborate, that SD 1.5 recognizes certain old timey onion site terms
>>
>cute tummy
>Alvarado
>>
>>101909312
Because can you believe it the AI learns different concepts from images, somehow the caption "a bedroom with white walls and blue carpet" somehow also captures ideas like "bed" and "dresser" and "closet". It's almost like variations in captions and concepts from a variety of images allows a model to learn many different things. So a model trained on a woman and 1girl somehow learns they mean the same thing.
>>
>>101908455
please sirs where to buy thioktor?
>>101908568
>>101908577
fuck off jew, stop nuking japanese nuclear plants.
>>
>>101909325
how many pedophiles does this general have? shall we take a tally?
>>
Where the fuck is pytorch with rocm6.2? It's been two weeks already.
>>
>>101909344
lol
>>
File: FLUX_00019_.png (1.11 MB, 896x1152)
1.11 MB
1.11 MB PNG
>>
>>101909367
kek
>>
>>101909303
There are people interested in sabotaging and they don't actually see the CSAM, they use shitty hashing algorithms which can have false positives. It's infinitely more difficult to prove a negative and the people who have to prove a positive purposely use weak testing methods.
>>
>>101909348
if the ip counter wasn't hidden we would know the exact number
>>
>>101909344
You cannot translate "A blue ball on top of an orange box" in booru tags
>>
>>101909344
Yeah, we saw that with pony. Completely forgot basic concepts because it was overwritten to the brim with booru tags.
>>
File: 00018-3966084874.png (1.26 MB, 832x1216)
1.26 MB
1.26 MB PNG
>>
>>101909388
yet "blue_ball on top of "orange_box" somehow works, almost like it learns disparate concepts
how is it difficult to understand that at a _fundamental level_ a model must not only learn but _infer_ nouns, adjectives and verbs.
>>
>>101908990
>AMD Amuse
>AyyyMD Not Invented Here syndrome strikes again
They really need to sort out their corporate culture if they don't want to become Intel one day.
>>
>>101909348
I am not such thing. I experimented when I heard about the scandal, and it's obviously undertrained to produce anything meaningful with those terms anyway. It probably picked up on something that reached the clearnet somehow. We will never know. But I'm sure they know full well.
>>
>>101909406
Pony is a fucking retard or do you think the rating system was smart? How have you not figured out there isn't a new Pony model yet is because he is incapable?
>>
File: FLUX_00020_.png (1.52 MB, 896x1152)
1.52 MB
1.52 MB PNG
>>
>>101909443
Yeah so the smart thing to do would be to train it on natural language concepts, not booru tags. Glad we agree.
>>
File: 00021-2883139532.png (1.19 MB, 832x1216)
1.19 MB
1.19 MB PNG
>>
>>101909465
nah
but feel free to translate all the booru images to natural language accurately
as I said, you can do that at least, poorfag
>>
>>101909418
And if it's trained only like that, it will forget exactly that. There is no syntax in booru tags. If you don't put syntax in, it will forget during training. Booru tags could work, if pre-processed into coherent sentences. Otherwise, it will end up trash, just like pony did.
>>
>>101909384
the assumption that you just come across cp on the web randomly is so absurd. I've literally only seen it on /b/ in the old days once or twice and that's it, out of 25 years browsing the internet. You honestly cannot tell me that chomoshit is so common that you'll pick up more instances of cp when mass scraping tagged images than you would images of things like very common vehicles, items, and places that sd is clearly undertrained on. If theres chomoshit in there (which there probably is), someone intentionally looked for it and added it into the dataset.
>>
File: 00009-2528583272.png (3 MB, 921x1612)
3 MB
3 MB PNG
>go on Civitai
>look at flux images
>most are literally identical to SD 1.5 1girl with the same shitty AI render look
Pic fucking related, tell me this shit wasn't made in SD 1.5 and tagged as Flux
>>
>>101909483
wrong
and that's why you don't train exclusively on booru tags
for example, doing half and half or even captioning the same image with both
>>
File: sunflowerhedgehogandsonic.png (894 KB, 1024x1024)
894 KB
894 KB PNG
>>101908964
Thanks! Looks like FLUX still has some catching up to do!
>>
>>101909486
retard talking confidently about things he doesnt understand again
>>
>>101909465
isn't that exactly what PonyXL did? the model card says it was trained on tags and natural language.
>>
File: (((sticks))) knows.png (561 KB, 762x713)
561 KB
561 KB PNG
>>101908868
please give me a sticks wearing a yarmulke, burning a israeli flag with a sheepish smile as sonic crosses his arms and glares at her with tired resignation.
>>
>>101909418
...... So training it entirely on booru tags is dog shit because it loses context of what's a noun and what's a verb.
>>
>>101909487
You can tell it's flux because it's got the horrible flux noise artifacts from upscaling kek
>>
>>101909487
gotta crank guidance to get the benefits of the improved prompt comprehension which leads to low quality toasted looking gens
>>
>>101909486
actually you can with fairly innocuous search terms and on "safe" websites like blogspot especially if you go in deep in the results (which robot scrapers do as a rule)
>>
Fuck it. I'm gonna build pytorch from source.
>>
File: grid-0141.jpg (529 KB, 2688x1536)
529 KB
529 KB JPG
>>
>>101909486
>You honestly cannot tell me that chomoshit is so common that you'll pick up more instances of cp when mass scraping tagged images than you would images of things like very common vehicles, items, and places that sd is clearly undertrained on
what are you even talking about, retard
LAION was public, some concerned party scanned it for known CSAM and found a match, it says nothing of how many matches there were
https://laion.ai/notes/laion-maintenance/
>>
>>101909549
Bro just sell your AMD trash
>>
>>101909548
No that's just an excuse you are using to cover your ass and claim deniability for having chomoshit YOU looked for on your pc.
>>
>>101909532
I never said you should exclusively train on booru tags but there are certain concepts literally only possible to capture with booru tags and I don't see people like you doing the captions needed
>>
File: ComfyUI_02321_.png (1.74 MB, 1536x640)
1.74 MB
1.74 MB PNG
>>
>>101909584
is ther a lora for that?
>>
>>101909548
This. Ukranian and Russian blogspots, and wordpress and drupal installations.
>>101909576
You can end up there by accident quite easily, but you need to use the internet for things other than 4chan, reddit and insta.
>>
>>101909606
Nope, all base flux prompt autism:

>This is a digital drawing in the style of the anime studio, Studio Shaft and their show "Monogatari". The scene is depicting two characters standing in an industrial warehouse filled with thousands small, red books in disorganized piles. The background features large, white, grid-patterned windows, casting a bright, diffused light over the scene.

>On the left, a male character with semi-long, dark hair that covers his right eye. The hair reaches down to his ears and at the top of his head he has a single strand of hair standing tall. He has a lean build is dressed in a black, buttoned-up japanese school uniform with dark red highlights and black pants. His expression is serious, and he sits on the pile. His face is round.

>On the right, also sitting atop the books, is a female character with short, messy, dark purple hair and a slender physique who is wearing a light pink, short-sleeved button-up top with a light red collar and a darker purple pleated skirt. Her right arm is bandaged, and she is holding her hands together between her outstretched legs. Her head is facing down to the ground, her expression intense and focused.

>The red books in the foreground and background create a stark contrast with the lighter tones of the characters and the background. The overall atmosphere is tense and dramatic, with the characters appearing to be in a state of readiness or conflict. The scene is set against a backdrop of industrial minimalism, emphasizing the raw, gritty environment. There's a moodiness to the characters like they're discussing something very serious, and this is emphasized by the high contrast tones of the foreground combined with the overexposure of light in the background.
>>
File: 00023-4220156906.png (1.05 MB, 832x1216)
1.05 MB
1.05 MB PNG
>>
>>101909616
It's honestly impressive how flexible flux is, wow
>>
>>101909610
>You can end up there by accident quite easily,
Nope.
>you need to use the internet for things other than 4chan, reddit and insta.
I've been online since before any of those sites existed. I've also viewed a fair amount of porn... there's simply no way you're just randomly coming across it by accident.
>>
File: FLUX_00022_.png (1.45 MB, 896x1152)
1.45 MB
1.45 MB PNG
damn, that was nearly perfect
>>
How well do SDXL loras work on Pony?
>>
>>101909610
We're just talking about a crawler by the way. If your intention is to download billions of images and you let loose a crawler that starts on cnn.com you will eventually get illegal images.

>>101909670
anon, your Reddit use is irrelevant when you're talking about a scraper that is downloading billions of images, you can just sit down and shut up gramps, do you really think a billion image dataset is some retard like you typing shit into image search?
>>
>>101909625
>just by smashing your face on the keyboard in yandex images
lmao

I use it when I need to search the picture of someone who's not a celebrity since google gimped its imagesearch completely and this is true. It's worse than bing used to be.
>>
>>101909670
/b/ got frozen several times because it was posted there. This was like 12-15 years ago or so though.
>>
gonna try captioning my dataset with
>booru tags
>paragraph
>joy caption's boomer prompt
lets see how it goes.. not sure how it'll work out with the booru tags since I assume if I try to shuffle the caption it will fuck up the boomer prompt part?
>>
>>101909682
That's REALLY good. If you could have the one also be slightly "A" ish that would be incredible
>>
hey mr pedo can't you just post the dragons and pikachu gens you do for a while, I'm sick of hiding your shit
>>
>>101909616
>>101909631
fantastic
i did a silly one yesterday "anime girl vampire with golden hair and big golden eyes" which worked pretty well for shinobu
>>
>>101909698
Why would you shuffle if you have paragraphs in your dataset? You know it splits on , right? Your paragraphs turn into gibberish.
>>
File: 1712826560943787.gif (3.53 MB, 498x423)
3.53 MB
3.53 MB GIF
how do people run flux? it takes me 7 mins to generate an image on fp8.
and that's on 16GB VRAM and 64GB RAM.

can i have some comparison numbers? does this look about right??
>>
>>101909696
>this is true
preaching to the choir
tell that to the retard script kiddie ITT who has never scraped anything in his life and has no idea just how easy it is to find stuff most people would consider """CSAM"""
>>
File: ComfyUI_00128_.png (867 KB, 1160x896)
867 KB
867 KB PNG
>>101909631
Outstanding.
>>
File: ComfyUI_01846_.png (2.06 MB, 768x1344)
2.06 MB
2.06 MB PNG
>>101909731
yeah i got some cool kisso-shottos as well
>>
>>101909741
Those are totally normal numbers assuming you are doing 150 steps
>>
>>101909741
<40s 3090 (25ish steps no neg). with your setup, 7 mins? nope. something very wrong
>>
File: FLUX.png (567 KB, 1024x1024)
567 KB
567 KB PNG
>>101909508
Same prompt in FLUX who can't draw each in a different style.
>>
>>101909696
please don't reply to Teebs
>>
>>101909741
https://comfyanonymous.github.io/ComfyUI_examples/flux/
https://github.com/city96/ComfyUI-GGUF/

I do 22 steps of euler or deis, but I'm still experimenting. Takes about 1 minute per image on 4060Ti16GB with less RAM than you.
>>
>>101909790
you have to think about how such an image in the wild might have been described, it's very rare for someone to mix styles in a real picture
>>
>>101909790
this is why you don't reply on cumfart to bring you the training in a briefcase
>>
>>101909758
very nice, i'll have to try some missy classrep catgirls
>>
>>101909793
I haven't been to the general in months. I don't know about your thread celebrities sorry.
>>
File: forge.jpg (284 KB, 1856x923)
284 KB
284 KB JPG
>>101909741
~2 mins (with way too many steps and retardedly big resolution)
with 8gb vram and 32gb ram here
>>
>>101909755
>stuff most people would consider """CSAM"""
typical overreaction to conceal nonce tendencies. family pictures of children at the beach is normal, just because it gives you a boner doesn't mean it's sexualized.
>>
File: file.png (27 KB, 582x230)
27 KB
27 KB PNG
>add this node
>VRAM usage down by 9GB
>no discernable performance impact

what?
>>
>>101909824
Flux doesn't use captions found in the wild. You have to think like a caption model.
>>
>>101909524
Huh... I don't want to meet ImageFX's version of the most hated dog...
>>
File: ComfyUI_00114_.png (1.45 MB, 1024x1024)
1.45 MB
1.45 MB PNG
>>101908729
>>101908817
how do I tell Comfy to run CLIP on the CPU?
>>
Q4_0 vs Q4_1?
>>
>>101909863
because running CLIP and T5 on the CPU is plenty fast or at least faster than swapping if your CPU is shitty
>>101909899
>>101909863
>>
File: ComfyUI_Flux_9017.jpg (224 KB, 1024x1024)
224 KB
224 KB JPG
>>
File: ifx39.png (1.28 MB, 1024x1024)
1.28 MB
1.28 MB PNG
>>101909922
intredasting
>>
>>101909740
that's why I just said I don't think I can use shuffle when I have the boomer prompt in there, so I don't know how good the booru tags will be without shuffle
>>
>>101909899
I used this. Just save on custom nodes.
https://gist.github.com/city96/

First link on the page.
>>
>>101909573
If this fails I will certainly never buy an AMD GPU again.
>>
Any way to repeat the same pic but generate it again? Like a slightly different version
>>
>>101909952
You just shuffle them when you're generating the txts.
>>
>>101909741
on a 3060
36 steps 768x1024 ~1:50
36 steps 1024x1280 ~2:55
>>
>>101909971
it's called Seed Variation in A1111
>>
>>101909971
copy the seed (think forge has a button for it), slightly alter steps, prompt, cfg, etc
>>
https://x.com/OAlexanderDK/status/1824011441516499409
holy kek, elon really is retarded for using Flux
>>
>>101909971
click on extra then put a value in the variation strength box
>>
File: FLUX_00025_.png (1.48 MB, 896x1152)
1.48 MB
1.48 MB PNG
>>
>>101909974
oh, derp, of course. thanks anon
>>
>>101909994
first amendment country
>>
>>101909994
won';t last long
>>
File: ComfyUI_00131_.png (863 KB, 1160x896)
863 KB
863 KB PNG
Gentlemen, the age of the AI remasters draws near. Can you feel it? Feel it in the air?
>>
>>101909994
>no the model need to detect my prompts for thought crime
>>
>>101910024
So true anon, that's why all of the other online ai services make no effort to prevent CSEM/etc.
>>
>>101909994
just pictures dude, don't give them so much power over you
>>
>>101910041
I don't personally give a shit, but you're out of your mind if you think this is going to fly lmfao
>>
File: we're so back.png (67 KB, 991x790)
67 KB
67 KB PNG
>>101908519
Holy shit it's real. I also managed to run it with --train_batch_size 4 which turned my 37 hours from yesterday into 3.
>>
>>101910036
It's already here and looks like garbage. Look up the 4k releases of true lies and T2. It turned everyone's skin into plastic.
>>
>>101910074
batch 4 looks like ass unless you're training a style
>>
>>101910040
For VC money I assume. Elon can afford to self fund though
>>
>>101910083
That's on Cameron, not the model.
>>
>>101908519
>SD3
lol
lmao, even
>>
File: 1723485716553859.gif (1.73 MB, 326x274)
1.73 MB
1.73 MB GIF
>>101908455
Just came back to image generation after a year or so and I am completely out of my fucking deph now. What are the best models for generating old style pixel art? I remember some anons using older models to generate some cool looking SNES boss sprites back before losing interest.
>>
am i able to use t5 on cpu in forge now with quants? What about loras?

cant keep up with this
>>
>>101910083
It's not what I'm talking about. Those are """AI""" upscales. I'm hopeful for controlnet video models that can completely redo whole movies.
>>
File: ComfyUI_02349_.png (1.64 MB, 1024x1024)
1.64 MB
1.64 MB PNG
>>101908717
closest attempt i could get with base flux
>>
>>101910109
Pony SDXL with some loras probably can do that quite well. Look on civitai. Pony is a basemodel. Also check the thread on >>>/h/hdg
>>
>>101910119
>I'm hopeful for controlnet video models that can completely redo whole movies.
why? It would be artifact laden uncanny slop that looks like the polar express.
>>
>>101910157
>AI will never improve
retarded imbecile
>>
>>101910169
Ai is already at the peak. it won't get better from here
>>
File: 1723579997049651.webm (2.8 MB, 720x1280)
2.8 MB
2.8 MB WEBM
>>101910169
>two more weeks
>>
File: ComfyUI_00135_.png (969 KB, 1160x896)
969 KB
969 KB PNG
>>
>>101910191
You're retarded. AI reached it's peak 1 year ago, it's actively getting worse.
>>
>>101910204
>not even SOTA
>>
>>101910057
it will fly because the rule will be "you will be banned for violating the TOS of X by posting this content", it's time we grow up and just hold people accountable for outputs, it's not Adobe's job to put a screen recorder AI in Photoshop to make sure you aren't drawing wrongthink
>>
>>101910213
AI peaked inside me last night and I liked it
>>
>>101910087
Thanks for the info.
>>
>>101910218
Sometimes I forget how braindead the people in this general are, than I see posts like this.
>>
I peaked (you)'re mom since 2020.
>>
>>101910242
Handwringers will lose :)
>>
>>101909987
>>101909988
>>101909996
Thanks. Does variation strength matter?
>>
>>101910242
>pencil >THIS IS INSANE. You don't even need to confuse your brain to draw these images!
Releasing this as an actual paid product, without any safety net, is going to result in an expensive legal battle!!
>>101910279
very low amount, as low as 0.01 for tiny changes. depends on sampler, scheduler and step size too.
>>
>>101909487
>no buttchin
Not flux
>>
>>101910204
I love this kek
>>
>>101909984
>>101909838
>>101909795
>>101909785
>>101909771
i'm using an AMD card... maybe that's it. but normally it isn't that much worse. i dunno why it takes this long in this case.
>>
>>101910336
>i'm using an AMD card
you buried the lede
Flux will always be slow for you, anon
>>
>>101910336
amd's performance is unironically exponentially worse. No tensor cores = don't even bother.
>>
>>101910218
it should, but it won't
not yet anyway
>>
>>101910218
a better solution is that everyone who posts on twitter gets banned from participating in society
>>
>>101910154
Thanks anon
>>
>>101910354
the fact there is no safety means it was a choice from Elon, he might have decided to fight on that hill (which he should). He says he's free speech so it's plausible.
>>
File: ComfyUI_02362_.png (1.67 MB, 832x1216)
1.67 MB
1.67 MB PNG
>>
Is it possible to use loras on GGUF in comfy? It is in forge but it's very hacky and it does it every time you generate
>>
>>101910343
but errything else works anon. i run 70B LLM at useable speeds. and it's not that much slower than say a slower nvidia card. i don't see why flux is that much of an outlier. maybe something is wrong...
>>
>>101910336
Was it worth the extra 50 dollars you saved not getting a Nvidia card instead?
>>
>>101909984
Wait, I have a 12GB 3060 too and it's taking me 25 minutes for a 1024x1024, am I doing it wrong?
>>
File: ComfyUI_00139_.png (822 KB, 1160x896)
822 KB
822 KB PNG
>"anon, meet your next opponent"
wyd
>>
>>101910410
ye
i'm getting 3 minutes at 1MP, 26 steps
>>
File: file.png (7 KB, 1004x26)
7 KB
7 KB PNG
Soon
>>
>>101910410
Yes
My 3060ti takes about a minute at 1024x1024.
>>
File: a45.jpg (334 KB, 1098x1006)
334 KB
334 KB JPG
Should I download flux1-dev-Q4_0.gguf or flux1-dev-Q4_1.gguf?
>>
>>101910410
Check if your GPU is being used. If it is, it might be taking too much VRAM and it may be shuffling in and out to keep it running. Try to put the T5 on CPU and to use a lower quant.
>>
>>101910413
>The date's over.
>>
>>101910153
Oooh, that's very cool, what was your prompt?
>>
pretty sad thread image wise so far.
>>101910413
can make me food and jerk me off at the same time, win I'd say.
>>
>>101910423
>>101910429
I am following the main tutorial to the letter
>>101910442
Can you share a workflow that demonstrates that? Please and thank you anon.
>>
>>101910336
I ran Flux on a 7900XT. It took around 40s for one image.
This was also running on linux/rocm
>>
>>101910403
it was way more than that and i got 16GB of VRAM. not to mention it was for gaming initially so yeah.

dual booting on linux is honestly the bigger annoyance.

>>101910410
if it's anything like in LLM land, the less of the model you are able to fit in your GPU, the slower it'll be, and significantly so. but then i don't understand how other people with the same tier of cards claim to have faster speeds. not sure if we're doing something wrong here
>>
>>101910399
>16GB VRAM
>70B LLM
why even bother with such low quants?
as for the performance there have been anons with weird issues, one with a 4090 was taking 3 minutes for a default settings Flux image
>>
File: catbox_o2y8sv.png (963 KB, 1160x896)
963 KB
963 KB PNG
>>101910463
quants:
https://github.com/city96/ComfyUI-GGUF/
https://huggingface.co/city96/FLUX.1-dev-gguf/tree/main
>>
File: ComfyUI_02120_.png (1.57 MB, 896x1152)
1.57 MB
1.57 MB PNG
>>101910446

>This is a digital drawing in a cartoon style featuring two characters, Sonic the Hedgehog and an extremely photorealistic rendering of a hedgehog, set against a soft, light yellow background with a bokeh effect. Sonic, on the right, is a blue anthropomorphic hedgehog with large, green eyes, white gloves, red shoes with white soles, and a white belt with a yellow buckle. He has a mischievous grin and is standing upright on two feet. The hedgehog on the left is a larger, realistic depiction of a hedgehog - with a round, plump body covered in spiky brown quills, a light beige underbelly, and a small, round nose. Its eyes are large and expressive, with a gentle, happy expression. The hedgehog is holding a large, vibrant yellow sunflower with a dark brown center, extending its green leaves towards Sonic. The ground is depicted with a few grassy patches with spots of yellow, adding a touch of natural detail. The overall atmosphere is warm and friendly, with the characters interacting in a heartwarming manner. The drawing style is painterly and vibrant, with smooth lines and bold colors, typical of modern digital animation or illustration. The drawing style for the hedgehog is very finely detailed and illustrated to be as close as possible to the real animal.
>>
>>101910279
it's literally a percentage dude
>>
>>101908455
Fuck dude, give me that clear mini laptop.
>>
>>101910438
one barely fits into 8gb vram, the other one goes over.
>>
>>101910336
>AMD
lol
>>
File: file.png (417 KB, 1662x1007)
417 KB
417 KB PNG
>>101910463
>>101910498
Also the workflow is on catbox if you have the extension
>>
File: 4x1080p.png (4 KB, 592x127)
4 KB
4 KB PNG
>>101909984
>roughly 2 and a half fucking minutes for a single 1024x1024

>>101910410
>25 fucking minutes for a single 1024x1024

I had no idea shit was so dire on those cards, I'm doing about 1 minute and 30 seconds for 4 1920x1080 images on a 4090, offloading the clip model. Full fp16 dev model and clip.
>>
>>101910498
>>101910533
I will check it and report back
>>
File: bad example.jpg (107 KB, 1373x269)
107 KB
107 KB JPG
>>101910550
Don't take my case as an example, I am probably doing something wrong
>>
>>101910550
>4 1920x1080 images on a 4090, offloading the clip model. Full fp16 dev model
the dev model alone is near 24GB, does 4x1920x1080 latents and everything else really fit?
>>
>>101910550
>offloading the clip model.
you can offload the clip model to ram?
what's the workflow for flux anyway? i just have a very basic one.
>>
>>101910498
Schnell works with 4 steps, right? What are the downsides?
>>
>>101910645
worse quality, more failures
>>
>>101910101
its flux, the kohya lora training branch is just merged with sd3
>>101910074
noice, we're SO BACK anon
>>101909112
sorry for the delay, here is the command prompt I used:
accelerate launch  --mixed_precision bf16 --num_cpu_threads_per_process 1 "flux_train_network.py" --pretrained_model_name_or_path "C:/Users/B/Downloads/flux/flux1-dev.safetensors" --clip_l "C:/Users/B/Downloads/flux/clip_l.safetensors" --t5xxl "C:/Users/B/Downloads/flux/t5xxl_fp16.safetensors" --ae "C:/Users/B/Downloads/flux/VAE.safetensors" --save_model_as safetensors --sdpa --persistent_data_loader_workers --max_data_loader_n_workers 2 --seed 777 --gradient_checkpointing --mixed_precision bf16 --save_precision bf16 --network_module networks.lora_flux --network_dim 4 --network_args "train_blocks=single" --optimizer_type adafactor --optimizer_args "relative_step=False" "scale_parameter=False" "warmup_init=False" --learning_rate 1e-4 --network_train_unet_only --save_every_n_epochs 1 --output_dir "C:\Users\B\Downloads\ibuki satsuki lora" --output_name flux-lora-name --timestep_sampling sigmoid --model_prediction_type raw --guidance_scale 1.0 --loss_type l2 --train_data_dir "C:/Users/B/Downloads/ibuki satsuki lora-upscale_resized" --resolution "512,512" --cache_text_encoder_outputs --cache_text_encoder_outputs_to_disk --fp8_base --enable_bucket --bucket_no_upscale --split_mode --save_state --save_every_n_epochs=1 --max_train_steps=1600
>>
how do you offload the clip to cpu on forge?
>>
>>101910708
>forgelets

kek
>>
>>101910501
Ah! I kneel.
Looks like my SD prompting days have ruined me. On the other hand, if I manage to learn this skill perhaps I will not need to ask for prompts because I can describe what I see, what would be the dream!
>>
File: ComfyUI_02377_.png (1.47 MB, 896x1152)
1.47 MB
1.47 MB PNG
>>101910715
Another anon posted this a couple of threads ago, gives me good natural language captions to steal styles and ideas from that an LLM would understand.

https://huggingface.co/spaces/fancyfeast/joy-caption-pre-alpha
>>
File: ComfyUI_00059_.png (1.05 MB, 1216x832)
1.05 MB
1.05 MB PNG
>>
>>101909567
>public dataset with billions and billions of pictures
>some concerned citizen finds a image society doesn't like
>they take it down because of a single image amidst billions
dear god i hate amerishits so fucking much. image the chinese doing something so fucking gay. you can't, it's impossible
>>
>>101910732
hah
hearty laugh
>>
>>101910682
Thanks!
>>
File: 4090.png (22 KB, 555x213)
22 KB
22 KB PNG
>>101910612
It's actually somewhere around 19GB VRAM for FP16 and VAE. The clip models need 9 GB VRAM. So in total you need 28GB VRAM to fit everything perfectly on GPUs. If you're not offloading CLIP then your PC is automatically using shared ram and slowing down the generation.

I have a 4090 and 3090 system, so I offload the clip model to the 3090 but keep the FP16 model and VAE on the 4090.

>>101910624
There's a comfy node for it, you can offload to the RAM (using CPU) or offload to another GPU if you have one. VRAM will always be faster than RAM, so a GPU is preferred.
>>
File: ComfyUI_Flux_9085.jpg (221 KB, 1024x1024)
221 KB
221 KB JPG
>>
>>101910800
>If you're not offloading CLIP then your PC is automatically using shared ram and slowing down the generation.
I tried the custom nodes for forcing clip to CPU and could not find any change in VRAM usage, RAM usage, or gen speeds. Maybe something is just broken for me.
>>
File: FLUX_00029_.png (1.18 MB, 896x1152)
1.18 MB
1.18 MB PNG
>>
>>101910822
hey pedo poster could you just keep posting cool gens like these instead of nonce shit?

kind regards,

anon
>>
File: ComfyUI_00147_.png (821 KB, 1160x896)
821 KB
821 KB PNG
You wouldn't hit an old lady, would you, anon?
>>
>>101910842
Check the post times, newfag.
>>101910835
Jesus Christ I can't get over how good the details are with this model.
>>
>>101910835
>my eyes are up here
>sorry, what did you just say? I was distracted
>>
Is the lora fucking up the eyes?
>>
>>101910866
No anon. She's just british.
>>
>>101910856
>check the post times
pedo anon you're flipping through IPs, lmao
>>
I can't believe training flux on 12gb is real, what the fuck. I remember 12gb being the bare minimum for training loras on pony.
>>
>>101910800
>It's actually somewhere around 19GB VRAM
I see. Well, I'm jelly so I hate you now.
>>
>changes filename when people notice
Just give it up debo
>>
File: ComfyUI_Flux_9043.jpg (255 KB, 1024x1024)
255 KB
255 KB JPG
>>101910856
pedoanon used my filename format once and now this redditor is attached to me
>>
>>101909961
I'd recommend that instead
https://github.com/neuratech-ai/ComfyUI-MultiGPU
>>
>>101910893
Debo loves to false flag
He'll get bored eventually
>>
>>101910893
kek, it wasn't just the filename format it was the number of gens lining up as well. ComfyUI_Flux_xxxx is a pretty common filename but only you post 1024x1024 jpgs, like you used to do with your nonce shit until people pointed out they can just be filtered.

sorry i figured you out, but im gonna call you out every time i see your gens. even if you change filenames, they're really fucking trite so they're easy to identify
>>
>>101910866
>>101910877
Ok got it, added "perfect eyes"
>>
File: 00098-521248398.png (1010 KB, 896x1152)
1010 KB
1010 KB PNG
>>101909616
HOLY AUTISMOBALLS AUTISMERINO

btw doing my prompts like this from now on
>>
>>101910926
>perfect eyes
>boobs get enhanced
>>
File: 00050-686562007.jpg (988 KB, 2304x1728)
988 KB
988 KB JPG
>>
File: ComfyUI_Flux_0107.jpg (798 KB, 1536x2688)
798 KB
798 KB JPG
>>101910923
>but only you post 1024x1024 jpgs

wat
>>
uhhh sir...?
>>
>>101910893
>>101910953
>doubles down on it to try and false flag

i said this yesterday, but you're really obvious
>>
>>101910833
I tried the node before and offloading it to the cpu actually made it slower for me. offloading to another gpu is probably the only way to get a benefit from it
>>
You know if you didn't abuse proxies so many times already debo we wouldn't know it was you and add your rentry to OP
>>
File: file.png (40 KB, 782x268)
40 KB
40 KB PNG
>>101910970
lol
>>
>>101910971
what's your CPU?
>>
File: ComfyUI_Flux_9046.png (164 KB, 561x562)
164 KB
164 KB PNG
That's what I call effort. Made me chuckle, actually.
>>
>>101910992
12900k
>>
File: fp111.jpg (225 KB, 1024x1024)
225 KB
225 KB JPG
>>
File: ComfyUI_00149_.png (967 KB, 1160x896)
967 KB
967 KB PNG
>honey! kitana has eloped with a mortal again! from earthrealm, no less!
>>
File: file.png (14 KB, 345x99)
14 KB
14 KB PNG
>>101910984
are you retarded? do you think anyone would fall for that?
>>
>>101910926
>>101910935
>>
>>101910040
who the fuck makes them do that amerishit?
>>
File: ComfyUI_Flux_8707.jpg (245 KB, 1024x1024)
245 KB
245 KB JPG
>>
>>101911020
I've set out on a 10 hour quest to try to get flux to run on my machine (it will fail). But damn is SD still good fun.
>>
File: 00028-806472780.png (1.14 MB, 832x1216)
1.14 MB
1.14 MB PNG
>>
>>101910731
Can you execute that locally anon?
It keeps erroring on anything risqué.
>>
>>101910815
no, the chinese live in a society, we live in a open air metal institution
>>
>>101911051
I like metal also. My favorite is Iron Madden
>>
>>101911061
Debo can you fuck off?
You did this during the last split and had ritualistic meltdowns until everyone gave up. We're not doing this anymore, we're tired of you as a community
>>
File: ComfyUI_Flux_8945.jpg (188 KB, 1024x1024)
188 KB
188 KB JPG
>>
Here we go with the bread...
>>101911066
>>101911066
>>101911066
>>
>>101911000
I miss her bros
Lora when?
>>
>>101910087
>>101910240
You should be able to accumulate gradients across batches as well, if you're worried about the batch not being large enough.
>>
>>101909357
https://download.pytorch.org/whl/nightly/rocm6.2
>>
>>101909889
It's not descrimination, its representation! you cant silence one of sonics most cherished jewish figures!!!!!
>sticks what did I tell you about looking into your heritage again?



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.