[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


File: Flux_00213_.png (1.34 MB, 1024x1024)
1.34 MB
1.34 MB PNG
Discussion of free and open source text-to-image models

Previous /ldg/ bread : >>101758249

>Beginner UI
EasyDiffusion: https://easydiffusion.github.io
Fooocus: https://github.com/lllyasviel/fooocus
Metastable: https://metastable.studio

>Advanced UI
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
ComfyUI: https://github.com/comfyanonymous/ComfyUI
InvokeAI: https://github.com/invoke-ai/InvokeAI
SD.Next: https://github.com/vladmandic/automatic
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI

>Use a VAE if your images look washed out
https://rentry.org/sdvae

>Model Ranking
https://imgsys.org/rankings

>Models, LoRAs & training
https://civitai.com
https://huggingface.co
https://aitracker.art
https://github.com/Nerogar/OneTrainer
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts

>Flux
https://huggingface.co/spaces/black-forest-labs/FLUX.1-schnell
https://comfyanonymous.github.io/ComfyUI_examples/flux

>Pixart Sigma & Hunyuan DIT
https://huggingface.co/spaces/PixArt-alpha/PixArt-Sigma
https://huggingface.co/spaces/Tencent-Hunyuan/HunyuanDiT
https://huggingface.co/comfyanonymous/hunyuan_dit_comfyui
Nodes: https://github.com/city96/ComfyUI_ExtraModels

>Kolors
https://gokaygokay-kolors.hf.space
Nodes: https://github.com/kijai/ComfyUI-KwaiKolorsWrapper

>AuraFlow
https://fal.ai/models/fal-ai/aura-flow
https://huggingface.co/fal/AuraFlows

>Index of guides and other tools
https://rentry.org/sdg-link
https://rentry.org/rentrysd

>GPU performance
https://vladmandic.github.io/sd-extension-system-info/pages/benchmark.html
https://docs.getgrist.com/3mjouqRSdkBY/sdperformance

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
sd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium

>Related boards
>>>/h/hdg
>>>/e/edg
>>>/d/ddg
>>>/b/degen
>>>/vt/vtai
>>>/aco/sdg
>>>/trash/sdg
>>
File: delux_flebo_00001_.png (1.36 MB, 1216x832)
1.36 MB
1.36 MB PNG
>mfw
>>
File: file.png (624 KB, 1024x1024)
624 KB
624 KB PNG
>>
>>101761274
>debo is hijacking /ldg/
it's over
>>
File: FD_00247_.png (1.4 MB, 1024x1024)
1.4 MB
1.4 MB PNG
>>
im losing faith in bigma. they either sold it all to nvidia or the model simply won't compete with flux thanks to chinese using shitty datasets. has flux raising the bar effectively killed all other local attempts? i heard auraflow already dropped out realizing it would be too expensive to develop anything at flux's level.
>>
>>101761268
>7090

Hopefully Nvidia will finally decide to give it more than 24GB VRAM.
>>
File: file.png (797 KB, 1024x1024)
797 KB
797 KB PNG
>>
File: FD_00258_.png (1.06 MB, 1024x1024)
1.06 MB
1.06 MB PNG
>>101761335
>>101761205
>>
File: ComfyUI_00936_.png (1020 KB, 1288x848)
1020 KB
1020 KB PNG
>>
File: ComfyUI_00784_.png (1.49 MB, 1024x1024)
1.49 MB
1.49 MB PNG
>>
>>101761357
Bigma just needs to be the biggest possible while also finetuneable on 24 GB VRAM without compromises. Flux likely won't see really work for a year or more.
>>
>>101761357
>i heard auraflow already dropped out realizing it would be too expensive to develop anything at flux's level.
yeah, flux settled the standard bar so high now, every new model that isn't at flux level will be discarded, imo that's a good thing, it will make model creators work harder
>>
File: spread positive vibes.png (1.31 MB, 1024x1024)
1.31 MB
1.31 MB PNG
>>101761371
It still doesn't have that aura. It's getting better though!
>>
>>101761380
I'm still holding my breath to see the practicality of finetuning this thing.
>>
File: FD_00029_.png (1.49 MB, 1024x1024)
1.49 MB
1.49 MB PNG
>>101761375
Very cool
>>
>>101761394
What we need is a 4B or 6B model which will be the sweet spot.
>>
File: FD_00244_.png (458 KB, 512x512)
458 KB
458 KB PNG
>>101761392
Just takes some prooompting, it doesn't have the aggressive aesthetic prompt altering that DallE does.
But I have already moved on to other gens
>>
File: ComfyUI_00785_.png (1.07 MB, 1024x1024)
1.07 MB
1.07 MB PNG
>>
File: ComfyUI_00939_.png (1.17 MB, 1288x848)
1.17 MB
1.17 MB PNG
>>
>>101761406
unfortunately, the quality won't be as good as the 12b one, if flux got so much hype it's because it's a great model, and it's great because it's big, I could be wrong though and someone will make a 4b model in flux quality, but I'm not convinced yet
>>
>>101761372
holy shit yakuza, prompt?
>>
>>101761417
I'm of the mind that everything flux does so well is basically a result of its large parameter size. This is probably the size we should get used to working with in the future because it seems to be the point where hands stop getting fucked up.
>>
>>101761427
Gameplay footage screen shot of Yakuza 0.
featuring Donald Trump in a white jacket and pants with red shirt combo. He is fighting Yakuza thugs on the sidewalks of Osaka. Trump can be seen flying through the air delivering a kick to one thugs face and shattering his glasses. There is a logo on the bottom right of the image which reads "Like a Dragon: 7"
The subtitles read "That's the Donald of Dojima, GET HIM."
>>
>>101761446
ty
>>
File: FD_00262_.png (1.39 MB, 1024x1024)
1.39 MB
1.39 MB PNG
>>101761417
>>101761433
It's for sure because of the params. We will see people dropping prunes soon, and we can compare just hoe much we lose.
>>
I think ipadapter for Flux would solve any style issues without the need to finetune or use loras
ipadapter in style transfer mode using an example image with the style you want generally works extremely well (and with the node in style transfer mode the composition of the example image is usually correctly ignored)
>>
File: ComfyUI_00788_.png (1.43 MB, 1024x1024)
1.43 MB
1.43 MB PNG
>>
>>101761380
Flux is just ... okay. I'm not sold on AI art as a general idea until someone is brave enough to make a model that can effortlessly create a naked woman who is holding up 10 fingers and staring directly at the camera with both eyes focused and pointing in the same direction. And if you think being able to draw a naked chick is only useful for porn, you're playing yourself on levels that shouldn't be possible.
>>
>>101761461
>We will see people dropping prunes soon
Oh yeah I totally forgot about that, it's true that there's "pruned" versions of the SD models, I wonder how well it will work with flux, probably better because DiT models are more resiliant to pruned/quantized shit than unet
>>
>>101761476
>if you think being able to draw a naked chick is only useful for porn, you're playing yourself on levels that shouldn't be possible.

elaborate
>>
>>101761476
no no I don't disagree with you here, but flux can already do convincing naked women, the anatomy is there, we just needs to add better and consistant nipples and genitals and we'll be eating good, it'll be way easier to add those concept compared to base SDXL who was a disaster in that topic
>>
>>101761490
misogynymaxxing is the endgame of all silicon
>>
File: ComfyUI_00790_.png (1.34 MB, 1024x1024)
1.34 MB
1.34 MB PNG
>>
File: ComfyUI_00944_.png (1.03 MB, 1288x848)
1.03 MB
1.03 MB PNG
>>
File: Flux_00276_.png (1.31 MB, 1024x1024)
1.31 MB
1.31 MB PNG
>>
>>101761433
nta but I'm of the opposite. I think Flux could fit it in 8b with the theoretical world's perfect dataset. I mean, do you think if SD3 had a 12b version that it would somehow suddenly be Flux's level and not a completely broken shitheap? Data is what SAI decided to completely nuke with their safety team and data is what allowed Flux to finally do more than 1girl standing. Even better data should allow smaller models to do more with less
>>
>>101761523
kek
>>
File: ComfyUI_00794_.png (1.51 MB, 1024x1024)
1.51 MB
1.51 MB PNG
>>
>>101761527
>I mean, do you think if SD3 had a 12b version that it would somehow suddenly be Flux's level and not a completely broken shitheap?
absolutely not, the guy who made Flux have way more freedom on the training than on SAI, that's precisely why they decided to leave SAI in the first place.
https://xcancel.com/EMostaque/status/1819777262071226611#m
>fwiw I don’t think bfl team could have built/released Flux when at Stability (contrast aha) for… reasons
>>
>>101761546
What happened to SAI that caused their talent to leak out like that? It can't have been a money issue. So what part of the company was so rotten from within that everyone of merit decided to leave and how did it begin?
>>
File: ComfyUI_00793_.png (1.56 MB, 1024x1024)
1.56 MB
1.56 MB PNG
>>
>>101761565
>What happened to SAI that caused their talent to leak out like that?
the """safety""" shit, I would be depressed aswell if I was working hard on a model just to be told to destroy it with some cucking DPO by some retarded higher ups managers who doesn't know 3rd grade math calculus
>>
File: FD_00271_.png (1.35 MB, 1024x1024)
1.35 MB
1.35 MB PNG
>>101761468
>ipadapter
>>
>>101761468
>I think ipadapter for Flux would solve any style issues without the need to finetune or use loras
does ipadapter work on anime characters? I used InstantID and that one only worked on real humans
>>
File: Flux_00233_.png (1.42 MB, 1024x1024)
1.42 MB
1.42 MB PNG
>>101761268
woah my pic made it into the OP feels good man
>>
>>101761497
Yeah, it's definitely not far off in the human physiology department. I think someone is bound to make a full NSFW spin just due to popular demand...
>>
>>101761497
Speaking of. Here's a workflow to draw them back on.
https://files.catbox.moe/h8avmw.png
>>
File: Flux_00234_.png (1.34 MB, 1024x1024)
1.34 MB
1.34 MB PNG
>>101761616
>>
>>101761629
you use a SDXL refiner to do that?
>>
File: Flux_00241_.png (1.42 MB, 1024x1024)
1.42 MB
1.42 MB PNG
>>101761632
>>
>>101761612
sounds like you're thinking of faceid ipadapter which is a more specialized thing, I meant regular ipadapter which is general purpose and designed to work with any kind of images rather than just faces
>>
File: Flux_00242_.png (1.34 MB, 1024x1024)
1.34 MB
1.34 MB PNG
>>101761645
>>
File: Flux_21_.png (1.35 MB, 1368x768)
1.35 MB
1.35 MB PNG
is this the limit of 24gb VRAM?
I just wish Flux had better anime styles (soon-tm)
>>
File: Flux_00239_.png (1.31 MB, 1024x1024)
1.31 MB
1.31 MB PNG
>>101761660
>>
>>101761629
my autistic workflow is growing
>>
File: end_gen~5.jpg (303 KB, 1448x1280)
303 KB
303 KB JPG
>>
>simpletuner flux lora training doesn't work with >1 GPU when using quantization
>dev says "oops haha i just assumed it would work i haven't actually tested it yet"
I swear I'm the only person that has >1 3090 and actually uses them for training things. Nothing ever fucking works right with multi GPU, in basically every training script I've tried.
>>
File: 1721997851162164.png (1 MB, 1024x1024)
1 MB
1 MB PNG
>>
>>101761741
kek'ed hard
>>
File: FD_00277_.png (870 KB, 1024x1024)
870 KB
870 KB PNG
>>101761664
It knows Junji Ito, what more do you need?
>>
>>101761756
it doesn't know Studio Ghibli, I cri everitim
>>
>>101761639
Yeah. Well I was using 1.5 but goblin guy made it better.
>>101761705
Team work makes the dream work. Have you made the nippler more autistic?
>>
>>101761741
prompt? that's cool it can do animal crossing
>>
>>101761741
jfc lol
>>
>>101761733
that's because no one needed multi gpu until now, it'll be improved now that flux exist I think
>>
File: ComfyUI_00792_.png (1.53 MB, 1024x1024)
1.53 MB
1.53 MB PNG
>>
File: Flux_00260_.png (1.19 MB, 1024x1024)
1.19 MB
1.19 MB PNG
>>
File: ComfyUI_00803_.png (1.55 MB, 1024x1024)
1.55 MB
1.55 MB PNG
>>
>>101761733
are you the cunny trainer?
>>
File: FD_00279_.png (1.19 MB, 1024x1024)
1.19 MB
1.19 MB PNG
>>101761741
>>
File: ComfyUI_00802_.png (1.44 MB, 1024x1024)
1.44 MB
1.44 MB PNG
>>
>>101761830
No, but I do have my own NSFW dataset of 1000+ images that I want to try training. But I have a 4x3090 machine, I'm not even gonna attempt it with just one GPU and suffer 1/4 the training speed.
>>
File: Flux_00263_.png (977 KB, 1024x1024)
977 KB
977 KB PNG
>>
>>101761770
not really, it works quite well in my workflow right now. But I added OCR for text detection and another pass through pony lol.
at this point it might be better to gen first in flux, then do OCR and then nipping or a pass through pony
files.catbox.moe/uar5kp.png
>>
>>101761863
What do you do for a living that lets you buy and maintain a 4 4090 machine? Or do you just live beyond your means for gooning?
>>
>>101761883
3090 moran
>>
>>101761786
Gameplay footage screen shot of animal crossing, donald trump talking to his mexican mouse neighbor wearing a sombrero, top down view, animal crossing dialogue box saying "You have to go back!" with the name "Trump" as the speaker, trump smiles smugly while pointing and the mouse looks shocked with his mouth agape
>>
File: ComfyUI_00806_.png (1.72 MB, 1024x1024)
1.72 MB
1.72 MB PNG
>>
>>101761756
Every single cunny artist on pixiv known to man
>>
>>101761890
That makes sense since you can't NVlink 4090s
What PSU?
>>
>>101761895
>prompt is literally just a description of the image
I wonder when people will stop asking for prompts
>>
File: ComfyUI_00814_.png (1.6 MB, 1024x1024)
1.6 MB
1.6 MB PNG
>>
>>101761932
vocabularlylets and ESL
>>
>>101761937
that's what claude is for
>>
File: Flux_00304_.png (1.21 MB, 1024x1024)
1.21 MB
1.21 MB PNG
>>
File: Flux_00254_.png (1.75 MB, 1024x1024)
1.75 MB
1.75 MB PNG
>>
>>101761883
I work at a big tech company and make a little over 200k / yr. And it's used 3090s, not even all that expensive (compared to a lot of other hobbies).
>>101761913
I don't even nvlink, you can't do it 4-way and it doesn't matter for basic distributed data parallel training. Machine uses two 1600W PSUs.
>>
>>>/aco/8431464
does anyone know what anon means?
>>
File: ComfyUI_00818_.png (1.53 MB, 1024x1024)
1.53 MB
1.53 MB PNG
>>101762000
it means being anonymous
>>
>>101762000
I wonder if the safety stuff really is only as deep as the T5 model and it actually is a lot more NSFW than we expected.
>>
File: Flux_00267_.png (1.46 MB, 1024x1024)
1.46 MB
1.46 MB PNG
>>101761986
sounds like a cool setup.
how do you mount all these FAT af GPUs?
like what kinda case does any of that shit fit in?
>>
File: Flux_00307_.png (1.06 MB, 1024x1024)
1.06 MB
1.06 MB PNG
>>
>>101762031
make him a minion
>>
File: ComfyUI_00816_.png (1.6 MB, 1024x1024)
1.6 MB
1.6 MB PNG
>>
File: ComfyUI_00822_.png (1.42 MB, 1024x1024)
1.42 MB
1.42 MB PNG
>>
>>101762024
Crypto mining frame with PCIE risers. This is by far the least hassle way to do it.
>>
File: Capture.jpg (174 KB, 3830x1300)
174 KB
174 KB JPG
How do you save a XYZ plot into a giant picture that has all the grids?
>>
Are there any tricks with doing img2img with Flux? It seems like the results are very binary. Either there's a minor alteration to the original image or it's a completely new image.
>>
File: Flux_00268_.png (1.42 MB, 1024x1024)
1.42 MB
1.42 MB PNG
>>101762061
what kinda sound they make at full power?
must sound like a 747 on takeoff lmao
>>
File: ComfyUI_00823_.png (1.58 MB, 1024x1024)
1.58 MB
1.58 MB PNG
>>
File: 2024-08-07_07-35.png (97 KB, 814x630)
97 KB
97 KB PNG
>>101762068
it seems to just werk for me with this sampler and scheduler setup
>>
File: ComfyUI_00965_.png (998 KB, 1288x848)
998 KB
998 KB PNG
>>
>>101762094
It's me frfr
>>
File: Flux_00309_.png (1.37 MB, 1024x1024)
1.37 MB
1.37 MB PNG
>>101762037
good idea
>>
File: ComfyUI_00966_.png (1.06 MB, 1288x848)
1.06 MB
1.06 MB PNG
>>
File: Flux_00271_.png (1.3 MB, 1024x1024)
1.3 MB
1.3 MB PNG
>>
Can anon please share a prompt for the elmo pictures of past threads? I love that style that would fool me into thinking it's not an AI picture.
>>
>>101762087
Thanks, I'll try that.
>>
>>101762081
When the fans spin up while training full blast it's reasonably loud. Worse is that it pumps out a lot of heat. But this is all okay because the machine sits in a different room. I just SSH into it from my desktop and use NFS to have remote access to the whole filesystem.
>>
File: ComfyUI_00932_.png (1.09 MB, 1288x848)
1.09 MB
1.09 MB PNG
>>101762181

https://files.catbox.moe/snqbe1.png
>>
File: 1717621704673324.png (1.1 MB, 1400x1024)
1.1 MB
1.1 MB PNG
>>
File: 1702518976375515.jpg (1.05 MB, 2752x2002)
1.05 MB
1.05 MB JPG
Running flux schnell on Ipad Pro M4, 100 secs per gen at 4 steps
Apple won
>>
Hi, so im curious if theres any actual difference between nvidia and amd in SD (except access to cdnn)
>>
File: ComfyUI_00827_.png (1.47 MB, 1024x1024)
1.47 MB
1.47 MB PNG
>>
File: ComfyUI_00815_.png (1.81 MB, 1024x1024)
1.81 MB
1.81 MB PNG
>>
File: Flux_00312_.png (991 KB, 1024x1024)
991 KB
991 KB PNG
well this is kinda uncanny
>>
>>101762250
that's creepy
>>
File: ComfyUI_00809_.png (1.36 MB, 1024x1024)
1.36 MB
1.36 MB PNG
>>
>>101762250
Prompt please. I wanna try that.
>>
>>101762232
If there's any difference, I don't think it would be significant enough to matter since the underlying algorithms are the same.
>>
File: Flux_00313_.png (1.07 MB, 1024x1024)
1.07 MB
1.07 MB PNG
>>101762280
>House doorbell security cam footage photo. Fisheye lens effect creates rounded distortion. It is night time and the colors are monochrome with a night-vision effect. At the door, an extremely muscular ripped minion from Despicable Me is standing outdoors. The minion has yellow skin, one big eye, and sparse black hairs. The minion wears a skintight black shirt emblazoned with white text that reads "NEVER GOON". The minion is smoking a cigar and wields an AR-15 in his arms as he looks at the camera. In the background of the man, there is a very dark empty city street. Overall, the image conveys a sense of absurdity and humor mixed with intimidation.
>>
File: Flux_00280_.png (1.39 MB, 1024x1024)
1.39 MB
1.39 MB PNG
>>101762204
sounds comfy, are your 3090s all from the same brand?
>>
>>101762232
What you save at the checkout with AMD you, you pay ten times over in frustration at home.
>>
File: ComfyUI_00975_.png (1002 KB, 1288x848)
1002 KB
1002 KB PNG
>>
>>101762211
Got it, thanks! :D
>>
File: ComfyUI_00835_.png (1.18 MB, 1024x1024)
1.18 MB
1.18 MB PNG
>>
File: Flux_00283_.png (1.08 MB, 1024x1024)
1.08 MB
1.08 MB PNG
>>
>>101762302
Nope, 2 ASUS 1 EVGA 1 Gigabyte. It doesn't matter if it's different brands. I mean maybe nvlinking different brands does some weird shit (probably not, I bet it works) but I don't do that.
>>
>>101761822
I want to make flux stop tarting up faces.
>>
File: ComfyUI_00836_.png (1 MB, 1024x1024)
1 MB
1 MB PNG
>>
File: ComfyUI_00979_.png (879 KB, 1288x848)
879 KB
879 KB PNG
>>
File: 1704150803373177.png (1.29 MB, 1920x1080)
1.29 MB
1.29 MB PNG
war has changed...
>>
>>101762296
Lmao. Excellent.
>>
>>101762384
>guy behind her
don't open that door!
>>
File: ComfyUI_00981_.png (886 KB, 1288x848)
886 KB
886 KB PNG
>>101762389
>>
Well, I really wanted to know if Flux could draw a proper humanized Miku, but "teenage girl" fails the other way...
>>
File: Flux_00288_.png (1.14 MB, 1024x1024)
1.14 MB
1.14 MB PNG
>>
>>101762426
Nice cookie monster elmo hybrid
>>
The only thing that matters for me in these models is how well it can generate anime girls getting brutally raped, nothing else
>>
>>101762296
>>
We need a federal gooner police.
>>
File: Flux_00289_.png (1.19 MB, 1024x1024)
1.19 MB
1.19 MB PNG
>>
File: ComfyUI_00983_.png (898 KB, 1288x848)
898 KB
898 KB PNG
>>
File: FLUX_00013_.png (1.1 MB, 1152x896)
1.1 MB
1.1 MB PNG
>>101762453
>>
File: 1705088386914019.png (1.23 MB, 1920x1080)
1.23 MB
1.23 MB PNG
>>
can Flux modify images? it isn't respecting an encoded input.
>>
File: Flux_00316_.png (1.3 MB, 1024x1024)
1.3 MB
1.3 MB PNG
>>101762461
>>101762472
gonna goon to these
>>
>>101762296
>>101762393
Also, cool prompt hack is to just type: "Overall, the image conveys [thing] mixed with [something else]" if you want to generate something just based on vibes. Generative AI is pretty good at reasoning backwards from summary statements like that.

> Overall, the image conveys a sense of humor and surrealism mixed with palpable dread.
>>
>>101762503
> Overall, the image conveys a sense of fear and feminine seduction.
>>
File: ComfyUI_00985_.png (1.85 MB, 1288x848)
1.85 MB
1.85 MB PNG
>>
File: Flux_00317_.png (1.03 MB, 1024x1024)
1.03 MB
1.03 MB PNG
>>
File: 2024-08-07_00064_.png (1.77 MB, 1024x1024)
1.77 MB
1.77 MB PNG
>>
>>101762523
>>
File: t1.jpg (626 KB, 2150x3226)
626 KB
626 KB JPG
>>
>>101761878
>OCR
what is ocr?
>>
File: Flux_00291_.png (1.16 MB, 1024x1024)
1.16 MB
1.16 MB PNG
>I'm already prompting for 16 hours or so
>>
>>101762024
This is the first time I have seen a coherent guitar from a local model.
>>
>>101762303
Do you have a few examples of problems i could encounter?
>>
>>101762617
No
>>
>>101762617
Yes
>>
Alright bros, so why does it slow down on ComfyUI? I'm asking for the second time. On Windows 11, I have over 22GB of VRAM free on my 3090, but for some reason Flux slows down over time, from 1.26-1.4 sec/it to 10 sec/it. I do cancel my server every now and then because I do things, not sure if that influences it, but I also may load upscale models and that triggers it. Once it happens, the only way to fix is to restart my computer, otherwise rely on luck when I restart ComfyUI.
>>
>>101762617
Maybe
>>
File: 1721214075330507.png (1.07 MB, 1920x1080)
1.07 MB
1.07 MB PNG
>>
File: Flux_00300_.png (932 KB, 1024x1024)
932 KB
932 KB PNG
>just one more gen and I go to sleep
>>
>>101762636
just load it in fp8 and you won't have any issues.
>>
File: ComfyUI_temp_eihaf_00083_.png (749 KB, 1024x1024)
749 KB
749 KB PNG
>>101762126
>>101762094
>>
>>101762691
NGL, if it weren't for the hair and the fatass attached to them, those breasts looking pretty squeezable.
>>
File: Flux_00303_.png (928 KB, 1024x1024)
928 KB
928 KB PNG
>>
File: bad.png (1.33 MB, 1024x1024)
1.33 MB
1.33 MB PNG
>>101762531
Okay, what I want is a Captioner where I send a picture and then I get text that would reproduce a similar picture if I used it as a prompt for Flux. I tried your picture here:
https://huggingface.co/spaces/gokaygokay/FLUX.1-dev-with-Captioner
Which seems to be state of the art? And I got this:
>The image is a page from a video game. It is divided into two sections.
>On the left side of the page, there is a black and white illustration of a group of three characters standing in a grassy field. The character on the left is a man wearing a white t-shirt and black pants, and the character in the middle is a woman with long blue hair and a black dress. She is standing in front of a large rock formation.
>In the center of the image, there are two other characters, one wearing a black suit and the other wearing a blue dress. They are standing next to each other and appear to be posing for the photo. The background is a dark and eerie landscape with a castle in the distance. The text on the page is in Japanese and translates to "Mecis" in English.
Which gives me picrel.
...
Ugh, so I can use online captioners and image generators for free but what's the point if I can't even recreate some pictures? How far away is this dream?
>>
File: 2024-08-07_00077_.png (1.66 MB, 1280x720)
1.66 MB
1.66 MB PNG
>>
File: Flux_00319_.png (978 KB, 1024x1024)
978 KB
978 KB PNG
finally managed the cartoonish minion look but using negative prompting that slows shit down by 3x
>>
>>101762717
That caption is horrible tbhdesu, even claude 3 haiku/gpt-4o mini describe images better
>>
File: FD_00291_.png (1.15 MB, 1024x1024)
1.15 MB
1.15 MB PNG
>>
File: 2024-08-07_00078_.png (1.61 MB, 1280x720)
1.61 MB
1.61 MB PNG
>>
>>101762717
this is what gpt-4o gave me. It didn't name miku directly but you can fix it. It also hallucinated the "LOOK: Enemy Berserk", and put the Miku on the left page above the two dudes:
>The image is a two-page spread from what appears to be a magazine or a guidebook. The left page features a mix of text and images. At the top left, there is a heading labeled "Meclis" followed by a column of text in Japanese. Below the text, there is an image of a character with long turquoise twin-tails, wearing a black and turquoise outfit with a short skirt and thigh-high boots. To the right of this, there are two more images: one shows a menacing skull with glowing red eyes labeled "LOOK: Enemy Berserk," and below it, a screenshot from a game featuring two characters, one bald man in a white shirt and another person in black attire, with status bars above them. The right page displays a large, detailed image of the same turquoise twin-tailed character from the back, standing in a grassy, rocky environment with a status bar at the top labeled in Japanese. The character's shadow is prominently cast on the ground, and the setting appears to be a dimly lit outdoor area at night.
>>
>>101762800
And this is 3.5 Sonnet, actually better:
>The image appears to be a two-page spread from a Japanese video game magazine or guide book. On the left page, there's a large block of Japanese text under the heading "Meclis". In the top right corner of this page is a small image of a menacing skull-like creature with glowing red eyes. Below the text are two character models: a bald man in a white t-shirt and a woman in a black outfit. The right page is dominated by a 3D render of an anime-style character with long teal twin-tails, wearing a black and teal outfit reminiscent of Hatsune Miku. This character is shown in a dynamic pose against a grassy background. Above this image is what looks like a game interface element showing a health or progress bar. The bottom of the left page contains smaller images of the teal-haired character from different angles. Throughout the spread, there are various UI elements and text boxes that appear to be from the game itself, giving it the look of a strategy guide or character showcase.
>>
File: 2024-08-07_00082_.png (1.47 MB, 1280x720)
1.47 MB
1.47 MB PNG
>>
File: Oh.png (1.04 MB, 1024x1024)
1.04 MB
1.04 MB PNG
>>101762755
Here's what I got from this one:
https://huggingface.co/spaces/thobuiq/GPT-4o
>The image is a two-page spread from an anime magazine. The left page features a large illustration of a character named "Meclis". The character is depicted with long, blue hair and is wearing a black dress. The right page provides more details about the character. It includes a smaller illustration of the character, a description of their personality, and a list of their abilities. The text is in Japanese, and the character's name is written as "メクリス". The overall layout of the pages suggests a focus on the character and their role in the anime series.
Hmm, well, at least the pic is good. Looks nothing like the original, but it's impressive.
>>
You will never be an artist, even a 8 year old can generate infinite AI slop on his tablet.
>>
>>101762842
Me on the right
>>
>>101762856
That's not GPT-4o though, but some random open weight model in a HF space named "GPT-4o". Both GPT-4o and Claude 3.5 Sonnet are commercial models, there might be some HF demos that give you free access to them, but I doubt it.
>>
>>101762717
https://huggingface.co/spaces/opencompass/open_vlm_leaderboard
>>
File: 1705284146428081.png (47 KB, 950x343)
47 KB
47 KB PNG
>>101762856
Yeah, it's using mixtral and whatever this model is, the autistic space creator just named it "Open GPT-4o" for whatever reason
>>
>>101762866
>You will never be an artist
Thanks for the whitepill
>>
File: FD_00297_.png (1.68 MB, 1024x1024)
1.68 MB
1.68 MB PNG
>>
File: 1694499319905534.png (2.04 MB, 1920x1080)
2.04 MB
2.04 MB PNG
>>
>>101762894
this but unironically
>>
>>101762894
you can though
just not pathetic caveman shit like a woman choking on cock
>>
>>101762087
>>101762202
Found out it just requires additional steps if using schnell.
>>
File: FD_00298_.png (1.49 MB, 1024x1024)
1.49 MB
1.49 MB PNG
>>101762866
And neither will you, because even an 8 year old can generate infinite AI slop on his tablet.
>>
>>101762910
catbox now
>>
File: 2024-08-07_00090_.png (1.44 MB, 1280x720)
1.44 MB
1.44 MB PNG
>>101762868
kek
>>
>>101763009
Flux makes terrible asses
>>
>>101762994
For porn? Scroll up and open the goblina link
>>
>>101762994
I think you misread that post by missing the word "not" or something
>>
so why does lowering the guidance make the model better at doing paintings and reduce the slopped look?

like what's going on architecturally with the model that causes that to work
>>
>>101763022
that's inpainting and hardly porn
>>101763040
I did not, thank you. Don't talk to me again, retard.
>>
>>
File: 1716844133588489.png (1.05 MB, 1024x1024)
1.05 MB
1.05 MB PNG
>>
>>101763050
Because it follows your specific prompt less and has more "creativity"?
>>
>>101763050
>so why does lowering the guidance make the model better at doing paintings and reduce the slopped look?
the model has a bias on style, decreasing the guidance decrease this bias, but it also decrease the prompt following ability unfortunately
>>
>>101763060
prompt for this one?
>>
File: 1705751148772828.png (1.21 MB, 1024x1024)
1.21 MB
1.21 MB PNG
>>
havent been here since day 1 of flux, anyone figure out how to coax different styles out of it yet?
>>
File: g3nknskzw6hd1.png (3.61 MB, 2048x1024)
3.61 MB
3.61 MB PNG
I have a challenge for you, can flux reproduce those 2 pictures? They have been made by Midjourney
>>
File: output.png (1011 KB, 1024x1024)
1011 KB
1011 KB PNG
>>101762800
Ooh, this was actually very nice, and it didn't even need Miku in there! I'm a slow reader so I'll get to your next message now, I gotta admit this is clearly a new level in image generation I hadn't experienced before!
>>
>>101763137
Are you Japanese?
>>
>>101763131
Maybe, give the mj prompt and I will reverse engineer it into flux
>>
>>101763149
I found it on leddit and it doesn't have any prompt unfortunately
https://reddit.com/r/midjourney/comments/1em57b5/album_covers/
>>
File: 2024-08-07_00072_.png (1.43 MB, 1280x720)
1.43 MB
1.43 MB PNG
>>101763018
ya, depends on what you prompt tho.. but using "by Junji Ito" surely doesnt make them pleasing for the standard gooner, but thats not the point of these gens .. its about the monsters and without Ito in the prompt they are just boring

>>101763118
yes you have to be precise with the prompt, put the styles up in the front, it wont do magic like MJ by just adding an artist .. you have to specify paint/draw style (water color, inked, drawn, painted, oil paint, acryl, etc.), and artist if it knows them and a genre if you know it (Impressionist, Manga, Modern Art, Pop-art whatever), and the styles should be up front .. then it works quite well, pic related
>watercolor style, inked in red black and white, style Yusuke Murata, beach, cosmic horror, one punch man

overall its more difficult to prompt, but its extremely exact if you know what you do, MJ is definitly easier to coerce into some pleasing art style that is popular or widely known, but FLUX is so exact it will make ppl kinda mad cause they realize they have no idea of art history, if you do tho its great fun
>>
>>101763131
Honestly I think that the original MJ prompts for those didn't contain even 1/5 of the things present in the image, it's probably just MJ generating on its own, as usual, so replicating would be quite hard.
>>
File: test.jpg (784 KB, 3840x1713)
784 KB
784 KB JPG
I'm doing a XY plot on the samplers + schedulers on that prompt:
>Hatsune Miku with dreadlocks and a black skin showing her fists
So far, it seems like beta is a good scheduler if you want a better prompt understanding
>>
>>101763171
You should've made the prompt a bit more complex and specific, isn't this one too generic?
>>
File: 2024-08-07_00095_.jpg (693 KB, 2560x1440)
693 KB
693 KB JPG
>>101763171
pls post when done! thank you
>>
>>101763180
it's really hard to get miku to change skin and hair, this character is so overcooked I have to go to fucking cfg 3 + DynamicThresholding to make it work in the first place

And flux likes boomer prompt, so using "simple" prompt is a harder test for it
>>
File: miku.jpg (700 KB, 1792x1024)
700 KB
700 KB JPG
>>101762800
this prompt gets cut off by bing but results in picrel
can you try this version that hits the bing token cutoff point?
>The image is a two-page spread from what appears to be a magazine or a guidebook. The left page features a mix of text and images. At the top left, there is a heading labeled "Meclis" followed by a column of text in Japanese. Below the text, there is an image of a character with long turquoise twin-tails, wearing a black and turquoise outfit with a short skirt and thigh-high boots. To the right of this, there are two more images: one shows a menacing skull with glowing red
>>
File: moreoutput.png (1.29 MB, 1024x1024)
1.29 MB
1.29 MB PNG
>>101762819
Thanks! And here's this one, it seems GPT-4o was actually closer.
And with this it just hit me that all Stable Diffusion models had become obsolete, maybe a challenge of making a picture with those that Flux can't do it would be worthwhile (I don't care about NSFW) but these new toys are amazing!
>>
>>101763195
Did you try multiple times? With such prompts Flux won't give you same results on every generation.
>>101763194
Not sure what you meant by that, are you asking me to generate the cutoff prompt with Flux or something?
>>
>>101763195
>I don't care about NSFW
get out
>>
>>101763108
>Ivan Kramskoi painting of a parrot
But the style is more from experimenting with Karras scheduler than the prompt
It needs way more steps to converge than sgm_uniform, at least 50, and sometimes even 50 isn't enough. the behavour is weird
But when it works you get something very busy and "arty"
I don't really get it because on SDXL karras is _faster_ at converging than other schedulers, not slower. But machine learning is voodoo
>>
>>101763195
for someone who doesn't know how to read japanese, this book looks legit as fuck kek
>>
File: 1692037382471018.png (1.22 MB, 768x1344)
1.22 MB
1.22 MB PNG
>>101763213
Anon, even for English Flux dev gens look legit until you actually look into the letters
>>
>>101763203
yeah, throw that same prompt where bing cut it off. there is a way of getting past the token limit but dalle starts ignoring tokens, i want to know if the same happens with flux. iirc dalle on bing is 450 tokens
>>
>>101762875
Oh dear, suddenly I feel like I've been living under a rock, the last thing I knew about was CLIP Interrogator, ha.
>>
>>101763160
Then no probably not.
>>
>>101763227
also can you limit the output of those llms so they describe an image while respecting a certain number of tokens?
>>
first two FLUX loras appeared on civitai

>https://civitai.com/models/628576/silver-dust-gmr?modelVersionId=702754
>https://civitai.com/models/628763/androflux?modelVersionId=702968
one is some silver dust style, one is gay porn (puke)

but well it started
>>
>>101763227
6 results to compare, exactly the same prompt
>The image is a two-page spread from what appears to be a magazine or a guidebook. The left page features a mix of text and images. At the top left, there is a heading labeled "Meclis" followed by a column of text in Japanese. Below the text, there is an image of a character with long turquoise twin-tails, wearing a black and turquoise outfit with a short skirt and thigh-high boots. To the right of this, there are two more images: one shows a menacing skull with glowing red
https://files.catbox.moe/gj75mi.jpg
https://files.catbox.moe/99dgfd.jpg
https://files.catbox.moe/7eliyn.jpg
https://files.catbox.moe/qkjyup.jpg
https://files.catbox.moe/45fcyt.jpg
https://files.catbox.moe/va07l5.jpg
>>
>>101763171
been waiting for this
>>
>>101763245
>>https://civitai.com/models/628763/androflux?modelVersionId=702968
Holy fuck that "penis" looks attrocious, but hey at least there's one and it's from a shitty lora, it means that it will be easy to bring back genitals on flux with a finetune
>>
>>101763245
>they're for Schnell, not Dev
ugh PLEASE don't let the trainer community decide to rally around Schnell, it's not really much better than a good SDXL finetune
Dev is the only one worth using
>>
>>101763264
B-but the license, anonie, the retarded loramakies won't be able to earn that sweet donation money from their shitty loras for dev...
>>
>>101763264
the gay one is for schnell, the silver dust one is for dev
>>
File: miku2.jpg (624 KB, 1792x1024)
624 KB
624 KB JPG
>>101763247
thanks, for the comparison, it seems to be more accurate but less creative, it might be because of the lack of cfg
>>
>>101763264
>ugh PLEASE don't let the trainer community decide to rally around Schnell, it's not really much better than a good SDXL finetune
>Dev is the only one worth using
yeah, I don't want people to focus on that inferior model, but they're too interested on the licence, dev doesn't have an Apache 2.0 like schnell
>>
>>101763272
*vice versa, gay for dev, silver for schnell
>>
>>101763274
You can lower guidance down to 1, I'm using the default 3.5 on the API
>>
>>101763274
>it seems to be more accurate but less creative, it might be because of the lack of cfg
you can try some "high" cfg with this, it works fine and I'm using it to make my XY plot work >>101763171
https://new.reddit.com/r/StableDiffusion/comments/1ekgiw6/heres_a_hack_to_make_flux_better_at_prompt/
>>
>>101763272
what are you talking about, the silver dust one clearly says it's for schnell too
>>
>>101763293
It's the other way around, gay dev, silver schnell
>>
File: 1700875549057577.png (1.02 MB, 1024x1024)
1.02 MB
1.02 MB PNG
vgh...
>>
>>101763293
>>101763305
Will a lora from Schnev work for Dev?
>>
File: pjzpe96z37hd1.jpg (1.69 MB, 4096x1664)
1.69 MB
1.69 MB JPG
https://reddit.com/r/FluxAI/comments/1em5t0k/evokesneo/
how did he manage to get that style on flux?
>>
>>101763333
By lowering guidance and actually PROOMPTING? FFS anon just try doing it yourself instead of searching for "magic" tokens.
>>
>>101763293
ya ya sorry mixed em up, the silver is schnell, the gay is dev
>>
>>101763333
ask him? probably knew the artstyle and the artist who paints like that
>>
>>101763320
reminds me of my frustrations trying to get a swastika in sd1.5 and all it could ever do was incoherent checker patterns, we have come a long way
>>
File: 00000-3752555607.jpg (519 KB, 1814x1210)
519 KB
519 KB JPG
>>
File: 1692764872050898.png (90 KB, 791x556)
90 KB
90 KB PNG
>>
>>101763406
>reminds me of my frustrations trying to get a swastika in sd1.5 and all it could ever do was incoherent checker patterns, we have come a long way
Isn't there 16ch vae for 1.5? I would think something like that could actually make it happen
>>
>>101763424
you'd have to train the model to output a 16ch latent
>>
>>101763431
You couldn't use something like this https://github.com/city96/SD-Latent-Interposer ?
>>
File: file.png (743 KB, 818x635)
743 KB
743 KB PNG
I'm actually really impressed with Fooocus out of the box with an incredibly vague prompt.
>>
>>101763448
think about it, anon
if 1.5 is outputting a bad swastika then that little converter model has to do the fixing
do you think that makes sense?
I think you're putting too much weight on the vae's influence over large patterns like the swastika would be
>>
File: ComfyUI_Flux_11.png (1.14 MB, 1344x768)
1.14 MB
1.14 MB PNG
So is fp8 model MUCH worse at handling text than the fp16 version? I can't tell from my experience of switching between them because the full version often fucks up all the same. I tried playing around with guidance between 2.0 and 3.5 and sometimes it helps but usually not. Using fp16 clip only btw.
Also this >>101762087 sampler and a scheduler seem to amp comprehension dramatically.
>>
>>101762610
optical character recognition
>>
>>101763474
>Using fp16 clip only btw
is there even a fp8 CLIP_L anywhere? it's tiny, why would there be. the clip loader doesn't have an option to load in fp8 either
>>
File: 1723017770.png (619 KB, 1000x1000)
619 KB
619 KB PNG
>>
wait so why is no one talking about how flux is actually a 12+11B model since you need T5XXL?
>>
>>101763498
*bites you*
:3c
>>
File: FLUX_00130_.png (830 KB, 1024x768)
830 KB
830 KB PNG
>>101763320
think big
>>
File: 1723018419.png (750 KB, 1000x1000)
750 KB
750 KB PNG
>>
>>101763498
I'm surprised T5xxl is a 11b model, it's only using 9.3b of vram on fp16
>>
>>101763195
I pretty much wiped all of my SD 1.5 folder. If I want to recreate an image in Flux I still have my gens. As for XL I kept just 2 models.
>>
>>101763551
well idk the exact size but its big
>>
>>101763563
yeah you're right, it's a 11.3b model, but like I said it's only asking for 9.3b of vram on fp16
https://huggingface.co/google/flan-t5-xxl
>>
File: 1723018804.png (734 KB, 1000x1000)
734 KB
734 KB PNG
>>
File: ComfyUI_Flux_12.png (1.04 MB, 1344x768)
1.04 MB
1.04 MB PNG
>>101763491
My bad, I meant fp16 t5xxl.
>>
>>101763509
pp activate... form of MONSTER ENERGY DRINK!
>>
File: FLUX_00135_.png (922 KB, 1024x768)
922 KB
922 KB PNG
>>101763669
>>
>>101763680
the swastika got fucked
>>
>>101763683
it's symbolism
>>
>>101763245
tried em first one with the silver just doesnt load
the second one loads but produces dogshit body horror

but that was to be expected the first loras for SDXL were shit to but now that the they appear I guess we will see em flood into civitai all day long and the the day just started, will be exciting to see if something good appears

anyone got the silver one working I am getting
>lora key not loaded: up_blocks.1.attentions.2.transformer_blocks.1.attn2.processor.to_v_lora.down.weight
>lora key not loaded: up_blocks.1.attentions.2.transformer_blocks.1.attn2.processor.to_v_lora.up.weight
>...
error msgs from comfy
>>
>>101763680
make them niggers now with scars on their backs
>>
File: 1713599218568223.png (70 KB, 871x684)
70 KB
70 KB PNG
How can HF be giving out so much free compute away and still be profitable (they confirmed that they are)?
>>
>>101763537
fuck you emad
>>
>>101763705
cause they already have the GPUs and electricity doesn't cost them much, paid spaces have priority on them, ZeroGPU just uses the ones not doing anything
>>
>>101763324
seems that way, the gay lora loaded on schnell and dev, the silver didn't load at all on either (broken comfy?)
>>
>https://github.com/huggingface/diffusers/releases/tag/v0.30.0
>>
>>101763734
I guess that makes sense, thanks.
>>
>>101763474
Did some experiments and concluded that fp8 t5xxl is similar for short prompts but worse for long ones when it comes to text rendering at least, compared to fp16
>>
>>101763333
He shared the prompt in the comments btw. As usual, just proompting:
>This abstract expressionist painting features three stylized human figures against a dark background, using a stark contrast of black, white, and orange. The composition is filled with geometric shapes, particularly circles and curved lines. The style evokes neo-expressionist art, reminiscent of Basquiat, with a mix of primitive and complex symbolic forms. The painting's texture appears rough, with scratch-like marks enhancing its expressive quality. It conveys a mysterious and unsettling mood through distorted faces, symbolic elements like eyes and spirals, and the intense contrast of colors. Overall, the artwork presents a visually striking and emotionally charged piece that invites interpretation and engagement from the viewer.
>>
File: 1704040656984229.png (1.58 MB, 1344x768)
1.58 MB
1.58 MB PNG
>>101763821
Me and who?
>>
>>101763748
>Perturbed Attention Guidance (PAG)
>We already had community pipelines for PAG, but given its usefulness, we decided to make it a first-class citizen of the library.
Isn't PAG something that can make negative prompt possible even at CFG = 1? Will it work on flux?
https://mybyways.com/blog/testing-new-pag-and-perp-neg-nodes-in-comfyui
>Perturbed-Attention Guidance significantly enhances the sample quality of diffusion models without requiring external conditions, such as class labels or text prompts, or additional training. This proves particularly valuable in unconditional generation settings, where classifier-free guidance (CFG) is inapplicable.
>>
>>101763821
I hate these LLM prompts and I hate that's likely how the model was trained if they used generated captions.
>>
>>101763841
Why do you hate them? They actually allow to represent a lot of details and relationships between the objects. They store much more information than just tags.
>>
Is REACTOR still the best face replacement technique?
>>
>>101763841
I don't mind boomer prompt, but I also wished the model was able to understand simple concepts aswell like a painting of X
>>
>>101763849
It's not the length or the detail, it's how they are written, things like it starting with "This x is y"
Oh and "Overall, the artwork presents a visually striking and emotionally charged piece that invites interpretation and engagement from the viewer." is just fluff.
>>
File: Capture.jpg (298 KB, 3562x1478)
298 KB
298 KB JPG
>>101763837
>>101763748
C'mon they haven't made a FluxPAG
>>
>>101763869
>Oh and "Overall, the artwork presents a visually striking and emotionally charged piece that invites interpretation and engagement from the viewer." is just fluff.
Yeah, I know that this one is fluff, CogVLM etc don't usually add those things, it came from the DALL-E days
>>
>>101763875
>it came from the DALL-E days
I was there in the dall-e days and disagree with that
>>
File: 36c.jpg (238 KB, 1319x1313)
238 KB
238 KB JPG
>>101761375
>>
>>101763889
Why do you disagree? Anon, when DE3 came out, ChatGPT generated prompts that looked exactly like this. It still does. I'm specifically talking about Dalle 3 prompt rewriting.
>>
>>101763841
>>101763869
>>101763875
It's not fluff. It's here to add semantic capacity during the training. Read the goddamn paper.
You don't have to use it by the way, if your prompt is sufficiently detailed.
>>
>>101763916
>Read the goddamn paper.
which one
>>
>>101763920
>which one
The one ChatGPT generated
>>
>>101763748
>Minimum 8bit quantisation is required for a 24G card to train this model
Are we back or what?
>>
>>101763957
Anon, this has been known for like 1.5 days now, yes, you can train a lora for Flux on a single 24GB VRAM card.
>>
File: 1700051058108636.png (1.5 MB, 1920x1080)
1.5 MB
1.5 MB PNG
>>
>>101763969
what's with the squares?
>>
>>101763958
Thanks, haven't been browsing for the last few days
>>
>>101763977
This is an artifact of the Flux VAE, apparently they fucked it up. You can easily see it if you gen an image with 1 step
>>
>>101763983
I don't see anything of the sort with dev at 20 steps.
>>
>>101763990
it dpeends on the image, try to gen an image with 1 step and you'll see it
>>
>>101763994
but who does that
>>
>>101764011
Anon, are you messing with me? I just want you to do it so you can easily see the artifacting. It's an actual VAE issue.
>>
>>101764020
all kinds of issues happen at just 1 step
>>
>>101764026
So you're just messing with me, got it.
>>
>>101764028
I'm not, it wasn't trained for 1 step gen, the VAE having issues with 1 step latents is hardly "fucking it up"
>>
File: FLUX_00145_.png (1.05 MB, 1152x896)
1.05 MB
1.05 MB PNG
>>
>>101763498
you run T5 on the CPU and it only takes a couple of seconds to process the prompt unless you're a total CPU/RAMlet
>>
File: file.png (1.57 MB, 1022x1021)
1.57 MB
1.57 MB PNG
>>101761375
>>101763891
>>
>>101764062
Schnell gens in 1.3 sec total on Replicate, and you're saying that I should cope with having to waste 2 seconds just on the text encoding?
>>
File: 1695051819957907.png (2.09 MB, 1920x1080)
2.09 MB
2.09 MB PNG
>>101763977
I put "bad anti-aliasing" in the prompt and it fucked up my images
>>
>>101764087
that fuck-up will also happen without it in the prompt. It simply occurs at higher resolutions
>>
File: 1706839501194370.png (4 KB, 268x67)
4 KB
4 KB PNG
>update Forge for the first time since getting it a few months ago
>it fucking breaks
Bros any help? I get this when I click "Generate". My GPU is a GTX1660S and I have 16GB ram.



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.