[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology

Name
Options
Comment
Verification
4chan Pass users can bypass this verification. [Learn More] [Login]
File
  • Please read the Rules and FAQ before posting.
  • You may highlight syntax and preserve whitespace by using [code] tags.

08/21/20New boards added: /vrpg/, /vmg/, /vst/ and /vm/
05/04/17New trial board added: /bant/ - International/Random
10/04/16New board for 4chan Pass users: /vip/ - Very Important Posts
[Hide] [Show All]


[Advertise on 4chan]


File: Flux_00213_.png (1.34 MB, 1024x1024)
1.34 MB
1.34 MB PNG
Discussion of free and open source text-to-image models

Previous /ldg/ bread : >>101758249

>Beginner UI
EasyDiffusion: https://easydiffusion.github.io
Fooocus: https://github.com/lllyasviel/fooocus
Metastable: https://metastable.studio

>Advanced UI
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
ComfyUI: https://github.com/comfyanonymous/ComfyUI
InvokeAI: https://github.com/invoke-ai/InvokeAI
SD.Next: https://github.com/vladmandic/automatic
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI

>Use a VAE if your images look washed out
https://rentry.org/sdvae

>Model Ranking
https://imgsys.org/rankings

>Models, LoRAs & training
https://civitai.com
https://huggingface.co
https://aitracker.art
https://github.com/Nerogar/OneTrainer
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts

>Flux
https://huggingface.co/spaces/black-forest-labs/FLUX.1-schnell
https://comfyanonymous.github.io/ComfyUI_examples/flux

>Pixart Sigma & Hunyuan DIT
https://huggingface.co/spaces/PixArt-alpha/PixArt-Sigma
https://huggingface.co/spaces/Tencent-Hunyuan/HunyuanDiT
https://huggingface.co/comfyanonymous/hunyuan_dit_comfyui
Nodes: https://github.com/city96/ComfyUI_ExtraModels

>Kolors
https://gokaygokay-kolors.hf.space
Nodes: https://github.com/kijai/ComfyUI-KwaiKolorsWrapper

>AuraFlow
https://fal.ai/models/fal-ai/aura-flow
https://huggingface.co/fal/AuraFlows

>Index of guides and other tools
https://rentry.org/sdg-link
https://rentry.org/rentrysd

>GPU performance
https://vladmandic.github.io/sd-extension-system-info/pages/benchmark.html
https://docs.getgrist.com/3mjouqRSdkBY/sdperformance

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
sd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium

>Related boards
>>>/h/hdg
>>>/e/edg
>>>/d/ddg
>>>/b/degen
>>>/vt/vtai
>>>/aco/sdg
>>>/trash/sdg
>>
File: delux_flebo_00001_.png (1.36 MB, 1216x832)
1.36 MB
1.36 MB PNG
>mfw
>>
File: file.png (624 KB, 1024x1024)
624 KB
624 KB PNG
>>
>>101761274
>debo is hijacking /ldg/
it's over
>>
File: FD_00247_.png (1.4 MB, 1024x1024)
1.4 MB
1.4 MB PNG
>>
im losing faith in bigma. they either sold it all to nvidia or the model simply won't compete with flux thanks to chinese using shitty datasets. has flux raising the bar effectively killed all other local attempts? i heard auraflow already dropped out realizing it would be too expensive to develop anything at flux's level.
>>
>>101761268
>7090

Hopefully Nvidia will finally decide to give it more than 24GB VRAM.
>>
File: file.png (797 KB, 1024x1024)
797 KB
797 KB PNG
>>
File: FD_00258_.png (1.06 MB, 1024x1024)
1.06 MB
1.06 MB PNG
>>101761335
>>101761205
>>
File: ComfyUI_00936_.png (1020 KB, 1288x848)
1020 KB
1020 KB PNG
>>
File: ComfyUI_00784_.png (1.49 MB, 1024x1024)
1.49 MB
1.49 MB PNG
>>
>>101761357
Bigma just needs to be the biggest possible while also finetuneable on 24 GB VRAM without compromises. Flux likely won't see really work for a year or more.
>>
>>101761357
>i heard auraflow already dropped out realizing it would be too expensive to develop anything at flux's level.
yeah, flux settled the standard bar so high now, every new model that isn't at flux level will be discarded, imo that's a good thing, it will make model creators work harder
>>
File: spread positive vibes.png (1.31 MB, 1024x1024)
1.31 MB
1.31 MB PNG
>>101761371
It still doesn't have that aura. It's getting better though!
>>
>>101761380
I'm still holding my breath to see the practicality of finetuning this thing.
>>
File: FD_00029_.png (1.49 MB, 1024x1024)
1.49 MB
1.49 MB PNG
>>101761375
Very cool
>>
>>101761394
What we need is a 4B or 6B model which will be the sweet spot.
>>
File: FD_00244_.png (458 KB, 512x512)
458 KB
458 KB PNG
>>101761392
Just takes some prooompting, it doesn't have the aggressive aesthetic prompt altering that DallE does.
But I have already moved on to other gens
>>
File: ComfyUI_00785_.png (1.07 MB, 1024x1024)
1.07 MB
1.07 MB PNG
>>
File: ComfyUI_00939_.png (1.17 MB, 1288x848)
1.17 MB
1.17 MB PNG
>>
>>101761406
unfortunately, the quality won't be as good as the 12b one, if flux got so much hype it's because it's a great model, and it's great because it's big, I could be wrong though and someone will make a 4b model in flux quality, but I'm not convinced yet
>>
>>101761372
holy shit yakuza, prompt?
>>
>>101761417
I'm of the mind that everything flux does so well is basically a result of its large parameter size. This is probably the size we should get used to working with in the future because it seems to be the point where hands stop getting fucked up.
>>
>>101761427
Gameplay footage screen shot of Yakuza 0.
featuring Donald Trump in a white jacket and pants with red shirt combo. He is fighting Yakuza thugs on the sidewalks of Osaka. Trump can be seen flying through the air delivering a kick to one thugs face and shattering his glasses. There is a logo on the bottom right of the image which reads "Like a Dragon: 7"
The subtitles read "That's the Donald of Dojima, GET HIM."
>>
>>101761446
ty
>>
File: FD_00262_.png (1.39 MB, 1024x1024)
1.39 MB
1.39 MB PNG
>>101761417
>>101761433
It's for sure because of the params. We will see people dropping prunes soon, and we can compare just hoe much we lose.
>>
I think ipadapter for Flux would solve any style issues without the need to finetune or use loras
ipadapter in style transfer mode using an example image with the style you want generally works extremely well (and with the node in style transfer mode the composition of the example image is usually correctly ignored)
>>
File: ComfyUI_00788_.png (1.43 MB, 1024x1024)
1.43 MB
1.43 MB PNG
>>
>>101761380
Flux is just ... okay. I'm not sold on AI art as a general idea until someone is brave enough to make a model that can effortlessly create a naked woman who is holding up 10 fingers and staring directly at the camera with both eyes focused and pointing in the same direction. And if you think being able to draw a naked chick is only useful for porn, you're playing yourself on levels that shouldn't be possible.
>>
>>101761461
>We will see people dropping prunes soon
Oh yeah I totally forgot about that, it's true that there's "pruned" versions of the SD models, I wonder how well it will work with flux, probably better because DiT models are more resiliant to pruned/quantized shit than unet
>>
>>101761476
>if you think being able to draw a naked chick is only useful for porn, you're playing yourself on levels that shouldn't be possible.

elaborate
>>
>>101761476
no no I don't disagree with you here, but flux can already do convincing naked women, the anatomy is there, we just needs to add better and consistant nipples and genitals and we'll be eating good, it'll be way easier to add those concept compared to base SDXL who was a disaster in that topic
>>
>>101761490
misogynymaxxing is the endgame of all silicon
>>
File: ComfyUI_00790_.png (1.34 MB, 1024x1024)
1.34 MB
1.34 MB PNG
>>
File: ComfyUI_00944_.png (1.03 MB, 1288x848)
1.03 MB
1.03 MB PNG
>>
File: Flux_00276_.png (1.31 MB, 1024x1024)
1.31 MB
1.31 MB PNG
>>
>>101761433
nta but I'm of the opposite. I think Flux could fit it in 8b with the theoretical world's perfect dataset. I mean, do you think if SD3 had a 12b version that it would somehow suddenly be Flux's level and not a completely broken shitheap? Data is what SAI decided to completely nuke with their safety team and data is what allowed Flux to finally do more than 1girl standing. Even better data should allow smaller models to do more with less
>>
>>101761523
kek
>>
File: ComfyUI_00794_.png (1.51 MB, 1024x1024)
1.51 MB
1.51 MB PNG
>>
>>101761527
>I mean, do you think if SD3 had a 12b version that it would somehow suddenly be Flux's level and not a completely broken shitheap?
absolutely not, the guy who made Flux have way more freedom on the training than on SAI, that's precisely why they decided to leave SAI in the first place.
https://xcancel.com/EMostaque/status/1819777262071226611#m
>fwiw I don’t think bfl team could have built/released Flux when at Stability (contrast aha) for… reasons
>>
>>101761546
What happened to SAI that caused their talent to leak out like that? It can't have been a money issue. So what part of the company was so rotten from within that everyone of merit decided to leave and how did it begin?
>>
File: ComfyUI_00793_.png (1.56 MB, 1024x1024)
1.56 MB
1.56 MB PNG
>>
>>101761565
>What happened to SAI that caused their talent to leak out like that?
the """safety""" shit, I would be depressed aswell if I was working hard on a model just to be told to destroy it with some cucking DPO by some retarded higher ups managers who doesn't know 3rd grade math calculus
>>
File: FD_00271_.png (1.35 MB, 1024x1024)
1.35 MB
1.35 MB PNG
>>101761468
>ipadapter
>>
>>101761468
>I think ipadapter for Flux would solve any style issues without the need to finetune or use loras
does ipadapter work on anime characters? I used InstantID and that one only worked on real humans
>>
File: Flux_00233_.png (1.42 MB, 1024x1024)
1.42 MB
1.42 MB PNG
>>101761268
woah my pic made it into the OP feels good man
>>
>>101761497
Yeah, it's definitely not far off in the human physiology department. I think someone is bound to make a full NSFW spin just due to popular demand...
>>
>>101761497
Speaking of. Here's a workflow to draw them back on.
https://files.catbox.moe/h8avmw.png
>>
File: Flux_00234_.png (1.34 MB, 1024x1024)
1.34 MB
1.34 MB PNG
>>101761616
>>
>>101761629
you use a SDXL refiner to do that?
>>
File: Flux_00241_.png (1.42 MB, 1024x1024)
1.42 MB
1.42 MB PNG
>>101761632
>>
>>101761612
sounds like you're thinking of faceid ipadapter which is a more specialized thing, I meant regular ipadapter which is general purpose and designed to work with any kind of images rather than just faces
>>
File: Flux_00242_.png (1.34 MB, 1024x1024)
1.34 MB
1.34 MB PNG
>>101761645
>>
File: Flux_21_.png (1.35 MB, 1368x768)
1.35 MB
1.35 MB PNG
is this the limit of 24gb VRAM?
I just wish Flux had better anime styles (soon-tm)
>>
File: Flux_00239_.png (1.31 MB, 1024x1024)
1.31 MB
1.31 MB PNG
>>101761660
>>
>>101761629
my autistic workflow is growing
>>
File: end_gen~5.jpg (303 KB, 1448x1280)
303 KB
303 KB JPG
>>
>simpletuner flux lora training doesn't work with >1 GPU when using quantization
>dev says "oops haha i just assumed it would work i haven't actually tested it yet"
I swear I'm the only person that has >1 3090 and actually uses them for training things. Nothing ever fucking works right with multi GPU, in basically every training script I've tried.
>>
File: 1721997851162164.png (1 MB, 1024x1024)
1 MB
1 MB PNG
>>
>>101761741
kek'ed hard
>>
File: FD_00277_.png (870 KB, 1024x1024)
870 KB
870 KB PNG
>>101761664
It knows Junji Ito, what more do you need?
>>
>>101761756
it doesn't know Studio Ghibli, I cri everitim
>>
>>101761639
Yeah. Well I was using 1.5 but goblin guy made it better.
>>101761705
Team work makes the dream work. Have you made the nippler more autistic?
>>
>>101761741
prompt? that's cool it can do animal crossing
>>
>>101761741
jfc lol
>>
>>101761733
that's because no one needed multi gpu until now, it'll be improved now that flux exist I think
>>
File: ComfyUI_00792_.png (1.53 MB, 1024x1024)
1.53 MB
1.53 MB PNG
>>
File: Flux_00260_.png (1.19 MB, 1024x1024)
1.19 MB
1.19 MB PNG
>>
File: ComfyUI_00803_.png (1.55 MB, 1024x1024)
1.55 MB
1.55 MB PNG
>>
>>101761733
are you the cunny trainer?
>>
File: FD_00279_.png (1.19 MB, 1024x1024)
1.19 MB
1.19 MB PNG
>>101761741
>>
File: ComfyUI_00802_.png (1.44 MB, 1024x1024)
1.44 MB
1.44 MB PNG
>>
>>101761830
No, but I do have my own NSFW dataset of 1000+ images that I want to try training. But I have a 4x3090 machine, I'm not even gonna attempt it with just one GPU and suffer 1/4 the training speed.
>>
File: Flux_00263_.png (977 KB, 1024x1024)
977 KB
977 KB PNG
>>
>>101761770
not really, it works quite well in my workflow right now. But I added OCR for text detection and another pass through pony lol.
at this point it might be better to gen first in flux, then do OCR and then nipping or a pass through pony
files.catbox.moe/uar5kp.png
>>
>>101761863
What do you do for a living that lets you buy and maintain a 4 4090 machine? Or do you just live beyond your means for gooning?
>>
>>101761883
3090 moran
>>
>>101761786
Gameplay footage screen shot of animal crossing, donald trump talking to his mexican mouse neighbor wearing a sombrero, top down view, animal crossing dialogue box saying "You have to go back!" with the name "Trump" as the speaker, trump smiles smugly while pointing and the mouse looks shocked with his mouth agape
>>
File: ComfyUI_00806_.png (1.72 MB, 1024x1024)
1.72 MB
1.72 MB PNG
>>
>>101761756
Every single cunny artist on pixiv known to man
>>
>>101761890
That makes sense since you can't NVlink 4090s
What PSU?
>>
>>101761895
>prompt is literally just a description of the image
I wonder when people will stop asking for prompts
>>
File: ComfyUI_00814_.png (1.6 MB, 1024x1024)
1.6 MB
1.6 MB PNG
>>
>>101761932
vocabularlylets and ESL
>>
>>101761937
that's what claude is for
>>
File: Flux_00304_.png (1.21 MB, 1024x1024)
1.21 MB
1.21 MB PNG
>>
File: Flux_00254_.png (1.75 MB, 1024x1024)
1.75 MB
1.75 MB PNG
>>
>>101761883
I work at a big tech company and make a little over 200k / yr. And it's used 3090s, not even all that expensive (compared to a lot of other hobbies).
>>101761913
I don't even nvlink, you can't do it 4-way and it doesn't matter for basic distributed data parallel training. Machine uses two 1600W PSUs.
>>
>>>/aco/8431464
does anyone know what anon means?
>>
File: ComfyUI_00818_.png (1.53 MB, 1024x1024)
1.53 MB
1.53 MB PNG
>>101762000
it means being anonymous
>>
>>101762000
I wonder if the safety stuff really is only as deep as the T5 model and it actually is a lot more NSFW than we expected.
>>
File: Flux_00267_.png (1.46 MB, 1024x1024)
1.46 MB
1.46 MB PNG
>>101761986
sounds like a cool setup.
how do you mount all these FAT af GPUs?
like what kinda case does any of that shit fit in?
>>
File: Flux_00307_.png (1.06 MB, 1024x1024)
1.06 MB
1.06 MB PNG
>>
>>101762031
make him a minion
>>
File: ComfyUI_00816_.png (1.6 MB, 1024x1024)
1.6 MB
1.6 MB PNG
>>
File: ComfyUI_00822_.png (1.42 MB, 1024x1024)
1.42 MB
1.42 MB PNG
>>
>>101762024
Crypto mining frame with PCIE risers. This is by far the least hassle way to do it.
>>
File: Capture.jpg (174 KB, 3830x1300)
174 KB
174 KB JPG
How do you save a XYZ plot into a giant picture that has all the grids?
>>
Are there any tricks with doing img2img with Flux? It seems like the results are very binary. Either there's a minor alteration to the original image or it's a completely new image.
>>
File: Flux_00268_.png (1.42 MB, 1024x1024)
1.42 MB
1.42 MB PNG
>>101762061
what kinda sound they make at full power?
must sound like a 747 on takeoff lmao
>>
File: ComfyUI_00823_.png (1.58 MB, 1024x1024)
1.58 MB
1.58 MB PNG
>>
File: 2024-08-07_07-35.png (97 KB, 814x630)
97 KB
97 KB PNG
>>101762068
it seems to just werk for me with this sampler and scheduler setup
>>
File: ComfyUI_00965_.png (998 KB, 1288x848)
998 KB
998 KB PNG
>>
>>101762094
It's me frfr
>>
File: Flux_00309_.png (1.37 MB, 1024x1024)
1.37 MB
1.37 MB PNG
>>101762037
good idea
>>
File: ComfyUI_00966_.png (1.06 MB, 1288x848)
1.06 MB
1.06 MB PNG
>>
File: Flux_00271_.png (1.3 MB, 1024x1024)
1.3 MB
1.3 MB PNG
>>
Can anon please share a prompt for the elmo pictures of past threads? I love that style that would fool me into thinking it's not an AI picture.
>>
>>101762087
Thanks, I'll try that.
>>
>>101762081
When the fans spin up while training full blast it's reasonably loud. Worse is that it pumps out a lot of heat. But this is all okay because the machine sits in a different room. I just SSH into it from my desktop and use NFS to have remote access to the whole filesystem.
>>
File: ComfyUI_00932_.png (1.09 MB, 1288x848)
1.09 MB
1.09 MB PNG
>>101762181

https://files.catbox.moe/snqbe1.png
>>
File: 1717621704673324.png (1.1 MB, 1400x1024)
1.1 MB
1.1 MB PNG
>>
File: 1702518976375515.jpg (1.05 MB, 2752x2002)
1.05 MB
1.05 MB JPG
Running flux schnell on Ipad Pro M4, 100 secs per gen at 4 steps
Apple won
>>
Hi, so im curious if theres any actual difference between nvidia and amd in SD (except access to cdnn)
>>
File: ComfyUI_00827_.png (1.47 MB, 1024x1024)
1.47 MB
1.47 MB PNG
>>
File: ComfyUI_00815_.png (1.81 MB, 1024x1024)
1.81 MB
1.81 MB PNG
>>
File: Flux_00312_.png (991 KB, 1024x1024)
991 KB
991 KB PNG
well this is kinda uncanny
>>
>>101762250
that's creepy
>>
File: ComfyUI_00809_.png (1.36 MB, 1024x1024)
1.36 MB
1.36 MB PNG
>>
>>101762250
Prompt please. I wanna try that.
>>
>>101762232
If there's any difference, I don't think it would be significant enough to matter since the underlying algorithms are the same.
>>
File: Flux_00313_.png (1.07 MB, 1024x1024)
1.07 MB
1.07 MB PNG
>>101762280
>House doorbell security cam footage photo. Fisheye lens effect creates rounded distortion. It is night time and the colors are monochrome with a night-vision effect. At the door, an extremely muscular ripped minion from Despicable Me is standing outdoors. The minion has yellow skin, one big eye, and sparse black hairs. The minion wears a skintight black shirt emblazoned with white text that reads "NEVER GOON". The minion is smoking a cigar and wields an AR-15 in his arms as he looks at the camera. In the background of the man, there is a very dark empty city street. Overall, the image conveys a sense of absurdity and humor mixed with intimidation.
>>
File: Flux_00280_.png (1.39 MB, 1024x1024)
1.39 MB
1.39 MB PNG
>>101762204
sounds comfy, are your 3090s all from the same brand?
>>
>>101762232
What you save at the checkout with AMD you, you pay ten times over in frustration at home.
>>
File: ComfyUI_00975_.png (1002 KB, 1288x848)
1002 KB
1002 KB PNG
>>
>>101762211
Got it, thanks! :D
>>
File: ComfyUI_00835_.png (1.18 MB, 1024x1024)
1.18 MB
1.18 MB PNG
>>
File: Flux_00283_.png (1.08 MB, 1024x1024)
1.08 MB
1.08 MB PNG
>>
>>101762302
Nope, 2 ASUS 1 EVGA 1 Gigabyte. It doesn't matter if it's different brands. I mean maybe nvlinking different brands does some weird shit (probably not, I bet it works) but I don't do that.
>>
>>101761822
I want to make flux stop tarting up faces.
>>
File: ComfyUI_00836_.png (1 MB, 1024x1024)
1 MB
1 MB PNG
>>
File: ComfyUI_00979_.png (879 KB, 1288x848)
879 KB
879 KB PNG
>>
File: 1704150803373177.png (1.29 MB, 1920x1080)
1.29 MB
1.29 MB PNG
war has changed...
>>
>>101762296
Lmao. Excellent.
>>
>>101762384
>guy behind her
don't open that door!
>>
File: ComfyUI_00981_.png (886 KB, 1288x848)
886 KB
886 KB PNG
>>101762389
>>
Well, I really wanted to know if Flux could draw a proper humanized Miku, but "teenage girl" fails the other way...
>>
File: Flux_00288_.png (1.14 MB, 1024x1024)
1.14 MB
1.14 MB PNG
>>
>>101762426
Nice cookie monster elmo hybrid
>>
The only thing that matters for me in these models is how well it can generate anime girls getting brutally raped, nothing else
>>
>>101762296
>>
We need a federal gooner police.
>>
File: Flux_00289_.png (1.19 MB, 1024x1024)
1.19 MB
1.19 MB PNG
>>
File: ComfyUI_00983_.png (898 KB, 1288x848)
898 KB
898 KB PNG
>>
File: FLUX_00013_.png (1.1 MB, 1152x896)
1.1 MB
1.1 MB PNG
>>101762453
>>
File: 1705088386914019.png (1.23 MB, 1920x1080)
1.23 MB
1.23 MB PNG
>>
can Flux modify images? it isn't respecting an encoded input.
>>
File: Flux_00316_.png (1.3 MB, 1024x1024)
1.3 MB
1.3 MB PNG
>>101762461
>>101762472
gonna goon to these
>>
>>101762296
>>101762393
Also, cool prompt hack is to just type: "Overall, the image conveys [thing] mixed with [something else]" if you want to generate something just based on vibes. Generative AI is pretty good at reasoning backwards from summary statements like that.

> Overall, the image conveys a sense of humor and surrealism mixed with palpable dread.
>>
>>101762503
> Overall, the image conveys a sense of fear and feminine seduction.
>>
File: ComfyUI_00985_.png (1.85 MB, 1288x848)
1.85 MB
1.85 MB PNG
>>
File: Flux_00317_.png (1.03 MB, 1024x1024)
1.03 MB
1.03 MB PNG
>>
File: 2024-08-07_00064_.png (1.77 MB, 1024x1024)
1.77 MB
1.77 MB PNG
>>
>>101762523
>>
File: t1.jpg (626 KB, 2150x3226)
626 KB
626 KB JPG
>>
>>101761878
>OCR
what is ocr?
>>
File: Flux_00291_.png (1.16 MB, 1024x1024)
1.16 MB
1.16 MB PNG
>I'm already prompting for 16 hours or so
>>
>>101762024
This is the first time I have seen a coherent guitar from a local model.
>>
>>101762303
Do you have a few examples of problems i could encounter?
>>
>>101762617
No
>>
>>101762617
Yes
>>
Alright bros, so why does it slow down on ComfyUI? I'm asking for the second time. On Windows 11, I have over 22GB of VRAM free on my 3090, but for some reason Flux slows down over time, from 1.26-1.4 sec/it to 10 sec/it. I do cancel my server every now and then because I do things, not sure if that influences it, but I also may load upscale models and that triggers it. Once it happens, the only way to fix is to restart my computer, otherwise rely on luck when I restart ComfyUI.
>>
>>101762617
Maybe
>>
File: 1721214075330507.png (1.07 MB, 1920x1080)
1.07 MB
1.07 MB PNG
>>
File: Flux_00300_.png (932 KB, 1024x1024)
932 KB
932 KB PNG
>just one more gen and I go to sleep
>>
>>101762636
just load it in fp8 and you won't have any issues.
>>
File: ComfyUI_temp_eihaf_00083_.png (749 KB, 1024x1024)
749 KB
749 KB PNG
>>101762126
>>101762094
>>
>>101762691
NGL, if it weren't for the hair and the fatass attached to them, those breasts looking pretty squeezable.
>>
File: Flux_00303_.png (928 KB, 1024x1024)
928 KB
928 KB PNG
>>
File: bad.png (1.33 MB, 1024x1024)
1.33 MB
1.33 MB PNG
>>101762531
Okay, what I want is a Captioner where I send a picture and then I get text that would reproduce a similar picture if I used it as a prompt for Flux. I tried your picture here:
https://huggingface.co/spaces/gokaygokay/FLUX.1-dev-with-Captioner
Which seems to be state of the art? And I got this:
>The image is a page from a video game. It is divided into two sections.
>On the left side of the page, there is a black and white illustration of a group of three characters standing in a grassy field. The character on the left is a man wearing a white t-shirt and black pants, and the character in the middle is a woman with long blue hair and a black dress. She is standing in front of a large rock formation.
>In the center of the image, there are two other characters, one wearing a black suit and the other wearing a blue dress. They are standing next to each other and appear to be posing for the photo. The background is a dark and eerie landscape with a castle in the distance. The text on the page is in Japanese and translates to "Mecis" in English.
Which gives me picrel.
...
Ugh, so I can use online captioners and image generators for free but what's the point if I can't even recreate some pictures? How far away is this dream?
>>
File: 2024-08-07_00077_.png (1.66 MB, 1280x720)
1.66 MB
1.66 MB PNG
>>
File: Flux_00319_.png (978 KB, 1024x1024)
978 KB
978 KB PNG
finally managed the cartoonish minion look but using negative prompting that slows shit down by 3x
>>
>>101762717
That caption is horrible tbhdesu, even claude 3 haiku/gpt-4o mini describe images better
>>
File: FD_00291_.png (1.15 MB, 1024x1024)
1.15 MB
1.15 MB PNG
>>
File: 2024-08-07_00078_.png (1.61 MB, 1280x720)
1.61 MB
1.61 MB PNG
>>
>>101762717
this is what gpt-4o gave me. It didn't name miku directly but you can fix it. It also hallucinated the "LOOK: Enemy Berserk", and put the Miku on the left page above the two dudes:
>The image is a two-page spread from what appears to be a magazine or a guidebook. The left page features a mix of text and images. At the top left, there is a heading labeled "Meclis" followed by a column of text in Japanese. Below the text, there is an image of a character with long turquoise twin-tails, wearing a black and turquoise outfit with a short skirt and thigh-high boots. To the right of this, there are two more images: one shows a menacing skull with glowing red eyes labeled "LOOK: Enemy Berserk," and below it, a screenshot from a game featuring two characters, one bald man in a white shirt and another person in black attire, with status bars above them. The right page displays a large, detailed image of the same turquoise twin-tailed character from the back, standing in a grassy, rocky environment with a status bar at the top labeled in Japanese. The character's shadow is prominently cast on the ground, and the setting appears to be a dimly lit outdoor area at night.
>>
>>101762800
And this is 3.5 Sonnet, actually better:
>The image appears to be a two-page spread from a Japanese video game magazine or guide book. On the left page, there's a large block of Japanese text under the heading "Meclis". In the top right corner of this page is a small image of a menacing skull-like creature with glowing red eyes. Below the text are two character models: a bald man in a white t-shirt and a woman in a black outfit. The right page is dominated by a 3D render of an anime-style character with long teal twin-tails, wearing a black and teal outfit reminiscent of Hatsune Miku. This character is shown in a dynamic pose against a grassy background. Above this image is what looks like a game interface element showing a health or progress bar. The bottom of the left page contains smaller images of the teal-haired character from different angles. Throughout the spread, there are various UI elements and text boxes that appear to be from the game itself, giving it the look of a strategy guide or character showcase.
>>
File: 2024-08-07_00082_.png (1.47 MB, 1280x720)
1.47 MB
1.47 MB PNG
>>
File: Oh.png (1.04 MB, 1024x1024)
1.04 MB
1.04 MB PNG
>>101762755
Here's what I got from this one:
https://huggingface.co/spaces/thobuiq/GPT-4o
>The image is a two-page spread from an anime magazine. The left page features a large illustration of a character named "Meclis". The character is depicted with long, blue hair and is wearing a black dress. The right page provides more details about the character. It includes a smaller illustration of the character, a description of their personality, and a list of their abilities. The text is in Japanese, and the character's name is written as "メクリス". The overall layout of the pages suggests a focus on the character and their role in the anime series.
Hmm, well, at least the pic is good. Looks nothing like the original, but it's impressive.
>>
You will never be an artist, even a 8 year old can generate infinite AI slop on his tablet.
>>
>>101762842
Me on the right
>>
>>101762856
That's not GPT-4o though, but some random open weight model in a HF space named "GPT-4o". Both GPT-4o and Claude 3.5 Sonnet are commercial models, there might be some HF demos that give you free access to them, but I doubt it.
>>
>>101762717
https://huggingface.co/spaces/opencompass/open_vlm_leaderboard
>>
File: 1705284146428081.png (47 KB, 950x343)
47 KB
47 KB PNG
>>101762856
Yeah, it's using mixtral and whatever this model is, the autistic space creator just named it "Open GPT-4o" for whatever reason
>>
>>101762866
>You will never be an artist
Thanks for the whitepill
>>
File: FD_00297_.png (1.68 MB, 1024x1024)
1.68 MB
1.68 MB PNG
>>
File: 1694499319905534.png (2.04 MB, 1920x1080)
2.04 MB
2.04 MB PNG
>>
>>101762894
this but unironically
>>
>>101762894
you can though
just not pathetic caveman shit like a woman choking on cock
>>
>>101762087
>>101762202
Found out it just requires additional steps if using schnell.
>>
File: FD_00298_.png (1.49 MB, 1024x1024)
1.49 MB
1.49 MB PNG
>>101762866
And neither will you, because even an 8 year old can generate infinite AI slop on his tablet.
>>
>>101762910
catbox now
>>
File: 2024-08-07_00090_.png (1.44 MB, 1280x720)
1.44 MB
1.44 MB PNG
>>101762868
kek
>>
>>101763009
Flux makes terrible asses
>>
>>101762994
For porn? Scroll up and open the goblina link
>>
>>101762994
I think you misread that post by missing the word "not" or something
>>
so why does lowering the guidance make the model better at doing paintings and reduce the slopped look?

like what's going on architecturally with the model that causes that to work
>>
>>101763022
that's inpainting and hardly porn
>>101763040
I did not, thank you. Don't talk to me again, retard.
>>
>>
File: 1716844133588489.png (1.05 MB, 1024x1024)
1.05 MB
1.05 MB PNG
>>
>>101763050
Because it follows your specific prompt less and has more "creativity"?
>>
>>101763050
>so why does lowering the guidance make the model better at doing paintings and reduce the slopped look?
the model has a bias on style, decreasing the guidance decrease this bias, but it also decrease the prompt following ability unfortunately



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.