[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: grid-0161.jpg (485 KB, 1536x2688)
485 KB
485 KB JPG
Discussion of free and open source text-to-image models

Previous /ldg/ bread : >>101911066

>Beginner UI
EasyDiffusion: https://easydiffusion.github.io
Fooocus: https://github.com/lllyasviel/fooocus
Metastable: https://metastable.studio

>Advanced UI
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
ComfyUI: https://github.com/comfyanonymous/ComfyUI
Forge: https://github.com/lllyasviel/stable-diffusion-webui-forge
InvokeAI: https://github.com/invoke-ai/InvokeAI
SD.Next: https://github.com/vladmandic/automatic
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI

>Use a VAE if your images look washed out
https://rentry.org/sdvae

>Model Ranking
https://imgsys.org/rankings

>Models, LoRAs & training
https://civitai.com
https://huggingface.co
https://aitracker.art
https://github.com/Nerogar/OneTrainer
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts

>Flux
https://huggingface.co/spaces/black-forest-labs/FLUX.1-schnell
https://comfyanonymous.github.io/ComfyUI_examples/flux

>Pixart Sigma & Hunyuan DIT
https://huggingface.co/spaces/PixArt-alpha/PixArt-Sigma
https://huggingface.co/spaces/Tencent-Hunyuan/HunyuanDiT
https://huggingface.co/comfyanonymous/hunyuan_dit_comfyui
Nodes: https://github.com/city96/ComfyUI_ExtraModels

>Index of guides and other tools
https://rentry.org/sdg-link
https://rentry.org/rentrysd

>GPU performance
https://vladmandic.github.io/sd-extension-system-info/pages/benchmark.html
https://docs.getgrist.com/3mjouqRSdkBY/sdperformance

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
sd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium

>Maintain thread quality
https://rentry.org/debo

>Related boards
>>>/h/hdg
>>>/e/edg
>>>/d/ddg
>>>/b/degen
>>>/vt/vtai
>>>/aco/sdg
>>>/trash/sdg
>>
File: de_fl_00012_.jpg (98 KB, 1216x832)
98 KB
98 KB JPG
>mfw
>>
Blessed thread of frenship
>>
>>101914526
bert and ernie time travel meme
>>
File: 1694215938786.png (12 KB, 411x137)
12 KB
12 KB PNG
NAI v3 in 5 days
>>
File: ifx45.png (1.16 MB, 1024x1024)
1.16 MB
1.16 MB PNG
>>
>>101914561
>>101914501
>>Maintain thread quality
>https://rentry.org/debo
>>
>>101914562
Is it local? If not I don't care.
>>
>>101914562
go back
>>
>>101914562
Wait what?
>>
>>101913200
>>101914478
a lora?
>>
>>101914596
It will be, in 5 days
>>
bakers who don't do last thread collages are spitting on the spirit of /ldg/
>>
File: ComfyUI_00626_.png (1.35 MB, 832x1216)
1.35 MB
1.35 MB PNG
>>
>>101914615
the very first collage was one from a single anon :D but tru desu
>>
>>101914604
trooncord retards spamming fake shit in multiple threads
>>
>>101914612
nothing ever happens
>>
File: ComfyUI_00008_ (3).png (1.46 MB, 1248x1824)
1.46 MB
1.46 MB PNG
>>
File: cafe.jpg (137 KB, 832x1216)
137 KB
137 KB JPG
>>
>>101914663
Very neat. Prompt?
>>
File: ComfyUI_00012_.png (2.65 MB, 1248x1824)
2.65 MB
2.65 MB PNG
>>101914725
https://files.catbox.moe/8jqhqh.png
>>
>>
File: ComfyUI_00084_.png (1.11 MB, 1216x832)
1.11 MB
1.11 MB PNG
>>
File: flux_00014_.png (1.16 MB, 1160x896)
1.16 MB
1.16 MB PNG
>>
File: FD_00001_.png (1.27 MB, 1024x1024)
1.27 MB
1.27 MB PNG
>>
File: FD_00002_.png (1.32 MB, 1024x1024)
1.32 MB
1.32 MB PNG
Prompt: Lyrics to Blue by Eiffel 65
>>
File: edgy.png (1.56 MB, 896x1152)
1.56 MB
1.56 MB PNG
>>
File: ComfyUI_00104_.png (1.08 MB, 1216x832)
1.08 MB
1.08 MB PNG
>>
>>101914806
>>101914849
kek
>>
>>101914849
this vfx is skyrim heal spells lole
>>
>>101914838
very nice
>>
Are there any new tips for improving Flux's art style knowledge yet? Beyond the old ones about lowering guidance and splitting the clip-l/t5 prompts.
>>
File: download (96).jpg (722 KB, 1536x2432)
722 KB
722 KB JPG
>>
>>101915043
anons have been using this to recreate images, does an okay job with art styles.

https://huggingface.co/spaces/fancyfeast/joy-caption-pre-alpha

I've also had a good amount of success from pasting shit into OpenAi and asking for a prompt that would recreate the style.
>>
File: 2024-08-15_00004_.png (2.66 MB, 1280x1024)
2.66 MB
2.66 MB PNG
>>101915043
Yeah, using dynamic thresholding.
>>
>>101915109
Workflow:https://files.catbox.moe/xay1ip.png
>>101915043
>>
>>101915043
It really needs IPAdapter.
>>
File: flux_00027_.png (1.11 MB, 1160x896)
1.11 MB
1.11 MB PNG
>>
File: download (97).jpg (989 KB, 1536x2432)
989 KB
989 KB JPG
>>
>>101915148
Isn't someone working on it? I saw it on HF but it was empty.
>>
I am actually surprised at how easy it is to train LoRAs with flux.
>>
File: 22.png (163 KB, 334x274)
163 KB
163 KB PNG
which one of you sars did this?
>>
>>101915228
How did that shit get past 10 views?
>>
>>101915241
ai companies scraping youtube for training data
>>
>>101915192
I didn't see that but I hope so. Being able to just show the model an image in the art style you want should solve most of the problems. Since the model DOES know how to do them, it just doesn't know what they're called.
>>
>>101915269
This is the empty repository I saw, no clue who they are, so maybe they're not working on it, who knows.
https://huggingface.co/InstantX/FLUX.1-dev-IPadapter
>>
>>101915313
ngl, instant x's controlnets suck. I want xinxir to make one.
>>
>>101915329
SDXL finally has really good controlnets now but it took like a year
Hopefully Flux doesn't take that long
>>
>>101915336
The SDXL ones by xinsir are almost flawless, probably because of their massive dataset. But they mysteriously do not have an IP adapter.
>>
>>101915361
the existing ipadapterplus model for sdxl works good already so I assume xinsir didn't see any reason to bother
>>
flux was fun but now i see the flux-isms in every gen and it became boring. base slop style is far too strong and loras are a crutch.
>>
>>101915376
Not so good for Pony though. While the xinxir ones actually did an okay job with pony.
>>
File: 2024-08-16_00003_.png (1.45 MB, 768x1024)
1.45 MB
1.45 MB PNG
>>101915138
Looks like shit.
>>
>>101915380
>and loras are a crutch.
I swear there's only one person who hates LoRAs and it's you. Are controlnets a crutch too?
>>
>>101915413
Every concept in the world should be inside the model by default. If I can't make Messi playing Blizball in the style of picasso without LoRAs it failed.
>>
>>101915431
Haha! Well memed, friend.
>>
>>101915361
almost 60k downloads for the (very good) xinsir tile model, nice
>>101915380
just gen flux for a bit and then switch to sdxl/derivatives, that is what I do.
>>
>>101915380
At least now with the lightweight flux versions you can make a workflow that loads SDXL+xinsir's controlnet afterwards to do a style transfer, without oom or massive model swap slowdown
>>
>>101915329
Because it is an alpha and just was found to have bugs. Look at the hugging face page right now.
https://huggingface.co/InstantX/FLUX.1-dev-Controlnet-Union-alpha
> Found some bugs, currently fixing them. Please do not download until the fixes are applied.
>>
I'm got gonna lie bros... these piss weak rank 16 LoRAs that barely tickle the output aren't doing much for me.
>>
>>101915487
Bugs or not, they're still not as good desu.
>>
File: 1692830893080519.jpg (131 KB, 767x1024)
131 KB
131 KB JPG
good morning sirs, did the pedotroon finally get banned?
also when are we getting one of those "QR code" finetunes for Flux? i want to make heckin' funny images like picrel
>>
>aquiline nose
>hooked nose
>curved nose
>downturned nose
>jutting nose
>bulbous nose
>crooked nose
None of these work. I just want to generate a woman that looks middle-eastern and every character has an upturned button nose. It's ridiculous.
>>
File: 00252-AYAKON_12404472.png (3.72 MB, 1536x2560)
3.72 MB
3.72 MB PNG
>>
>>101915491
Give it some time anon
I agree thodesu
>>
>>101915508
>did the pedotroon finally get banned?
He's been banned multiple times. The problem is he has a dynamic IP address so the moment he gets banned, he need only hit the reset button on his router and clear his cache to get back here.
>>
>>101915520
turn down the thing that's at 3.5 by default, can't think of the name.
>>
>>101915533
I think I can train at a higher rank if I limit the size to 512x512, I'll do that next if my output isn't good enough when this training is done.
>>
>>101915530
*jaw drops to floor, eyes pop out of sockets accompanied by trumpets, heart beats out of chest, awooga awooga sound effect, pulls chain on train whistle that has appeared next to head as steam blows out, slams fists on table, rattling any plates, bowls or silverware, whistles loudly, fireworks shoot from top of head, pants loudly as tongue hangs out of mouth, wipes comically large bead of sweat from forehead*
>>
>>101915520
This is my main issue with Flux atm. All the women look like generic Dreamshaper XL 1girls, if I see that goddamn butt chin one more time
>>
>>101915548
I have Flux guidance at 2 for positive, 2.5 for negative. That seems to play best with the style I want.
>>
>>101915241
obviously coomers and weebs
>>
>>101915530
>>101915553
>ponyslop
>>
>>101915413
the more shit you have to stack on to get good results the more of a crutch it is. controlnet is also garbage, same with character loras. the more of this nonsense you stack, the LESS the model can do. want to add a character lora? good job, now the rest of your subjects in the image have leaking details from said lora. want to rectify it with regional prompter? now you're stuck with boring separated left/right compositions. NOTHING is a substitute for base model knowledge. you can cope all you want about it but it's true. if loras were an actual substitute for base model knowledge nobody would've ever left 1.5, because you can simply train a lora for everything.
flux is a decent architecture but the model itself will be diluted into generic 1girl garbage with finetunes, with each passing finetune losing more and more of the base model's knowledge. the base model simply isn't creative enough to remain interesting and finetune datasets are simply way too narrow and underwhelming to improve it without massive tradeoffs

lora cannot inject creativity into a creatively dead model like flux. lora is not an upgrade. lora is not an improvement. lora is not an expansion pack. it's quite literally copium designed to hard-steer the model towards what you want at the cost of everything else. loras are fun to play with and can inject some crazy things no base model would ever understand, but lets not pretend a model's character knowledge starting and ending at mario and sonic is somehow a good thing.
>>
File: ComfyUI_00029_ (2).png (1.96 MB, 1248x1824)
1.96 MB
1.96 MB PNG
>jeets using flux to spam synthetic looking 1girl slop instead of kino schizo gens
what a shame
>>
Man Forge has so many UI bugs right now and the developer is mostly focusing on working on backend. Even if I made issues I don't get the feeling they would get read since there are 498 open ones lol. Sucks because it is working the best for my tiny VRAM amount when generating for whatever reason.
>>
File: FD_00003_.png (765 KB, 1024x1024)
765 KB
765 KB PNG
>>101915612
Why not both?
>>
>>101915611
Here we go with the unreasonable expectations for the model again.
LoRAs are a perfectly acceptable band aid for a model's shortcomings and controlnet is an absolute necessity for things like inpainting, upscaling and style transfer.
I don't want to hear about your theoretical omni model that knows all concepts and characters. It doesn't exist. Stop letting the perfect be the enemy of the good.
>>
File: ComfyUI_00008_ (5).png (2.16 MB, 1248x1824)
2.16 MB
2.16 MB PNG
>>101915651
now thats what we need
>>
>>
>>101915611
Controlnets let you do shit you could never achieve otherwise tho. Like weird poses that you could never get via prompting because they are too unusual in the model training data. I dunno how that's not an upgrade. Same for Loras really, they are literally additional weights that let you generate new things unlike textual inversion that is just doing things the model already can do.
>>
File: Untitled.png (1.26 MB, 1241x609)
1.26 MB
1.26 MB PNG
Training my first LoRA for Flux on a 3090.

Image on the left is the baseline and right is at 500 steps. The subject of the LoRA is a random Japanese woman who likes to post nudes a lot on twitter so it's got the asian part down, still not seeing the resemblance yet, but still, things are happening.
>>
>generate taylor swift pissing on my face with a base model
>go ahead ill wait
>>
>>101915611

We can all wish for a perfect local model, but it's not like we're getting any open source model that recognizes characters and celebrities along with being able to do nudes and all kinds of wacky concepts. No one wants any legal liability of any kind, even if it's arguable that there is any to begin with.
We're basically stuck with Loras for the foreseeable future.
I agree that the models get diluted every time they're messed with in any way, but so far Flux is a radical improvement in prompt understanding and adherence over anything we have had before.
Only thing we can do is use what's given to us and make the best of it with every aid that can be used.
It's a miracle we even got anything this year after the complete failure of SD3. If anything it'll just get worse from here as things go more corporate by the day.
It'll be like this until hardware is strong enough to train our own models from scratch locally and it'll be a long while until that day comes.
>>
File: ComfyUI_temp_mvfio_00014_.png (1.48 MB, 1024x1024)
1.48 MB
1.48 MB PNG
It's too late you fool, she's already cast Child Support
>>
>>101915679
>The subject of the LoRA is a random Japanese woman who likes to post nudes a lot on twitter
>>
>>101915678
I wouldn't engage with this guy. His whole thing is making unreasonable demands of the base model and expecting understand everything and do everything through prompting alone. I can't tell if he's doing it on purpose or he's autistic, but at this point there's not much difference.
>>
>>101915695
Yes.
>>
>>101915694
Kek
>>
"reduce your desires and you will never yearn for a lora" - ancient chinese proverb
>>
>>101915712
"Why download a LoRA when you can train your own and have all the naked Japanese ladies you want?" -Me
>>
File: Capture.jpg (127 KB, 3162x1506)
127 KB
127 KB JPG
Is it me or the "check update" button doesn't work on ComfyUi manager? There's definitely some nodes that need to be updated yet it gives me "no result"
>>
>>101915728
Comfy UI has been a little bit fucky lately with the new updates.
>>
>>101915542
every german internet user who isnt using telekom (so most of them) gets a new ip adress assigned every time they switch off their router for a few seconds and we can see how reluctant jannies are to range-ban his ip address. maybe it's the same in other countries too, I don't know.
>>
>>101915712
>learn to live without want and you will never need
>>
File: 1697343101129557.jpg (240 KB, 1024x1024)
240 KB
240 KB JPG
>>101915655
how is any of that unreasonable for a base model if dall-e can do it fine as a base model? it's not some ancient secret technology, it's simply a matter of not using shitty slopped datasets.
>>
>>101915740
The country I'm in has fixed IPs for all users unless you specifically ask for a dynamic one. When I get banned I have to sit out for like 3 days.
I'm still salty over the one time I posted a spooky forest monster and got banned because it had a "penis" It wasn't a penis.
>>
>>101915679
>Japanese woman with an upturned nose
REEEEEEEEE!
But anyway, it looks like you're getting somewhere.
>>
>>101915550
Do post progress imgs
>>
Is this shit incapable of putting someone further away in the background? It's like they always have to be within 5 meters of the camera.
I've tried a plethora of prompts and it keeps ignoring them and putting my main character right up front despite me asking literally the opposite.
>>
>>101915750
>Dalle knows these characters, so all models must know all characters
Now make the brown one face sit on the white one.
You don't have enough time on this earth for someone to curate and train the dataset for the perfect omni model you're asking for. You do have the time and can easily muster up the dataset for something more specfic.

You know what I'm saying is true. There are practical limitations to what you can train a model in a reasonable time.
>>
>>101915761

75 more steps and I'll post the next sample image.
>>
>>101915750
because OpenAI has people being paid money to curate datasets correctly and you bitch here for free?
>>
Remember that? >>101908547
He reverted back his commit because he noticed the quality decreased, as I suspected:
https://github.com/city96/ComfyUI-GGUF/commit/d4c75a69c9a7777fae7e26ee945435133a878320
Now we're back to Q8_0(old) and I think he made the right choice, that one is definitely closer to fp16: https://imgsli.com/Mjg4MDMz/0/1
>>
File: ComfyUI_00540_.png (2.03 MB, 1152x1536)
2.03 MB
2.03 MB PNG
>>
File: 0.jpg (280 KB, 1024x1024)
280 KB
280 KB JPG
>>
>>101915789
are there any characters flux knows that dall-e doesn't?
>>
Where the fuck does kohya_ss install models? I tried a couple of it's captioning options, they suck and now I have like 30gb of blip models and I don't know where they are
>>
>>101915756
interesting. all I gotta do is switch off my router. also, fucking ancient gens, sorry
>>
>>101915818
I don't know, I haven't meticulously tested every character in a prompt yet
>>
>>101915787
Yeah, it needs a lot of work with that. I've had some luck mitigating it with something like "This is a full-body portrait, and the ground beneath the subject is visible in the foreground. The environment is the focus of the image, with the subject taking up only a small portion." Most of the time I can get the full body barely in the frame, but getting them further out than that is very rare.
>>
>>101915400
>While the xinxir ones actually did an okay job with pony.

For the realistic or 2D/2.5D Pony models? The realistic Pony models don't seem to work well with anything, not even loras created for them.
>>
>>101915679

Here we are at 759 steps. I just hope the Ponyfied images I slipped into the dataset where she has massive breasts and a giant cock don't ruin the output in unpredictable ways.
>>
>>101915872
I'm pretty sure the realistic pony models are literally just LoRA merges on top of the original pony. It would make sense that they're fucked beyond belief. No actual training went into making those. Unless something changed since last time I looked at realistic pony models.
>>
File: 0.jpg (272 KB, 1024x1024)
272 KB
272 KB JPG
>>
>>101915889
>No actual training went into making those
this is 99% of checkpoints on civitai. you can count on one hand the amount of actual relevant finetunes with training put into them. you can even go onto the flux category and see for yourself. it's why actual 'realistic' porn on stablediffusion has always looked disgusting and deep-fried. it was always just stacking and merging loras on top of anime finetunes.
>>
File: flux2.png (1.01 MB, 1024x1024)
1.01 MB
1.01 MB PNG
>>
>>
File: flux5.png (863 KB, 1024x1024)
863 KB
863 KB PNG
>>
File: flux3.png (922 KB, 1024x1024)
922 KB
922 KB PNG
>>
File: 0.jpg (277 KB, 1024x1024)
277 KB
277 KB JPG
>>
>>
File: 19240467.jpg (21 KB, 460x460)
21 KB
21 KB JPG
>>
File: flux6.png (901 KB, 1024x1024)
901 KB
901 KB PNG
>>
File: 00000.png (3.97 MB, 2560x1440)
3.97 MB
3.97 MB PNG
Flux is pretty good at this futuristic Frutiger Aero style desing.
>>
File: egypt.png (1.46 MB, 1024x1024)
1.46 MB
1.46 MB PNG
>>101915683
is there any combination of models and loras that could do that? asking for my friends' uncle
>>
>>
File: download (98).jpg (998 KB, 1536x2432)
998 KB
998 KB JPG
>>
>>101915987
sick
>>
File: ComfyUI_00014_.png (829 KB, 832x1216)
829 KB
829 KB PNG
>>
>>101915987
But can it do Messi playing blitzball in that style?
>>
https://civitai.com/models/645943/fluxunchained-artful-nsfw-capable-fluxd-tuned-model-by-socalguitarist
Is this the very first finetune of flux?
>>
File: flux7.png (969 KB, 1024x1024)
969 KB
969 KB PNG
>>
>>101915983
Nice. Now it could use a subject in there, like Miku, Trump, or some other well-liked character.
>>
>>101916028
Huh, the outputs look pretty good.
>>
>>101916028
Damn. Are we back? And do loras made with the base work with that as well?
>>
>>101916028
>What a time to be alive! I created this model by block merging my low weight LoRA trainings over multiple passes (very similar to how I created my SDXL series models) to the base flux.d model.
just another jeetmix like every single flux 'finetune' so far
>>
>>101916056
Of course, it's just a finetune, a finetune works aswell on loras than a base model
>>
>>101916058
>I created this model by block merging my low weight LoRA trainings over multiple passes
Oh...
>>
>>101916058
Oh so it's not a real finetune then, what a bummer
>>
>>101916047
Maybe mario, princess peach, or ummm some anime girl that looks vaguely like a touhou character? at least loras will fix it!
>>
File: 0.jpg (265 KB, 1024x1024)
265 KB
265 KB JPG
>>
Speaking of fine tunes... what happened to the guy who said he had figured out 24gb card finetunes for flux?
Did he get silenced by NVIDIA?
>>
>>101915984
>cocktail waitress at the luxor hotel and casino takes a shit on her break
>>
File: flux8.png (1.27 MB, 1024x1024)
1.27 MB
1.27 MB PNG
>>
>>101916092
He was full of shit, what did you expect? We can barely run a fp16 on a 24gb card and you think we could finetune the same fp16 with it? kek
>>
>>101916103
>b-but if we just load half the model and then do some magic voodoo shit.
>>
>>
File: flux9.png (1.04 MB, 1024x1024)
1.04 MB
1.04 MB PNG
Too bad the handlebar fucked up
>>
File: ComfyUI_00012_ (2).png (2.29 MB, 1248x1824)
2.29 MB
2.29 MB PNG
>>
File: file.png (3.94 MB, 1394x1080)
3.94 MB
3.94 MB PNG
Anyone know what's causing these grid patterns?
I'm using that 1970s illustration by Ralph Bakshi prompt (slightly edited) with 20 steps, distilled cfg 3.5, normal cfg 1, 1920x1080 res on flux dev q4
didnt have the issue with schnell
>>
>>
File: flux10.png (1.56 MB, 1024x1024)
1.56 MB
1.56 MB PNG
>>
>>101916092
you 'can' finetune at any amount of memory, that doesn't mean you should. this was a massive issue with ponydiffusion loras, with people underbaking on shit settings just so they could have something to post on civitai. not everyone deserves to train. the internet would be better off with zero finetunes than 500 half-baked int4 quantized garbage ones. there is only so much that can be done before someone does the sneedful and raids nvidia hq to secure us the hardware that's needed. you can only cope so much with 24gb.
>>
>>101916214
you're not using karras scheduler right?
karras is weird with Flux, it needs a ton of steps to converge and you get fucked up shit below at least 50 steps
>>
File: Capture.jpg (40 KB, 670x752)
40 KB
40 KB JPG
>>101916214
>distilled cfg 3.5
does Forge hace a negative distilled cfg like on ComfyUi?
>>
>>101916237
im using the beta scheduler which worked earlier when i tried it... maybe i should just restart the program to see if it persists
>>101916244
no idea im new to forge and diffusion in general
>>
>>101916235
>someone does the sneedful and raids nvidia hq to secure us the hardware

Surely I'm not the only who's fantasized about stealing a pallet full of enterprise gpus
>>
File: ComfyUI_00013_ (3).png (1.75 MB, 1248x1824)
1.75 MB
1.75 MB PNG
>>
File: flux11.png (1.18 MB, 1024x1024)
1.18 MB
1.18 MB PNG
>>
File: download (99).jpg (998 KB, 1536x2432)
998 KB
998 KB JPG
>>101916025
No.
>>
>>101916224
>>101916302
oh no, it's the attack of the eleven foot tourist who needs to go back to wherever the fuck she came from!
>>
>>101916279
yeah beta should work fine, not sure what's up then
>>
File: ComfyUI_00015_.png (3.12 MB, 1248x1824)
3.12 MB
3.12 MB PNG
>>
Bros, what is stopping us from starting an AI hentai studio and get rich together?
>>
>>101916214
Care to show a screen of forge, that has the picture and all the settings (cfg, distilled guidance...) in it?
>>
>>101916327
That market is saturated already.
>>
>>101916327
It's called Anlatan
>>
>>101916333
They said the same about porn and then onlyfans came, bro. It's all about marketing. There's a lot of hentai there, but it's for the japanese market, at least in terms of income for the studios. We can directly target western coomers.
>>
>civitai now allows on-site flux finetuning
Oh great, cant wait for 90% of loras to be literal dogshit
>>
>>101916370
You'll be able to do real finetunes and not just loras right?
>>
>>101916097
you're not far away from the truth
>>
File: ComfyUI_13372_.png (1.12 MB, 1024x1024)
1.12 MB
1.12 MB PNG
>>101915811
Yeah, sorry about that. The format is still changing. It's also GGUF so expect to requantize 5-10 times a year kek.
Also pushed this so it shouldn't fucking OOM every 12 seconds anymore while taking 32GBs of system ram:
https://github.com/city96/ComfyUI-GGUF/commit/de5f6e9dcc7aac4af6ef8e06ddacbecc6549672a
Now to somehow figure out LoRAs.
>>
File: download (100).jpg (220 KB, 1024x1024)
220 KB
220 KB JPG
>>101916361
>>
File: flux12.png (1.12 MB, 1024x1024)
1.12 MB
1.12 MB PNG
>>101916316
rude
>>
>>101916370
please saar redeem the emma watson lora
>>
>>101916414
>Yeah, sorry about that. The format is still changing. It's also GGUF so expect to requantize 5-10 times a year kek.
Yeah I'm a /lmg/ fag so I know exactly what you mean by that kek

>Now to somehow figure out LoRAs.
you could "inspire" yourself from forge's loras' implementation on GGUF :^)

I hope you'll also implement the _K quants, Q8_0 is really close to fp16 but not quite, I suspect that going for Q8_K will be the final nail in the coffin, that one will probably be "loseless"
>>
File: ComfyUI.png (258 KB, 1904x1630)
258 KB
258 KB PNG
ComfyUI is now more comfy.
>>
File: flux13.png (1.55 MB, 1024x1024)
1.55 MB
1.55 MB PNG
>>
>>101916383
Just LoRAs
>>
>>101916446
Trained on 15 youtube screencaps
>>
>>101916447
>you could "inspire" yourself from forge's loras' implementation on GGUF :^)
I think he dequantizes, applies the LoRA, then requantizes is. I'd much rather store the LoRA in full precision and apply it to the dequantized weights. I think that should be both easier and faster, at the cost of using more VRAM (as much as the LoRA weights are, realistically) or being slower (assuming I just leave it on CPU and pull them in one by one since the weights are tiny).
>>101916447
>_K quants
I have put my best associates on the job, so far one of them managed to create a 46GB FP32 gguf before giving up
>>
>>
>>101916458
Should I learn to use this? It seems difficult for me.
>>
>>101916501
>I think that should be both easier and faster, at the cost of using more VRAM (as much as the LoRA weights are, realistically
yeah, that's the good method, it's the same one used on LLMs (even though loras are possible on the llm ecosystem no one use it kek)
>>
>>101916493
not mumbai enough. needs to be trained on 1000 ponyrealism outputs downloaded indiscriminately off civitai
>>
>>101916319
>>101916329
did some more testing and i think it was just my 3070 screaming for help cause it couldnt handle such outrageous resolutions as 1920x1080 :/
the problem disappears as soon as i lower it a bit
so in the end it was >>101916292
>>
>>101916458
Is this an addon or just how it is now?
>>
>>101916420
I will be!
>>
File: 00007-714976943.png (1.01 MB, 896x1152)
1.01 MB
1.01 MB PNG
>>
File: 00012-2019140598.png (1.25 MB, 896x1152)
1.25 MB
1.25 MB PNG
>>
Watching these promptlets be promptlets in real time is painful
https://www.youtube.com/watch?v=sUxh1J8oAZQ
>>
>>101916466
nice one anon, what prompt?
>>
File: ComfyUI_00019_.png (537 KB, 1216x832)
537 KB
537 KB PNG
>>
File: ComfyUI_00002_ (5).png (2.12 MB, 1248x1824)
2.12 MB
2.12 MB PNG
>>
File: Capture.jpg (84 KB, 1085x1281)
84 KB
84 KB JPG
>>101916501
I see that you implemented the "_1" variants like Q4_1, could you also implement a Q8_1? For those with 24gb they don't mind if the model is a bit bigger than Q8_0, there's plenty of room to spare (10 gb of vram left)
https://huggingface.co/kcoopermiller/aya-101-GGUF
>>
>>
File: flux14.png (942 KB, 1024x1024)
942 KB
942 KB PNG
>>101916730
>A 1985 japanese magazine scan of an advertising featuring a nerdy 18yo young japanese woman smiling with a dimple, wearing round glasses in a coastal environment. The image is a vintage analog photograph featuring film grain, and is slightly worn. The woman is a cute japanese idol, and has a 1980s style straight hair, and she is sitting on the top of a skyscraper while holding an ice cream, looking attentively at the cars below her, passing by in vice city ocean drive. The image has a nostalgic 80s japan feel, with a vintage retro look, with chromatic aberration, film grain, and slight blur. Present in the background are the ocean, palmtrees, cars on the road, and tall skyscrapers in art deco style and pastel colors.
>>
>>101916821
I absolutely love that style, thanks a lot for the prompt anon
>>
best/simplest workflow for transforming a reference image into a lengthy (but accurate!) boomer prompt that i can feed back into Flux to start working from?
>>
Should we add this to the OP? It's very useful.
https://huggingface.co/spaces/fancyfeast/joy-caption-pre-alpha
>>
File: flux15.png (1014 KB, 1024x1024)
1014 KB
1014 KB PNG
>>101916838
You're welcome. If you can, generate some with Dev (mine so far were with Schnell because I was too lazy to change models lol)
>>
File: ComfyUI_00007_.png (562 KB, 1216x832)
562 KB
562 KB PNG
>>
>>101916865
>>101916871
>>
>>101916871
is it local yet?
>>
>>101916887
it is
>>
File: 1715559633480899.png (985 KB, 1024x1024)
985 KB
985 KB PNG
>>
>>101916871
Is there any option to supply a prompt along with the image to tell the model what to focus on its caption? E.g. "focus on describing the art style rather than the content"?
>>
>>
>>
>>101916383
There is no way in hell I am uploading my dataset to civitAI
>>
>>101916800
I don't think that's a valid format for actual quants
https://github.com/ggerganov/llama.cpp/blob/2a24c8caa6d10a7263ca317fa7cb64f0edc72aae/gguf-py/gguf/constants.py#L1197-L1233
>>
>>101916919
No, what I often do with it though is take the caption it generates then take that caption to an LLM and tell it to embellish it, or change the subject of it.
>>
File: Capture.jpg (52 KB, 600x1376)
52 KB
52 KB JPG
>>101916964
It's there though?
>>
>>101914501
anyone using special hardware? like IBM has dedicated AI chips but everyone does stuff NVIDIA gamer GPUs
>>
>>
>>101915542
I love how you typed that shit out like you think it's some new and novel technology far beyond anything you've encountered before.

Like I'm watchin fuckin CSI and policehackerman says "It's no good boss.. he CHANGED HIS IP!" and then there's a slow zoomin and a musical sting and then cut to commercial
>>
>>101916977
Yeah for quantization type, but you need to set the file type as well in the metadata when saving, and it doesn't have a value assigned for that.
>>
>>101916097
Can't post shit here or else you'll get banned.
>>
>>101916995
Oh ok, oh well, there's still Q8_K then right?
>>
>>101917008
Yes, K quants are still planned.
>>
>>101917003
if you put a catbox link it would be ok though, people do that to show some NFSW outputs
>>
>>101916370
Where can I download your loras?
>>
>>101916992
>I love how you typed that shit out like you think it's some new and novel technology far beyond anything you've encountered before.

I'm sorry, how was I supposed to type that in a way that didn't offend you. He literally changes his IP by resetting the router. There is no other way to say it.
>>
>>101917016
thanks, legend o/
>>
>>
>>101917030
>gets upset that he's new to computers and ip addresses
KEK
get even more pissy that you think you're being gangstalked by le ebul ip changer lmao
>>
>>101917068
I think we found the pedoposter you guys
>>
>>101917072
>HELP HELP I'M BEING GANGSTALKED
>WHAT'S AN IP ADDRESS?
>>
File: Capture.jpg (174 KB, 1387x1519)
174 KB
174 KB JPG
>>101916964
>>101917016
Uh... I think Q8_K is also not on that list, so it's ova? :(
>>
>>101917085
meds
>>
>>101917068
How many calories you burned getting up to reset your router this week?
>>
>>101917055
kek'ed
>>
>>101917098
>HELP HELP I'M BEING GANGSTALKED
kek
all i have to do is observe one schizo retard being retarded and it sets him off and derails a thread
>>
>>101916893
>Access to model meta-llama/Meta-Llama-3.1-8B is restricted and you are not in the authorized list. Visit https://huggingface.co/meta-llama/Meta-Llama-3.1-8B to ask for access.

really?
>>
File: 00016-665355279.png (1.2 MB, 896x1152)
1.2 MB
1.2 MB PNG
>>
>>101917055
nice, did you prompt for the eyes being askew or was that the model's idea
>>
>>101917143
>This image is a photorealistic scene depicting a large, blue anthropomorphic cookie monster. It holds a tobacco pipe made of glass, with a cookie inside the bulb in its right hand, the steam clearly visible. In its left hand, the cookie monster holds a lighter with “Cookie Dough” written in clear black letters.
>>
File: 1698758464367.png (100 KB, 1517x578)
100 KB
100 KB PNG
>>
>>101917151
damn that's wild, so it did the eyes on its own as if it recognized the intent of the scene (smoking the cookies as a metaphor for getting high)
>>
File: 00022-732215967.png (1.01 MB, 896x1152)
1.01 MB
1.01 MB PNG
>>
File: 00024-1237099497.png (1.24 MB, 896x1152)
1.24 MB
1.24 MB PNG
>>
>>101917171
I like this
>>
File: 00027-1146659331.png (1.36 MB, 896x1152)
1.36 MB
1.36 MB PNG
>>
File: 00028-1502491003.png (739 KB, 896x1152)
739 KB
739 KB PNG
>>
File: RetroXL0006.jpg (186 KB, 1320x1320)
186 KB
186 KB JPG
>>
>>101917209
oh my
>>
File: 00031-1962172314.png (1.37 MB, 896x1152)
1.37 MB
1.37 MB PNG
>>
https://reddit.com/r/StableDiffusion/comments/1estj69/remove_the_blur_on_photos_with_tonemap_an/
After making some more serious test (X = GuidanceNeg | Y = tonemap_multiplier): https://files.catbox.moe/97if24.png

It seems like the sweet spot seems to be on (CFG 6 + tonemap_multiplier 0.21 + GuidanceNeg 5): https://imgsli.com/Mjg4MTkz
>>
File: 00046-4053438203.jpg (204 KB, 1728x1344)
204 KB
204 KB JPG
>>
>>101917229
wow.. i want to be there
>>
>>101917229
long woman is long
>>
File: 00038-2130975747.png (1.51 MB, 896x1152)
1.51 MB
1.51 MB PNG
>>
>>101917170
That's just how Cookie Monster's eyes are
>>
File: 00002-3400989322.jpg (188 KB, 1728x1344)
188 KB
188 KB JPG
>>
>>101917287
any amateur cookie monsterologist could tell you that
>>
lmao i can't believe how good the captioning is, using https://github.com/StartHua/Comfyui_CXH_joy_caption/tree/main
>>
>>101916430
>Twin Peaks the Return
>>
File: FD_00066_.png (486 KB, 768x1216)
486 KB
486 KB PNG
>>
>>101917303
Can it caption porn images?
>>
>>101917126
>really?
yes >>101917303
>>
File: 1699078871057.png (350 KB, 898x486)
350 KB
350 KB PNG
>>101916458
I dislike comfyui and I don't understand the people who use it. What is wrong with automatic1111? what are all these weirdass lines going everywhere, why do you need that

how the fuck does this mean anything to anyone
>>
>>101917318
Yes
>>
File: 1697275263799626.png (20 KB, 493x285)
20 KB
20 KB PNG
>>101917323
really?
>>
>>101917318
It can, but for SFW stuff I still recommand GPT4V, it's still the best caption model
>>
File: ComfyUI_00055_.png (773 KB, 1216x832)
773 KB
773 KB PNG
>>
>>101917318
example caption (hopefully not ban...)

This is a close-up, highly detailed photograph of a human vulva, taken from a medical perspective. The image shows the labia majora and labia minora, with the clitoral hood partially covering the clitoris. The skin texture is smooth and slightly wrinkled, with natural folds and variations in pigmentation. The labia are pinkish in color, with the inner labia being slightly darker and more moist, indicating natural lubrication. The surrounding skin is a light tan, suggesting Caucasian ethnicity. The person's hands, with neatly trimmed nails, are gently spreading the labia to reveal the inner structures more clearly. The photograph is sharply focused, capturing fine details such as skin texture and natural variations. The lighting is even, ensuring that no shadows obscure the anatomical features. There are no other objects or background elements visible, emphasizing the anatomical focus of the image. The photograph is clinical in nature, likely intended for educational or medical purposes, providing an unfiltered view of the vulva for medical understanding and awareness.
>>
File: 00006-2295642715.jpg (202 KB, 1728x1344)
202 KB
202 KB JPG
kek at the expression
>>
>>101917335
they're retarded, you need to remove the chink shit at the end of the link
https://huggingface.co/unsloth/Meta-Llama-3.1-8B-bnb-4bit
>>
File: ComfyUI_00078_.png (2 MB, 1248x1824)
2 MB
2 MB PNG
Reminder that all "photorealistic" ai images made with flux look like synthetic dogshit and you should not redeem them
>>
>>101917331
Purple lines are model, yellow is clip loader, orange is prompt output, red is vae, blue is image, pink is latent, greens are primitive strings.
>>
>>101917365
Please try something more pornographic, if it always says it's "medical" it's borderline useless.
>>
>>101917377
why do you need that shit
>>
>>101917371
oh, it's me who's a retard as well

sick thanks dude
>>
>>101917390
To create custom workflows, lots of shit is simply impossible to do in WebUI
>>
>>101917391
it's not your fault, it's them kek, and you're welcome, have fun with it o/
>>
>>101917402
i highly doubt that
>>
File: 00007-921263325.jpg (220 KB, 1728x1344)
220 KB
220 KB JPG
>>
>>101917410
I don't really care what you doubt. It's just true.
>>
File: 1719131219330.png (625 KB, 1493x594)
625 KB
625 KB PNG
spooked
>>
>>101917424
I doubt that it's true. I think you're just not good at programming and need colored lines everywhere for reasons that have not been adequately explored.
>>
>>101917425
the problem is it missed the entire context of this image, it should be classified as a screenshot of an analog recording from the late 80s, early 90s, the style of the image is way more important than the contents, this caption really fails the test
>>
>>101917438
I really don't care what you doubt. If you don't like comfy don't use it, but I gave you a valid explanation as to why people do.
>>
File: ComfyUI_00056_.png (786 KB, 1216x832)
786 KB
786 KB PNG
>>
File: 00012-136076223.jpg (213 KB, 1728x1344)
213 KB
213 KB JPG
>>
File: FD_00071_.png (590 KB, 768x1216)
590 KB
590 KB PNG
>>
>>
>>101917468
I hope more effort is spent in the future in capturing the purpose of an image, basically the value, what makes it interesting especially it's inherent style and technical execution. A lot of time spent on the who and what without any of the why. Also they're terrified of captioning skin color and ethnicity... still. Should be standard to have the race and sex.
>>
>>
File: 00047-2017429721.png (1.16 MB, 896x1152)
1.16 MB
1.16 MB PNG
>>101917494
yus do want
>>
File: 00050-3496364762.png (1.42 MB, 896x1152)
1.42 MB
1.42 MB PNG
>>
File: ComfyUI_00552_.png (2.39 MB, 1536x1152)
2.39 MB
2.39 MB PNG
>>
>>101917483
You didn't give a valid explanation. You made an untrue statement.
>>
File: 00052-1783300751.png (1.3 MB, 896x1152)
1.3 MB
1.3 MB PNG
>>
>>101917618
>I have no capacity for imagination therefore what you said is untrue
>>
>>101917603
Looks like Laura Bailey
>>
It always takes ~4 months for actually good finetunes of a new base model to start dropping, that's how long it took with SD1.5 and SDXL so I expect it'll take that long for Flux too

I hate the wait though
>>
>>101917641
i was thinking Ana De Armas, but yeah i guess a bit of both
>>
DAE play fallout new vegas while generating pics?
>>
File: 1706621354524.png (1.82 MB, 1497x1246)
1.82 MB
1.82 MB PNG
>>101917468
i gave it one last chance, rip
>>
File: 1701097806911244.png (1015 KB, 1024x1024)
1015 KB
1015 KB PNG
>>
>>101917675
No
>>
>>101917718
kek. I've been playing zero resource browser games like Universal Paperclips
>>
File: 00023-136076223.jpg (232 KB, 1344x1728)
232 KB
232 KB JPG
>>
>>101917468
>>101917686
You can just ask an LLM to alter the description.
>>
>>101917734
that's a fun game
>>
File: file.png (2.31 MB, 1280x720)
2.31 MB
2.31 MB PNG
>>
File: ComfyUI_01561_.png (871 KB, 1024x1024)
871 KB
871 KB PNG
LoRA training went fine after about 2000 steps for anyone that cared.
There was a lot of AI slop in the dataset, so the skin got smoothed out and I think it fucks up the hands more as a result, but overall the LoRA looks like the subject and text works. I did this with no tagging.
>>
>>101917803
>There was a lot of AI slop in the dataset,
Why do you do that in the first place? There's billions of real life images ready to be used
>>
>>101917803
Did you use kohya or something else?
>>
File: 00055-605945291.png (1.01 MB, 896x1152)
1.01 MB
1.01 MB PNG
>>
How are you guys using joy caption locally bypassing the hugging face repo access for Meta-Llama-3.1-8b?
>>
>>101917822
There are not billions of images of the subject. So I had a few images in different poses that I made from a previous LoRA.
>>
>>101917378
switched to using this version, still tying to figure out the best prompt for it; https://huggingface.co/Orenguteng/Llama-3.1-8B-Lexi-Uncensored-V2/tree/main
>>
>>101917839
AI-toolkit on a 3090, took a few hours but overall not painful at all to set up and get running.
>>
>>101917843
just use this or any other fine tune, put it all in a folder, change the path to point to that folder locally
https://huggingface.co/unsloth/Meta-Llama-3.1-8B-bnb-4bit/tree/main
>>
>>101917854
Thanks fren.
>>
>>101917851
>>101917839
Lemme link to the repo
https://github.com/ostris/ai-toolkit

I think Kohya will be coming out with fine tunes and LoRAs soon
>>
>>101917846
A lora is already an approximation of a full training, and AI pictures are also an apprixmation of real life pictures, no wonder why so many loras suck ass, you are so fucking lazy
>>
>>101917869
not sure where finetune sits but Kohya has lora training working already, the branch is called 'SD3' but has the flux lora network .py
>>
It's bread time...
>>101917856
>>101917856
>>101917856
>>
>>101917851
>>101917869
Ah, I'll check it out I've been trying Kohya and it's completing training and it spits out sample images that kind of look like my subject but loading the lora into comfy doesn't seem to do anything
>>
alright so I think Joy Caption is a bit fucked, or I'm a retard.

I've been changing the prompt to stuff like "tell me a story about this image" but I think something about what it is doing is baked into the wpkklhc6 image_adapter.pt
>>
I badmouthed the comfyui "PixelArt Detector" custom nodes yesterday, i wish to half retract my rudeness. The core node (called "PixelArt Detector (Image->)") does in fact silently automatically detect if the input seems to map to a lower resolution and downscales it appropriately, with k-centroid as the algorithm. the all-in-one converting/saving node that only outputs webp and jpeg still seems like nonsense, but at least the important functionality is cleanly accessible. Example >>101917518 this was a 4x downscale/re-upscale. It loses certain fine detail from the original like text shapes, the implied short/spiky hair, and the line patterns on the shoulder pads, which sort of implies the AI was mistaken in trying to put such detail into an image of that pseudo "resolution" to begin with, but in return it fixes up the various subpixel fades, fried bits of the controls that look like jpeg artifacts, etc. I guess it's more appropriate for really low res pixel art like in console games rather than what anon posted which is i guess a little bit post-pc98 era. also works better the better a job the model you're using does of actually creating pixel art style output, e.g. suggesting shapes through few pixels rather than trying to draw detail explicitly.
>>
>>101917906
If the samples are spitting out correct it's probably a comfy issue. Did you do a before after on the same seed to check?



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.