[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: tmp.jpg (696 KB, 3264x3264)
696 KB
696 KB JPG
Discussion of free and open source text-to-image models

Previous /ldg/ bread : >>102101429

>Beginner UI
EasyDiffusion: https://easydiffusion.github.io
Fooocus: https://github.com/lllyasviel/fooocus
Metastable: https://metastable.studio

>Advanced UI
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
ComfyUI: https://github.com/comfyanonymous/ComfyUI
Forge: https://github.com/lllyasviel/stable-diffusion-webui-forge
InvokeAI: https://github.com/invoke-ai/InvokeAI
SD.Next: https://github.com/vladmandic/automatic
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI

>Use a VAE if your images look washed out
https://rentry.org/sdvae

>Model Ranking
https://imgsys.org/rankings

>Models, LoRAs & training
https://civitai.com
https://huggingface.co
https://aitracker.art
https://github.com/Nerogar/OneTrainer
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts

>Flux
https://huggingface.co/spaces/black-forest-labs/FLUX.1-schnell
https://comfyanonymous.github.io/ComfyUI_examples/flux

>Pixart Sigma & Hunyuan DIT
https://huggingface.co/spaces/PixArt-alpha/PixArt-Sigma
https://huggingface.co/spaces/Tencent-Hunyuan/HunyuanDiT
https://huggingface.co/comfyanonymous/hunyuan_dit_comfyui
Nodes: https://github.com/city96/ComfyUI_ExtraModels

>Index of guides and other tools
https://rentry.org/sdg-link
https://rentry.org/rentrysd

>GPU performance
https://vladmandic.github.io/sd-extension-system-info/pages/benchmark.html
https://docs.getgrist.com/3mjouqRSdkBY/sdperformance

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
sd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium

>Maintain thread quality
https://rentry.org/debo

>Related boards
>>>/h/hdg
>>>/e/edg
>>>/c/kdg
>>>/d/ddg
>>>/b/degen
>>>/vt/vtai
>>>/aco/sdg
>>>/u/udg
>>>/trash/sdg
>>
Might be the worst collage yet
>>
File: flux0313.jpg (2.57 MB, 2528x2000)
2.57 MB
2.57 MB JPG
>>
So what GPU is worth buying for Flux Dev?
>>
>>102106706
4090 or 3090
>>
>>102106694
Be the change you wish to see
>>
>>102106706
Any Nvidia GPU with 24gb (or more) of vram
>>
>>102106694
you need to make better gens to beat these. Also how dare you one of those are mine.
>>
>>102106717
That's 2000$

>>102106710
Aren't used 3090 breaking fast?
>>
>>102106726
I hope for your sake it's the bottom left
>>
>>102106736
Buy from a reputable used GPU seller. Or buy new.
>>
File: 00033-2957713204.png (1.56 MB, 832x1216)
1.56 MB
1.56 MB PNG
>>102106737
Take a guess
>>
alright we cookin that urushihara floox lora now that clip_l is supported
>>
>>102106764
The blurry 1girl then
>>
File: ComfyUI_02148_.png (1.98 MB, 1488x1024)
1.98 MB
1.98 MB PNG
>>102106690
That's a very strong indicator that he's here too.
>>
I don't get devs that spend a shit ton of time making new features and then don't document any of it with a basic workflow. "I added this, this is what it does, this is how you use it". Simple.
>>
>>102106766
Crazy, I just got started on my clip L LoRA using Kohya's latest push. I'll report back and see if it can finally recognize novel tokens.
>>
Blessed thread of frenship
>>
File: 2024-08-27_00135_.jpg (1.13 MB, 3840x2160)
1.13 MB
1.13 MB JPG
>>102106115
may I suggest realSR_BSRGAN_DFOWMFC_s64w8_SwinIR-L_x4_GAN then?

pic related
>>
>>102106803
Most of these people are autists that do it for their own pleasure and the documents are written in their wake.
>>
>>102106815
No, they clearly do it so people will use it. OneTrainer, AI Toolkit, etc, they want people to use their trainers, it's definitely not for their own pleasure entirely.
>>
>>102106827
99% sure Kohya just wants to generate anime girls.
>>
I just want to report back that the amateur photography LoRA does work really well for making varied and realistic people, and it plays reasonably nicely with my style LoRA. Thanks, anon.
>>
File: image.png (472 KB, 640x1024)
472 KB
472 KB PNG
>>
File: 00045-3356989398.png (1.6 MB, 896x1152)
1.6 MB
1.6 MB PNG
>>102106772
That's mostly because it got cropped I guess.
>>
>>102106813
yeah I got that one. flux is easy mode anyways, its so fantastic with the resampling you gotta try real hard to make the result look bad.
>>102106856
some of the loras rare eally nice for "unique" faces but sadly quality degradation is very real
>>102106799
FURKAN PROLAPSE PIC CATBOX or GTFO
>>
>>102106764
Why does the bust have lipstick too?
>>
>>102106934
Didn't you know the statues had color paint. It's just missing the rest.
>>
File: 00039-327398409.png (1.09 MB, 832x1216)
1.09 MB
1.09 MB PNG
>>102106934
just a bit of vandalism
>>
File: ComfyUI_01514_.png (1.06 MB, 768x1360)
1.06 MB
1.06 MB PNG
>>
File: 00040-327398410.png (1.14 MB, 832x1216)
1.14 MB
1.14 MB PNG
>>
File: 00041-327398411.png (1.1 MB, 832x1216)
1.1 MB
1.1 MB PNG
>>102106963
Nice hip shape with a thigh gap
>>
The worst thing that happened to /ldg/ was becoming more popular than /sdg/ and by virtue becoming /sdg/
>>
>>102107012
This is the kind of comment you would see in sdg, please provide images.
>>
>>102106694
my classy unique 1girls gets ignored like always. one day..
>>
>>102107034
I think the collage should be decided by an algorithm of (You)s on the post.
The collage maker's personal tastes reflect in the content of the collages too often, the collage maker must relinquish control of the makeup to the collage to an unbias method of deciding the images.
>>
File: 00034-2957713205.png (1.26 MB, 832x1216)
1.26 MB
1.26 MB PNG
>>102107034
post image, and tell OP to put it in the next one
>>
>>102107064
but some images have loads of yous and are obviously bad, like some crappy meme or something

I think whoever makes the OP can choose whatever they like. That's the basic simple rule.
>>
This text-to-something slop is a slippery slope and a great way for an AI to turn humans into dopamine addicted labrats.
>>
>>102107064
>>102107080
I was just being cheeky. there has been several collages where half of the pics are my gens, no worries
>>
>>102107005
so ronery
>>102107034
after 2 dozen times it gets boring
>>
>>102107099
especially when video and VR simulated worlds come easier
>>
>>102107098
I think collage based on (You)'s would be full of the kind of images that ask for the prompt mixed in with controversial images. It would provide a good spread of images that excite a response.
>>
>>102107119
Yep, I'm never touching this shit. Not even chatgpt. I prefer thinking and programming on my own.
>>
File: 00019-1086920223.png (989 KB, 1216x832)
989 KB
989 KB PNG
>>102107103
That's fine anon, same here I got 2 in a row with 2 images once, I'm just so damn amazing.
>>
>>102107159
Okay bye.
>>
>>102107159
It will get harder to resist.
Often times humans will do what everyone else does, so many of the people that don't like it will convert when all their friends and family are doing it.

Or when your life has a need for it, like how it's getting harder to use cash now, because using cards etc is more convienent.
>>
File: image-19.jpg (247 KB, 1024x1024)
247 KB
247 KB JPG
Blackforest could raise money by guaranteeing inclusion into the training set for a small fee. Surely that would work out well.
>>
>>102107188
I'm simply never touching it. Very simple. I know it would cause me misery if I did.
>>
>>102107211
then what compels you to be here still?
>>
>>102107229
Discussion.
>>
>>102107208
They are literally the sole provider of image generation services to Twitter. I think they're good on money right now.
>>
>>102107252
Discussion leads to you trying out text/image gen as you try to understand it and what others see in it. Leave before it is too late.
>>
>>102107252
What can you discuss if you don't use the said thing being discussed. It's like discussing Baldur's gate 3 and never playing it.
>>
>>102107266
Good point, bye.
>>
>>102107273
>>>/v/
>>
>>102107273
It's like discussing Baldur's Gate except you don't even play games.
>>
https://github.com/bmaltais/kohya_ss/tree/sd3-flux.1

Clip training is now available on Kohya. Nobody knows if it even works. Are you gonna wait around for cerfukin to tell you or find out?
>>
>>102107261
They're not the provider as such, they aren't providing a saas, they licensed the model. Licensing is their entire business model yet people really think they won't go after anyone who releases a full finetune of dev and tries to profit off it lol
>>
>>102107159
you could have SD generate patterns and waves. its very good at those things. fascinating stuff
>>
>>102107292
That's the pleb gui. I think it's just the main fork for now
>>
>>102107297
>people really think they won't go after anyone who releases a full finetune of dev and tries to profit off it lol
Until someone actually does get a bill from BFL for using their model as a base, their ability to actually collect is up to debate.
Also are they actually licensing the model or providing through their API?
>>
>>102107211
Before you leave, yeah eventually the only way to avoid this is to somehow live in the countryside and just focus on enjoying real life and nature (and programming for fun)
>>
>>102107314
Oh shit, I don't know why I linked the pleb GUI.
https://github.com/kohya-ss/sd-scripts/tree/sd3
>>
File: 00047-992114957.png (1.66 MB, 832x1216)
1.66 MB
1.66 MB PNG
>>
>>102107099
>The irony of posting this on 4chan for (Yous)
>>
File: 00051-992114961.png (1.64 MB, 832x1216)
1.64 MB
1.64 MB PNG
dat hair
>>
>>102107366
1st one is a magnificent face. I like it very much.
>>
I don't see anybody talking about the fact that GGUFs are slower than sft on 24gb. Is that just me? Do I need to do some updates? Or is it just that most people with the VRAM aren't using GGUFs? After a couple LoRAs I OOM so I need the extra space.
>>
>>102107453
quantized GGUFs do extra math on the fly so they are slower, also means loras have to be applied on the fly slowing it down even further and city96 isn't smart enough to solve the problem
>>
File: epic.png (83 KB, 1066x302)
83 KB
83 KB PNG
>>
File: 00023-518698999.png (2.67 MB, 1920x1152)
2.67 MB
2.67 MB PNG
>>
>>102107453
is dogshit if you got the vram. flux GGUF quant w/ 1 lora goes from 1.0x s/it to 1.6ish s/it on my gpu aka trashbin. plus what anon above me said
>>
>>102107502
>The turk again
What's this in relation to?
>>
very nice Furkan
https://www.pokemonpets.com/
>>
>>102107510
>>102107477
That sucks because two LoRAs push me above 24gb and my s/it skyrockets worse than if I use GGUF. Can we merge Flux LoRAs yet?
>>
>>102107612
PHD btw
>>
>>102107670
Mersin btw
>>
>>102107621
punch clip into ram, bit of a delay after each gen but steady speed and no overflow.
>>
Flux can't into blowing kisses. How lame.
>>
File: 00061-3492905394.png (1.36 MB, 1216x832)
1.36 MB
1.36 MB PNG
>>102107436
:)
>>
fuck yea.. my System RAM upgrade arrived. Loading times for loading up the fp16 modell went from mind numbing 40 seconds to acceptable 4 seconds! happi
>>
any good flux finetunes for porn yet?
>>
https://civitai.com/models/692255/showergirls-flux
Wetgirlbros, we fuckin won
>>
File: 00069-4216691847.png (1.2 MB, 1216x832)
1.2 MB
1.2 MB PNG
>>
File: 00105-859389595.png (1.3 MB, 1440x1080)
1.3 MB
1.3 MB PNG
>>102107826
>>
>>102108101
not really it, is it? also can you generate her shopping at Ikea or something?
>>
>>102108101
Holy shit, is there a lora of her available?
>>
is 8gb 3070 enough to run flux checkpoints? They said SDXL wouldn't be possible but I get like 15s/render with it now with forge or comfy.

>>102107351
catbox?
>>
File: kiss.jpg (107 KB, 1024x1024)
107 KB
107 KB JPG
>>102107826
You can sometimes just describe the action explicitly. Yeah it's not quite there here.
>>
>>102108174
Sure

I stole the prompt and added a bit about Japanese and latex (which didn't really appear lol)

So don't blame me for the crazy prompt!

https://files.catbox.moe/i37c0x.png

Also wait, I think I have something that makes you see the prompt without catbox.
>>
>>102108174
I think 8gb is enough, you have to choose one of the weaker checkpoints. Maybe Nf4 not sure.

https://github.com/lllyasviel/stable-diffusion-webui-forge/discussions/1050
>>
File: 00074-4216691852.png (1.16 MB, 1216x832)
1.16 MB
1.16 MB PNG
>>
https://github.com/chrisgoringe/cg-mixed-casting
Thoughts?
>>
seeling those mutated fingernails quite often. shrug.
>>
File: file.png (2.53 MB, 1728x968)
2.53 MB
2.53 MB PNG
Asuka
>>
>>102108214
I know but expected it to know the expression since it is common
>>
>>102108325
were you successful with your lora? & thanks again for bringing the inksketch lora to my attention.
>>
>>102108349
>haven't posted in a few threads
>instantly recognized who I am

HOW AAAAAAA

Anyways, if you are referring to the WoH lora, here:https://civitai.com/models/676564
>>
File: 00071-4216691849.png (1.21 MB, 1216x832)
1.21 MB
1.21 MB PNG
>>
>>102108367
NTA but there are very recognisable personalities in this thread.

You, the Asuka poster
The guy who calls Hatsune Miku Miku Hatsune and makes warcraft gens of her
The guy with the LoRA troubles
Debo
The weird skinny girl poster
The guy that drops in and posts a random pony gen then leaves
The guy who really likes tight vinyl ass shots of women
The guy who reports on random meme merges on civit

There's more but I can't remember them off hand.
>>
File: 00109-26991113.png (1.91 MB, 1024x1440)
1.91 MB
1.91 MB PNG
>>102108127

>>102108147
There's a couple lizzie loras on civitai. I trained this one mostly to test if I could caption outfits and have them show up with just the tag. It didn't work
>>
File: 00082-1964955537.png (1.2 MB, 1216x832)
1.2 MB
1.2 MB PNG
>>
>>102106813
link? cant find it on hf
>>
File: file.png (2.54 MB, 968x1728)
2.54 MB
2.54 MB PNG
Anyways sd1.5 still slaps

>>102108419

I only posted like 1 though...

I also posted bocchi the c++
>>
>>102108419
>tfw not on the list
>>
>>102108322
haven't tried it myself (i have 24GB VRAM) so this is a hot take. Useful to determine what your system can handle and/or you patience level is at at different quant levels without downloading those multiple gguf files at different quants. Once you've found an appropriate quant level with this custom node then download that quant and continue prompting without the node to save on the realtime casting penalty.

If it's actually useful to cast from full to quant everytime then sure keep using it.
>>
File: file.png (2.69 MB, 968x1728)
2.69 MB
2.69 MB PNG
>>102108440
>tfw on the list

is that good
>>
File: 00111-2731475149.png (1.5 MB, 1024x1320)
1.5 MB
1.5 MB PNG
>>102108421
>>
>>102108419
I guess I'm debo?
>>
>>102108472
and i think I'm the vinyl guy, but I post all kinds of images really.
>>
>>102108419
don't forget the guy that posts n64/ps2 gens and is physically butthurt because they weren't included in the collage.
>>
>>102108367
ah no I thought you were the anon with the failed asuka lora. I asked about the gen (4xgrid) and the inksketch lora was being mentioned. you shared the WoH lora fresh off your HD, I remember. NICE
>>102108425
really cool glasses
>>
Please don't try an make yourself into a thread personality and stop trying to identify posters
>>
File: fs_0012.jpg (114 KB, 912x1280)
114 KB
114 KB JPG
>>
File: 2024-08-27_00147_.jpg (1.03 MB, 3840x2160)
1.03 MB
1.03 MB JPG
>>102108428
here you go
>https://huggingface.co/uwg/upscaler/blob/main/SwinIR/003_realSR_BSRGAN_DFOWMFC_s64w8_SwinIR-L_x4_GAN-with-dict-keys-params-and-params_ema.pth
>>
>>102108493
Which one is this guy?
>>
If you want to become a thread personality and celebrity head on over to /sdg/ and spam the same gen 200 times over and interact with everybody in the thread.
>>
say i wanted to do a "manual upscale" in comfy, ie. split the pic into custom-picked 1024x1024 tiles and upscale them individually, then merge the result in photoshop
how would i replicate the ultimate SD upscale node's operation? is it just a simple upscale model then img2img with 0.2-0.25 denoise? or are there some other tricks going on too?
>>
>>102108488
Yeah I only realise after you mentionmed it but those are some nice glasses.
>>
File: 00112-3311145557.png (1.55 MB, 1024x1320)
1.55 MB
1.55 MB PNG
>>102108456
Last one
>>
>>102108488
Thanks anon
Sometimes I just suddenly have an idea for a bizzare lora and does it. I have done a few coomer artist loras that I don't share anywhere but the WoH lora seemed good enough. Other times I just post random shit

Pic related is another SDXL lora I trained the other day with some coomer weeb game CG
>>
>>102108530
great gens and lora
>>
>>102108488
What's this lora?
>>
>>102108523
>>102108493
This is the hard core /ldg/, /sdg/ split maximalist. He gets upset at the notion of people having discernable difference. He wears a Gray tracksuit, he is completely bald and sits in a scantily decorated white room illuminated by bright LED lights. His face is a constant scowl.
>>
File: fs_0022.jpg (136 KB, 1280x1280)
136 KB
136 KB JPG
>>
>>102108599
>projecting this hard
>>
>>102108591
nta, but looks like the World of Horror lora
>https://civitai.com/models/198729/world-of-horror
>>
>>102108616
I KNOW HER!!
>>
>>102108541
I should get into lora making.
>>102108528
you could always go the manual route. gen image, save, open in program of your choosing, splice it up, dump back into comfy in a simple load image>upscale with model>downsize>resample pipeline
>>
>>102108627
>>102108591
sorry I mean the flux version ofc
>https://civitai.com/models/676564/world-of-horror-flux-d
>>
>>102108509
thank you :)
>>
File: 00000-3875963766.jpg (102 KB, 896x1152)
102 KB
102 KB JPG
>>102108530
Haha nice
>>
File: ComfyUI_01427_.png (1.14 MB, 1216x832)
1.14 MB
1.14 MB PNG
>>102108627
>>102108645
>>102108656
It's not this one. Mine is for flux
>>
File: fs_0026.jpg (157 KB, 1280x1280)
157 KB
157 KB JPG
>>102108632
cool pls tell her hi for me, and that I'm sorry for the nonconsensual depictions I make
>>
>>102108618
I'm not projecting, anon, I'm mocking you.
>>
>>102108683
>doesn't even realize it
>>
Kohya is tinkering with the idea of training LoRAs for the t5 model. Thoughts?
>>
>>102108694
You are debo
>>
>>102108714
waiting for my boy with the phd to do his tests first
>>
>>102108714
More variety .. but hm. What will it achieve? Maybe he wants to talk Japanese to it. Even the smarter nerds in Japan have trouble with Engerrish
>>
>>102108731
Keep going schizo, what else am I?
>>
>style lora
>with a trigger word
Please stop
>>
File: 2024-08-27_00159_.jpg (1.07 MB, 3840x2160)
1.07 MB
1.07 MB JPG
>>
>>102108746
I think the issue is that unique tokens for concepts aren't being reflected in the training well, if at all. Now that we have clip training working as well, we can see if that helps with the training but I suspect it's not actually doing much.

That's probably why Kohya is now looking at T5 training as an avenue to improve training results.
>>
How to make unet loader gguf create a model name string?
>>
File: grid-0039.jpg (297 KB, 1792x2400)
297 KB
297 KB JPG
>>
>>102108804
>I think the issue is that unique tokens for concepts aren't being reflected in the training well
People should stop using unique tokens.
If you do "el1z4b3th" instead of "Elizabeth" you deserve to be shot.
>>
File: 00331-667938033.png (2.14 MB, 1024x1440)
2.14 MB
2.14 MB PNG
>>102108349
>>102108419
there are multiple asuka posters bro
>>
>>102108856
No I need to train it like SD 1.5, I want my outputs to look like SD 1.5. Don't ask me why I'm using Flux.
>>
File: file.png (3.67 MB, 1024x1504)
3.67 MB
3.67 MB PNG
>>
File: 00014-3045120459.png (1 MB, 1216x832)
1 MB
1 MB PNG
>>102108816
Noice
>>
>>102108805
open up the code and add a string output
>>
>>102107332
What API?
>>
>>102107332
Anon, a cease and desist is itself enough to ruin your day. You really don't want to get into legal trouble.
>>
File: sisyphus.png (24 KB, 1457x121)
24 KB
24 KB PNG
>>102108926
ty man, it's from cyber city oedo 808
>>
>>102106681
>Maintain thread quality
>https://rentry.org/debo
we /sdg/ again sisters!
>>
we certainly aren't troon
>>
>>102108645
>>102108627
Somehow...I like SDXL better? At least the example images.
Are we being bamboozled by Flux?
>>
>>102109103
If all you want is "1girl, big titties", and hit the random button 100 times, sure, SDXL is better. And people said the same thing about SDXL vs SD 1.5 when SDXL came out. It was obvious that SDXL was clearly better but it took months for it to gain a foothold because full finetuning is ultimately the requirement for these models.
>>
File: 00085-224693973.png (1.89 MB, 1024x1440)
1.89 MB
1.89 MB PNG
>>102109103
hard to explain, in flux you can get specific images much easier. In XL you can put in a simple prompt and get the style with little control over the content. Once we get more flux finetunes comparing XL and flux will seem like a joke.
>>
>>102109100
right! you are valid sister!!!
>>
>>102109150
For 1girl+titties I'd take 1.5
>>
â–²
â–² â–²
>>
File: file.png (119 KB, 640x562)
119 KB
119 KB PNG
>"k let's see if there's some performance upgrade in comfy"
>click upgrade all
>"UNINSTALLING TORCH 2.4"
>"FETCHING TORCH 2.2"
>"INSTALLING TORCH 2.2"
>"UNINSTALLING TORCH 2.2"
>"FETCHING TORCH 2.4"
>"INSTALLING TORCH 2.4"
>>
File: 2024-08-27_00164_.jpg (931 KB, 3840x2160)
931 KB
931 KB JPG
>>
>>
â–²
▲▲
>>
>>102109384
Congrats, you lose
>>
ironically, comfyui, is, not very comfy at all. you could say it is uncomfy, even.
>>
>still no sdxl + t5xxl
>still no absolute size slider
>still no syntax to specify and compose styles
its over
>>
>>102109538
>still no sdxl + t5xxl
won't happen .. thats what they did with SD3
>still no absolute size slider
the fuck you need that for?
>>
Ankha box
>>
>>102109587
is that the marge simpson lora?
>>
File: 2024-08-27_00169_.jpg (1.27 MB, 3840x2160)
1.27 MB
1.27 MB JPG
>>
>>102109049
oh nice
>>
>>102109597
no its just some pony merge
https://civitai.com/models/685087?modelVersionId=766761

Pony knows Ankha quite well
>>
>>102109538
You have to train the whole model to do T5 and at that point just train a new model on better architecture. SDXL's dual clip is ass.
>>
â–²
â–² â–²
>>
File: 1girl.jpg (429 KB, 1024x1536)
429 KB
429 KB JPG
>>
>>102109538
>>still no sdxl + t5xxl
I talk with some people that work for companies and they do some similar stuff to XL, idk what because they can't say because of business contracts but theres stuff that be done to XL to improve its prompt understanding (like a dark skinned woman wearing a white beanie and a blonde man wearing a black hat kind of stuff), we are missing on a lot of stuff we dont even know about :/
>>
File: FLUX_00028_.png (1.2 MB, 1152x896)
1.2 MB
1.2 MB PNG
roll on halloween
>>
>>
>>102109690
the random guy on the left.. lol
>>
>>102109556
>the fuck you need that for?
consistency

>>102109628
yeah. I mean clip sucks and I want : sdxl - clip + t5.
I like some sdxl checkpoints dataset
>>
>>102109690
>>102109729
>>
>>102109732
Any change to SDXL breaks your Loras. SDXL + T5 is a new model.
>>
https://huggingface.co/THUDM/CogVideoX-5b
CogVideoX-5b weights got released
>>
>>102109729
>>102109736
its me
>>
>>102109777
bet it can't even do boobs so I sleep
>>
>>
File: 00115-2243074266.png (1.24 MB, 832x1216)
1.24 MB
1.24 MB PNG
>>
>>102109777
only 5b? wtf i thought video gen would take way more, looks impressive. since it's a transformer it should be quantable too, right?
>>
>>102109997
>since it's a transformer it should be quantable too, right?
of course, it can be GGUF'ed aswell
>>
>>102109997
good video gen takes more, this is Luma tier
>>
File: 00008-4024025922.png (1.04 MB, 1216x832)
1.04 MB
1.04 MB PNG
>>
>>
File: 00121-588678283.png (1.49 MB, 832x1216)
1.49 MB
1.49 MB PNG
>>
>>102108966
How?
>>
>>102110051
More of her?
>>
File: 1718571080290345.webm (2.23 MB, 1024x1024)
2.23 MB
2.23 MB WEBM
>>102110036
>this is Luma tier
Wouldn't that still be a good thing if it was true?
>>
>>102110051
would. mega would.
>>
File: 1718770902862679.webm (1.1 MB, 1024x1024)
1.1 MB
1.1 MB WEBM
>>102110109
Like this is still pretty cool if we can have this local. Not all the gens turn out this well, but if you can get even some, then it's a win.
>>
>>102110109
Luma is gen and pray.
>>
>>
>>102109777
>no img2vid
Useless for us.
>>
>>102110141
I remember the Newgrounds hentai of her...
>>
File: 1718381468763268.webm (1.69 MB, 720x972)
1.69 MB
1.69 MB WEBM
>>102110127
>>
>>102110147
I think that's possible to make it img2vid no? like our image models, they weren't made to do img2img but we managed to make it work anyway
>>
>>102110168
fucking nightmare fuel
>>
File: 1718825951783614.webm (632 KB, 1024x1024)
632 KB
632 KB WEBM
>>102110168
>>
>>102110036
a 5b luma, any vramlet could run this. pretty cool.
>>
>>102110168
nice
>>102110189
lmaoooooo
>>
File: 00006-4024025920.png (1.07 MB, 1216x832)
1.07 MB
1.07 MB PNG
>>102110092
I wish I can't really replicate her. But maybe I'll try again.
>>
>>102110189
>>
>>102109997
>only 5b? wtf i thought video gen would take way more,
it's asking for 11gb at fp8, we could go for 10b and still make it work on a 24gb card at Q8_0
>>
File: 00127-588678289.png (1.55 MB, 832x1216)
1.55 MB
1.55 MB PNG
>>
>>102110203
hey hey people
>>
File: 1722040243184285.webm (3.58 MB, 1024x1024)
3.58 MB
3.58 MB WEBM
>>102110203
>>
>>102110203
oh vey
>>
File: 00001-3892022416.png (1.02 MB, 896x1152)
1.02 MB
1.02 MB PNG
>>
>>102110175
>they weren't made to do img2img but we managed to make it work anyway
There is about zero work to implement simple img2img in an image model. It's the same process when starting from noise or from an image.
That's not the case for these kinds of video models that generate all frames at once,
>>
>>102109150
>>102109197
>flux finetunes
Yeah. The impossibly expensive finetunes on the model with a license that won't let you even take donations to pay for them.
>>102109238
This. And for inpainting as well. 1.5 is king.
>>
>>102110300
Why aren't you in SDG spamming SDXL gens?
>>
File: 00004-2888243161.png (1.04 MB, 896x1152)
1.04 MB
1.04 MB PNG
>>
File: 1718306882394022.webm (2.49 MB, 1024x1024)
2.49 MB
2.49 MB WEBM
>>102110246
>>
>>102110317
I got a bit tired of genning. I'm just checking what's up (nothing seems to be up).
>>
File: 00005-4242892882.png (1.1 MB, 896x1152)
1.1 MB
1.1 MB PNG
>>
File: flux_01000_.png (1.31 MB, 968x1240)
1.31 MB
1.31 MB PNG
>>
File: 1655783628694.jpg (375 KB, 1248x1868)
375 KB
375 KB JPG
Which models, loras, techniques, tags, etc should I know about to generate the most beautifully huge boobs in high res with realistic skin and stuff?
Bonus if able to generate them with good fluid physics going on like getting grabbed or something where you can see the folds well and unusual angles.
>>
File: 1718579407679115.webm (1.15 MB, 1024x1024)
1.15 MB
1.15 MB WEBM
>>102110339
>>
>>102110398
big boobs lora and realism lora.

wow that was hard.
>>
File: hu75f.webm (174 KB, 1024x1024)
174 KB
174 KB WEBM
about a 3 day queue time on luma. def worth the wait.
>>
>>102110417
this killed me
>>
File: 1718728687233377.webm (1.28 MB, 1024x1024)
1.28 MB
1.28 MB WEBM
>>102110403
>>
>LOCAL diffusion
>some nigger spamming non-local videoslop
>>
>>102110440
racist cunt
>>
>>102110409
Point me the links please. Also which checkpoints are the best for photorealistic boobs?
I already had fun generating some stuff but I feel like it could be better if I had more overall guidance. My prompting skills is also very basic as of now
>>
File: ComfyUI_04588_.png (1.52 MB, 832x1216)
1.52 MB
1.52 MB PNG
>>102108419
you forgot the stinky feet fetish guy and the hatsune miku with dreadlocks cfg guy
>>
>>102110465
you're one lazy fuck
https://civitai.com/models/656458/big-boobs-flux?modelVersionId=734465
https://civitai.com/models/631986?modelVersionId=706528
>>
File: 1719305948356779.webm (1.8 MB, 1024x1024)
1.8 MB
1.8 MB WEBM
>>102110403

>>102110440
The point is demonstrating what local could be, if the claim that "it's luma-tier" was true. The potential to get good gens outweighs everything else. If we had this local, then it would be worth letting it churn out a gorillion gens to get a good one.
>>
>>102109633
>>102109452
>>102109301
lmao gottem
>they removed the ability to triforce, now we're all newfags
>>
File: 00129-588678291.png (1.54 MB, 832x1216)
1.54 MB
1.54 MB PNG
>>
>um actualyl it's okay for me to spam my slop instead of genning with the loc-ACK
>>
File: 00130-588678292.png (1.61 MB, 832x1216)
1.61 MB
1.61 MB PNG
>>
>>102110479
you're recommending Flux to a porn addict? you're retarded
>>
>>102110514
no one said hardcore
youtube is awash with porn, try-on hauls, ASMR roleplay and shit
>>
File: Flux_00514_.png (1.38 MB, 1024x1024)
1.38 MB
1.38 MB PNG
still cooking dat loora
>>
this is weird but in testing my style lora, certain prompts simply will not take to the style and remain that weird flux realism, even at epoch 30 or where the style is so cooked in, in other images that hands are garbled etc. I tried 4 different training settings, 2 different dataset amounts and
1) both danbooru tagging with boomer prompting
in one dataset and
2) wildcard mix of boomer or danbooru tagging
in the other
I fucking hate that I'm saying this, but maybe a trigger word would've fixed that...? I don't really even want to test that lmao, such is my deep disdain for trigger words..
give me a second to get everything together and I'll post images to better display what I mean. did a big batch overnight for testing and need to sort through
>>
File: 1722000644390427.webm (2.3 MB, 1536x768)
2.3 MB
2.3 MB WEBM
Found some local that I remembered. For static scenes, animatedif can produce some pretty videos. Though still, a local version of Luma would be a game changer. Whether the Cog model can get there idk, but it would be nice.
>>
File: 1721993432161910.webm (2.75 MB, 1536x768)
2.75 MB
2.75 MB WEBM
>>102110591
Also none of these are mine, idk if anyone legitimately thought I was posting personal gens.
>>
File: grid-0044.jpg (329 KB, 1792x2400)
329 KB
329 KB JPG
>>102110565
I would try one batch with caption dropout every second epoch. If that doesn't work perhaps another optimizer, change rank, make the dataset smaller etc.
>>
File: 1721991176828137.webm (3.2 MB, 1536x768)
3.2 MB
3.2 MB WEBM
>>102110609
>>
File: 00005-1866866540.png (1.04 MB, 1216x832)
1.04 MB
1.04 MB PNG
>>
File: 1721996115329985.webm (2.81 MB, 1536x768)
2.81 MB
2.81 MB WEBM
>>102110623
>>
File: 00006-1866866541.png (1.03 MB, 1216x832)
1.03 MB
1.03 MB PNG
>>102110623
I likey like.
>>
>>102110632
>>102110642
Great. Portrait please.
>>
File: newalya1.png (3.65 MB, 1280x3072)
3.65 MB
3.65 MB PNG
>>
File: newalya2.png (2.73 MB, 1280x3072)
2.73 MB
2.73 MB PNG
>>
File: newalya3.png (2.92 MB, 1280x2904)
2.92 MB
2.92 MB PNG
>>
File: grid-0106.jpg (438 KB, 2304x1024)
438 KB
438 KB JPG
>>
File: grid-0027.jpg (606 KB, 2688x2016)
606 KB
606 KB JPG
>>
>>102110565
Of course you need a trigger word, you are, after all, describing a style. Why wouldn't there be a style keyword?
>>
>>102110467

>>102110687
>>102110705
>>102110721
>>
>>102110770
neat
>>
File: ComfyUI_03974_.png (973 KB, 1024x1024)
973 KB
973 KB PNG
>>102110467
>the hatsune miku with dreadlocks cfg guy
finally I'm still remembered!
>>
>>102110800
>segmented thumb
>>
File: ComfyUI_Flux_3.png (1.22 MB, 1344x768)
1.22 MB
1.22 MB PNG
>>
>>102111042
nice gun
>>
>>102111042
What you gonna do, shoot me?
>>
>>102110565
You know, I've actually gotten some use before out of using LORAs without their trigger words.
Sometimes LORAs like half work without the trigger word and sometimes this actually produces a more desirable output than with the trigger word.
So trigger words can be useful.
>>
File: ComfyUI_Flux_6.png (1.31 MB, 1344x768)
1.31 MB
1.31 MB PNG
>>102111072
Thanks.
>>102111080
(You)'re already dead.
>>
File: ComfyUI_03884_.png (2.51 MB, 1440x1280)
2.51 MB
2.51 MB PNG
>>
I posted a couple days ago about the multi-step workflow I'm doing to caption images for flux. Well, everything is done now, I have the final captions. Here's an update.

I have 2000 real images, of women peeing, desperate to pee, or wetting themselves (yes I know it's weird, this is my fetish). I have manually made booru-style tags for each image that capture the core concepts but not any details.

The workflow is as follows: caption the images using InternVL-40b, giving the model the tags as additional context to help it. Do the same with joycaption, also giving it the tags as context. Finally, pass both captions + tags into Mistral-Large (q4_k_m running locally), with a rather complex prompt describing what certain tags mean, how to combine the information, and the desired writing style.

Here is a random sample of the results, not cherrypicked: https://files.catbox.moe/t6t2au.txt
It shows the tags, each input caption, and the final combined caption.

Seems to work pretty well tbhdesufam. But you need four 3090s to run this all locally like I did. With a smaller LLM, I think 3 3090s could make it work (the minimum for internvl-40b).
>>
>>102111177
>But you need four 3090s to run this all locally like I did
god damn my dude I guess that's the price you have to pay to gen peeing women
>>
File: ComfyUI_03885_.png (2.62 MB, 1280x1440)
2.62 MB
2.62 MB PNG
>>
>>102110565
ok, here are some examples of the issue. the pure boomer prompts I stole from civitai are the culprits

>example of my dataset captioning for reference: https://desuarchive.org/g/thread/102013088/#102014906

prompts used with success previously (not pictured in picrel, but posted before) :
>a man with long white hair and Chinese style clothing, 1boy, long hair, white hair
>a woman with long white hair and Chinese style clothing, 1girl, long hair, white hair
>a man and a woman in a romantic embrace. the man, on the left, has white hair and a soft smile. the woman, on the right, has black hair and her eyes are closed in peaceful contemplation. there are sakura blossom trees in the background and chinese architecture. 1girl, 1boy, long hair, white hair, black hair, asian architecture

my current hypothesis is that it overlearned the booru tagging and now doesn't perform well without at least one booru tag, despite use of the wildcard arg. kinda gay, I really don't want to rent more gpu time to re-run the 500 dataset, might just live with it as-is and insist on using at least 1 booru tag while prompting. obviously the other optimizers/lrs/dataset sizes have the same issue, I'm just not bothering to put an image from each in my collage
>>102110612
>If that doesn't work perhaps another optimizer, change rank, make the dataset smaller etc.
one dataset is 500ish images, one dataset is 150 images. I have tried lion-cosine, rex-came, adamw8bit-cosine, 8dim/8alpha, 32dim/32alpha, 3e-4 LR, 4e-4 LR, 5e-4 LR (5e-4 was a failure, 4e-4 has messed hands). I actually think one of the runs might have trained without captions mixed into the wildcards because I accidentally left paragraph spaces in the captions instead of just linebreaks, but I can't say this for certain and will have to try proper caption dropout on future loras
>>
File: file.png (9 KB, 357x179)
9 KB
9 KB PNG
Do I touch any of these?
>>
>>102111273
If you're in doubt, don't use the advanced loader.
>>
>>102111177
You are like little babby.
>>
>>102111243
unironically once we have perfect AI porn people would've paid 10x that much 50 years ago
>>
>>102111303
or paid a woman $100 to do it
they still do that btw
>>
I've had women pee in my mouth for free and I still believe that training peeing LORAs is a worthy endeavor.
>>
>>102111243
It's not that expensive in the grand scheme of things, at least if you work a high paying tech job like I do. Other hobbies cost way more. I've done so much image model and LLM fine tuning that it easily has been worth. Would have been worth it even if it was double the price.

Also 2x3090 is easy to do even on a desktop computer, without having to do server bullshit with a separate machine. With that you could do joycaption only, and a good 70b LLM would probably be enough to combine the tags and caption. So this is mostly doable in the same way with 2x3090, which is kinda the bare minimum for serious AI dev work.
>>
wheres the scat lora guy now
>>
>>102111253
I wonder if you could reduce 150 dataset to 50, just the best ones, and run with 2e-4 batch size 2. Just as a test
>>
>>102111351
Have you had to modify your home's breakers?
>>
>>102111354
in hospital with dysentery
>>
>>102111342
This. I've listened to cute girl tummies IRL, and I'm still working on tummy rumbling LORAs. Both are really nice.
>>
>>102111326
sounds like a waste of money
>>
>>102111370
He'll be out shooting buffalo in no time.
Let's just hope he doesn't attempt to ford any rivers.
>>
>>102111363
at this point I highly doubt the issue is related to LR/batch size/dataset size considering it works flawlessly as long as some form of booru tag is included, but I will consider it for the sake of testing (sadly I'm a 12gb vramlet and can't run batch 2 so I will need to rent to run the test, shouldn't take long if I do though)
>>
>>102111367
Kind of, the machine has 2 power supplies that need to be plugged into 2 different 15A circuits or else the breaker trips.
>>
File: ComfyUI_33100_.png (1009 KB, 1024x1024)
1009 KB
1009 KB PNG
>>
File: flux_01002_.png (1.46 MB, 968x1240)
1.46 MB
1.46 MB PNG
>>102111177
That looks like pretty good prompts. You should document your workflow and post it online.
>>
>>102111435
what the FUCK is that thing..........
>>
File: ComfyUI_03886_.png (2.36 MB, 1280x1440)
2.36 MB
2.36 MB PNG
>>
File: grid-0577.jpg (408 KB, 2304x1792)
408 KB
408 KB JPG
>>
>>102111443
Hatsune Gigu
>>
File: grid-0552.jpg (432 KB, 2304x1792)
432 KB
432 KB JPG
>>
File: grid-0563.jpg (427 KB, 2304x1792)
427 KB
427 KB JPG
>>
File: ComfyUI_33099_.png (1.1 MB, 1024x1024)
1.1 MB
1.1 MB PNG
>>102111443
You never had moe octopi steal tomatoes from your farm?
>>
>>102111494
interesting style
>>
Is anyone reverse engineering the hashed tokens in flux dev yet?
>>
>>102111630
yeah some turkish dude that has a phd is
>>
Best way to caption anime/manga images?
Combine ViT-L and ViT-bigG tags?
>>
File: grid-0597.jpg (743 KB, 1792x2304)
743 KB
743 KB JPG
>>
File: 00187-1714176828.jpg (797 KB, 1440x1920)
797 KB
797 KB JPG
>>
File: ComfyUI_33105_.png (1.05 MB, 720x1280)
1.05 MB
1.05 MB PNG
>>
File: grid-0595.jpg (753 KB, 1792x2304)
753 KB
753 KB JPG
>>
File: 00196-1714176829.jpg (667 KB, 1440x1920)
667 KB
667 KB JPG
>>
File: flux_01009_.png (1.55 MB, 968x1240)
1.55 MB
1.55 MB PNG
>>
Come and get it, your own fresh loaf of...
>>102111793
>>102111793
>>102111793
>>
File: grid-0581.jpg (865 KB, 1792x2304)
865 KB
865 KB JPG
I've always liked Frank Frazetta LoRAs, but his style often adds weird unwanted stuff.
On the bright side it makes the outputs more fun.
>>
File: ifx262.png (1.33 MB, 1024x1024)
1.33 MB
1.33 MB PNG
>>
>>102111802
>page 2
>>
>>102111802
stop baking when we're already eating bread
123 images, lmao
>>
>>102111630
hashed... tokens?
>>
>>102111913
replies limit tho
>>
>>102111913
we hit post limit
>>
>>102111913
people in a rush to show the images they spent 5 minutes making xd
>>
>>102111928
Yes. There are apparently a bunch of tokens they used during training and simply hashed to hide. It ensures you censor your model without fucking up your training, but if we can reverse engineer them somehow we are then able to use them.

An example is celebrities. They're hiding in latent space somewhere.
>>
>>102112074
>>102112102
"bump limit, it's a bump limit"

anyways: shifting this limit stifles threads. the late-thread territory gets larger, and discourages posts
>>
>>102112134
>"bump limit, it's a bump limit"
replies make thread not go up anymore
>>
>>102112134
humm hermmm.... zzz... zzzzzzzzzzzzzzz...........
>>
File: ComfyUI_33113_.png (789 KB, 1280x720)
789 KB
789 KB PNG
>>
>>102112133
the tokens available whatsoever are the ones in T5 and CLIP. all of those are in the vocabs.
there is nothing to hide on a token-level for the transformer, and there are no digests available anywhere.
merges on the other hand could be custom strings, that could be digests - but that would be token-inefficient.
>>
bump
>>
File: ComfyUI_33110_.png (768 KB, 1280x720)
768 KB
768 KB PNG
sage
>>
>>102110154
hehe



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.