[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: tmp.jpg (1.12 MB, 3264x3264)
1.12 MB
1.12 MB JPG
Discussion of free and open source text-to-image models

Previous /ldg/ bread : >>101558992

>Beginner UI
EasyDiffusion: https://easydiffusion.github.io
Fooocus: https://github.com/lllyasviel/fooocus
Metastable: https://metastable.studio

>Advanced UI
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
ComfyUI: https://github.com/comfyanonymous/ComfyUI
InvokeAI: https://github.com/invoke-ai/InvokeAI
SD.Next: https://github.com/vladmandic/automatic
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI

>Use a VAE if your images look washed out
https://rentry.org/sdvae

>Model Ranking
https://imgsys.org/rankings

>Models, LoRAs & training
https://civitai.com
https://huggingface.co
https://aitracker.art
https://www.modelscope.cn/home
https://github.com/Nerogar/OneTrainer
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts

>Pixart Sigma & Hunyuan DIT
https://huggingface.co/spaces/PixArt-alpha/PixArt-Sigma
https://huggingface.co/spaces/Tencent-Hunyuan/HunyuanDiT
Nodes: https://github.com/city96/ComfyUI_ExtraModels

>Kolors
https://gokaygokay-kolors.hf.space
Nodes: https://github.com/kijai/ComfyUI-KwaiKolorsWrapper

>AuraFlow
https://fal.ai/models/fal-ai/aura-flow
https://huggingface.co/fal/AuraFlows

>Index of guides and other tools
https://rentry.org/sdg-link
https://rentry.org/rentrysd

>View and submit GPU performance data
https://vladmandic.github.io/sd-extension-system-info/pages/benchmark.html
https://docs.getgrist.com/3mjouqRSdkBY/sdperformance

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
sd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium

>Related boards
>>>/h/hdg
>>>/e/edg
>>>/d/ddg
>>>/b/degen
>>>/vt/vtai
>>>/aco/sdg
>>>/trash/sdg
>>
File: 0.jpg (275 KB, 1024x1024)
275 KB
275 KB JPG
>>
Blessed bred of frenship
>>
File: 00001-4046871599.jpg (561 KB, 1232x1528)
561 KB
561 KB JPG
The significance of the passage of time
>>
File: Sigma_09560_.png (1.9 MB, 1024x1024)
1.9 MB
1.9 MB PNG
Oooh this bread is searchable and blessed!
>>
File: Sigma_09599_.png (2.13 MB, 1024x1024)
2.13 MB
2.13 MB PNG
>>
File: Sigma_09611_.png (1.66 MB, 1024x1024)
1.66 MB
1.66 MB PNG
The new stability "4d video" is pretty neat. I don't have use for 3d assets, but people who buy them sure do.
>>
File: Sigma_09623_.png (1.71 MB, 1024x1024)
1.71 MB
1.71 MB PNG
>>
File: Sigma_09636_.png (1.99 MB, 1024x1024)
1.99 MB
1.99 MB PNG
>>
File: Sigma_09668_.png (1.79 MB, 1024x1024)
1.79 MB
1.79 MB PNG
>>
File: Sigma_09676_.png (2.2 MB, 1024x1024)
2.2 MB
2.2 MB PNG
>>
File: Sigma_09689_.png (2.21 MB, 1024x1024)
2.21 MB
2.21 MB PNG
>>
File: Sigma_09704_.png (1.84 MB, 1024x1024)
1.84 MB
1.84 MB PNG
Later
>>
File: x7bi5yrmj2f51.jpg (10 KB, 480x360)
10 KB
10 KB JPG
havent made the collage in a while
>>
>>101566261 >>101566128 >>101566356
beautiful
>>
File: Sigma_09712_.png (1.6 MB, 1024x1024)
1.6 MB
1.6 MB PNG
>>101566407
Fret not about the taste of the baker and consider what you yourself would like to see.

>>101566495
ty
>>
File: Sigma_09716_.png (1.67 MB, 1024x1024)
1.67 MB
1.67 MB PNG
>>
File: Sigma_09720_.png (1.83 MB, 1024x1024)
1.83 MB
1.83 MB PNG
>>
File: Sigma_09729_.png (1.73 MB, 1024x1024)
1.73 MB
1.73 MB PNG
>>
File: Sigma_09734_.png (1.38 MB, 1024x1024)
1.38 MB
1.38 MB PNG
>>
File: Sigma_09748_.png (2.16 MB, 1024x1024)
2.16 MB
2.16 MB PNG
>>
File: Sigma_09754_.png (1.74 MB, 1024x1024)
1.74 MB
1.74 MB PNG
>>
File: Sigma_09761_.png (2.26 MB, 1024x1024)
2.26 MB
2.26 MB PNG
>>
File: Sigma_09771_.png (2.09 MB, 1024x1024)
2.09 MB
2.09 MB PNG
>>
File: Sigma_09778_.png (2.14 MB, 1024x1024)
2.14 MB
2.14 MB PNG
>>
File: Sigma_09775_.png (1.59 MB, 1024x1024)
1.59 MB
1.59 MB PNG
>>
File: Sigma_09783_.png (1.9 MB, 1024x1024)
1.9 MB
1.9 MB PNG
>>
File: Sigma_09790_.png (2.08 MB, 1024x1024)
2.08 MB
2.08 MB PNG
>>
File: Sigma_09795_.png (2.04 MB, 1024x1024)
2.04 MB
2.04 MB PNG
>>
File: Sigma_09796_.png (2.14 MB, 1024x1024)
2.14 MB
2.14 MB PNG
>>
File: Sigma_09805_.png (1.73 MB, 1024x1024)
1.73 MB
1.73 MB PNG
>>
File: Sigma_09807_.png (1.7 MB, 1024x1024)
1.7 MB
1.7 MB PNG
>>
File: Sigma_09809_.png (1.91 MB, 1024x1024)
1.91 MB
1.91 MB PNG
>>
File: Sigma_09817_.png (2.07 MB, 1024x1024)
2.07 MB
2.07 MB PNG
>>
File: Sigma_09828_.png (1.99 MB, 1024x1024)
1.99 MB
1.99 MB PNG
>>
File: Sigma_09835_.png (1.12 MB, 1024x1024)
1.12 MB
1.12 MB PNG
>>
>>101566407
its okay anon we believe in you
>>
File: Sigma_09842_.png (1.56 MB, 1024x1024)
1.56 MB
1.56 MB PNG
1girl 1bump
>>
Steady as she goes
>>
File: Sigma_09856_.png (2.07 MB, 1024x1024)
2.07 MB
2.07 MB PNG
>>101568548
Your post as the prompt
>>
File: Sigma_09861_.png (1.91 MB, 1024x1024)
1.91 MB
1.91 MB PNG
>>101568096
Your post as the prompt
>>
File: Sigma_09862_.png (2.25 MB, 1024x1024)
2.25 MB
2.25 MB PNG
>>
File: Sigma_09866_.png (1.78 MB, 1024x1024)
1.78 MB
1.78 MB PNG
>>
File: Sigma_09874_.png (1.93 MB, 1024x1024)
1.93 MB
1.93 MB PNG
>>
File: Sigma_09876_.png (1.46 MB, 1024x1024)
1.46 MB
1.46 MB PNG
Ciao for now
>>
>>101569012
>the voices

>>101569379
Caio
>>
>>101565034
TY for the blessings, anon
>>
>>101565003
nice pyramid
>>
File: ComfyUI_temp_blzgp_00062_.jpg (1.81 MB, 1792x2304)
1.81 MB
1.81 MB JPG
>>101561519
i tested a lot of euler variants, i might test others eventually but i have reason to believe euler is the best here. euler a and the cfg++ variant are very close. i think cfg++ may handle complex backgrounds better especially if you use RAUnet on the cross attention blocks. euler a regular may do shit like hands better, not obvious from a low sample size. euler negative from the koishi-star package is also interesting. schedulers: prefer ays_30 but gits is interesting

picrel is a 12 step gen, 14 seconds straight at 1792x2304
https://files.catbox.moe/kptpa3.png
>>
File: ComfyUI_temp_blzgp_00064_.jpg (1.71 MB, 1792x2304)
1.71 MB
1.71 MB JPG
>>101570720
and just out of robo-curiousity, the same seed but with regular euler a (samplers have different CFG curves, 4.5 in pic related and 2.8 using cfg++ in the above gen)
>>
File: Sigma_09885_.png (1.98 MB, 1024x1024)
1.98 MB
1.98 MB PNG
>>101569412
Your entire post as the prompt

>>101570720
12 steps is quite low. I tried gits, but maybe too many steps. Interesting findings anon, ty
>>
>>101571197
i mean more steps than 12 is more quality, but ive clicked 12 step rough drafts and been perfectly satisfied many times now. a few months ago i wouldve thought it impossible even with lightning memes
>>
File: Sigma_09914_.png (1.94 MB, 1024x1024)
1.94 MB
1.94 MB PNG
>>101571221
Yep 12 steps too low. 24 steps dpm++2sa 4.2CFG is my go-to for Sigma
>>
>>101560949
>>101570720
>>101570801
I opened this thread only for C.C. Thank you and make more.
>>
Sigma IPadapter when
>>
>>
>>
File: ComfyUI_01677_.png (1.28 MB, 1344x1728)
1.28 MB
1.28 MB PNG
>>
File: 00001-1824550978.png (1.4 MB, 896x1152)
1.4 MB
1.4 MB PNG
Good evening
>>
I got into the OP
>>
File: Sigma_09956_.png (1.5 MB, 1024x1024)
1.5 MB
1.5 MB PNG
>>101570720
What do you use for gits coefficient?

>>101571873
Evening
>>
File: Sigma_09968_.png (1.62 MB, 1024x1024)
1.62 MB
1.62 MB PNG
>>101572136
>>
File: catbox_prnkbi.png (3.91 MB, 1344x1728)
3.91 MB
3.91 MB PNG
>>101572136
i dont know what that is, i only tested gits as a scheduler preset provided by some custom node

>>101571405
>>
File: Sigma_09974_.png (1.26 MB, 1024x1024)
1.26 MB
1.26 MB PNG
1girl 1bump

>>101572212
No problem. It's an option if you choose the scheduler as a node
>>
android Laura
>>
File: Sigma_09982_.png (1.47 MB, 1024x1024)
1.47 MB
1.47 MB PNG
>>
File: Sigma_09985_.png (1.48 MB, 1024x1024)
1.48 MB
1.48 MB PNG
>>
File: Sigma_09882_.png (2.14 MB, 1024x1024)
2.14 MB
2.14 MB PNG
So peaceful here compared to sdg :)
>>
File: Sigma_09990_.png (1.44 MB, 1024x1024)
1.44 MB
1.44 MB PNG
>>
writing your own scheduler >>>
>>
File: Sigma_10002_.png (1.36 MB, 1024x1024)
1.36 MB
1.36 MB PNG
>>101572434
Your post as the prompt
>>
>>101564986
>>
>>101572577
imagined with ai watermark.. you made that on meta.ai
>>
>>101572512
accurate portrayal
>>
>>101572620
Yeah, can I run a local model with my machine? I have a RTX 3070 (8GB) and 16GB of RAM?
>>
>>101572662
>RTX 3070 (8GB) and 16GB of RAM?
You can make so much coom with that
>>
File: 2451814453-2395616658.png (1.43 MB, 896x1152)
1.43 MB
1.43 MB PNG
>>101572662
I have a GeForce GTX 1060 3GB, your pc can handle it for sure.
>>
>>101572830
>>101572697
Thanks I'm gonna try it.
>>
Hunyuan packaged into safetensors for use with ComfyUI
>https://huggingface.co/comfyanonymous/hunyuan_dit_comfyui/tree/main
>>
File: 0.jpg (460 KB, 1024x1216)
460 KB
460 KB JPG
>>
File: 0.jpg (210 KB, 768x1024)
210 KB
210 KB JPG
>>
File: 0.jpg (182 KB, 768x1024)
182 KB
182 KB JPG
>>
>>101573176
Nice.
>>
File: Sigma_10052_.png (1.76 MB, 1024x1024)
1.76 MB
1.76 MB PNG
>>101573176
Wooow
>>
File: Sigma_10072_.png (1.34 MB, 1024x1024)
1.34 MB
1.34 MB PNG
This would be a cool job
>>
File: Sigma_10061_.png (1.53 MB, 1024x1024)
1.53 MB
1.53 MB PNG
>>101573058
Comfy just merged native support for Hunyuan too
>>
File: Sigma_10123_.png (1.4 MB, 1024x1024)
1.4 MB
1.4 MB PNG
>>101573954
>>
File: Sigma_10077_.png (1.84 MB, 1024x1024)
1.84 MB
1.84 MB PNG
>>
File: Sigma_10102_.png (1.82 MB, 1024x1024)
1.82 MB
1.82 MB PNG
>>
File: Sigma_10106_.png (1.69 MB, 1024x1024)
1.69 MB
1.69 MB PNG
>>101573176
Still going back to this. Really cool
>>
File: grid.jpg (1010 KB, 1664x2432)
1010 KB
1010 KB JPG
>studio anime screencap, landscape,
>>
File: Sigma_10111_.png (1.78 MB, 1024x1024)
1.78 MB
1.78 MB PNG
>>
File: Sigma_10131_.png (1.7 MB, 1024x1024)
1.7 MB
1.7 MB PNG
>>101574259
3/4 Ghibli
>>
>>101573058
>>101573986
Seems like there are a few finetunes already
>>
File: Sigma_10145_.png (1.7 MB, 1024x1024)
1.7 MB
1.7 MB PNG
>>101574304
A shame I don't know Chinese
>>
File: Sigma_10158_.png (1.83 MB, 1024x1024)
1.83 MB
1.83 MB PNG
>>
File: Sigma_10175_.png (1.62 MB, 1024x1024)
1.62 MB
1.62 MB PNG
Make Sigma less difficult to use, we must
>>
File: 00001-1308769665.png (1.29 MB, 1024x1024)
1.29 MB
1.29 MB PNG
Are there any interesting non SD finetunes yet?
>>
>>101574738
Yes, quite a few https://civitai.com/search/models?baseModel=PixArt%20E&sortBy=models_v9
>>
>>101574738
https://civitai.com/models/573014/900m-pixart-sigma
https://civitai.com/models/435669/bunline-2k1024512-pixart-sigma
https://civitai.com/models/575978/thelaboratory

https://civitai.com/models/593660/900m-pixart-sigma-animeonly
https://civitai.com/models/590464/anime-sigma-768px
https://civitai.com/models/490203/booru-madness

Could be more on other model sites desu
>>
>>101574738
whats your definition of interesting?
>>
>>101565003
Look guys, I JUST want to make porn of women in my life that I want to fuck but can't ever. Which AI is best for that?
>>
File: 0.jpg (308 KB, 768x1024)
308 KB
308 KB JPG
>>101573317
>>101573912
thanks
>>
>>101574941
Sigma still struggles with basic anatomy and Hunyuan looks like a render. Afraid you've coom to the wrong place
>>
>>101574941
https://civitai.com/images/20254452
>>
File: Sigma_10195_.png (1.42 MB, 1024x1024)
1.42 MB
1.42 MB PNG
>>
File: Sigma_10213_.png (1.41 MB, 1024x1024)
1.41 MB
1.41 MB PNG
>>
is it just me or is 1.5 better at photo realism than pony? im looking at sdxl checkpoints too and they look a bit off like pony does from the images people are making. but epicrealism for 1.5 is amazing
>>
File: Sigma_10230_.png (1.77 MB, 1024x1024)
1.77 MB
1.77 MB PNG
>>
File: Sigma_10239_.png (1.85 MB, 1024x1024)
1.85 MB
1.85 MB PNG
>>
File: Sigma_10246_.png (1.68 MB, 1024x1024)
1.68 MB
1.68 MB PNG
>>
File: Sigma_10249_.png (1.43 MB, 1024x1024)
1.43 MB
1.43 MB PNG
>>
>>101575120
XL is better when it's better but 1.5 has more face variety
>>
File: Sigma_10257_.png (1.73 MB, 1024x1024)
1.73 MB
1.73 MB PNG
>>
File: Sigma_10266_.png (1.86 MB, 1024x1024)
1.86 MB
1.86 MB PNG
>>
File: Sigma_10274_.png (1.69 MB, 1024x1024)
1.69 MB
1.69 MB PNG
>>
File: Sigma_10289_.png (2.02 MB, 1024x1024)
2.02 MB
2.02 MB PNG
>>
>>101571603
Can you catbox this? I've always liked cartoonish AI renders.
>>
File: Sigma_10290_.png (2.06 MB, 1024x1024)
2.06 MB
2.06 MB PNG
>>
File: Sigma_10303_.png (1.59 MB, 1024x1024)
1.59 MB
1.59 MB PNG
>>
File: Sigma_10312_.png (1.34 MB, 1024x1024)
1.34 MB
1.34 MB PNG
Good night anon
>>
>>101575120
>epicrealism for 1.5 is amazing
Not really
>>
>>101575531
NTA but red pill me - I am using epic realism SD 1.5 with usually pretty solid results with some proompting
>>
>>101575591
It'll give you marginal results that few will comment on and everyone will know you're using that model.
>>
>>101564986
probably should include >>101573058 since it makes it easier to use
>>
pan someone convert a model into a gguf for me?
>>
>>101572866
how did it work out
>>
File: grid.jpg (1.58 MB, 1408x2688)
1.58 MB
1.58 MB JPG
>>
File: 16ch SDXL.jpg (508 KB, 1760x1176)
508 KB
508 KB JPG
>>
File: grid.jpg (1.53 MB, 2112x2688)
1.53 MB
1.53 MB JPG
>>
>>101576249
Does 16ch vae for 1.5 work yet?
>>
Someone please make Kolors text encoder 4bit
>>
File: 14175.png (825 KB, 832x1152)
825 KB
825 KB PNG
>>
>>101577194
How much memory would it consume if it were 4bit?
>>
>>101577264
5-6 GB
>>
official pixart bigma and lumina 2 waiting room
>>
File: 1721979588687225.jpg (140 KB, 1350x1290)
140 KB
140 KB JPG
>>101575213
>XL is better when it's better but 1.5 has more face variety
That's probably because SD1.5 is the only good model that was being trained with only real pictures
https://www.nature.com/articles/s41586-024-07566-y
>>
>>101577554
synthetic data and it's consequences have been a disaster for local image gen, worst part are the 10 star retards that train their model on SD1.5 and sdxl gens of all things.
>>
>>101577626
Can't agree more with you on this one, it's a disaster and we should avoid synthetic data like the plague when training models, this crap is literally poison.
>>
Kling is insane!
https://reddit.com/r/StableDiffusion/comments/1ecfu3g/girl_turning_around_for_the_camera_showing_her/
>>
>>101577684
will they be releasing the model weights?
>>
>>101577725
we'll never get a local model at this level, no way
>>
>>101577730
aw :(
>>
>>101577684
https://reddit.com/r/StableDiffusion/comments/1ecg544/a_girl_emerging_from_the_hot_springs_adjusting/
bro, hentai is gonna be amazing in the next few years
>>
File: ldg_intel_seems_bad.png (49 KB, 1494x294)
49 KB
49 KB PNG
hello /g/en friends
Is anyone running intel ARC for this hobby? How is it doing?

Was considering getting one but from the results shared in the op in picrel they seem pretty bad, but I thought they were meant to be good at AI workloads?
>>
>>101578365
If you haven't already got a GPU then buy an Nvidia one with as much VRAM on it as you can afford. Cheap option would be 3060 12GB will be around 300ish new depending on where you live, less if used. Try not to go below 8GB VRAM if you can't afford something with 12GB or more on it.
>>
>>101578404
the thing is price performance for intel is better if taking gayming into account and i plan to make good use of intel qsv

so I'm wondering why they're so shitty in those benchmarks, is it just poor support for a first gen product or what?
>>
>>101577644
Anon remembers when we had a full thread (or two) about this very thing. LDG ahead of the curve as always.
>>
>>101578691
synthetic data will keep on haunting us until training data requirements gets alot cheaper, it's why even megacorps like meta train their llms on 15t tokens of synthetic reddit. image gen is in a much luckier place though, good images aren't that hard to scrap and stuff like pixart bring the requirements down significantly. what we really need is a good image captioner. meta.. where my multimodal llm... meta...
>>
>>101578815
>what we really need is a good image captioner. meta.. where my multimodal llm... meta...
I really think this should be the hardest thing to achieve, a great AI captioner, and like this captioner would be able to guess what character or celebrity is in the picture, if it can't do that, then it means we'll always have to rely on human labeling, and that sucks
>>
>>101576249
VAEry interesting.. odd the expected scaling ratio isn't working for Ostris.. Cool that it works at all!
>>
>>101578437
>the thing is price performance for intel is better if taking gayming into account
I put two charts together (from this review of 7900xtx that includes an A770 amongst other gpus) https://files.catbox.moe/q040jh.png

Look at where the A770 and the surrounding cards are in 1080p gaming on the left chart and SD performance in the right chart. If you've got an intel CPU then that will also have hardware support for qsv.
>>
>>101578915
>we'll always have to rely on human labeling
Have you not seen the VLM advancements? Local is better than GPT4o in some cases with InternVL
>>
>>101578942
I mean, AI captioning has 2 issues right now:
- Doesn't seem to do NFSW or is really bad at it
- Doesn't add character/celebrities names onto the pictures that need one, and that one is a huge issue
>>
>>101578815
The only issue with synthetic data is that we like our bias more than the transformer's. That paper is a "slippery slope" of recursively re-using only synthetic data generated without human oversight
>>
>>101578959
anon pls
>>
>>101578930
thanks for sharing
maybe this is a stupid question but "relative performance" is performance relative to price right?
>>
>>101578959
Human oversight will merely delay the inevitable. There is no need to use AI images in training.
>>
>>101578992
Microsoft wouldn't release Phi if synthetic data was toxic. Humans control the content, style, and flow of information there. This paper can be boiled down to: "It's proven harmful to let Jesus take the wheel"
>>
>>101579047
Every cellphone photo you use is an AI image. Every photoshopped image too
>>
I don't understand why anon brings up LLM training when discussing txt2img models. I could care less if your ERP makes you ejaculate quicker if you train on AI sloppa.
>>
>>101579051
nta but nobody likes phi, it sucks. slopped to hell and back. gemma2, mistral nemo and mistral 2 large aren't gpt slopped at all and people love them.
>>
>>101572251
Sick!
>>
>>101579074
Can you elaborate on this because I think you're meaning to say many images already use AI enhancements but we are not talking about that, we are talking about completely synthetic images.
I want to refrain from labeling you a retard but you're making it very difficult.
>>
>>101579051
>microsoft wouldn't do dumb thing
kek good one
>>
>>101579089
It doesn't do well outside of the domain it was trained on. Synthetic data does induce that, which is why you don't let Jesus take the wheel.

>>101579121
>Can you elaborate on this

Every pixel is synthetically rendered by your cellphone. The world is an approximation of what you see rendered, no matter how happy you are with the result. Cellphone sensors are tiny garbage that let AI do the heavy lifting. Things are not shaped how they appear in the photo and it changes drastically without you realizing it. You can move the goalpost and say retard all you want. I don't see anyone suggesting only using RAW photos fully white-balanced, which is as close to non-synthetic as you're going to get. Literally the paper's point is that if you regurgitate data ad nauseam, then you will isolate the major features that the model biases to.
>>
>>101579206
No goal posts moved, it simply sounds like you're saying "it's okay to use dalle images in training because when you really think about it all images are AI". If that's not what you're saying please correct me.
>RAW photos fully white-balanced,
That would be interesting.
>>
And if that IS the case then YOU are the one moving
>its okay to use AI images in training because real images also have flaws
>well okay real flaws are better than AI flaws, but every real image is actually AI so it doesn't matter anyway
>>
File: 0.jpg (171 KB, 1024x512)
171 KB
171 KB JPG
>>
Training on synthetic data is a necessary part of dreambooth and is why it works.

>>101579242
That sounds sloppa af. I'm saying people speaking in ultimates about synthetic data being bad are missing the forest for the trees. Adding Dalle3 inputs won't ruin your model, but depending on some unknown resource will. You see the difference?
>>
>>101579041
>"relative performance" is performance relative to price right?
no, when sites like these do reviews for cpu or gpus the relative figure here is to the main product being reviewed. So for example in that chart the A770 performs in gaming at 39% of the main 7900xtx (which is why that says 100%).

If you want a performance per dollar then look at this page from the same review https://www.techpowerup.com/review/xfx-radeon-rx-7900-xtx-magnetic-air/32.html
>>
>>101579259
The best data is the data that fits our bias too. No matter the source, we're trying to get out what we expect.
>>
>>101579282
>7900xtx
Beware there are cuda dragons here
>>
>>101579308
sure but I wasn't suggesting him get a 7900xtx. I just grabbed the newest GPU review TPU did so I could show him their SD performance chart that shows a bunch of other cards because he's thinking about getting an Intel Arc 770
>>
>>101579320
You did good anon. I'm telling the newfag the full story
>>
>>101579272
>Training on synthetic data is a necessary part of dreambooth and is why it works.
Can you speak more on this?
>missing the forest for the trees
That's a fair point. A very fair point. But I've never seen a model trained with AI images that "looked good". Every dalle clone looks like utter ass. Moreover, you can blatantly see when a model is trained with a heavy use of AI images like https://civitai.com/models/477673/extramode-pixart-sigma
It seems like common sense to say "the more AI images used in training, the more it will look like AI which at the end of the day is what we're all trying to avoid, no?
>>101579290
I think the majority bias prefers real images over slop. If you prefer the latter by all means go for it but let's not ruin everyone else's fun by including it in pretraining.
>>
>>101579272
>That sounds sloppa af.
more sloppa than training with le heckin meme ai images scraped from reddit and twitter?
>>
>>101579352
>dreambooth
The loss function in Dreambooth training typically consists of two main components:
a) Reconstruction Loss: This measures how well the model can reconstruct the input images of the target subject. It's calculated as the difference between the original image and the model's reconstruction of that image after adding and then removing noise.
b) Prior Preservation Loss: This helps maintain the model's ability to generate diverse images within the same class as the target subject.

The model aims to minimize a weighted sum of the reconstruction loss on the subject-specific images and the prior preservation loss on the class regularization images.

This balance is crucial:
* Too much focus on subject images can lead to overfitting, where the model can only generate the specific subject.
* Too much focus on regularization images can result in underfitting, where the model fails to capture the unique characteristics of the subject.
>>
>>101579426
You'll have to be more explicit with
>Training on synthetic data is a necessary part of dreambooth
because I don't see in your reply where that's implied.
>>
>>101579271
cool
>>
>>101579352
What I'm saying is that synthetic data has its uses. This paper is actually a good case against AGI being able to take over

>>101579491
All of the class images are gens from the model you're training (with the same caption). Improves fidelity and reduces forgetting when training against what the model already has.
>>
>>101564986
Any animatediff wizards here? Seems as though everything new for video doesnt involve animatediff, cant even use TensorRT with it. How do I speed things up? How do I use 2 controlnets without it taking hours to render 3 seconds? Any tips on dynamic movement? Seems even openpose struggles sometimes. I've used LCM, T2I, lowered frames, loading the smallest models, etc. still slow as all dicks
>>
>>101579520
Do you have an idea as to why https://civitai.com/models/477673/extramode-pixart-sigma looks like slop and https://civitai.com/models/435669?modelVersionId=665122 looks so much better other than the former uses more AI images in training than the latter?
>>
>>101579594
kek idk captions/learning rate/taste/etc. play a huge role. Maybe extramode just looks like what the author wanted it to and bunline looks like they wanted it to.

They both seem to be an improvement on the base model
>>
>>101579594
bunline is a tastier sounding name so it adds more flavor to the model
>>
>>101579282
>>101579320

Thanks for sharing all this, I was planning to pair with an amd cpu so I wasn't going to have intel qsv on there, but intel gpus seem to have a fair share of toothing problems normal in a first gen product. Terrible idle power and the SD performance isn't bad but it's underwhelming...


I'm not in a hurry to upgrade so I'll wait and see what nvidia 5000 and battlemage bring.
>>
localjeets will soon be training on kling outputs as they claim 1000 niggabytes worth of youtube videos isnt enough data.
just keep falling for the synthetic psyop!
>>
>>101579780
kek https://www.404media.co/runway-ai-image-generator-training-data-youtube/
>>
>>101579854
Non-paywall version https://archive.is/RAXi4
>>
File: ComfyUI_AuraFlow_00093_.png (1.38 MB, 1216x832)
1.38 MB
1.38 MB PNG
Pizza theft.
>>
File: 0.jpg (550 KB, 1736x1024)
550 KB
550 KB JPG
>>101579509
thanks
>>
>>101579890
Hoping Simo fixes AuraFlow
>>
File: Sigma_10011_.png (1.78 MB, 1024x1024)
1.78 MB
1.78 MB PNG
Good morning

>>101579594
ty, bunline author here. anyMODE took the feedback on extramode and made realmode. I probably have 20x more images as well, as he is much more selective.
>>
File: sdxl_14.jpg (227 KB, 1480x1328)
227 KB
227 KB JPG
>>
File: Sigma_09908_.png (1.59 MB, 1024x1024)
1.59 MB
1.59 MB PNG
>>
File: Sigma_09875_.png (1.91 MB, 1024x1024)
1.91 MB
1.91 MB PNG
Come to ldg Quokka anon. We know you aren't using saas to gen
>>
File: Sigma_09860_.png (1.86 MB, 1024x1024)
1.86 MB
1.86 MB PNG
>>
>>101578815
with image gen we get safety cucking instead preventing its full potential in terms of the dataset
>>
File: Sigma_09827_.png (2.24 MB, 1024x1024)
2.24 MB
2.24 MB PNG
>>
File: Sigma_09819_.png (2.05 MB, 1024x1024)
2.05 MB
2.05 MB PNG
>>
File: Sigma_09813_.png (2 MB, 1024x1024)
2 MB
2 MB PNG
Bye for now
>>
>>101580159
morning anon
>>101580379
see you anon
>>
File: 116775249263549038-SD.png (2.65 MB, 1080x1600)
2.65 MB
2.65 MB PNG
hello anons
hows it goin?
>>
>>101580455
nothing much, just waiting. wbu? nice gen btw.
>>
File: 116775249263549043-SD.jpg (2.04 MB, 1496x1496)
2.04 MB
2.04 MB JPG
>>101580455
>>101580476
been a while since I visited here, chill weekend, hopefully get a nice rest after this hectic week, genrelated haha
>>
>>101580512
that's a really nice gen. catbox, if you don't mind?
>>
hello m u p p e t s
thread split, ahaha. well, whatever. good day and inspiration to everyone. also, I am back in business.
>>101580455
I am easing back into SD. just redid various comfy workflows, and learned a thing or two. yay. nice bazunevkahs !
>>
File: 00008-2152262941.jpg (327 KB, 1280x1384)
327 KB
327 KB JPG
>>101580713
>thread split,
Uh, where?
>>
>
>>
File: Sigma_09784_.png (2.02 MB, 1024x1024)
2.02 MB
2.02 MB PNG
>>101580455
Good about to grab lunch, hbu?
>>
File: Sigma_09713_.png (1.57 MB, 1024x1024)
1.57 MB
1.57 MB PNG
Om nom
>>
Comfy bred
>>
File: Sigma_09646_.png (1.87 MB, 1024x1024)
1.87 MB
1.87 MB PNG
>>
>>101580851
Cool gen
>>
>>101581956 (me)
>>101581674
Also cool gen
>>
File: 00001-2052907823.jpg (305 KB, 1304x1472)
305 KB
305 KB JPG
>>101581956
Thanks
>>
File: stablediffusion10.jpg (287 KB, 1552x1200)
287 KB
287 KB JPG
>>101581634
Talk about thin skinned
>>
File: Sigma_09382_.png (1.42 MB, 1024x1024)
1.42 MB
1.42 MB PNG
>>101581956
>>101580851
>Cool gen
Agreed

>>101580512
Also very good

>>101581967
ty
>>
File: Sigma_09381_.png (1.45 MB, 1024x1024)
1.45 MB
1.45 MB PNG
>>101582002
Don't say that too loud. He'll crumple if he hears you
>>
File: Sigma_09359_.png (1.68 MB, 1024x1024)
1.68 MB
1.68 MB PNG
>>
File: Sigma_09154_.png (1.43 MB, 1024x1024)
1.43 MB
1.43 MB PNG
>>
2k model printer goes BRRRRR
>>
>>101582358
2 iq here, what are you doing?
>>
>>101582375
Training the Sigma 2k model to make a new finetune of bunline
>>
>>101582404
>>101582358
did you notice a significant difference in quality between the "normal" one and the 2k one?
>>
File: Sigma_09137_.png (1.83 MB, 1024x1024)
1.83 MB
1.83 MB PNG
>>101582404
Forgot pic
>>
https://reddit.com/r/StableDiffusion/comments/1ecjpvw/combining_sd_animatediff_tooncrafter_viggle_and/
That's really impressive, probably one of the first short films that actually looks like it could be professionally produced.
>>
>>101582404
>>101582423
looks delicious
>>
File: Sigma_09116_.png (2.12 MB, 1024x1024)
2.12 MB
2.12 MB PNG
>>101582421
Yes, it's way more "clear" but not 4 times longer to gen better imo. I'm mostly making sure the 2k bunline doesn't get left behind
>>
>>101582453
I think the combinaison of 2k and a 16ch VAE will make the hands way better, models just need more pixels to work I guess, that's why it works fine when something is zoomed in, but not so much when it's far away
>>
>>101582431
It's not bad but they need to learn Film 101 they break the 180 rule multiple times. What I'm really excited about is when some of the traditional 2D animators learn that AI can speed them up 10x. AI is most powerful right now in the hands of someone that can do key frames and want some inbetweens.
>>
>>101582453
pixart is so good with water reflections, imagine what it could do with more parameters and a bigger dataset. bigma cannot come any sooner.
>>
bigma updates plz
>>
>>101582575
nuthin at all :(
>>
File: file.png (180 KB, 256x256)
180 KB
180 KB PNG
>>101582575
2girl, ballroom, formal dresses, fine_art
>>
>>101582593
spooky ghost face! two of em! EEK!
>>
File: file.png (142 KB, 256x256)
142 KB
142 KB PNG
>>101582614
It does lots of cursed images now. It does a lot of people like n64 / low poly where you have a low fidelity silhouette with a face glued on.
>>
>>101582593
>>101582633
More please and thank you
>>
>>101582633
it looks like she has layers to her face
>>101582593
this one is strangely beautiful, really interesting to see the model slowly start learning concepts. who is the mother?
>>
File: file.png (118 KB, 256x256)
118 KB
118 KB PNG
>>101582646
1girl, drow, silver hair, woman, beautiful, sinister, holding a dagger
>>
abstract posters are going to have a field day when this releases
>>
>>101582668
>who is the mother?
What do you mean?
>>
>>101582719
the model is like a little baby, you can be the father. can i be the mother?
>>
>>101582670
I love these so much
>>
>>101582737
I made it with the force
>>
File: Sigma_10202_.png (1.88 MB, 1024x1024)
1.88 MB
1.88 MB PNG
>>101582516
The SDXL VAE really eats up low-res faces, and more pixels definitely helps.

>>101582562
Yes please
>>
File: Sigma_09933_.png (528 KB, 1024x1024)
528 KB
528 KB PNG
>>101582593
>>101582633
>>101582670
HYPE
>>
>>101582770
kek
>>
>>101582794
It cost $200 in electricity last month to make those images.
>>
>>101582773
we already have a generalist 16ch VAE ready to be used, it just needs to pre trained further to fit into those new base models, I'm sure it'll happen for Sigma
>>
>>101582827
drop a link and ill throw you some coin (not bitcoin kek)
>>
>>101582845
I'll wait until I get more official, it be nice to pull a Pony and get the scratch for a A100/H100 computer.
>>
File: Sigma_09901_.png (1.84 MB, 1024x1024)
1.84 MB
1.84 MB PNG
>>101582827
Way less than renting. LET'S GO LOCAL, LET'S GO

>>101582836
We shall see what Ostris can do with his VAE. If not him, pixart should support it in the next release.
>>
>>101582889
Ignoring the cost to build the dataset computer of course. I'm like $4500 in. But chump change for a dumbass hobby project like this. Guys my age / profession are puttering around in their fishing boats.
>>
File: Sigma_09884_.png (2.12 MB, 1024x1024)
2.12 MB
2.12 MB PNG
>>101582915
Borrowed parts from old mining rigs have gotten me far. That plus electricity will still likely have you under the costs of anything else being released
>>
>>101582981
prompt?
>>
>>101582997
kek literally this:
>>101569012 (You)
>the voices

>>101569379 (You)
Caio
>>
>>101583010
KEKE
>>
File: Sigma_09864_.png (2.14 MB, 1024x1024)
2.14 MB
2.14 MB PNG
>>
>>101582981
Yeah the main price I'm paying is time and my computer. I'll still doing captioning and downloading more images but the main dataset I want to train on is complete. But a lot of this is prep for the next model and getting everything before the internet / archives get nuked. Only a matter of time until you can't download anything in bulk.
>>
>>101582836
im sure there are people who know much more and i hope to be corrected
but from what i understand the problem is that a vae is two part, an encoder and decoder, when training the images are encoded to the vae dimensions and trained with those encoded latents
and then when you go generate with the trained model, the model generates latents, and those latents are then scaled with the decoder portion
so the damage is kinda already done because the model is trained with a fixed amount of dimensions, even if you modify the blocks/layers of the model or translate the outputs of the model to what the larger channel vae can understand
>>
File: Sigma_09859_.png (1.95 MB, 1024x1024)
1.95 MB
1.95 MB PNG
>>101583036
>prep for the next model and getting everything before the internet / archives get nuked
Almost every day a new complaint about major AI companies scraping too hard on twitter
>>
File: Sigma_09831_.png (2.05 MB, 1024x1024)
2.05 MB
2.05 MB PNG
>>101583045
SD 1.5 and SDXL mapped over to 16ch nicely but Sigma gave issues.

>model generates latents, and those latents are then scaled with the decoder portion
They're scaled after encoding in Sigma (1). You should only need to train a few layers or an adapter. Ostris was experimenting with a different scale factor (2), which might be the magic we need to get cooking with 16 channels.
1. https://github.com/PixArt-alpha/PixArt-sigma/blob/master/train_scripts/train.py#L155
2. https://x.com/ostrisai/status/1816064676096086030
>>
>>101583210
yeah what i mean is that its not that easy to convert already existing specific models like pony or animagine just by training the vae, since the weights of those models exist based on the originally encoded latents, so they would either need to be retrained completely or finetuned to get benefits of the new vae
>>
>>101579890
abomination
kek
>>
>>101583210
>>101583276 (me)
oh maybe i get what is going on now, but still wouldnt it be a kinda uphill battle to force the vae to do a lot of the work of improving the output of the models, since its much smaller and would probably be fighting against what is actually encoded there
>>
>>101582423
Why the image isn't 2k?
>>
>>101582562
current sigma (finetunes) are so good but always room for improvement
indeed, bigma cannot come any sooner
>>
File: Sigma_09808_.png (1.95 MB, 1024x1024)
1.95 MB
1.95 MB PNG
>>101583441
Can't say definitively what the 16ch problem is with Sigma or I'd have fixed it

>>101583506
The local anon curse. Training and no VRAM left to gen with. Now's a good time to validate though
>>
File: Sigma_10315_.jpg (1.4 MB, 2048x2048)
1.4 MB
1.4 MB JPG
>>101583506
Needs more steps but here
>>
File: Sigma_10323_.jpg (2.65 MB, 2048x2048)
2.65 MB
2.65 MB JPG
Trees are still fuzzy. We have a while to go yet
>>
File: Sigma_10327_.jpg (1.16 MB, 2048x2048)
1.16 MB
1.16 MB JPG
And the .png's for 2k are always above the image size limit
>>
>>101584232
could your make one of these pictures but prompt for it to be made out of nuts or something?
>>
File: Sigma_10328_.jpg (920 KB, 2048x2048)
920 KB
920 KB JPG
>>101584252
>>
>>101584289
thats nuts
>>
>>101584289
im going this lesbian couple
>>
>>101584306
to eat
>>
File: Sigma_10330_.jpg (898 KB, 2048x2048)
898 KB
898 KB JPG
>>101584298
lul

>>101584306
A detailed photo of two female souls joining as one made of nuts. The background is the sky
>>
>>101564986
Local or bust
>>
>>101584605
a nut
>>
I tried using 1.5 for the first time (I started on Pony), and it's so fucking shitty.

I wanted to use it because there's so many celebrity loras on civitai, however after this experience I'm quite confident you should just delete them all off the website.
>>
>>101583916
very cool colors
>>
>>101576120
>>101576838
And this is also an awesome style
>>
File: Sigma_10335_.jpg (2.33 MB, 2048x2048)
2.33 MB
2.33 MB JPG
>>101584783
Sigma does better colors w/ SDXL VAE than SDXL can. Doesn't make sense
>>
File: it per s.jpg (160 KB, 1082x475)
160 KB
160 KB JPG
what causes wild changes in generation speeds? same model and sampler for all
>>
Getting near the end of the thread again, oven fresh bread ready to roll...

>>101585073
>>101585073
>>101585073
>>
>>101577684
>>101577754
any saves? it's gone
>>
>>101585035
If that's Windows, shared VRAM?
>>
>>101585127
We could it have it all!
>>
>>101585035
a1111 does this to me sometimes, using a controlnet basically guarantees the slowdown, after that my gens will take like 10 minutes and i need to restart the server
>>
>>101585127
Rolling in the deep
>>
>>101585234
yeah it is on windows. is there a recommended way to force stable diffusion to use up all the dedicated ram or something?
>>
>>101583916
So it's 2k native but also can gen 1024 without problem?
>>
any starter video card suggestions? I have a 1050 ti which obviously isn't working out with 4gb vram. hopefully less than $500
>>
>>101588327
I think it's still typically a 3060 12GB.
>>
>>101588826
thank was looking at that one. The used price is just a hair below new though...
>>
File: 1709165416199169.png (1.15 MB, 1280x720)
1.15 MB
1.15 MB PNG
>>
File: 1695882937161397.png (929 KB, 1280x720)
929 KB
929 KB PNG
>>
File: 1713984758917085.png (1.17 MB, 1216x832)
1.17 MB
1.17 MB PNG
>>
File: 1721131118966184.png (1.12 MB, 1312x736)
1.12 MB
1.12 MB PNG
>>
File: 1719352799054938.png (995 KB, 1280x720)
995 KB
995 KB PNG
>>
File: 1699565792909087.png (1000 KB, 1216x832)
1000 KB
1000 KB PNG



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.