[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: tmp.jpg (1.12 MB, 3264x3264)
1.12 MB
1.12 MB JPG
General dedicated to creative use of free and open source text-to-image models

Previous /ldg/ bread : >>101329150

>Beginner UI
Fooocus: https://github.com/lllyasviel/fooocus
EasyDiffusion: https://easydiffusion.github.io
Metastable: https://metastable.studio

>Advanced UI
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
InvokeAI: https://github.com/invoke-ai/InvokeAI
ComfyUI: https://github.com/comfyanonymous/ComfyUI

>Auto1111 forks
SD.Next: https://github.com/vladmandic/automatic
Anapnoe UX: https://github.com/anapnoe/stable-diffusion-webui-ux

>Kolors
https://gokaygokay-kolors.hfspace
Nodes: https://github.com/kijai/ComfyUI-KwaiKolorsWrapper

>Pixart Sigma & Hunyuan DIT
https://huggingface.co/spaces/PixArt-alpha/PixArt-Sigma
https://huggingface.co/spaces/Tencent-Hunyuan/HunyuanDiT
Nodes: https://github.com/city96/ComfyUI_ExtraModels

>Use a VAE if your images look washed out
https://rentry.org/sdvae

>Models, LoRAs & training
https://civitai.com
https://huggingface.co
https://aitracker.art
https://github.com/Nerogar/OneTrainer
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts

>Animation
https://rentry.org/AnimAnon
https://rentry.org/AnimAnon-AnimDiff
https://rentry.org/AnimAnon-Deforum

>Index of guides and other tools
https://rentry.org/sdg-link
https://rentry.org/rentrysd

>View and submit GPU performance data
https://vladmandic.github.io/sd-extension-system-info/pages/benchmark.html
https://docs.getgrist.com/3mjouqRSdkBY/sdperformance

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
sd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium

>Share image prompt info
https://rentry.org/hdgcb
https://catbox.moe

>Related boards
>>>/h/hdg
>>>/e/edg
>>>/d/ddg
>>>/b/degen
>>>/vt/vtai
>>>/aco/sdg
>>>/trash/sdg
>>
Blessed thread
>>
File: PA_0071.jpg (923 KB, 2560x1536)
923 KB
923 KB JPG
>>101344420
Post https://imgsys.org/rankings
next time.
>>
File: 0.jpg (708 KB, 2048x1024)
708 KB
708 KB JPG
>>
https://github.com/lllyasviel/Paints-UNDO/issues/7

STOP THIS RIGHT NOW. TAKE THIS DOWN
EVIL
>>
>>101344943
ELI5?
>>
>>101344968
>Someone creates model that makes 30 second videos that fake steps to drawing an image
>Artists pissing farting and cumming with rage because they used the the steps in making the art as a way to certify the image was not AI generated
>>
>>101345027
Nice...
>>
File: PA_0001.jpg (792 KB, 3328x1152)
792 KB
792 KB JPG
>>
File: file.png (3.86 MB, 1408x2816)
3.86 MB
3.86 MB PNG
>cumming with rage
>>
File: file.jpg (1.43 MB, 1408x2816)
1.43 MB
1.43 MB JPG
>>
File: PA_0007.jpg (757 KB, 3328x1152)
757 KB
757 KB JPG
Which one is better. Type 1
>>
>>101345100
>>101345109
true art
>>
File: PA_0005.jpg (663 KB, 3328x1152)
663 KB
663 KB JPG
>>101345169

Type 2
>>
>>101345169
>>101345182
type 1 but you should combine elements of both in post to get the best of both worlds
whats the difference workflow wise?
>>
File: PA_0008.jpg (627 KB, 3328x1152)
627 KB
627 KB JPG
>>101345197
about 40 seconds extra for type 1
>>
File: PA_0009.jpg (590 KB, 3328x1152)
590 KB
590 KB JPG
I mean Type 2
>>
File: PA_0016.jpg (466 KB, 2560x1536)
466 KB
466 KB JPG
Type 1
>>
File: PA_0017.jpg (443 KB, 2560x1536)
443 KB
443 KB JPG
Type 2
>>
CHOOSE YOUR DIABETES
>>
>>101345302
This one
>>
File: PA_0029.jpg (584 KB, 2560x1536)
584 KB
584 KB JPG
>>
File: PA_0028.jpg (526 KB, 2560x1536)
526 KB
526 KB JPG
>>
>>101345422
>>101345431
Not going to specify which one is what. Choose one please.
>>
>>101345450
kek first one is overly detailed to the point of being wonkier
>>
File: kolors_00532_.png (1.49 MB, 1024x1024)
1.49 MB
1.49 MB PNG
>>
File: PA_0031.jpg (523 KB, 2560x1536)
523 KB
523 KB JPG
>>101345450
Well then, now I need to figure out how to shave off 40 seconds and fix this problem every 2nd image
>>
File: file.png (108 KB, 256x256)
108 KB
108 KB PNG
>>101344629
We simply just wait for other models, AI art models aren't going out of fashion. I do believe Pixart is going to be the 3B/7B model of the local AI art space, it has a ton of potential and is attainable to train. Things are progressing with my 1.3B from scratch model even though it's slow. But keep in mind with the 50 series coming soon it will be even easier to train with local hardware. We used to wait years for improvement in AI stuff. I still remember when the first art models came out and how crude they were.
>>
>>101345644
Hype
>>
File: Grid.jpg (2.01 MB, 4096x1920)
2.01 MB
2.01 MB JPG
>>
File: PA_0033.jpg (573 KB, 2560x1536)
573 KB
573 KB JPG
>>101345498
fixed lost 3~5% quality gained 10 seconds
>>
File: PA_0036.jpg (561 KB, 2560x1536)
561 KB
561 KB JPG
>>
File: PA_0038.jpg (861 KB, 3328x1152)
861 KB
861 KB JPG
>>
File: PA_0040.jpg (950 KB, 3328x1152)
950 KB
950 KB JPG
Night everybody!
>>
>>101345981
bai
>>
File: ComfyUI_00808_.png (3.36 MB, 1096x2392)
3.36 MB
3.36 MB PNG
>>
>get bad hands
>add negative prompts to help
>get good hands
>with 6 fingers
It's like it's mocking me.
>>
File: ComfyUI_00815_.png (3.48 MB, 1096x2392)
3.48 MB
3.48 MB PNG
>>
>>101345182
>>101345169
Type 1 feels more ethereal, eldritch, otherworldly, etc. Type 2 feels like a cringy band t-shirt or something.
>>
I see you kek
>>
>>101346035
Remove the extra with gimp or IOPaint
>>
File: file.jpg (1.27 MB, 1920x2048)
1.27 MB
1.27 MB JPG
>>
>>101346069
kek
>>
>https://github.com/GAIR-NLP/anole
they finetuned the image gen capabilities back to meta's chameleon model. i am not sure if these models can be quanted like llms.
>>
>>101346069
TJD

>>101346613
>We have provided open-source model weights, code, and detailed tutorials below to ensure that each of you can reproduce these results, and even fine-tune the model to create your own stylistic variations.
Love it
>>
small dataset maybe 30-40 images, high res images, sometimes very wide or tall ratios.

for Lora training, is it worth manually zoom/cropping 1:1, 3:4, and 4:3 to capture more detail and artificially increase dataset? If so, any other ratios I should consider cropping to?
>>
official pixart bigma and lumina 2 waiting room
>>
>>101345644
the fact this is possible at all is incredible. good stuff
>>
>>101346792
Soon
>>
>>101345644
I heard a rumor about the 5090 only having 28GB VRAM, hope it has more but I don't expect good things.
>>
File: file.jpg (1.12 MB, 2688x1536)
1.12 MB
1.12 MB JPG
>>
>>101346758
It's best to have most or all of the images in the same resolution. Another way to artificially increase the dataset is to do img2img at low denoise. The more dissimilar the images are, the more training time is required in order for the concept to be learned correctly.
>>
File: file.jpg (932 KB, 2688x1536)
932 KB
932 KB JPG
>>
I KEEP MAKING THE OP COLLAGE ITS NUTS WOOOO
>>
File: stablediffusion12.jpg (323 KB, 1552x1200)
323 KB
323 KB JPG
>>
File: file.jpg (1.12 MB, 1920x2048)
1.12 MB
1.12 MB JPG
>>
>>101346613
>>101346731
Anyone tested it yet?
>>
can it finally make good hentai?
>>
When I was little I was told I was bad at drawing. Of course I was never told you could learn. Anyway, I often daydream about a machine that would read my thoughts and generate images. And now that dream is basically reality.
It's so strange when I think about it. It felt impossible but some people were able to make it possible.
>>
>>101348072
been able to for awhile now
>>
>>101346613
>Specifically, Anole-7b-v0.1 was developed using a small amount of image data (5,859 images, approximately 6 million image tokens) and was fine-tuned on just a few parameters (less than 40M) in a short time (around 30 minutes on 8 A100 GPUs). Despite this, Anole-7b-v0.1 expresses impressive image generation capabilities.
woah..
>>
>>101348982
They made a LoRA with ~6k images in 30 mins + merged it? lul
>>
File: 0.jpg (506 KB, 1024x1024)
506 KB
506 KB JPG
>>
File: PA_0001.jpg (637 KB, 2048x2048)
637 KB
637 KB JPG
Morning
>>
>>101347109
The other rumor is it's 50% faster than the 4090. The 4090 is significantly faster than the 3090. If they have the same leap, that alone is insane. It's not all about VRAM and nowadays you can shard your extremely large models but no one really does that and certainly not faggots like SAI that want to ensure there's an 80GB minimum on fine tuning.
>>
File: PA_0007.jpg (1002 KB, 3328x1152)
1002 KB
1002 KB JPG
>>
File: PA_0016.jpg (907 KB, 2432x1664)
907 KB
907 KB JPG
>>
File: PA_0021.jpg (1024 KB, 2432x1664)
1024 KB
1024 KB JPG
>>
File: PA_0027.jpg (582 KB, 2432x1664)
582 KB
582 KB JPG
>>
File: 00006-1977336930.jpg (255 KB, 1576x1120)
255 KB
255 KB JPG
>>
File: PA_0031.jpg (744 KB, 2432x1664)
744 KB
744 KB JPG
>>
RIP
>>
>>101350932
https://www.ft.com/content/7b8d2057-2687-45b3-bae4-1488a75ac5b2

Might be a good news
>>
>>101351012
AMD could, you know, pay several developers $5 million and just get CUDA compatibility.
>>
>>101349182
neat
>>
Does anyone have a good rotoscoping tutorial? I want to understand it rather than the minimum to get it to work.

>>101349603
>not all about VRAM and nowadays
>you can shard your extremely large models
>SAI that want to ensure there's an 80GB minimum

I feel like you wanted to have 3 conversations here.

>>101351034
nvidia would hire them away when the project was at 70%
>>
>>101352063
I can have a multifaceted point about this.

Large models are unnecessary and has been shown multiple times now, a small model can beat a large model in a single domain/discipline. Large models excel at being jacks of all trades but master of none and it's done with huge diminishing returns. A 2B model trained only on anime is going to be better at anime than Dalle-3.

Sharding is efficiency, you don't need to make something bigger just because you introduce efficiency. This means you could train your smaller model faster with the same hardware.

SAI is a shit company that purposely hamstringed local AI training.
>>
>>101352063
>nvidia would hire them away when the project was at 70%
Not everyone is as unscrupulous as you and quit projects just because they got an opposing bid. I realize you'll sell your mom for a dollar but that's just projection.
>>
>>101352206
>Large models are unnecessary and has been shown multiple times now, a small model can beat a large model in a single domain/discipline......
Agreed. Too bad investors can't get behind the idea of a doing one thing well, instead of adapting the mega then to do the one thing.

>Sharding is efficiency, you don't need to make something...
you have a very specific definition of efficiency. Sharding decreases hardware efficiency to save you on human efficiency.

>SAI is a shit company
relax dude, we all know, there is nothing new that can be said at this point.

>>101352227
Your entire post was projection. It's okay anon, life will get better.
>>
>>101352361
>Your entire post was projection. It's okay anon, life will get better.
Life must suck as a nihilistic assuming the worse of people all the time.
>>
>>101352385
Your entire post was projection. It's okay anon, life will get better.
>>
>>101346029
>>101346051
neato
>>
>>101349605
>>101349705
I like
>>
what image model do you guys suggest for western fantasy art, I'm talking like mtg art. I want to generate goblins and dragons.
>>
>>101349603
Just use tensorrt for comfyui on your favourite model for a 50% increase in gen speed. The current version has a memory leak but it's not terrible and will get more dev time allocated if more people use it.
>>
>>101353175
No one cares about generation speed.
>>
>>101353189
So you're rocking along using your cpu, good for you.
>>
>The powerful LI-DiT-10B will be available through the online platform and API after further optimization and security checks.
bros.....
>>
>>101353202
This conversation is not about inference speed, but it's cute you use some distilled model anon to generate corpo approved images.
>>
>>101352206
I don't want to have to juggle a mess of different models and loras to synthesize multiple disparate concepts. No future there.
>>
>>101344943
>wastes time training a model that has no other purpose than to flood the internet with fake process videos
SAARS HE IS BASED ! LOOK AT HOW MAD THE ARTIST
>>
ded
>>
>>101354077
work, sleep, work, sleep.. No time to even gen anything
>>
Generating for sake of just Generating is not good practice
>>
>>101344420
The girl in the middle looks hot

Any more gens of her?
>>
>>101355203
You solved a captcha to post this
>>
>>
File: 0.jpg (369 KB, 832x1216)
369 KB
369 KB JPG
>>101351567
thanks
>>
File: oeoeoo_00348_.png (2.18 MB, 1280x1280)
2.18 MB
2.18 MB PNG
Really frustrating that SD3 basically refuses to produce figures that aren't in a neutral standing position facing forward or backward. It can produce really interesting detail and actually does know a lot of significant artists' names/styles.
>>
File: PA_0036.jpg (513 KB, 2560x1536)
513 KB
513 KB JPG
>>101355265
This one?
>>
File: 00000-3064829863.png (2.45 MB, 1248x1824)
2.45 MB
2.45 MB PNG
>>
File: 00001-1477610561.png (1.77 MB, 1216x1216)
1.77 MB
1.77 MB PNG
>>
File: PA_0038.jpg (741 KB, 1664x2432)
741 KB
741 KB JPG
>>101356894
>>
File: PA_0039.jpg (778 KB, 1664x2432)
778 KB
778 KB JPG
>>
>>101356027
Prompt?
>>
File: PA_0041.jpg (725 KB, 1664x2432)
725 KB
725 KB JPG
>>
File: 00003-3792186198.png (2.4 MB, 1248x1824)
2.4 MB
2.4 MB PNG
>>101358206
>>101358206
Kinda reminds me of game of thrones, only missing the purple eyes but in the tv series they all blue eyed anyway kek
>>
File: SDXL_0004.jpg (754 KB, 1664x2432)
754 KB
754 KB JPG
>>101359185
We'll never see winds of winter release date.
>>
File: 00006-3089850714.png (1.09 MB, 832x1216)
1.09 MB
1.09 MB PNG
>>101359279
I dunno, depending on grrm we could at least see what he wrote for Winds of winter, a dream of spring is a pipedream though so I gave up on that at least house of the dragons is kinda nice even if I disagree with many things, the velaryons being one of them but far from the main one
>>
File: SDXL_0008.jpg (399 KB, 1664x2432)
399 KB
399 KB JPG
>>101359463
Refuse to watch that on the grounds on how GoT ended
>>
>>101355203
You Generate for fame and glory?
>>
File: SDXL_0009.jpg (623 KB, 1664x2432)
623 KB
623 KB JPG
>>
File: SDXL_0013.jpg (661 KB, 1664x2432)
661 KB
661 KB JPG
>>
File: SDXL_0015.jpg (697 KB, 1664x2432)
697 KB
697 KB JPG
>>
File: SDXL_0016.jpg (648 KB, 1664x2432)
648 KB
648 KB JPG
>>101359791
>>
File: SDXL_0017.jpg (873 KB, 1664x2432)
873 KB
873 KB JPG
>>
File: SDXL_0018.jpg (860 KB, 1664x2432)
860 KB
860 KB JPG
>>
File: SDXL_0019.jpg (801 KB, 1664x2432)
801 KB
801 KB JPG
>>
File: 00014-2354484914.png (1.4 MB, 1024x1024)
1.4 MB
1.4 MB PNG
>>
File: SDXL_0021.jpg (722 KB, 1664x2432)
722 KB
722 KB JPG
>>
File: SDXL_0025.jpg (1.01 MB, 1664x2432)
1.01 MB
1.01 MB JPG
>>
File: SDXL_0026.jpg (610 KB, 1664x2432)
610 KB
610 KB JPG
>>
File: SDXL_0027.jpg (677 KB, 1664x2432)
677 KB
677 KB JPG
>>
File: PA_0053.jpg (961 KB, 2560x1536)
961 KB
961 KB JPG
>>
File: PA_0055.jpg (942 KB, 2560x1536)
942 KB
942 KB JPG
>>
File: PA_0056.jpg (1.13 MB, 2560x1536)
1.13 MB
1.13 MB JPG
>>
File: PA_0058.jpg (1017 KB, 2560x1536)
1017 KB
1017 KB JPG
Descending into madness
>>
File: PA_0064.jpg (927 KB, 2560x1536)
927 KB
927 KB JPG
>>
I've become convinced that "synthetic data" is how SD3 ended up fucked. Especially the CogVLM captions.

I've been fucking around with image-to-text, and all the "best" models have the same kinds of problems. It's always "The image depicts X. The overall composition of Y suggests a tone of Z." That's not a damn caption, it's a conversation with a digital assistant trying to hit a word count, and typically making shit up to get there. Cut it all down to just the "X" part, and you'd be fine.

So I eventually just found myself using BLIP-2. It's ancient, by the pace we're going at, but it clearly wasn't trained on output from some LLM, it doesn't hallucinate, it doesn't try to hold a conversation, and it isn't overly verbose. I doubt I'll ever find another model that straight-up generates "an anime girl is getting fucked by several men".
>>
File: PA_0065.jpg (875 KB, 2560x1536)
875 KB
875 KB JPG
>>
>>101360663
The problem is they trained *only* on long synthetic captions when in reality it should be a mix of short, medium and long captions.
>>
>>101360663
If you want verbose model?

https://civitai.com/models/502468
>>
File: PA_0076.jpg (1.32 MB, 2560x1536)
1.32 MB
1.32 MB JPG
>>101345100
>>101345109
I hope you enjoy this piece
>>
File: PA_0078.jpg (1.11 MB, 2560x1536)
1.11 MB
1.11 MB JPG
Good night.
>>
File: 00000-2010508896.png (3.2 MB, 1536x1536)
3.2 MB
3.2 MB PNG
Good night
>>
Prompt challenge?
In the course of centuries, Man has devoured the Earth itself. The Machine Age has dried up the seas of oil. Industry has consumed the heartlands of coal. The Atomic Age has plundered the rare elements — uranium, cobalt, plutonium — leaving behind worthless deposits of lead and ashes. Starvation is at hand. Only here, in the void of space, is there a new source of atomic power. Above us, in the debris of the solar system, in the meteorites and asteroids, are the materials needed to drive the reactors. Yet in their distant, silent orbits, these chunks of matter are beyond the reach of man, beyond the reach of human hands, but not beyond the reach of human minds. Driving along a country road in an ordinary car is a modest man: Harold J. Finley, quiet and profound...
>>
File: 1720670820684.jpg (83 KB, 626x626)
83 KB
83 KB JPG
>>101344968
I'm diffoosin
>>
What are the best prompts for benchmarking/comparisons?
>>
>>101360663
synthetic data is an utter plague upon this field and has been horribly overused beyond applications where it makes sense.
>>
File: 00064-1783398307.png (1.89 MB, 1024x1280)
1.89 MB
1.89 MB PNG
This is jungle fever

testing stealth PAPA NOVEMBER GOLF info transmission on channel Alpha

please acknowledge
>>
SD3 or Kolors?
Both lobotomized?
>>
>>101361645
catbox or prompt
>>
File: 13212313212.png (51 KB, 1321x595)
51 KB
51 KB PNG
>>101361714
repeat - info transmission on channel alpha
>>
File: Untitled.png (54 KB, 408x1080)
54 KB
54 KB PNG
what the FUCK, the Crystools node is slowing down all my generations by 4.5 seconds.

Uninstall that shit right now if you've got it.
>>
>>101361754
https://github.com/neggles/sd-webui-stealth-pnginfo
>>
File: ComfyUI_Kolors_00892_.png (1.84 MB, 1216x832)
1.84 MB
1.84 MB PNG
>>
>>101360779
I like >>101360814 this one :D
>>
File: zzz9_00011_.jpg (181 KB, 1200x1200)
181 KB
181 KB JPG
I made a super nice Mariana Cordoba Lora but civtai won't host it because it's a real person with nudity
but she's fucking dead!
only a dozen or so people got to delete it before mods yeeted it
is there any non-scared of lawsuits from dead people places to host the file?
>>
>>101360582
These ones are cool too
>>
>>101363509
aitracker.art
>>
>>101360663
It's not so much the captions but the images themselves being synthetic. Pixart uses LLM captioned images and works quite well.
But SD3 is more than that, from insiders: the safety training is what fucked it up the most.
>>
>>101360138
I want to have her babies
>>
>>101363509
it's kinda funny how civitai essentially becomes a hentai only website because of rules like this.
Like AI just becomes completely synonymous with hentai, all the professional researchers and scientists become associated with hentai, with billions of dollars going into funding the ability to make better and better hentai.
>>
>>101362753
how do you get cpu, ram, etc stats like that
>>
>>101363938
Those stats are all added by Crystools. Sorry I wasn't clear. But yeah those bars reporting data slow shit down.

I've also noticed other nodes slowing shit down, like the comfyui-profiler. You don't want to be running that all the time either.
>>
File: file.jpg (1.37 MB, 2176x1920)
1.37 MB
1.37 MB JPG
>>
>>101360663
>So I eventually just found myself using BLIP-2.
Same here 2.7b version + wd v3 tagger. I combine interrogations and it has been really good
>>
File: PA_0002.jpg (1.27 MB, 2560x1536)
1.27 MB
1.27 MB JPG
>>101363484
>>
File: 00014-3613041320.jpg (277 KB, 1400x1400)
277 KB
277 KB JPG
>>
>>101364657
mind dropping a workflow?
>>
File: 33380292.jpg (566 KB, 2048x1024)
566 KB
566 KB JPG
>Kolors
>chink models are the future
well, so be it then, I fully embrace our new chink ai overlords
>>
File: 0.jpg (127 KB, 1024x512)
127 KB
127 KB JPG
>>
File: 00442-3037234057.jpg (967 KB, 1440x1920)
967 KB
967 KB JPG
>>101365232
I use this
>https://github.com/jhc13/taggui/
it also supports internlm-xcomposer2-vl-7b-4bit if you have beefy gpu and want to tag porn
>>
File: 0.jpg (672 KB, 2048x1024)
672 KB
672 KB JPG
>>
File: 1girl_marin.jpg (56 KB, 512x768)
56 KB
56 KB JPG
takigawa marin
>>
Hello anon
>>
>https://github.com/derrian-distro/LoRA_Easy_Training_Scripts
Dev branch is great, highly recommend
>>
Have ideas why LDG is so comfy comparing to other AI-threads?
>>
File: 1692594552804026.jpg (1.96 MB, 9367x1050)
1.96 MB
1.96 MB JPG
how do i do a 2d grid for random seeds in swarmui? i did my cfg 4.5,5,..,7 and now i want to dig through say 4 random sneeds for each. ofc i can set seeds for the second field but that aint random. the image count in the default field next to cfg doesnt apply to grids.
>>
>>101360814
>>101363414
>>101364169
>>101365428
>>101359883
>>101359829
cool
>>
>>101366808
Does swarm allow you to set grid seeds to -1?
>>
>>101366861
that works but kinda defeats the purpose because it generates a random sneed every time instead of running cfg permutations of each random number.
>>
Is there any promising new model with 16ch VAE?
Now that I understand the difference it makes with SD3, I don't want to go back. Too bad SD3 is so uncooperative.
>>
>>101363509
Just upload only non-nude images with the LoRA. There are tons of pornstar and celebrity LoRAs on civit, and eveyone knows what they're used for.
>SFW only boys haha wink wink nudge nudge
Sure, no online genning, but I'm sure local gennera would still want it.
>>
>>101363509
They could be sued by an estate of a dead person, I mean at least that's as likely as any other lawsuit

I thought nudity in the training data was fine, you just can't upload generations with nudity
>>
>>101367445
1.5 and XL versions. Last I heard they were /close/ to being finished.
>>
>>101368071
>1.5 + 16ch vae
Please I'll take it
>>
>>101366806
No avatars.
>>
>>101345027
Those are fake artists anyway with their undo button and unlimited colors.
I'm an oil painter and I love diffusion because it makes me painting ideas
>>
>>101368911
>I'm an oil painter and I love diffusion because it makes me painting ideas
In next month I have to go photograph one dudes whole gallery and make a finetune/loras out of it. I'm free to use that dataset any way I want which is nice
>>
File: 00000-3904996473.jpg (1.82 MB, 1960x2620)
1.82 MB
1.82 MB JPG
>>
>>101369009
>photograph
No scans available?
>>
File: 00001-3904996473.jpg (1.17 MB, 1960x2620)
1.17 MB
1.17 MB JPG
>>
>>101369198
>No scans available?
Nope. 20+ framed paintings. Tripod + camera will do, it should be alright. Decent natural light
>>
>>101344420
>2 days ago
Isn't it time to wrap up the splitting experiment?
>>
>>101369748
I don't want to associate with the faggots in SDG. If you want to hang out with the middle schoolers sharing TikTok dances go ahead.
>>
>>101365332
谢谢并欢迎
>>
>>101369748
>>101355203
>>101355492
>>
>>101369773
I don't frequent imagegen and I pop in these threads time to time when I feel like baking pictures. It's not really obvious to me what's wrong with sdg, and I don't see any tiktok dances even in a non literal sense. But when I post it's probably going to be in one general or the other, can you make a case why this exists? It seems like a slower general that absorbs some quality discussion. I also see charges against sdg in ldg but not the other way around.
>>
>>101370091
forgot to include, not that anon. I'm obviously asking what is the reason for "the splitting experiment"
>>
>>101370091
The general is where all the high IQ anons are. SDG is a bunch of faggots spamming 1girls. You wonder why their is quality discussion here and not there? Can you apply yourself?
>>
>>101370113
SDG is a circle jerk around one specific image generation company (SAI). They are actively hostile to other models even though a year from now no one will remember SAI because they will be irrelevant.
>>
>imagegen
>>
>>101370116
>You wonder why their is quality discussion here and not there?

I said the first part of that sentence, not the second half

I also see an equal amount of butthurt even when asking polite questions in a charitable way. I think you've explained everything, thanks
>>
File: 1717204019374258.webm (2.43 MB, 856x1280)
2.43 MB
2.43 MB WEBM
I've been playing around with liveportrait and it works really well as long as the head does not move. wish they could figure out incorporating head movements though.
>>
>>101370149
You'd have to be pretty stupid to look at the generic 1girl slop and Discord-tier social fagging in sdg and think it's equal to here.
>>
File: 00003-3904996473.jpg (1.44 MB, 1960x2620)
1.44 MB
1.44 MB JPG
t: architect
>>
>>101370130
This is not true or outdated. I've seen nothing but contempt for their censoring and ruining SD3 with it, and talent is jumping ship
>>
>>101370176
Haha okay, anyways, you can go hang out in SDG now. This general isn't going away.
>>
>>101370162
Nice. Very interesting.
>>
>>101370167
I'd rather scroll passed low quality content than see butthurt that fits on screen
>>
>>101370198
You can hang out with your underaged banned buddies now having middle school tier conversations.
>>
>>101370182
It's fine that this general exists, I just wanted to know if it was worth typing ldg in the catalog
>>
>tfw memeanon isnt here to point out b8 posts
>>
>>101370216
You talk about middle school a lot and have your own lunch table, don't grow up too fast
>>
>>101370221
If you want actual tech discussion this is the only place where it happens. SDG is just a Discord server, that's the difference.
>>
File: media_GSIv_PWaUAURhiQ.jpg (161 KB, 1024x1024)
161 KB
161 KB JPG
https://xcancel.com/cloneofsimo/status/1811062800695054500#m
Looks like this mf is making a really uncensored model based on the SD3 architecture
https://github.com/cloneofsimo/minRF
>>
>>101370329
It's still censored because it's using censored datasets, it's also square cropped which is a shame. The white pill of all this is one guy did it.
>>
>>101370553
Fuck... looks like we have to rely on the chinks (Kolors, Pixart, Hunyuan...) to move the imagegen community forward. That's how far the west has fallen.
>>
>>101370582
It doesn't mean it won't be trainable or another good base model. But you're unlikely to get anything too crazy out of it out of the box.
>>
>>101370582
>Kolors
If this model was trained on DiT instead of the SDXL architecture, it would've been midjourney tier, I'm not kidding, they know how to train their models well
>>
File: 00006-3904996473.jpg (2.86 MB, 1960x2620)
2.86 MB
2.86 MB JPG
>>
>>101370582
You can't sue China
>>
>>101370614
>DiT
Why is this so much better than previous architectures? Explicitly please.
>>
>>101370676
dunno, but Sora uses that and it's a fucking beast
https://www.youtube.com/watch?v=lKM-QMnZ3yY
>>
File: 00007-3904996473.jpg (757 KB, 1646x2201)
757 KB
757 KB JPG
>>
File: SDXL_0002.jpg (644 KB, 1664x2432)
644 KB
644 KB JPG
>>
>>101370676
It's just a better architecture than unet
https://arxiv.org/abs/2212.09748
>We explore a new class of diffusion models based on the transformer architecture. We train latent diffusion models of images, replacing the commonly-used U-Net backbone with a transformer that operates on latent patches. We analyze the scalability of our Diffusion Transformers (DiTs) through the lens of forward pass complexity as measured by Gflops. We find that DiTs with higher Gflops -- through increased transformer depth/width or increased number of input tokens -- consistently have lower FID. In addition to possessing good scalability properties, our largest DiT-XL/2 models outperform all prior diffusion models on the class-conditional ImageNet 512x512 and 256x256 benchmarks, achieving a state-of-the-art FID of 2.27 on the latter.
>>
File: Silverlight.jpg (616 KB, 1536x1536)
616 KB
616 KB JPG
>>
I am trying to remove a logo from a shirt via inpaint, and all I want is it to be black, like the rest of the shirt. But every time I try to inpaint it, it keeps spitting out more logos despite me putting it in the negatives, having my prompt simply be "black shirt", etc. What can I do to not be retarded and get this stupid shit fixed? Trying latent noise and latent nothing only switched the logo from red to white and made it an even bigger eyesore.
>>
>>101370925
first use photoshop/gimp/whatever to crudely paint it black and then run it tough it for fine details. Inpaint does not want to change the image too much, it tries to give you a variation of already existing stuff
>>
File: 00000-3904996473.jpg (819 KB, 1646x2201)
819 KB
819 KB JPG
>>
>>101370939
Thanks.
>>
>>101370925
p h o t o s h o p
>>
File: 00001-3904996473.jpg (943 KB, 1646x2201)
943 KB
943 KB JPG
>>
>>101371097
too little too late eh
>>
File: SDXL_0008.jpg (551 KB, 2432x1664)
551 KB
551 KB JPG
>>
>>101368071
Anyone know if it's correct that the existing 1.5/XL models will need to be partially retrained to adapt them to the 16 channel VAE?
>>
>>101371168
From my understanding that is correct but I feel like I also heard, once they figure it out, it wont be "that much work".
>>
File: 00002-3904996473.jpg (1.04 MB, 1646x2201)
1.04 MB
1.04 MB JPG
>>
>>101370770
I love these girls
>>
>>101371710
just few thousand more and I'll be tired of making them
>>
File: kek.jpg (134 KB, 965x1489)
134 KB
134 KB JPG
>A unet architecture model is more trending than a DiT model
that's how bad SD3M is
>>
>>101371934
>a dit model
I mean, you can train a 1 param dit model and it also wont trend, good architecture cant salvage garbage dataset, hyperparams and whatnot
>>
>>101372056
of course, that's why it's insane how much they destroyed SD3M in the sake of the """safety""" religion, the chinks don't give a fuck about that, they just want to make a great product, period
>>
File: ComfyUI_00526_.png (1.6 MB, 1024x1024)
1.6 MB
1.6 MB PNG
to fluff or not to fluff
>>
>>101372107
sd3?
>>
File: ComfyUI_00572_.png (1.45 MB, 1024x1024)
1.45 MB
1.45 MB PNG
>>101372253
no, SD3 can't do this.
>>
File: IMG_6991.jpg (738 KB, 2432x1664)
738 KB
738 KB JPG
>>
File: IMG_6996.jpg (605 KB, 2432x1664)
605 KB
605 KB JPG
>>
File: 0.jpg (657 KB, 2048x1024)
657 KB
657 KB JPG
>>
>>101372373
it can't do what?
>>
File: SDXL_0013.jpg (748 KB, 2432x1664)
748 KB
748 KB JPG
>>
newfag here.

i want to use AI to make portraits for characters for my superhero campain.

should i use a pony model or stick to 1.5? i'm looking mostly for ease of use, something i could learn a few tricks and be decent.
thanks in advance
>>
>>101373025
/tg/ or /qst/?

Type of quality of image you're looking for?

Do you care about hands or feet?

Just portraits nothing else?
>>
>>101373025
If you don't need any NSFW or females in revealing outfits, then I can suggest Dall-E 3.
https://www.bing.com/create

It's still easily the best and very noob friendly.
>>
>>101373059
Local Diffusion General
>Dall-E 3 Come on man, you can't be serious with this.
>>
>>101373046
1. Tg

2. escencially decent quality is good enough

3. yes, i know my players will point it out so the best looking the better

4. portraits are the most important, but i will try to work it on scenary or background, tho that is a lesser importance

also, i have some ideas with CivitAi, i'm using forge, and got some idea with loras and negatives, beyond that i'm willing to look into plugins for hands and stuff like that
>>
>>101373074
(me)
also should i go back to A1111? i just notice forge is no longer in the main OP post like it used to
>>
>>101373074
You can start here
>>101344687
https://imgsys.org/rankings

Or I can point to a basic one to start off with local that just will do the job to get you started.
https://civitai.com/models/119229
with
https://github.com/vladmandic/automatic
>>
>>101373109
forge is still fine for now and will be fine as long as you never update past the current version
the changes they're making will break compatibility with most extensions, but they still work as of now
>>
>>101373125
https://github.com/LostRuins/koboldcpp/releases/tag/v1.69.1
with
https://huggingface.co/froggeric/WestLake-10.7B-v2-GGUF/tree/main

grab the Q8 version to bounce ideas off of.
>>
>>101373172
okay, i was following until this, whhat are those? models? add ons?
>>
>>101373172
WestLake is quite familiar with Mystery Men RPG and can help you with being a good DM

as an addition to chat over hall I would recommend to you https://sillytavernai.com/
so you can test some of your characters there instead of bare bones koboldcpp
>>
>>101373226
Chat (not pictures) like ChatGPT but local no data sent to a random fag in the cloud
>>
>>101373269
ohhh, i get it now, thanks
>>
Trying that new fal model that popped up an hour ago.
>>
File: adfasdfasdf.jpg (870 KB, 2387x1128)
870 KB
870 KB JPG
>>101373648
>they fell for the text meme
>>
File: 0.jpg (607 KB, 1024x1024)
607 KB
607 KB JPG
>>101373648
>>
>>101373728
A smart enough model gets text for free unless you go out of your way to remove any images with text.
>>
File: file.png (1.82 MB, 1024x1024)
1.82 MB
1.82 MB PNG
>>
>>101373648
What's its token limit tho
>>
File: file.png (2.03 MB, 1024x1024)
2.03 MB
2.03 MB PNG
>>
>sfw prompt
>random seed
>pic rel
kek
>>
>>101374111
I figured another company would take up the low hanging fruit. Obviously AF is heavily monetized but SAI was fucking retarded thinking they couldn't be displaced in the API/Server For Hire "Open" Art model space.
>>
File: SDXL_0015.jpg (742 KB, 2432x1664)
742 KB
742 KB JPG
>>
File: PA_0007.jpg (874 KB, 2560x1536)
874 KB
874 KB JPG
>>
File: file.jpg (1.09 MB, 2176x1920)
1.09 MB
1.09 MB JPG
>>
File: AuraFlow_00020_.png (1.02 MB, 1024x1024)
1.02 MB
1.02 MB PNG
>>101374111
I'm kind of confused, does this model just spit out images of cats if it thinks the image isnt sfw?
This wasn't a NSFW prompt at all.
>>
File: AuraFlow_00022_.png (1.27 MB, 1024x1024)
1.27 MB
1.27 MB PNG
>>101374877
Here's what I was trying to generate btw. If it's going to berate me for "not being safe" with a cat at every turn the model may have already sunk itself.
>>
>>101374877
>>101374902
I presumed it was a post inference thing.
>>
File: file.jpg (1009 KB, 1536x2560)
1009 KB
1009 KB JPG
>>
>>101373069
There are three reasons to use local over Dall-E 3.

>Uncensored, for NSFW
>Total control over every aspect of the generation process
>No logs, no logging in

If all you care are pretty images with the least amount of effort and you have no need for NSFW and don't care if Microsoft can see what you gen, then Dall-E 3 is the best there is. I was just being honest.
I do bunch of shit locally, but I also use DE3, because it just works.
>>
>>101372107
seems fluffable to me
>>
Since it's basically come to light recently that anyone with a dataset and a few GPUs for hire can make models on par or surpassing SD3, (Kolors, AuraFlow, Pixart Sigma (soon bigma)) etc, when is someone going to ask the question of what the fuck SAI did with the hundreds of millions of dollars it was afforded in venture capital and forgiven loans?
>>
File: ComfyUI_00101_.png (1.23 MB, 1024x1024)
1.23 MB
1.23 MB PNG
Towards the right is a cartoon dragon on top of a cliff, to the left is a anthromorphic fox wearing armor riding a horse. The horse is standing on top of a blue cube. In the background there is a flying eagle holding a sun. The sun has a angry face on it.
>>
is there as much of a difference between seed 1 and seed 2, as there is between seed 1 and seed 1000000?`
>>
>>101375237
Yes
>>
File: plinko-stake.gif (1.25 MB, 636x640)
1.25 MB
1.25 MB GIF
>>101375237
Think of a seed being a big game of plinko and the number you choose being where the ball starts.
>>
>>101375199
Comment all you want about the shitty art. The model nailed the prompt pretty much flawlessly.
>>
File: AuraFlow_00040_.png (979 KB, 1024x1024)
979 KB
979 KB PNG
a shiny red ball reflecting the face of a man with a scraggly beard and glasses looking at the viewer with rage on top of a green cube, the green cube has a window on it and within the window is a sign that says "test" on it. The image is in a 3D cgi style,
>>
>>101375199
>>101375414
how much if at all do you have to cherry pick
>>
File: AuraFlow_00039_.png (781 KB, 1024x1024)
781 KB
781 KB PNG
>>101375429
Not much. Here's the previous where the face kind of bled out
>>
File: ComfyUI_00102_.png (1.25 MB, 1024x1024)
1.25 MB
1.25 MB PNG
>>101375429
That was first try. This was 2nd. It got the angle holding the sun part wrong.
>>
File: ComfyUI_00122_.png (1.21 MB, 1024x1024)
1.21 MB
1.21 MB PNG
(dramatic reptilian alien portrait photo taken on the bridge of a spaceship overlooking a alien planet:1.4), (it is wearing a translucent helmet, it's eyes are glowing:1.4), (dark background:1.2), (fantasy vibe:1.2), rich colors, high contrast, hard focus, intricate details, natural light, ethereal, expressive, intimate, elegant, vibrant bloom, whimsical, dramatic shadows, medium close-up, 85mm lens, f/2.8, atmospheric, moody, evocative, luxurious, textured, artistic, surreal, detailed, otherworldly
>>
File: file.jpg (847 KB, 2560x1536)
847 KB
847 KB JPG
>>
File: Grid.jpg (1.03 MB, 6144x2560)
1.03 MB
1.03 MB JPG
>>
>>101375414
>>101375443
damn, it adhered your prompt really well, wonder how sd3's prompt adherence is compared to it
>>
>>101373648
wait he released the model? where?
>>
>>101375558
https://huggingface.co/fal/AuraFlow

Best prompt following model so far.
>>
>>101375564
it works on comfyUI? can it do nfsw?
>>
>>101375576
https://files.catbox.moe/01hhe0.json
Comfyui workflow, make sure to update to latest comfy and get the model from >>101375564
>>
>>101375576
>can it do nfsw?
not sure but it's very underbaked so temper your expectations. it's basically a v0.1 beta
>>
>>101375564
>16.4 GB
oof that's a big boy, does it run on a 24gb vram card? that's probably a 5b or a 6b right?
>>
>>101375576
Kind of the same way pixart can do with careful prompting. But its clearly super undertrained still. Will be barbie doll like, doubt it has much nudity in it if at all.
>>
>>101375595
I'm running it on a 4090 so yes. 6.8B.
>>
>>101375599
does he intend on finishing the pretraining or is he expecting us to finish the job?
>>
>>101375603
how many vram does it ask? if that's barely 24gb it means that we won't be able to run sd3-8b
>>
>>101375604
https://blog.fal.ai/auraflow/

>We are not done training! This model is an initial release to kickstart some community engagement. We will continue training the model and apply our learnings from this first attempt. We also noticed that smaller models or MoE’s might be more efficient for consumer GPU cards which have a limiter amount of compute power, so follow closely for a mini version of model that is still as powerful yet much much faster to run. In the meantime, we encourage the community to experiment with what we are releasing today.

>Our goal is to make this model a standard backbone that other innovative work can be built on top of. We look forward to community contributions. If you want to train finetunes, IP-Adapters, or quantizations of the current model, we are happy to support you in any way we can. There is already a vibrant community around fal and Aura models in our Discord. We invite you to join if you want to get involved.
>>
>>101375414
the prompt following is impressive, how did he do that? he used CogVLM or something better?
>>
>>101375611
maybe he's our savior, I hope he's not on the side of the "muhhh safety" freaks though and that he'll add human nudity on the pretraining aswell
>>
>>101375606
I'm seeing around 16GB vram ish?
>>
>>101375623
Weird that we know practically nothing about him, his methods, the price of the training so far, how much money he has nor his ultimate unstated goals.
>>
>>101375199
holy fuck, it's probably even better than dalle3 on prompt understanding, please tell me the licence is good too?
>>
>>101375646
It's the guy who made loras for text to image basically.
>>
>>101375642
The size of the file is the size it will take up in vram
>>
>>101375655
Apache License
>>
File: GQr-cr1W8AA6Ns-.jpg (86 KB, 1004x1004)
86 KB
86 KB JPG
>>101375665
that sounds too good to be true!
>Great licence
>Great prompt understanding
>DiT architecture
>Big model (~6b) that can run on a 24gb card
>Not a single statement on "muhhh safety"
Maybe this is it anons, we are probably back
>>
Okay okay, AuraFlow is cool and all. But what about training? That model is a big boy. Can it be cut down and made more manageable? Can Loras be trained on a consumer GPU? If the answer is no, it may be kind of dead.
>>
>>101375658
not at all, SDXL is "only" 6gb big but when I do a 1024x1024 inference it asks for 11gb of vram (A1111)
>>
>>101375690
Don't be a pussy, the /lmg/ community are training their 70b models on cloud, this model is 10 times lighter than that
>>
>>101375564
You can test the demo here
https://fal.ai/models/fal-ai/aura-flow?share=45041643-4b84-4603-b6c8-b76be7869c4f
>>
This video and the comments are the funniest thing I've seen in weeks


https://www.youtube.com/shorts/kqgH2judLTw
>>
Le Fresh Bread
>>101375708
>>101375708
>>101375708
>>
>>101375702
I see you didn't answer my question about training it on a consumer GPU.
>>
>>101375734
it won't work on a consumer GPU, it's too big, the future is the cloud, that's the price to pay if you want your model to be midjourney/dalle3 tier, you won't beat that with small model
>>
>>101375745
Oh cool, another thing I have to "rent" to remain competitive. Fuck renting, fuck datacenters.
>>
>>101375730
i gained absolutely nothing from clicking that link
>>
File: file.jpg (1.16 MB, 2560x1536)
1.16 MB
1.16 MB JPG
>>101375732
I didn't see top left when it was posted. I love it when the collage has ones I missed.
>>
>>101375776
I see you ignored the part where I said that big models are required if you want to get competitive against API models, don't whine at us, whine at fucking Nvdia for not providing us enough VRAM, and it being said that their 5090 will only be 28gb, fuck them
https://www.youtube.com/watch?v=_36yNWw_07g
>>
File: file.jpg (1.18 MB, 2560x1536)
1.18 MB
1.18 MB JPG



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.