[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: tmp.jpg (1.07 MB, 3264x3264)
1.07 MB
1.07 MB JPG
General dedicated to creative use of free and open source text-to-image models

Previous /ldg/ bread : >>101312179

>Beginner UI
Fooocus: https://github.com/lllyasviel/fooocus
EasyDiffusion: https://easydiffusion.github.io
Metastable: https://metastable.studio

>Advanced UI
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
InvokeAI: https://github.com/invoke-ai/InvokeAI
ComfyUI: https://github.com/comfyanonymous/ComfyUI

>Auto1111 forks
SD.Next: https://github.com/vladmandic/automatic
Anapnoe UX: https://github.com/anapnoe/stable-diffusion-webui-ux

>Kolors
https://gokaygokay-kolors.hfspace
Nodes: https://github.com/kijai/ComfyUI-KwaiKolorsWrapper

>Pixart Sigma & Hunyuan DIT
https://huggingface.co/spaces/PixArt-alpha/PixArt-Sigma
https://huggingface.co/spaces/Tencent-Hunyuan/HunyuanDiT
Nodes: https://github.com/city96/ComfyUI_ExtraModels

>Use a VAE if your images look washed out
https://rentry.org/sdvae

>Models, LoRAs & training
https://civitai.com
https://huggingface.co
https://aitracker.art
https://github.com/Nerogar/OneTrainer
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts

>Animation
https://rentry.org/AnimAnon
https://rentry.org/AnimAnon-AnimDiff
https://rentry.org/AnimAnon-Deforum

>Index of guides and other tools
https://rentry.org/sdg-link
https://rentry.org/rentrysd

>View and submit GPU performance data
https://vladmandic.github.io/sd-extension-system-info/pages/benchmark.html
https://docs.getgrist.com/3mjouqRSdkBY/sdperformance

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
sd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium

>Share image prompt info
https://rentry.org/hdgcb
https://catbox.moe

>Related boards
>>>/h/hdg
>>>/e/edg
>>>/d/ddg
>>>/b/degen
>>>/vt/vtai
>>>/aco/sdg
>>>/trash/sdg
>>
File: 0.jpg (318 KB, 1024x1024)
318 KB
318 KB JPG
>>
I'm in XY Sampler/Scheduler softlock for this model >>101318340
can't post images
>>
bred
>>
File: 0.jpg (377 KB, 1024x1024)
377 KB
377 KB JPG
>>
File: Bread2.0.png (1.33 MB, 832x1024)
1.33 MB
1.33 MB PNG
>>
File: night1.jpg (141 KB, 1032x1352)
141 KB
141 KB JPG
>>
/sdg/ is so fucking bad for this type of advice so I'll try asking here

Is there any SDXL model that is both light-NSFW capable and can do both anime and photorealism? The photorealism is more important than anime, it just has to not be completely shit at anime.

The reason is that I'm trying to train a realistic NSFW concept model / lora. I have a massive number of anime images, and I've already trained an anime-only lora that is extremely good. I have a much smaller number of photos of the concept, that don't have complete "coverage" of it. I've done experiments where I started with the anime lora, merged to a realistic pony model, then did another lora on top of that on the photos. Compared to directly training a realistic model on just the photos, this has much better conceptual understanding, but being based on pony it's not 100% photoreal.

So now I'm thinking that if there's a model flexible enough to do both anime and realistic, I can jointly train on the anime images and photos. I've tried a few top SDXL models. Of the 6 or so I tried literally only leosam's HelloWorld can be prompted for anime, but it's kinda bad at anatomy and especially NSFW overall (can't make a pussy to save its life).

Also has anyone done this type of joint realism / anime training to better learn a concept or am I a schizo for even trying.
>>
>>101330101

Your approach of combining anime and photorealistic images to enhance the training of a realistic NSFW concept model is quite innovative, and it makes sense given your constraints. While there isn't a perfect off-the-shelf SDXL model that excels at both anime and photorealism, there are a few strategies and models that might work for your needs:

Recommended Models and Techniques
Universal SDXL Models:

Stable Diffusion XL (SDXL): SDXL models are generally more capable of handling a variety of styles, including anime and photorealism. They can be fine-tuned further with specific datasets to improve performance in desired areas. The latest SDXL 1.0 model is versatile and might be a good starting point.
Waifu Diffusion: Specifically tailored for anime-style images, this model can be merged with photorealistic models to create a hybrid capable of handling both styles.
Merging Models:

Model Merging Tools: Tools like the "Block Merger" allow you to combine different models at specific layers, potentially giving you a balanced model that can handle both anime and photorealism. Merging Waifu Diffusion with a photorealistic SDXL model could be a viable approach.
Joint Training:

Combined Dataset Training: Jointly training on both anime and photorealistic images can help the model learn from both styles. Using a model like SDXL as a base, you can fine-tune it with a combined dataset. Ensure that your dataset is balanced to prevent the model from overfitting to one style.
Multi-Style Fine-Tuning: Fine-tuning a base model with different subsets of your data (anime and photorealistic) in separate stages might help the model learn distinct features from both styles effectively.
Practical Steps
Prepare Your Dataset:

Organize your images into distinct categories (anime and photorealistic).
Ensure that the NSFW images are properly labeled and segregated to maintain ethical standards during training.
Model Merging and Fine-Tuning:
>>
>>101330121
Start with a strong SDXL base model.
Merge it with an anime-focused model like Waifu Diffusion using a model merging tool.
Fine-tune the merged model with your combined dataset, starting with a few epochs to observe the results.
Experiment and Evaluate:

Conduct multiple experiments with different merging ratios and fine-tuning techniques.
Evaluate the model's performance on both anime and photorealistic images, especially focusing on NSFW capabilities.
Community Insights
Forums and Communities: Platforms like /sdg/, Reddit's r/StableDiffusion, and Hugging Face forums can provide valuable insights and feedback from others who might have attempted similar experiments.
Joint Style Training: While not extremely common, the approach of training on both anime and photorealistic datasets to enhance conceptual understanding is gaining interest. Sharing your findings in these communities might also help you get more specific advice.
By carefully selecting your models, using the right merging techniques, and methodically fine-tuning, you can create a model that meets your specific needs.
>>
>>101330130
Claude is that you
>>
>Translating the prompt to chinese then using it for Kolors turns the prompt adherence from meh, to near perfect.
>>
>>101330159
westlake
>>
>>101330163
But how will I know if the Chinese translation is accurate and all encompassing?
>>
>>101330299
>>101330163
Try this prompt Jackie Chan punching Xi Jinping in the face
>>
>>101330121
>Your approach of combining anime and photorealistic images to enhance the training of a realistic NSFW concept model is quite innovative
That's not innovative, that's literally the first thing everyone thought of.
>>
File: 462294291.jpg (134 KB, 768x768)
134 KB
134 KB JPG
>>
>>101330101
XL? no.
Pixart? yes.
>>
File: file.jpg (542 KB, 1920x2048)
542 KB
542 KB JPG
>>
File: 00242-674167712.jpg (100 KB, 1024x1376)
100 KB
100 KB JPG
yo
>>
>>101329704
Fresh bread
>>
File: PA_0003.jpg (1.07 MB, 2560x1536)
1.07 MB
1.07 MB JPG
>>
File: PA_0005.jpg (1.12 MB, 2560x1536)
1.12 MB
1.12 MB JPG
>>
File: PA_0007.jpg (580 KB, 2560x1536)
580 KB
580 KB JPG
>>
>>101329512
upload to catbox and post the link here
>>
>>101330585
Hello Mayne
>>
Im retarded, can someone explain to me what the
>6GB GPU VRAM Inference scripts are released.
On Hunyuan actually is/does? Does this help performance for vramlets and if yes how do I actually download/use that? - t. 12gb takes 6 years to gen
>>
File: PA_0011.jpg (829 KB, 2560x1536)
829 KB
829 KB JPG
>>101330637
>>
File: file.png (2.61 MB, 1920x2048)
2.61 MB
2.61 MB PNG
>>
File: Capture.png (4 KB, 709x51)
4 KB
4 KB PNG
>>101330637
Can't catbox limit is 200mb
>>
File: PA_0012.jpg (849 KB, 2560x1536)
849 KB
849 KB JPG
>>
>>101330718
https://litterbox.catbox.moe 1gb limit
>>
>>101330718
>8373 x 28333px
Holy
>>
>>101330726
>>101330637
Just use any of these, I think it was 20/40 steps on CFG 5
euler_a
dpm_2_a
dpmpp_2s_a
dpmpp_sde
dpmpp_sde_gpu
dpmpp_2m_sde_gpu
dpmpp_3m_sde_gpu (some)
lcm (if you can fix the lines blur)
>>
>>101330526
>pixart
Elaborate. Base model or a finetune? Can pixart do even softcore NSFW?
>>
File: PA_0013.jpg (871 KB, 2560x1536)
871 KB
871 KB JPG
>>
>>101330788
https://civitai.com/models/435669/bunline-2k1024512-pixart-sigma
>Can pixart do even softcore NSFW?
yes
>>
File: PA_0016.jpg (811 KB, 2560x1536)
811 KB
811 KB JPG
>>
File: file.png (3.32 MB, 1920x2048)
3.32 MB
3.32 MB PNG
>>
File: PA_0017.jpg (797 KB, 2560x1536)
797 KB
797 KB JPG
>>
File: Capture.png (12 KB, 557x103)
12 KB
12 KB PNG
>>101330787
>>
File: PA_0018.jpg (800 KB, 2560x1536)
800 KB
800 KB JPG
>>
>>101330787
If anyone going to use that model. Do not use more than 30 tokens, and do not use buke effect (or any effects for that matter) You can expand on tokens but image will turn to shit. Good luck
>>
File: PA_0019.jpg (808 KB, 2560x1536)
808 KB
808 KB JPG
>>
>>101330899
imgur? kek
>>
>>101330958
That's the most useful data out of it.
>>101330787
>>
File: PA_0024.jpg (880 KB, 2560x1536)
880 KB
880 KB JPG
>>
File: PA_0025.jpg (656 KB, 2560x1536)
656 KB
656 KB JPG
>>
File: PA_0026.jpg (424 KB, 2560x1536)
424 KB
424 KB JPG
>>
File: PA_0028.jpg (1.18 MB, 2560x1536)
1.18 MB
1.18 MB JPG
>>
File: PA_0029.jpg (1.15 MB, 2560x1536)
1.15 MB
1.15 MB JPG
>>
File: PA_0030.jpg (1.2 MB, 2560x1536)
1.2 MB
1.2 MB JPG
>>
File: PA_0031.jpg (740 KB, 2560x1536)
740 KB
740 KB JPG
>>
File: 00299-3945391833.jpg (250 KB, 1024x1376)
250 KB
250 KB JPG
>>
File: file.png (3.65 MB, 1920x2048)
3.65 MB
3.65 MB PNG
>>
File: Untitled.png (79 KB, 1784x960)
79 KB
79 KB PNG
Can someone here help me figure out if Deep Cache is worth using?
At first I thought it was incredibly overlooked and underrated, as it grants you a huge speed boost in exchange for some reduced quality.
But then I tried increasing the sampler steps (from 25 to 40) to compensate for the loss of quality and I'm not sure if it's any better than just not using Deep Cache at all.
https://github.com/styler00dollar/ComfyUI-deepcache

Useful tip: the start_step and end_step seem to function as a percentage (of 1000), so setting it to 500 and 1000 means it'll speed up the latter half of your sampler steps.
>>
>>101331139
How much time does it save you between using it and not using it for same seed prompt?
>>
>>101331139
Show some examples
>>
File: PA_0036.jpg (712 KB, 2560x1536)
712 KB
712 KB JPG
>>
>>101331155
It changes the output, so it's like using a different seed. But if you use 500, 1000 then it changes the output less and looks closer to the original seed.
>>
File: PA_0039.jpg (711 KB, 2560x1536)
711 KB
711 KB JPG
Good night everybody!
>>
File: 8wc01a.jpg (97 KB, 738x499)
97 KB
97 KB JPG
>>
>>101331087
Neat
>>
>>101331155
>>101331177
oh and it's a 25% to 40% speed increase (from 20 seconds to 15 or 12) if you want to keep the image similar.
I was hoping to get a fresh pair of eyes on this because I've just gone back and forth with so many settings that I'm really not sure of anything anymore.
>>
File: BWT.jpg (1.54 MB, 4096x4096)
1.54 MB
1.54 MB JPG
>>
Are the days of SD 1.5 returning? Feels like Pixart's release is giving me the same vibes. Are the Chinks really this based?
>>
>>101331445
>Are the Chinks really this based?
是的,我们是
>>
>>101330121
thanks llama
>>
File: 1720500444132.jpg (1022 KB, 1536x1536)
1022 KB
1022 KB JPG
>20W0SM
>test passed
>>
>>
>>
>>
File: file.png (3.91 MB, 2432x1664)
3.91 MB
3.91 MB PNG
>>
File: ao_00284_.png (922 KB, 1024x1024)
922 KB
922 KB PNG
Are there any alternatives to layer diffusion that are model-agnostic? I feel like I'm stuck using XL because I'm totally dependent on having easy background transparency now.
>>
File: file.jpg (1.39 MB, 2432x1664)
1.39 MB
1.39 MB JPG
>>
File: file.jpg (801 KB, 2432x1664)
801 KB
801 KB JPG
>>
>>101331993
didn't know that was a thing
I just use masks from depth maps
>>
>>101330422
Bird
>>
>>101331206
>smol but easily tunable model
>big boi barely viable for local
don't compare spoons to forks, especially when kolors ain't even close to being a spork
>>
a1111 question, just started using it yesterday. Is it possible to either
1. lock certain settings/sliders in img2img tab
2. prevent certain settings/sliders from moving to img2img when pressing "send generation parameters and image to img2img"
I want to prevent the step count from being modified and the "resize to/by" resetting in the img2img tab, a quick google search yielded no results. There are only settings for "send seed/size when sending prompt or image to other interface", not steps
>>
File: Grid.jpg (2.39 MB, 2816x2816)
2.39 MB
2.39 MB JPG
>>
File: aefsgrthyjtu.png (9 KB, 421x65)
9 KB
9 KB PNG
>>101332551
>1. lock certain settings/sliders in img2img tab
No, but you can change your defaults. On a fresh launch of the ui, change whatever settings you want to begin with, go to settings -> Defaults -> View Changes -> Apply if all's good
Img2img and txt2img can have separate defaults.
>2. prevent certain settings/sliders from moving to img2img when pressing "send generation parameters and image to img2img"
Picrel should help.
>>
>>101332551
also, if you find auto1111 too slow, or that you can't run the bigger models, you can hop on it's more efficient branch: https://github.com/lllyasviel/stable-diffusion-webui-forge
>>
>>101331556
Nice
>>
>>101332712
I don't care about the defaults, I just don't want the sampling steps to get carried over to img2img when pressing the "transfer prompt and image to img2img"

>>101332721
Wasn't forge merged into a1111 a month ago and there's now feature parity? I'm using a 4 GB VRAM GPU on a1111 and it's fine so far, handles big images well too, just not XL models
>>
File: flat_00008_ copy.png (1.76 MB, 1024x1024)
1.76 MB
1.76 MB PNG
>>101332320
Thanks for the suggestion
I tried it now but it seems like it's a pain to get consistent results and will need a lot of tweaking. What depth model do you use?
>>
>>101332919
>Wasn't forge merged into a1111 a month ago and there's now feature parity?
No. I tested the dev branch yesterday and it's nowhere close to Forge performance, apparently because it lacks the low/medium vram managment utilized in Forge. I'm not a code magician, so I barely understand what I'm saying, but apparently they have way too different backends to merge outright. Whatever solutions found their way into a1111 dev branch ain't enough to match it's performance, especially for the likes of you and me with 4 or 8 vram.

You might want to give it a go. Not sure if the same will apply to you, but on 8vram Forge gave me the boost needed to handle XL models, and even the smaller ones handled WAY faster than what vanilla a1111 can offer currently, but hopefully things change by the time of next official release or two. As for steps carrying over in img2img, tough luck I'm afraid.
>>
>>101332320
Why don't you just use a background segmentation model?
>>
>>101333080
what do you recommend?
>>
>>101333147
https://github.com/danielgatis/rembg?tab=readme-ov-file#models

I used isnet-anime with pretty good results. I'd recommend trying them all if anime isn't what you need it for. I think you can get it as a node in Comfy with the Inspire or Impact pack (or maybe it's part pf the base nodes by now?), can't remember which, if you use Comfy.
>>
File: ComfyUI_Kolors_00701_.png (1.79 MB, 1216x832)
1.79 MB
1.79 MB PNG
Good morning, sirs.
>>
File: anime-girl-1.out.png (215 KB, 850x601)
215 KB
215 KB PNG
>>101333280
thanks, worth a try, although their showcase examples are not encouraging with all the fuzz they leave
>>
官方 PixArt Bigma 和 Lumina 2 等候室,现在添加 Kolors。
>>
>>101333035
I think I've used the medium version for SDXL, mostly at 1024 res. But I've also used color correction nodes (for contrast, exposure, and saturation) to modify both the input image and the generated depth map. But there's probably better things for it.

>>101333080
Didn't really look around; haven't used segmentation models until quite recently. will look into that too, thanks
>>
I'm a beyond fucking retard. Is there controlnet for PDXL? If so how do I use it?
>>
>>101333565
can just use ordinary XL CNs
>>
>>101333581
Are there cn for xl?
>>
>>101333631
https://huggingface.co/xinsir/controlnet-union-sdxl-1.0
>>
>>101333044
Thanks, after switching to forge I noticed my gen speed increased by about 25-50% depending on what I do. I can actually feasibly generate xyz plots without waiting half an hour now
>>
>>101334024
If at some point you have'll have to look for alternatives to Forge, SwarmUI and Metastable should perform similarly well, since their comfy backend is more optimized than whatever a1111 uses.
>>
>>101334024
I guess there's also Fooocus, which should be further supported and might be optimized similarly to Forge, since it seems to have been started by the same guy. Avoid EasyDiffusion though, since it's performance is roughly that of current a1111.
>>
File: gupsam_00006_.png (524 KB, 968x1440)
524 KB
524 KB PNG
Testing segment anything for background removal and it's promising so far. Still seems sort of weak compared to layer diffusion, more like a smarter photoshop magic selection tool.
>>
>>101334161
>more like a smarter photoshop magic selection tool
looks on par, rather than smarter
>>
File: Sigma_04424_.png (975 KB, 1024x1024)
975 KB
975 KB PNG
>>101331197
Good night

>>101329150
Good morning
>>
>>101334263
Good afternoon
>>
>>101334117
funny you say that, I started with EasyDiffusion because it was the only program that could run on Windows 7 (CPU mode). Its performance seems to be worse than a1111 though especially when doing batches
>>
File: kolors_00374_.png (1.6 MB, 1024x1024)
1.6 MB
1.6 MB PNG
>>101333488
>>
>>101334182
It's really worse but instructing it what to isolate with a prompt is neat
>>
File: kolors_00379_.png (1.45 MB, 1024x1024)
1.45 MB
1.45 MB PNG
>>
File: 1kvBkc.png (591 KB, 1206x735)
591 KB
591 KB PNG
>>101334436
Back in the day I started with NMKD, picrel
Those were the times.. last update was Jul 10, 2023
>>
File: kolors_00390_.png (2.53 MB, 2048x768)
2.53 MB
2.53 MB PNG
photograph of Jesus Christ holding in one hand Mjolnir while smoking a cigar and holding a machine gun in the other hand in a burning post apocalyptic city hellscape |
这张照片展示了耶稣基督在燃烧的末日城市地狱景象中,一手拿着雷神之锤,一手拿着机枪,还抽着雪茄。

English left and Chinese (as translated by chat gpt) right
>>
>>101334717
>as translated by chat gpt
how viable is it, compared to something like: https://www.deepl.com/pl/translator
>>
>>101334717
Do this one. Jackie Chan punching Xi Jinping in the face
>>
>>101334754
I don't know. I don't know enough about Chinese to judge what's better.
>>
File: kolors_00393_.png (2.5 MB, 2048x768)
2.5 MB
2.5 MB PNG
a 1990 screenshot of an anime in a style reminiscent of "Neon Genesis Evangelion" depicting a woman leaning against a tank |
1990 年的一个动画截图,风格类似于 "霓虹创世纪",截图中一名妇女靠在坦克上

English left, Chinese right
>>
File: kolors_00395_.png (2.3 MB, 2048x768)
2.3 MB
2.3 MB PNG
>>101334772

I'm not even sure this thing knows who Jackie Chan is.
>>
File: kolors_00398_.png (2.19 MB, 2048x768)
2.19 MB
2.19 MB PNG
Xi jinping punching xi jinping
>>
>>101334779
I'd test it myself, but I've no access to chat gpt. Care to give me two samples? Have it translate the following into Polish, Russian and German:
"I was on my way to buy smokes, but suddenly a plane fell out of the sky and dropped on my head. That's when I realised fully automated gay space communism was the only viable socioeconomic system."
>>
>>101334802
>>101334826
>>101295804
Oh well. We reached the limit of it.
>>
File: PA_0044.jpg (822 KB, 2560x1536)
822 KB
822 KB JPG
>>
File: kolors_00402_.png (2.36 MB, 2048x768)
2.36 MB
2.36 MB PNG
>>101334850
Here you are.

Polish:
"Byłem w drodze po papierosy, ale nagle samolot spadł z nieba i wylądował na mojej głowie. To wtedy zdałem sobie sprawę, że w pełni zautomatyzowany gejowski komunizm kosmiczny jest jedynym realnym systemem społeczno-ekonomicznym."

Russian:
"Я шeл зa cигapeтaми, нo вдpyг caмoлeт yпaл c нeбa и пpизeмлилcя нa мoю гoлoвy. Имeннo тoгдa я пoнял, чтo пoлнocтью aвтoмaтизиpoвaнный гeй-кoммyнизм в кocмoce являeтcя eдинcтвeннoй жизнecпocoбнoй coциaльнo-экoнoмичecкoй cиcтeмoй."

German:
"Ich war auf dem Weg, Zigaretten zu kaufen, aber plötzlich fiel ein Flugzeug vom Himmel und landete auf meinem Kopf. Da wurde mir klar, dass vollautomatisierter schwuler Weltraumkommunismus das einzig praktikable sozioökonomische System ist."
>>
File: kolors_00406_.png (2.48 MB, 1536x1024)
2.48 MB
2.48 MB PNG
2020 anime style magazine cover
>>
File: kolors_00407_.png (2.03 MB, 1536x1024)
2.03 MB
2.03 MB PNG
a red ball on top of a blue cube with a green triangle behind them. On the left is a dog and on the right is a cat |
一个红色的球放在一个蓝色的立方体上面,后面有一个绿色的三角形。左边是一只狗,右边是一只猫
>>
File: PA_0047.jpg (634 KB, 1664x2432)
634 KB
634 KB JPG
>>
File: kolors_00408_.png (2.04 MB, 1536x1024)
2.04 MB
2.04 MB PNG
And again just to confirm.

I definitely think the deep L translation do a better job than GPT
>>
File: kolors_00410_.png (2.07 MB, 1536x1024)
2.07 MB
2.07 MB PNG
>>
File: PA_0048.jpg (689 KB, 1664x2432)
689 KB
689 KB JPG
>>
File: kolors_00411_.png (2.39 MB, 1536x1024)
2.39 MB
2.39 MB PNG
anime style image of two girls, the girl on the left has red hair in a ponytail and the girl on the right has black long black hair and blunt fringe. The girl on the left has a jacket with a "1" on it and the girl on the right has a tank top with the number "2" on it. |
两个女孩的动漫风格图片,左边的女孩扎着红色马尾辫,右边的女孩留着黑色长发和钝头流苏。左边的女孩穿着印有 "1 "的外套,右边的女孩穿着印有数字 "2 "的背心。
>>
File: kolors_00417_.png (2.26 MB, 2048x768)
2.26 MB
2.26 MB PNG
>>
File: PA_0053.jpg (610 KB, 1664x2432)
610 KB
610 KB JPG
>>
File: kolors_00421_.png (2.27 MB, 2048x768)
2.27 MB
2.27 MB PNG
spiderman laying in a hospital bed while the hulk and ironman look over him with a sad expression |
蜘蛛侠躺在病床上,绿巨人和钢铁侠一脸悲伤地看着他

I'm declaring no winners on this one.
>>
File: PA_0054.jpg (459 KB, 1664x2432)
459 KB
459 KB JPG
>>
File: kolors_00426_.png (2.55 MB, 2048x1024)
2.55 MB
2.55 MB PNG
eerie image of homer simpson standing in a dark alleyway illuminated by a single light on the wall to the left |
霍默-辛普森(Homer Simpson)站在黑暗的小巷中,左侧墙壁上的一盏灯照亮了小巷,画面阴森恐怖
>>
File: PA_0056.jpg (745 KB, 3328x1152)
745 KB
745 KB JPG
>>
what the hell happened to that TerDiT or whatever bitnet model? do any of the backends even support it? why did nobody talk about it?
>>
>>101334906
Much appreciated. No major mistakes. I'm only less confident about German, since I'm not fluent with it.

DeepL seems to have a more nuanced understanding of translation, being able to generate more stylistically appropriate phrasing, or simply a more natural response by finding synonymous equivalents. For example you have to take creative liberty with something like the german shadenfreude or weltschmerz.

Chat GPT feels a bit more like mechanical translation, meaning it leans into 1:1 word translation, rather than trying to best convey the idea and meaning behind a given phrase.

All in all I'd say it's good enough.
>>
File: kolors_00430_.png (2.26 MB, 2048x768)
2.26 MB
2.26 MB PNG
in game world of warcraft screenshot |
游戏中的魔兽世界截图

English won here.
>>
File: 0.jpg (359 KB, 1024x1024)
359 KB
359 KB JPG
>>
>>101335313
>English won here.
Makes me wonder.. I guess their dataset has a bias for concepts that were more likely to be come from either the anglosphere or sinosphere. WoW probably ain't as popular mainland, so that would make sense.

How about: character concept art from a gacha game
>>
File: kolors_00448_.png (3.03 MB, 2048x1024)
3.03 MB
3.03 MB PNG
Barack Obama holding a banana in his left hand and a luminous red sphere emanating foreboding energy in his right hand. He is looking at the camera with a hint of malice in his expressing giving a sense of imminent danger. |
奥巴马左手拿着一根香蕉,右手拿着一个散发着不祥能量的红色发光球体。 他看着镜头,眼神中流露出一丝恶意,给人一种危险迫在眉睫的感觉。
>>
>>101335439
Right is english?
>>
File: kolors_00450_.png (2.72 MB, 2048x1024)
2.72 MB
2.72 MB PNG
>>101335456
Left is always English

>>101335418
character concept art from a gacha game|
游戏中的角色概念图

Both were pretty garb desu
>>
>>101335466
>Both were pretty garb desu
Maybe try something specific like Genshin Impacat instead of gacha game
>>
>>101335466
>Left is always English
Damn, English is very handicapped then. Sad! This is why I'll never be optimistic about chink models. The language barrier is too great.
>>
File: kolors_00451_.png (3.16 MB, 2048x1024)
3.16 MB
3.16 MB PNG
>>101335505
It's sad watching the English speaking AI space basically be outpaced by China so quickly, but then I remember that this is entirely a self inflicted wound. I'm hoping there's an easy workaround with the prompting, and it's just chat GLM issue.

still image of a 1980s sitcom television show featuring two women in a kitchen. The woman on the left has blonde curly hair while the woman on the right has black hair cut into a short bob. There are looking at a package on the kitchen table|
20 世纪 80 年代情景喜剧电视节目的剧照,画面中两位女性在厨房中忙碌。左边的女人是金色卷发,右边的女人是黑色短发。她们正在看厨房桌上的一个包裹
>>
>>101334013
Not really usable yet unless you want to make a node yourself
>>
>>101335480
I tried to give it more to work with.

Genshin impact character art featuring a woman wearing traditional Japanese shrine maiden outfit in a lively and dynamic pose
>>
File: kolors_00453_.png (2.07 MB, 1536x1024)
2.07 MB
2.07 MB PNG
>>101335575
Forgot my image.
>>
File: kolors_00454_.png (2.82 MB, 1536x1024)
2.82 MB
2.82 MB PNG
classical Japanese wall scroll featuring a crane and a cherry blossom tree with a frog on a lily pad|
日本古典壁画,描绘仙鹤和樱花树,以及荷叶上的青蛙。
>>
>>101335526
>English speaking AI space basically be outpaced by China so quickly
Arguable. SD and it's derivatives like Pony are still king in local space. Even the english SaaS alternatives like NaI and Midjourney are still miles ahead in terms of aesthetics. Hell, even Dall-e 3 is impressive with how up to date it is with a variety of concepts, especially the likes of franchise characters.

For now, the only thing going for chink models is maybe a bit more prompt coherence/complexity, but I'd put even that into question. A lot can change though. Kolors might be salvaged if it's distilled down in spec requirements, with proper finetuning support to follow. Same for whatever Pixart is cooking, but for now, both are HUGE maybes. I'd sooner expect the mess around SD3 to sort itself out and then maybe we can see if it can be salvaged. Hunyuan I was never optimistic to begin with, but Kolors does look more promising, so at least that's that.
>>
>>101335590
Well, at least the pose is more lively and dynamic in english.
>>
>>101335526
It's because American tech is infested with a nu-religion of anti-thinkers and moralists. When they see AI they see 1984 mind control.
>>
stupid civitai
>>
>>101335671
I think SD and Pony are king because they have such an extensive ecosystem of tools and UIs that make it viable. The models themselves are starting to show their age. The issue with the current western models is that the cutting edge is hidden behind a paywall and heavily censored while the newest local stuff we have, namely SD3 was so badly botched in the name of safety that at the current time it's functionally useless.
If SD3 wasn't so badly bungled, I'd be playing with it now instead of kolors. Another thing kolors is extremely good at is anatomy. Look at the hands. They're almost all perfect. Almost.
As for requirements, I'm using 10gb of vram during inference and around 30 normal ram, I don't think the requirements are that steep.

My biggest concern with it is that I need to have a semi functional knowledge of Chinese to make it work as well as possible.
>>
>>101332919
>I just don't want the sampling steps to get carried over to img2img
Wait, what sampler and how many steps do you use in txt2img and img2img? Generally you want around 25-45, so it shouldn't be a problem when it carries over to img2img. I tend to increase steps in img2img, since it does seem to help a bit with blending into the original content, but even then I'd go no higher than 45 which is also fine for txt2img.
>>
>>101335671
Pony is an outlier and was only achieved with brute force on frankly a shitty architecture. The core problem is American AI models have shitty architectures and very little has been pushed to make a good architecture with a good base model. And without Pixart, there would be no image models that were actually trainable on consumer hardware. Remember, Pony has like 4xA100s.
>>
File: ComfyUI_Kolors_0721.jpg (228 KB, 832x1216)
228 KB
228 KB JPG
>>101335526
I'll have to include a llm into my Kolors workflow to auto-translate my prompts into chinese
>>
>>101335778
SD3 is not that bad. It's better than SDXL was at release, and no one seems to remember how 1.4/1.5 base were. I think most of the hate is from wannabe saas founders butthurt that the original license ruined their ez startup dreams.
>>
File: kolors_00480_.png (2.55 MB, 2048x768)
2.55 MB
2.55 MB PNG
distant shot of a tent at night in a dark forest, a floodlight is illuminating the extremely dark distant photograph of a tent at night in a dark forest, a floodlight is illuminating the area around the tent, to the left a gaunt and humanoid creature can be seen emerging from the darkness around the camp|
一张极暗的远景照片,拍摄的是夜晚黑暗森林中的一顶帐篷,一盏泛光灯照亮了帐篷周围的区域,左侧可以看到一个憔悴的人形生物从营地周围的黑暗中走出来
>>
>>101335785
if people continue to fixate on basic consumer hardware we will never have any hope of catching midjourney or dall-e. it's ok if billy can't train loras on his gtx 1060. llama proves that a hobbyist ecosystem can develop even with vastly larger models.
>>
>>101335778
Fair points, though do wake me up on Kolor when/if it drops to 8vram and 16ram. As for SD3, let's hope it picks up the pace with the change in licensing.
>>101335785
>The core problem is American AI models have shitty architectures and very little has been pushed to make a good architecture with a good base model.
SD3 did change into DiT right? Or was it something else. Either way >>101335825 has a point. They budged on the license, so I have some hope for it being salvaged in some shape or form, as was SDXL with Pony. Either that, or there's ground to gain with Pixart as you mentioned.
>>
>>101335825
You say SD3 is not that bad, but man... it absolutely fucks human anatomy. I don't even get what the point of it is if it can't produce human subjects with any degree of reliability.
>>
>>101335850
What a dumb argument, you'd be shitting and pissing everywhere even with a model that required 24GB of VRAM to run.
>>
>>101335876
He wants you to switch to SaaS model and be happy with censorship.
>>
File: kolors_00486_.png (3.13 MB, 2048x1024)
3.13 MB
3.13 MB PNG
woman wearing a bikini while laying on the floor and holding her feet up to the camera so the soles of her feet are visible|
身着比基尼的女子躺在地板上,将双脚举向镜头,脚底清晰可见
>>
>>101335876
i have a 4090 so your words are nothing to me!
if someone released a higher parameter model needing more i'd happily buy another card, too!
>>
>>101335850
Dumb take. I understand where you're coming from, there is a sweet spot in terms of a hardware minimum, but it's the same argument we've had with that one Hunyuan anon. You HAVE to "fixate" on basic consumer hardware, because that's what local is for in the first place. Even most high-end users do go beyond a 4090, and most of us have something like a 30something, I have a mobile 2080, and I'm seeing anons with even less. I'm not going to make collages for a dead general, and that's what would happen, if you fixate on high end instead.

Take a look at the current state of gaming as an example. Theoretically big budget games are made for consumer grade GPUs, and yet most of it is too unoptimized even for the 4090s, which barely anyone can even afford. Local is all about efficiency, because we can never hope to outgun SaaS, so stop putting us to it's standards.
>>
>>101335929
lmao
>>
>>101335929
>if someone released a higher parameter model needing more i'd happily buy another card, too!
Not many such cases, so GL in a dead ecosystem. As another anon mentioned, SD did well because of accesibility and all the support around it. If you want a thriving local ecosystem, you need a thriving community around it, and for that you need more accesibility.
>>
>>101335954
yet local llama blazes forward with 70b+ models. people will find ways to move forward if the output is worth it.
>>
>>101336015
>blazes forward with 70b+ models.

Most people are running those at retarded quants. Nobody is "happy" with the state of that space except for the 10 or so people with a fire hazard rig.
>>
File: kolors_00496_.png (3 MB, 2048x1024)
3 MB
3 MB PNG
model shot of model wearing latest oversized Balenciaga puffer jacker and baggy pants creating an avant garde and impractical display of fashion. In the background there are marble statues and checkerboard tiles creating a sense of class|
模特身穿 Balenciaga 最新款超大号夹克衫和宽松长裤,展现前卫而不实用的时尚。背景是大理石雕像和棋盘格瓷砖,营造出一种高级感。
>>
>>101336015
>if the output is worth it
the hardware requirements for 70b models is hardly worth it, >>101336026 also has a point
>>
>>101335954
>Local is all about efficiency, because we can never hope to outgun SaaS, so stop putting us to it's standards.
well said, your $5000 rig cannot hope to compete with million dollar gpu farms the likes of midjourney, openai, etc have. the strength of local models is privacy and finetunablity. the moment local models get too big for communities to finetune is the day it's joever. you'll be paying thousands to run an inferior product.
>>
File: oeoeoo_00007_.png (1.69 MB, 1024x1280)
1.69 MB
1.69 MB PNG
>>101335863
It's ok a lot of the time, base 1.5 was unusable and SDXL was also RNG with lower image quality
>>
File: kolors_00501_.png (3.35 MB, 1664x1280)
3.35 MB
3.35 MB PNG
>>
File: kolors_00504_.png (3.39 MB, 2560x896)
3.39 MB
3.39 MB PNG
I think I'm having fun with the fashion stuff now lol.
>>
>>101335863
But think of the children anonymous. That's why you can't have AI that can draw fingers, toes or eyeballs, let alone vaginas!

Praise jesus, hallejujala, pass the collection plate
>>
File: SD3.png (862 KB, 1024x1024)
862 KB
862 KB PNG
>>101335439
>Barack Obama holding a banana in his left hand and a luminous red sphere emanating foreboding energy in his right hand. He is looking at the camera with a hint of malice in his expressing giving a sense of imminent danger.
Needs some elbow grease, but if training becomes an option, it's salvageable. I'll be posting some more SD3 comparisons.
>>
>>101336147
That orb is not very foreboding. Just saying.
>>
>>101336055
you talk like any serious fine tuning of xl was done on consumer hardware. someone serious about it can rent. with this attitude local models will be genning 1mp base resolution images with mutated hands for another 10 years
>>
File: SD3.png (1.77 MB, 1024x1024)
1.77 MB
1.77 MB PNG
>Genshin impact character art featuring a woman wearing traditional Japanese shrine maiden outfit in a lively and dynamic pose
>>101336160
As said, will need elbow grease if it can be scrapped. We're talking about a base model that wasn't even supposed to be released. The important thing is, it shares the capability for more complex prompts, which is the only thing going for chink edition. Getting it biased on artistic touches like backgrounds and composition is easy. I'm more worried about stuff like hand quality in picrel, but if Pony could do it, so can SD3 mehdium.
>>
How to make female faces that aren't pretty? They're always too generically attractive
>>
>>101336178
The only trainable model for local hardware that we have is PixArt. Everything else is out of reach.
>>
>>101336178
>you talk like any serious fine tuning of xl was done on consumer hardware
no, i used the word communities for a reason

>with this attitude local models will be genning 1mp base resolution images with mutated hands for another 10 years
i think it's still to early to tell, models like pixart sigma have showed that there's more to improving and optimizing models besides stuffing more parameters. we'll have to wait and see for pixart bigma and see how well that performs.
>>
>>101336214
I appreciate your optimism, but I feel SD3's issues might go deeper than what anything but the most aggressive of finetunes can fix.
>>
>>101336221
Specify they're British
>>
File: file.png (122 KB, 1817x573)
122 KB
122 KB PNG
>>101335818
hell yeah
>>
File: SD3.png (888 KB, 1024x1024)
888 KB
888 KB PNG
>>101336037
>model shot of model wearing latest oversized Balenciaga puffer jacker and baggy pants creating an avant garde and impractical display of fashion. In the background there are marble statues and checkerboard tiles creating a sense of class
>>101336299
Then my only point is that it retains the one distinct feature of chink models, namely more complex prompt adherence. It's something we're unlikely to finetune into a model, and it's THE milestone in any alternative models. Otherwise they're all aesthetically inferior to current standards. As said, getting an artistic bias finetuned into them is unlikely an obstacle. Same for styles or niche concepts like fetishes. Pixart has a great aesthetic sense in spite of the lower params and being undercooked.
>>
File: SD3.png (1.67 MB, 1024x1024)
1.67 MB
1.67 MB PNG
>>101335526
>still image of a 1980s sitcom television show featuring two women in a kitchen. The woman on the left has blonde curly hair while the woman on the right has black hair cut into a short bob. There are looking at a package on the kitchen table
>>
>>101336455
Nice.
>>
File: image.png (1.53 MB, 1024x1024)
1.53 MB
1.53 MB PNG
>>101335626
>classical Japanese wall scroll featuring a crane and a cherry blossom tree with a frog on a lily pad
>>101336462
also glad to see it behave decently in spite of typos
>>
{"type":"embed","modelVersionId":106916,"modelName":"Civitai Safe Helper","modelVersionName":"v1.0"},{"type":"embed","modelVersionId":250708,"modelName":"Civitai Safe Helper (Minor)","modelVersionName":"safe_pos"},{"type":"embed","modelVersionId":250712,"modelName":"Civitai Safe Helper (Minor)","modelVersionName":"safe_neg"}

lol
lmao
>>
>>101336417
Another issue I have with SD3 and PAΣ to a similar degree is evident in the tiles behind the model. If you look at anything those smaller parameter models produce you'll notice they seem to have difficulty maintaining straight lines and patterns more so than larger parameter models. Hopefully that will be fixed with SD3 large if it ever gets released and I understand they do plan to
>>
File: SD3.png (1.58 MB, 1024x1024)
1.58 MB
1.58 MB PNG
>>101336479
>Civitai Safe Helper
What am I looking at?
>>101334948
>2020 anime style magazine cover
welp, at least it got the magazine part right
>>
I've gone back to 1.5 desu it's comfy
>>
File: oeoeoo_00056_.png (1.34 MB, 1280x960)
1.34 MB
1.34 MB PNG
SD3 is a struggle to get more than a few different poses out of. I think it's true that it has too much "dreamshaper look" baked in.
>>
File: SD3.png (1.68 MB, 1024x1024)
1.68 MB
1.68 MB PNG
>still image of a 1980s comedy featuring two women in a kitchen, they are looking at eachother in confusion over a small box of nails next to a hammer hammer covered in jelly
>>
File: SD3.png (1.79 MB, 1024x1024)
1.79 MB
1.79 MB PNG
>world war 2 pinup propaganda artwork featuring an attractive woman in military unifrom, her hands rest chest, perspective from above, she is looking at the viewer with a soft smile on her face
>>
File: file.png (678 KB, 2053x1205)
678 KB
678 KB PNG
k chinese prompting workflow with ollama done
>>
File: SD3.png (2 MB, 1024x1024)
2 MB
2 MB PNG
>watercolour artwork, closeup of a plant in shape of a pretty woman, morning dew on top of her leaves, dutch angle perspective
>>101336712
impressive
>>
File: SD3.png (1.05 MB, 1024x1024)
1.05 MB
1.05 MB PNG
>digital artwork of an otherwordly landscape, chunks of earth floating in air, in style of Zdzisław Beksiński
>>
>>101336712
Can haz in catbox?
>>
File: SD3.png (1.01 MB, 1024x1024)
1.01 MB
1.01 MB PNG
>>101336758
>same prompt and seed without the otherworldly typo
>>
>>101333427
Prompting for white background, maybe with detailed background in the negatives helps.
>>
>>101336712
>clip scale
Is that like clip skip? I've never used that node before.
>>
>>101333044
>>101334024
I lied, I didn't download forge correctly so I was still using a1111, it was placebo (fresh PC start probably)

After correctly installing forge it's actually SLOWER because it has to move the model back and forth between RAM and VRAM after every gen which takes 2-3 seconds
>>
>>101336781
sure: https://files.catbox.moe/jl4zqr.json

you will need the ollama custom nodes and ollama installed and running on your system
https://ollama.com/

>>101336837
node parameters are explained here
https://docs.getsalt.ai/md/rgthree-comfy/Nodes/SDXL%20Empty%20Latent%20Image%20%28rgthree%29/#required
idk if im using it right
>>
>>101334263
Cool img.
>>
>>101336712
I don't get the whole "visual mode" thing. What do people get out all these panels and wires and stuff?
>>
>>101337288
some people like it. some hate it.
>>
>>101337288
https://en.wikipedia.org/wiki/Node_graph_architecture
>>
>>101336712
Can you only run it on those nodes?
I can't use SIGMA on the workflow?
>>
>>101337312
Horrifying.
>>
File: vodka.jpg (163 KB, 1024x1024)
163 KB
163 KB JPG
>>
>>101337321
Kolors is currently only usable with the diffusers wrapper nodes specifically made for it. you'll have to replace the Kolors nodes if you want to use another model
>>
>>101337331
some people think lots of clicks mean that they are working hard.
>>
>>101334967
are you testing the translations on the same seed?
>>
>>101336536
>think it's true that it has too much "dreamshaper look" baked in.
Simply prompting "dreamshaper" gives you its look. They put, in my opinion, too much of that kind of data in the training.
>>
>>101337510
*which is ironic considering its license forebays the use of SD3 outputs for use in training other models.
>>
File: oeoeoo_00102_.png (1.57 MB, 960x1280)
1.57 MB
1.57 MB PNG
>>101337510
I want synthetic data to leave
>>
>>101337617
Unfortunately some enjoy the look it gives even if it's objectively ass.
>>
>>101329150
So which is the best model to use?
>>
>>101337842
None of them.

Save yourself and don't do Gen your image that you will Gen is never going to be the way you imagined in your head. Always off by 25%
>>
>>101337288
more customizability
>>
>>101338018
okay and which one is best?
>>
>>101337842
what kinds of images are you trying to make?
>>
>>101338214
you are like the troll who is asking in /diy/ which tool it the best, maybe you start with a hammer and see if you can use it to dig a hole
>>
>>101338214
Unfortunately none of them are universally good. They're too small to be comprehensive. What types of images are you aiming for?
>>
>>101338264
You can totally dig holes with hammers, and if you have hard, rocky terrain a sledgehammer is often necessary.
>>
>>101338271
then i guess realistic stuff rather than anime, i don't care about porn or humans though
>>
Good Afternoon
>>
>>101333631
learn to look shit up, jesus
>>
>>101337384
I click less using it
>>
>>101338309
in this case try an (photo)realistic or all purpose model
>>
>>101338478
yes but which one?
>>
>>101338489
check out images on civitai and use the model associated with an image you like
>>
>>101330663
not same anon but bumping for interest
>>
>>101330121
is this 4o
>>
>>101338309
If you don't care about humans SD3 might be the way to go
>>
>>101338489
for starters maybe something like good old absolute reality
https://civitai.com/models/81458?modelVersionId=108576
>>
>>101338593
>>101338634
thanks
>>
Albedobase is an good XL based generalist model, it can do realism and also art styles quite well
https://civitai.com/models/140737?modelVersionId=329420
>>
File: grid-0019.png (3.33 MB, 2048x1280)
3.33 MB
3.33 MB PNG
I want to train a photorealistic lora on hot female statues

Lora guides give different advice based on the type of thing you are training. I don't know if statues would be an object, concept, race, or character. Also I suppose the statues would be more diverse and handle prompts better if they incorporated knowledge from what non-statue humans look like. But I was also worried about it generating statues that look too much like an alive human, or it generates statues that don't fall into the ideal body types of antiquity.

I was just going to read all the lora guides but if someone has a favorite to suggest that would be neat.

I also have no idea what model I should train on. To my knowledge there's no large undertaking like pony for photorealistic, but because I would be making statues I wonder if SDXL censorship wouldn't give me a bad time like naked woman.
>>
>>101338773
you don't need a lora for this, most models can do this fine out of the box, just ask for a sculpture
>>
not just classical marble or metal art sculptures, even anime figurines come out nicely
>>
>>101334948
>2020 anime style magazine cover
the magazine part broke because of the extra stuff I added on the prompt
>>
>>101338773
Shouldn't you only train on base models anyway to allow for maximum compatibility?
>>
>>
AI winter.
>>
>>101338959
SDXL was able to produce likeable statues but the detail is terrible

I tried a photorealistic finetune and it really wants to give me normal photos of nude girls then petrify them after the fact. I probably need a good SFW model
>>
File: teaser_small.png (1.5 MB, 1320x1184)
1.5 MB
1.5 MB PNG
https://buaacyw.github.io/mesh-anything/

>MeshAnything mimics human artist in extracting meshes from any 3D representations. It can be combined with various 3D asset production pipelines, such as 3D reconstruction and generation, to convert their results into Artist-Created Meshes that can be seamlessly applied in 3D industry.

Just imagine the possibilities
>>
>>101339610
The next few decades will be wild.
>>
>>101339417
if you just ask for a statue and material this can happen, being more specific about art mediums and artist styles is a lot better
>>
>>101339610
nice for computer games
>>
>>101339186
>>101339760
cute gens
>>
>post images to a persons character lora on civit
>they :( emote my shit
i guess im bbc posting your waifu now
>>
>>101340124
tks, been testing different artists on the prompts to see how good sdxl is with them
>>
>>101340209
maybe they're sad you didn't post more
>>
>>101340295
there's a hundred different ways you can express that than to basically downvote a photo
>>
File: tmp_store~11.jpg (278 KB, 1552x1200)
278 KB
278 KB JPG
Good afternoon
>>
>>101338214
https://imgsys.org/rankings
>>
File: ComfyUI_Kolors_00754_.png (1.54 MB, 1216x832)
1.54 MB
1.54 MB PNG
>>
>>101340560
I like that style
>>
>>101340321
maybe they're shy
>>
>>101332368
Bird is the word
>>
>>101339133
That's disinfo. If you train on the base model, it'll look like shit on all models except on the base model.
>>
File: 4etrtghj.png (1.41 MB, 1344x768)
1.41 MB
1.41 MB PNG
>>
>>101340525
Love it, gives chill vibes.
>>
File: Sigma_04436_.png (1.3 MB, 1280x768)
1.3 MB
1.3 MB PNG
>>101334276
>>101338376
>>101340525
Good afternoon anon
>>
>>101341315
very nice
>>
File: ComfyUI_Kolors_00810_.png (1.66 MB, 1216x832)
1.66 MB
1.66 MB PNG
>>101340875
I like your style.
>>
File: ComfyUI_Kolors_00831_.png (1.82 MB, 1216x832)
1.82 MB
1.82 MB PNG
>>
File: ComfyUI_Kolors_00835_.png (1.39 MB, 1216x832)
1.39 MB
1.39 MB PNG
>>
File: tmpn5cn4_7g.png (1.1 MB, 1344x768)
1.1 MB
1.1 MB PNG
>>101341315
Good evening
>>
File: ComfyUI_Kolors_00848_.png (1.61 MB, 1216x832)
1.61 MB
1.61 MB PNG
Trying to get Kolors to do a Dwemer tower from Morrowind.
>>
File: Adventurers XL.png (1.83 MB, 1152x1408)
1.83 MB
1.83 MB PNG
Mud.
Love me some MVDCORE.
Low detail and washed out colors.
I finally move from the old Dreamshaper to ZavyChromaXL.
...
Anyway, what's the next big thing?
Is there any hope for SD?
>>
>>101340525
Hullo
>>
File: Sigma_04440_.png (1.34 MB, 1472x704)
1.34 MB
1.34 MB PNG
>>101336037
>model shot of model wearing latest oversized Balenciaga puffer jacker and baggy pants creating an avant garde and impractical display of fashion. In the background there are marble statues and checkerboard tiles creating a sense of class
>>
>>101341728
trying to get anything out of any model that is specific is extra hard.
>>
New sigma checkpoint (as of Saturday)
https://civitai.com/models/533100/ladies
Boobies
>>
>>
>>101342347
alright...I'll bite.
>>
>>101342389
and I'm out.
>>
File: ComfyUI_Kolors_00885_.png (1.75 MB, 1216x832)
1.75 MB
1.75 MB PNG
>>
File: 00111-2055578356.jpg (242 KB, 1024x1376)
242 KB
242 KB JPG
>>
>>
>>
>>101342424
Not enough boob?
>>
>>101342855
1) I hate that workflow
2) I really hate that workflow
3) Fuck that workflow
4) Under baked model =(
>>
>>101337506

Good morning sir, they were all tested on the same seed.
The ching chong prompt adherence is frankly a LOT better than English.
>>
>>101342389
>>101342424
>>101342855
>>101342905
Going to try UNO reverse card on this.
>>
>>101343105
My Disappointment Is Immeasurable And My Day Is Ruined
>>
File: kolors_00521_.png (3.8 MB, 1536x1536)
3.8 MB
3.8 MB PNG
>>
File: SDXLtoPixArt_0002.jpg (214 KB, 832x1216)
214 KB
214 KB JPG
Starting Item
>>
File: SDXLtoPixArt_0003.jpg (245 KB, 1040x1520)
245 KB
245 KB JPG
>>101343105
>>101343159
>>101343280
Finished with PixArt
>>
>>101343280
>>101343291
If you direct your attention to her hand. That's what happens to her chest if shown any signs of juice. Once again >>101343159
>>
>>101343280
>>101343291
>>101343159
If anyone is intrested.

https://files.catbox.moe/utrhz2.json
>>
I just want to be able to do toes reliably, is that really too much to ask?

Whargarble.
>>
File: notcomfortableUI_00001_.png (1.27 MB, 832x1216)
1.27 MB
1.27 MB PNG
>>101343367
>>101343367
https://civitai.com/models/520661
>>>/fit/
>>
>>
File: 12.png (1.15 MB, 1152x1152)
1.15 MB
1.15 MB PNG
>>
>>101336875
excellent work anon
>>
File: ComfyUI_KolorsXL_0819.jpg (804 KB, 1792x2304)
804 KB
804 KB JPG
>>
>>101342663
Interesting feline
>>
>>101343448
Substantial improvement over what I was using.


Thanks.
>>
ded
>>
>>101344403
In that case, have some fresh bread...

>>101344420
>>101344420
>>101344420
>>
File: PA_0022.jpg (652 KB, 2560x1536)
652 KB
652 KB JPG
>>
File: PA_0023.jpg (699 KB, 2560x1536)
699 KB
699 KB JPG
>>
File: PA_0050.jpg (745 KB, 1664x2432)
745 KB
745 KB JPG
>>
File: SDXL_0003.jpg (616 KB, 1664x2432)
616 KB
616 KB JPG
>>
File: PA_0035.jpg (549 KB, 2560x1536)
549 KB
549 KB JPG
>>
File: PA_0038.jpg (933 KB, 2560x1536)
933 KB
933 KB JPG
>>
File: PA_0051.jpg (661 KB, 1664x2432)
661 KB
661 KB JPG
>>
File: PA_0045.jpg (644 KB, 1664x2432)
644 KB
644 KB JPG
>>
File: PA_0064.jpg (790 KB, 2560x1536)
790 KB
790 KB JPG
>>
File: PA_0057.jpg (535 KB, 3328x1152)
535 KB
535 KB JPG
>>
File: PA_0058.jpg (444 KB, 3328x1152)
444 KB
444 KB JPG
>>101344501
>>101344526
>>101344548
>>101344559
>>101344564
>>101344571
>>101344580
>>101344589
>>101344607
>>101344614
These are throwaways. Just filling the board.
>>
So is there any concensus on whats happening after Stable Diffusion 3 struck an iceberg?

Is the plan now to move to pixart?
>>
File: PA_0066.jpg (524 KB, 2560x1536)
524 KB
524 KB JPG
>>101344618
>>
File: PA_0068.jpg (578 KB, 2560x1536)
578 KB
578 KB JPG
>>
>>101344629
>>101344629
There was never a plan. Besides PixArt everything else requires serious $$$ to train. PixArt is not better but can actually be trained on 2x4090 in few months time and if you are willing to drop serious $$$ you could have god tier model.
>>
What's the pony stuff? Just MLP nonsense?

What happened with SD3? 1.5 is still okay and XL variants seem "fine."
>>
>>101344629
There are paths forward - Pixart, Lumina, Kolors or perhaps for some HunyuanDit.

No "consensus" i think, people will just train whatever can be trained for months and it'll either work out or not. The amount of training hardware the respective research teams and finetuners have varies.

>>101345459
sdxl finetune that got better anatomy / more poses / more nsfw tuned in for illustrations but also changed the weights a lot so it's essentially its own base model with regards to lora
>>
>>101345608
Thanks!



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.