[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: tmp.jpg (1.28 MB, 3264x3264)
1.28 MB
1.28 MB JPG
General dedicated to creative use of free and open source text-to-image models

Previous /ldg/ bread : >>101301739

>Beginner UI
Fooocus: https://github.com/lllyasviel/fooocus
EasyDiffusion: https://easydiffusion.github.io
Metastable: https://metastable.studio

>Advanced UI
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
InvokeAI: https://github.com/invoke-ai/InvokeAI
ComfyUI: https://github.com/comfyanonymous/ComfyUI

>Auto1111 forks
SD.Next: https://github.com/vladmandic/automatic
Anapnoe UX: https://github.com/anapnoe/stable-diffusion-webui-ux

>Kolors
https://gokaygokay-kolors.hf.space
Nodes: https://github.com/kijai/ComfyUI-KwaiKolorsWrapper

>Pixart Sigma & Hunyuan DIT
https://huggingface.co/spaces/PixArt-alpha/PixArt-Sigma
https://huggingface.co/spaces/Tencent-Hunyuan/HunyuanDiT
Nodes: https://github.com/city96/ComfyUI_ExtraModels

>Use a VAE if your images look washed out
https://rentry.org/sdvae

>Models, LoRAs & training
https://civitai.com
https://huggingface.co
https://aitracker.art
https://github.com/Nerogar/OneTrainer
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts

>Animation
https://rentry.org/AnimAnon
https://rentry.org/AnimAnon-AnimDiff
https://rentry.org/AnimAnon-Deforum

>Index of guides and other tools
https://rentry.org/sdg-link
https://rentry.org/rentrysd

>View and submit GPU performance data
https://vladmandic.github.io/sd-extension-system-info/pages/benchmark.html
https://docs.getgrist.com/3mjouqRSdkBY/sdperformance

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
sd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium

>Share image prompt info
https://rentry.org/hdgcb
https://catbox.moe

>Related boards
>>>/h/hdg
>>>/e/edg
>>>/d/ddg
>>>/b/degen
>>>/vt/vtai
>>>/aco/sdg
>>>/trash/sdg
>>
Blessed thread
>>
official pixart bigma and lumina 2 waiting room, now with kolors
>>
File: 1690080318567.jpg (823 KB, 2048x1024)
823 KB
823 KB JPG
Pretty damn accurate to prompt, and as far as character traits go honestly exactly how I would have rendered this as a "photorealistic", bing's text-to-style advantage still hard to comp though
>>
oh no i typoed that image text haha
me so chinee
>>
>>101312230
Damn, dall-e 3 got sovl
>>
I tried that infamous prompt with kolors, it's cheating a bit by hiding hands and only showing a portrait, but it's not cursed, best result so far of any base model
>>
File: 1693783440149.jpg (224 KB, 1024x1024)
224 KB
224 KB JPG
>>101312254
yeah it does
i'm pretty much just longing for the day I can recreate all the things it's given me locally
>>
How long ago did the Kolors team start their project? Feels odd to not release a DiT model desu.
>>
>>101312230
adding the prompt as text would be nice
>>
>>101312264
>i'm pretty much just longing for the day I can recreate all the things it's given me locally
I mean, you could train a lora on it's output. I guess someone might've even done that for you already.
>>
File: PA_0025.jpg (642 KB, 2560x1536)
642 KB
642 KB JPG
>>
>>101312261
yeah, I'm really impressed by the anatomy, it's not perfect but for a base model it destroys everything that existed so far
>>
>>101312287
that's awesome. could i have to prompt if you don't mind?
>>
>>101312283
I've been thinking I should get into experimenting with that.

>>101312282
pastoral fantasy cottage,Fierce woman napping getting a tan,pale gold hair,long straight hair,blunt bangs,straight bangs,hime haircut,straight fringe,gold eyes,fierce eyes,leonine eyes,narrow face,toned triceps,NOT nude athletic female,fullfigured,curvy,great figure,body positive,ivory skin,bare arms,bare back,mature flat cartoon illustration,imperious,soft fields,rolling hills,tall grass,hedges,wildflowers,white walls,hazy mountains,watercolor painting muted wispy colors
>>
>>101312266
it's the same team that made Kling AI
>>
>>101312230
How did that Bing image not get dogged?
>>
File: file.png (2.1 MB, 1024x1024)
2.1 MB
2.1 MB PNG
the model even added a bit of cleavage in by itself, how immoral!
>>
>>101312351
It can do literally full nudity, unlike the prude SAI christians!
https://files.catbox.moe/vmjuus.webp
https://files.catbox.moe/d5iigy.webp
>>
>>101312297
indeed, second try gives even better insight
>>
File: PA_0031.jpg (820 KB, 2560x1536)
820 KB
820 KB JPG
>>101312300

A harbinger resembles a whirling mechanical model of a planetary system. Its sun, moons, and planets are crafted of tiny glowing jewels whose color shifts with the harbinger’s mood. Electric arcs and sparks of magical essence dance between its whirling pieces. The harbinger can dismantle itself into its component parts to squeeze through tiny gaps, its rings dissolving into glowing vapor while its star and planets remain. Harbingers shed light as brightly as a torch under normal conditions, but can dim to a faint glow if desired, and frequently spin off bits of their essence into dancing lights. A harbinger archon’s voice resembles the ringing of a dozen tiny bells, and regardless of the language the archon speaks, it retains that tinkling musical quality. Most harbinger archons form from souls that led simple lives of wisdom and piety before their deaths. They remember nothing of their mortal existences, but retain a decidedly practical, if parochial, view of worldly matters. Only rarely does a harbinger want for a parable or proverb apropos to the task at hand.


>>
File: long dick general.jpg (3.41 MB, 3264x3264)
3.41 MB
3.41 MB JPG
>>101312179
>>
>>101312357
Doesn't it feel good when a company isn't terrified of making a fully uncensored model and ends up making a great base model at the end? The chinks have way more balls than the fucking western cucks that's for sure
>>
>>101312179
>>101312370
never lucky
>>
>>101312351
since this worked so well I tried squatting and the result is fortunately covered in the right places
>>
https://www.youtube.com/watch?v=sjJST8jpqo4

i always laugh at this guy's funny ass voice
>>
>>101312384
Because their idea is to make it to the market first and stay on top defeating the competition. China must be #1
>>
>>101312402
>sd3
superseded by chang.
>>
>>101312402
oh wow, long time no see/hear from em'
god I was such a noob back when I watched his stuff
>>
File: 1695843471467.jpg (191 KB, 1024x1024)
191 KB
191 KB JPG
>>101312326
It's a First Month image, I assume that same prompt would get blocked now

I haven't touched bing at all in months, I had set up scripts to hammer it and just autogen images all day long and retry dogs. With enough brute force and delicate promptsmithing you could get it to pop some tiddies out
>>
File: PA_0034.jpg (697 KB, 2560x1536)
697 KB
697 KB JPG
>>
>>101312412
It does and I always said it, if the western cucks don't want to do the job proprely, the chinks will do it at their place and take all the praise, welp, fuck them, now we got a great base model and I hope great finetunes will come out from it
>>
its kooler timeeeee
>>
File: fsedrgfth.png (1.97 MB, 1691x646)
1.97 MB
1.97 MB PNG
>>101312402
i'm sorry pixart sexuals, but I unironically like the red sd3 cow
>>
>>101312459
>it can do a ballpoint sketch of a cow
>great, can it do anything else?
>...no
>>
>>101312459
pixart bigma will win you back
>>
File: 351tt50563bd1.png (1.51 MB, 1344x768)
1.51 MB
1.51 MB PNG
https://gokaygokay-kolors.hf.space/
Really nice
>>
>>101312447
They can release ChatGPT-4o with Scarlett Johansson voice and name it China#1FUSA as a model and app name and people would download it.

It will take ages for Scarlett to sue China.
>>
>>101312501
Exactly, they don't give a fuck about celebrities fee fees, they just want to make a great product, that's how it should be in the first place, the west definitely has fallen
>>
File: PA_0037.jpg (1.06 MB, 3328x1152)
1.06 MB
1.06 MB JPG
>>
>>101312519
They can't do the opposite where it will shit on China in the same model. Just different laws
>>
>>101312546
you can clown the China's president on the model though, and still, it's way less censored than SAI's model
>>
>>101312534
fuck off poseidon i'm not sacrificing any more bulls to you!!
>>
File: PA_0040.jpg (905 KB, 2560x1536)
905 KB
905 KB JPG
>>
>>101312370
LONG DICK
>>
>>
>>101312287
nice
>>
File: kolors1.jpg (388 KB, 1024x1024)
388 KB
388 KB JPG
>>
File: kolors2.jpg (397 KB, 1024x1024)
397 KB
397 KB JPG
>>
File: ComfyUI_KolorsXL_0045.jpg (745 KB, 1792x2304)
745 KB
745 KB JPG
>>
>>101313041
very good, damn
>>
>>101313041
I never thought we would finally find an alternative to SAI but here we go, thank you chinks ;-;
>>
>>101313041
how about that, huh
16ch VAE when?
>>
>>101313041
Okay, this one is impressive
>>
>>101313114
we already have a MiT 16ch VAE, it needs a bit of training to be adapted to Kolors though
>>
>>101313160
i know im just surprised that it seems like only one dude is working on that
bless him
>>
>>101313179
>bless him
yeah, we need more heroes like him, 16ch VAE anon if you read this, thanks a lot for your work
>>
what is a 16ch vae
>>
>>
>>101313199
16 channel VAE
>>
>>101313281
ok what does it do? every time I googled "vae" it says "makes your colors better" but I haven't noticed any difference between using one or not
i just don't get it
>>
>>101313199
its why sd3 (outside everything else) is so detailed and "crisp"
more channels = higher quality output
>>
File: file.png (1.71 MB, 967x1511)
1.71 MB
1.71 MB PNG
>>101313199
>picrel
>>
>>
>>101313304
I'm a simple layman, but from my understanding VAE is resposible for decoding whatever is being generated. I guess you could say it's kind of like decompressing the size of a file, except here it's decompressing an image's visual detail? More channels could imply it has more neuron layers, meaning more complex decompression/decoding process, or being able to translate more visual detail from smaller amount of data. See >>101313339
>>
>>101313304
it's the thing that turns the latent image (math and shit) into pixels you can see. i think the vae compresses the latent image and more channels mean it's less compressed and therefore more fine details are retained? someone correct me if im wrong
>>
Wake me up when Kolors can run on my system desu
>>
>>101313458
*suffocates you with pillow*
>>
>>101313421
I think latents are the compressed state, and VAE is the decompression process. Picrel is supposedly an example visualization of how SDXL works.
>>101313472
Me too please.
>>
>>101313479
>VAE is the decompression process
oh i see, thank you anon. *suffocates you with a pillow*
>>
suffocate everyone who trains style loras with a trigger word
>>
>>101313339
>>101313396
>>101313421
intredasting, thanks. I've been using sdxl-vae-fp16-fix.safetensors, I'm thinking of ways to finish this sentence but I can't
>>
>>101313592
Protip, finish with a full stop. Simple as.
>>
>>101312179
>https://gokaygokay-kolors.hf.space
Is this censored or is it indicative of what I'd get with a local install of kolors? (it's not giving me booba)
>>
>>101313638
it's kinda censored, but easy to break. ssometimes it gives u fucked up looking nips though.
>>
I'm using fixFP16ErrorsSDXLLowerMemoryUse_v10, and I don't even remember where I got it from.
>>
File: ComfyUI_KolorsXL_0093.jpg (720 KB, 1792x2304)
720 KB
720 KB JPG
with Kolors upscale pass instead of XL. I prefer the XL upscale of a Kolors base gen
>>
>>101313704
Can you give an example of someting non-realistic. Anime, cartoon, anything stylized. Whatever you're doing with realism seems to work well, so I wonder if your approach improves on other styles.
>>
>>101313382
>pixart_every person has two lives
but I've not a single one
>>
File: ComfyUI_KolorsXL_0107.jpg (945 KB, 1792x2304)
945 KB
945 KB JPG
>>101313719
there is nothing special im doing though

havent tested too much anime on it but it can do backgrounds it seems?
>>
>>101313819
pretty good
>>
File: file.png (3.48 MB, 1920x2176)
3.48 MB
3.48 MB PNG
>>
>>101313867
looks very NFT pilled and pretentious enough for a cybermonkey collector to buy it
>>
File: ComfyUI_KolorsXL_0113.jpg (602 KB, 1792x2304)
602 KB
602 KB JPG
>an illustration of Son Goku
>>
File: ComfyUI_KolorsXL_0115.jpg (770 KB, 1792x2304)
770 KB
770 KB JPG
>an illustration of Mnokey D Luffy
>>
File: tmpdiec36pt.png (2.93 MB, 4032x932)
2.93 MB
2.93 MB PNG
>>
File: file.jpg (1.03 MB, 1920x2176)
1.03 MB
1.03 MB JPG
>>101313893
precisely what im going for, nice
will likely switch up styles soon tho
>>
File: tmpdiec36pt.png (2.8 MB, 2352x2304)
2.8 MB
2.8 MB PNG
>>
File: tmpdiec36pt.png (2.54 MB, 2352x2304)
2.54 MB
2.54 MB PNG
>1girl and a girl are retarded
>girl is fine
this amuses me
>>
File: tmpdiec36pt.png (3.17 MB, 2352x2304)
3.17 MB
3.17 MB PNG
>>
File: file.png (3.2 MB, 1920x2176)
3.2 MB
3.2 MB PNG
a guy propositioned me once to mint a gen but it seemed like a scam
not minting as a whole but his specific thing
>>101314103
interesting find i would say even the lens flare looks better as well
>>
File: ComfyUI_temp_auxcf_00006_.png (2.23 MB, 1120x1472)
2.23 MB
2.23 MB PNG
testing kolors
>>
File: ComfyUI_temp_auxcf_00009_.png (1.95 MB, 1120x1472)
1.95 MB
1.95 MB PNG
>>
>>101313927
What a weird mix of Kid Goku and Goku.
>>
File: ComfyUI_temp_auxcf_00020_.png (2.08 MB, 1120x1472)
2.08 MB
2.08 MB PNG
>>
File: file.jpg (629 KB, 1920x2176)
629 KB
629 KB JPG
>>
qrd on kolors? it works in comfy-ui only, right?
>>
File: ComfyUI_temp_islql_00001_.png (2.71 MB, 1120x1472)
2.71 MB
2.71 MB PNG
>>
File: ComfyUI_temp_islql_00006_.png (2.38 MB, 1120x1408)
2.38 MB
2.38 MB PNG
>>
File: ComfyUI_temp_islql_00018_.png (3.64 MB, 1472x1472)
3.64 MB
3.64 MB PNG
>>
File: ComfyUI_temp_islql_00021_.png (3.58 MB, 1472x1472)
3.58 MB
3.58 MB PNG
>>
File: ComfyUI_temp_islql_00023_.png (3.68 MB, 1472x1472)
3.68 MB
3.68 MB PNG
>>
File: image.jpg (292 KB, 1344x768)
292 KB
292 KB JPG
>ancient greek femboy in style of modern anime
kolor does indeed have a closeup bias
>>
File: ComfyUI_temp_islql_00030_.png (3.15 MB, 1280x1536)
3.15 MB
3.15 MB PNG
>>
>>101314478
yeah, A1111 is always 2 years behind
>>
>>101314813
I'm a simple anon, if nta. I'll be satisfied once it runs like forge.
>>
>>101314780
https://youtu.be/yDhjCOFan5E?t=2
>>
>>101314834
forge is dead no? the guy making the repo said he's focusing on gradio 4 and some shit and that every extension will be broken in consequence
>>
File: file.png (3.34 MB, 1920x2176)
3.34 MB
3.34 MB PNG
>>
How much vram do I need to run kolors, more than SD3?
>>
>>101314948
20gb of vram, that's because it's using the LLM aswell, but that can probably be optimised (put the LLM on the cpu + quantize)
>>
File: kek.jpg (307 KB, 1363x935)
307 KB
307 KB JPG
https://xcancel.com/cloneofsimo/status/1809996950420480126#m
>AuraDiffusion
looks like this guy is trying to fix SD3M or something?
>>
>>101314834
you can try the demo to see how good it is though
https://gokaygokay-kolors.hf.space/
>>
>>101314967
>20gb of vram
welp, It's painful being a vramlet ;_;
>>
>>101315017
like I said, it'll be optimised, if you make a 4bit LLM, put that on the cpu and use the 8bit unet, it'll ask probably for 6gb of vram, that would work for you but you need to be patient kek
>>
File: image (2).png (1.99 MB, 1152x1024)
1.99 MB
1.99 MB PNG
>>101314990
doesnt give you many attempts unfortunately
>>
>>101315079
yeah I know, still better than learning how to use spaghetthi Ui shit kek
>>
>>101314913
Rumors say the dev branch of auto is nowhere close to forge performance, so I'm about to check it myself. If I won't be able to comfortably use Pony/SDXL, I might be forced to stick with it anyway, or switch to something else, even though none of the alternatives suit me.

Why is it so fucking hard for ui devs to get inpainting right.
>>
>>101315124
I don't know man... it's so frustrating aswell, if only there was a mix between ComfyUi backend with A1111 frontend, that would be the best of both words
>>
>>101314782
This one is pretty good. What was the prompt?
>>
>>101315137
>>101315137
>ComfyUi backend with A1111 frontend, that would be the best of both words
Currently either Metastable or StableSwarm are our best alternatives in that regard.
>>
>>101315147
yeah I know those ones but that's not exactly A1111 in the frontend, I really don't want to change, the first guy willing to make a real A1111 + ComfyUI will make bank that's for sure
>>
File: tmp27s6obby.png (349 KB, 512x512)
349 KB
349 KB PNG
>>101315179
ikr
>>101315124
>Rumors say the dev branch of auto is nowhere close to forge performance
1024x1024 took me ~13s with Forge on 8vram
dev branch A1111 took ~1 minute.. for a 512x512

mfw
>>
>>101313396
The VAE is like zipping for latent spaces. You take an image and zip it into latent space and pair it with a caption during training. During inference take a prompt, generate a latent image for that prompt and then unzip it with the VAE.
>>
Adam Mini is my new favorite optimizer, at bfloat16 it's lighter weight compared to 8bit AdamW and it seems to have a more pleasant result.
>>
>>101315388
but is it better than when anon CAME?
>>
>>101315393
CAME is way more heavy especially with the 1.3B Pixart model, it also seems to have similar quality to CAME.
>>
>>101315405
>>101315388
have you compared the loss function for the both of them to be sure CAME and adam mini are equivalent?
>>
>>101315416
I like to eye ball it and use my emotions. I like how Mini trains although I think Sophia is the best but it's very unstable.
>>
>>101315427
And CAME is simply unusable if you talk strictly about performance, Mini can do more than double CAME's batch size.
>>
File: image (3).png (1.74 MB, 1152x1024)
1.74 MB
1.74 MB PNG
>>
>>101315527
yes
>>
File: 00079-16598489.jpg (721 KB, 2432x1664)
721 KB
721 KB JPG
>>
File: PA_0443.jpg (732 KB, 2560x1536)
732 KB
732 KB JPG
>>
File: 0.jpg (257 KB, 1024x1024)
257 KB
257 KB JPG
>>
File: file.png (2.3 MB, 1920x2176)
2.3 MB
2.3 MB PNG
>>
>>101315179
What about Fooocus? It's not abandoned, is it?
>>
File: kolors_00177_.png (1.26 MB, 1024x1024)
1.26 MB
1.26 MB PNG
How are we feeling about kolors? I'm genuinely undecided.
>>
File: PA_0448.jpg (832 KB, 2560x1536)
832 KB
832 KB JPG
>>
File: kolors_00181_.png (1.26 MB, 1024x1024)
1.26 MB
1.26 MB PNG
>>
>>101316034
love it
>>
>>101316023
    quantization_code = "RANDOMSHITGO!"

kernels = Kernel(
bz2.decompress(base64.b64decode(quantization_code)),
[
"int4WeightCompression",
"int4WeightExtractionFloat",
"int4WeightExtractionHalf",
"int8WeightExtractionFloat",
"int8WeightExtractionHalf",
],



yep looks great to me can't wait to use it
>>
>>101316023
It's okay, but it's also kind of whatever for as long as it doesn't have proper UI support, finetuning infrastructure and lower spec requirements.
>>
File: PA_0449.jpg (670 KB, 2560x1536)
670 KB
670 KB JPG
>>
>>101316063
Can you elaborate for the retards
>>
>tfw anon mentioned me
>>
File: image7.jpg (222 KB, 1024x1024)
222 KB
222 KB JPG
>>101316023
quite good, it does really good fireballs.
>>
File: kolors_00187_.png (1.23 MB, 1024x1024)
1.23 MB
1.23 MB PNG
>>
>>101316086
Y'know I'm some what of a retard myself. Some yahoo on internet called it out and I'm his echo.
>>
File: image.jpg (181 KB, 1344x768)
181 KB
181 KB JPG
>glowing wraith made of goo, attractive, pretty, feminine, cute, surrounded by darkness
>>
File: PA_0451.jpg (788 KB, 2560x1536)
788 KB
788 KB JPG
>>101316058
is this one better?
>>
File: file.png (2.07 MB, 1920x2176)
2.07 MB
2.07 MB PNG
>>101316023
so/so but it can make nice images like >>101304716 >>101307281 >>101307693 >>101309054 >>101309754>>101313041 >>101314782 >>101316113
we'll see what happens
>>101316156
both are cool but im an abstract anon myself
>>
>>101316023
>no training code
>no training examples
>no info about dataset
>bullshit china-license
>bad at following prompts
>u-net
>bloated with china LLM

Idk bros. I'm not that excited.
>>
File: image (1).jpg (168 KB, 1344x768)
168 KB
168 KB JPG
>female wraith made of glowing goo, fullbody artwork, stylized cartoon, dutch angle, from below, standing
Eh. Think I'll just wait for the next Pony release. Then again, it's literally:
>Effective Training of Diffusion Model for PHOTOREALISTIC Text-to-Image Synthesis
>>101316205
Feel you.
>>
>>101316218
You're shooting yourself in the foot by prompting "full body" with a landscape aspect ratio
>>
File: PA_0453.jpg (953 KB, 2560x1536)
953 KB
953 KB JPG
>>101316178
get me out of monster gens. drop me in to abstract ones
>>
>>101316239
But horizontals have some of the most interesting full body compositions. Pony gets it.
>>
File: kolors_00215_.png (1.13 MB, 1216x768)
1.13 MB
1.13 MB PNG
>>
>>101316250
Do you have to include something like "laying" or does pony do it
>>
>>101316205
>u-net
Why exactly is u-net bad?
>>
>>101316275
DiT is newer therefore better
>>
File: image9.jpg (276 KB, 1024x1024)
276 KB
276 KB JPG
>>101316256
at least it knows some copyrighted characters, i wonder if it knows any popular artists
>>
>>101316311
I think you need to be mega famous to be recognized by the model. I'm pretty sure all of the tagging is done by LLM, so if the LLM doesn't know what megumin is, it won't produce megumin even if it's in the dataset.
It can probably produce old public domain artists like Picasso, but don't expect your favorite furry inflation artist to be in there.
>>
File: file.png (1.66 MB, 1920x2176)
1.66 MB
1.66 MB PNG
>>
>>101316311
Jackie Chan punching Xi Jinping
>>
File: kolors_00221_.png (1.15 MB, 1216x768)
1.15 MB
1.15 MB PNG
>>
File: PA_0457.jpg (868 KB, 2560x1536)
868 KB
868 KB JPG
>>
File: tmpwo3zfng6.png (1.1 MB, 1344x768)
1.1 MB
1.1 MB PNG
>>101316262
>Do you have to include something like "laying" or does pony do it
>>
File: file.png (1.5 MB, 1920x2176)
1.5 MB
1.5 MB PNG
 surfeit, drawing sketch lithograph of a young woman standing in an empty field, 
>>
File: PA_0460.jpg (955 KB, 2560x1536)
955 KB
955 KB JPG
>>
File: PA_0462.jpg (852 KB, 2560x1536)
852 KB
852 KB JPG
>>101316512
young grill in an empty field
>>
>>101316546
heheh nice
>>
>>101316141
No, really. I don't understand what that specific code means.
>>
>>101316583
A code to run "randomshitgo!" At kernel level
>>
File: PA_0467.jpg (524 KB, 2560x1536)
524 KB
524 KB JPG
>>
File: PA_0468.jpg (551 KB, 2560x1536)
551 KB
551 KB JPG
>>
File: PA_0469.jpg (575 KB, 2560x1536)
575 KB
575 KB JPG
>>
File: 0.jpg (358 KB, 1024x1024)
358 KB
358 KB JPG
>>
File: image.png (3.27 MB, 2582x794)
3.27 MB
3.27 MB PNG
>>
File: PA_0536.jpg (843 KB, 2560x1536)
843 KB
843 KB JPG
>>
File: PA_0537.jpg (1.06 MB, 2560x1536)
1.06 MB
1.06 MB JPG
>>
File: PA_0538.jpg (1000 KB, 2560x1536)
1000 KB
1000 KB JPG
>SIX-FINGERED HAND, THE
>This section presents a cult with a simple if ominous motto:
>“The end of the world isn’t merely at hand—it’s at our fingertips.”
>Agenda: Destroy the world.
>Structure: A loosely organized cult of lunatics.
>>
File: PA_0539.jpg (939 KB, 2560x1536)
939 KB
939 KB JPG
>>101316931
Oops, ignore that one please...
>>
File: file.png (1.24 MB, 2304x1792)
1.24 MB
1.24 MB PNG
>>
>>101317133
I'm guessing you are a fan of one line drawings
>>
>>101317212
yeah
>>
so artistic
look at the line
wow
>>
im artist
>>
File: file.png (515 KB, 1152x896)
515 KB
515 KB PNG
>as you can see from the chart behind me, it says i am real artist
>>
File: kolors_00243_.png (1.29 MB, 1216x768)
1.29 MB
1.29 MB PNG
>>
File: kolors_00262_.png (1.36 MB, 1216x768)
1.36 MB
1.36 MB PNG
>>
>>101316786
Do I want to know what it would look like if you went beyond 1?
>>
>>101316668
Thought that was the One Ring for a moment
>>
I've been working on training a niche realistic fetish lora, on a variety of models. I'm just here to say that bigASP has fucking insane variety and "authenticity", for lack of a better term. A basic prompt makes each image look completely different. Different ethnicities, faces, hair, clothing. Sometime it fucks up with anatomy or other things, the model seems a bit undertrained. But when it hits, it looks like an actual amateur photo.

Whatever this guy did, just improve it: even more training images, optimized hyperparameters, longer training. Someone get him some funding.
>>
File: tmpjr3bqlng.png (1.32 MB, 896x1152)
1.32 MB
1.32 MB PNG
Fellas, is it unhealthy to share your drink with your friend?
>>
>>101317815
bigASP the what now? Sdxl?
>>
File: PA_0543.jpg (893 KB, 3328x1152)
893 KB
893 KB JPG
>>
File: PA_0545.jpg (932 KB, 3328x1152)
932 KB
932 KB JPG
>>
File: Kolors_00011.png (1.23 MB, 832x1216)
1.23 MB
1.23 MB PNG
>>
File: file.jpg (848 KB, 1920x2048)
848 KB
848 KB JPG
>>
File: PA_0548.jpg (663 KB, 2560x1536)
663 KB
663 KB JPG
>>
>>101317815
>>101318151
https://civitai.com/models/502468/bigasp-v1 ?
>>
File: 3974805805.jpg (68 KB, 768x768)
68 KB
68 KB JPG
>>
>>101318340
Since we are somewhat on the topic.

How do you test new models?

Basic prompt
Ksampler Efficient
CFG, STEPS, Scheduler
>>
File: Grid.jpg (879 KB, 5760x2048)
879 KB
879 KB JPG
>>
>>101315079
just restart your modem, you should have a dynamic ip address so you'll get a new one everytime you turn it on
>>
>>101318679
Cool little guy
>>
i desire a local model with the anatomy of kolors and the sovl plus architecture of pixart
>>
File: montage.jpg (2.24 MB, 8294x958)
2.24 MB
2.24 MB JPG
>>101317793

oh god
>>
>>101319289
So basically Pony
>>
>>101319608
>When she's on her period
>>
>>101319608
Let's go even higher and further beyond
>>
>>101319609
>Pony
>sovl
no
>>
File: montage.jpg (1.59 MB, 7680x591)
1.59 MB
1.59 MB JPG
>>101319621
this one is even funnier

> i forgive you <3 !
>was what you thought i was going to say !!!
>>
>>101319628
See >>101319608 up from 0.85
That, dear anon, screams sovl to me.

Lack of sovl in Pony gens is nothing but a skill issue.
>>
>>101319735
no, sorry, i like pony but that's not sovl. she has that same plastic yucky look in her eyes that's in every 2.5d pony slop gen.
>Lack of sovl in Pony gens is nothing but a skill issue.
maybe, but i'd say most of them look soulless and it's not the prompter's fault.
>>
File: tmpg2gj3jjx3.png (565 KB, 900x1024)
565 KB
565 KB PNG
>>
File: tall2.jpg (230 KB, 1624x1120)
230 KB
230 KB JPG
Gen some surrealism
>>
File: Kolors_00006.png (1.05 MB, 832x1216)
1.05 MB
1.05 MB PNG
>>
>>101319735
>>101319950

you can use fucking sketches in img2img, what the f are you talking about
>>
>>101316205
>no info about dataset
who cares about that? no one share the dataset for the pretraning because they do it on copyrighted images
>>
>>101316275
if they did the same training on DiT it would be even better, which is a shame because it's the best base model we got so far
>>
File: Kolors_00010.png (1.23 MB, 832x1216)
1.23 MB
1.23 MB PNG
>>
>>101320934
what does that have to do with the sovl factor of a model? i'm talking about pure prompting text to image + refinement with inpainting. pony sloppa still lose
>>
File: Kolors.jpg (1.58 MB, 4924x1728)
1.58 MB
1.58 MB JPG
For a base model, Kolors is blowing everything out of the water, too bad it's still a unet model and the prompt understanding is not great, it's like we're having a base SDXL model if SAI were actually not cucked and more competent
>>
>>101321304
>it's like we're having a base SDXL model if SAI were actually not cucked and more competent
pretty much it. there's another stable cascade based model that hasn't been released yet, so keep an eye out for that too.
>https://jingjingrenabc.github.io/ultrapixel/
>>
File: Kolors_00009.png (1.33 MB, 832x1216)
1.33 MB
1.33 MB PNG
>>
>>101321341
Oh wow, finally we got models that can render high resolution pictures, that's what it actually needs, more pixels to get good details everywhere, and we can see it works well, the hands look good on that one. Do you know if they
>>
>>101321341
>>101321369
... Do you know if they plan on releasing the model or not?
>>
>>101321378
...no i don't actually
>>
Does anyone know what model is this person using? specifically for the faces (NSFW) https://x.com/Doreiko_Ai
>>
>>101321593
novelai, maybe? i think you'll have better luck asking this in the >>>/h/ and >>>/d/ generals
>>
File: 0.jpg (611 KB, 2048x1024)
611 KB
611 KB JPG
>>
>>
File: SDXL_0001.jpg (860 KB, 2048x2048)
860 KB
860 KB JPG
>>101318340
The only decent image I got out of this thing
>>
>>101321700
NTA but /h/ and /d/ are weird... not like the things they are into, but the way they treat diffusion models is weird as fuck. They'll all about muh refiners and muh known characters
>>
File: SDXL_0002.jpg (881 KB, 2048x2048)
881 KB
881 KB JPG
>>101322249
>>
>>101322257
>They'll all about muh refiners and muh known characters
to be fair, both are pretty important for porn pictures
>>
what vae should i use for pony?
>>
File: 0.jpg (337 KB, 1024x1024)
337 KB
337 KB JPG
>>
File: SDXL_0006.jpg (768 KB, 2048x2048)
768 KB
768 KB JPG
>>101322249
upscaler fuck up
>>
>>101322307
sdxl vae?
>>
File: orb1i.jpg (203 KB, 1304x1304)
203 KB
203 KB JPG
>>
dead worthless general no one ever wanted or asked for
>>
File: 0.jpg (310 KB, 1024x1024)
310 KB
310 KB JPG
>>
>>101322679
zzzzzz.... mimimimi... zzzzzz.... mimimimi
>>
>>101322679
Go back to your hugbox general where you can spam 1girls with a bunch of 14 year olds :)
>>
>>101321948
Neat
>>
>>101316063
What the fuck is this?
>>
this thread has better images
>>
>>101323287
What do you think it is?
>>
>>101323415
4bit quantization for Kolors
>>
>>101312179
How far along is you model, anon?
>>
>>101323427
quantizing a unet architecture to 4bit is a really bad idea, it has been tried before and it was a disaster
>>
File: 00330-3467391478.jpg (920 KB, 1260x1680)
920 KB
920 KB JPG
>>
>>101323524
Good job it only applies to the large language model then
>>
File: 01582-710939302.jpg (840 KB, 1814x1210)
840 KB
840 KB JPG
>>
File: 00325-3467391480.jpg (501 KB, 1260x1680)
501 KB
501 KB JPG
>>
>>101323611
cool
>>
>>101323842
I like this one better
>>101323658
>>
>>101323842
>>101323949
ty dudes
>>
>>101319608
Interesting how .85 looks odd compared to the numbers around it
>>
File: 0.jpg (503 KB, 1024x1024)
503 KB
503 KB JPG
>>
>>101319608
based reddit gen
>>
>>101322384
>>101322644
>>101323629
>>101324241
pondering
>>
File: orb6i.jpg (267 KB, 1304x1304)
267 KB
267 KB JPG
>>101324477
>>
>>101322679
anon sleeping
>>
>>101324977
working
>>
>>101313199
VAE designed by members of 16chan
>>
>>101313339
Why aren't we doing 32 channels already, since it clearly looks better than 16? VRAM requirement issues?
>>
>>101325133
=(

Hope you had your normal amount of coffee today
>>
>>101325401
You can do it on T5 in PixArt.
>>
>>101325484
wdym?
>>
>>101325401
People are already saying 16 channels is harder to train and converge with.
>>
>>101325511
Forget what my retarded ass said. That's why I don't like to get up from my chair.
>>
>>101325401
higher vae channels makes it harder to train the model i believe
>>
>image.webp
could you not
>>
>>101325719
try right click copy image to clipboard and then paste it onto the reply box
>>
File: ComfyUI_Kolors_00645_.png (1.31 MB, 1216x832)
1.31 MB
1.31 MB PNG
What are you fellows drinking today?
>>
File: file.png (1.6 MB, 1024x1024)
1.6 MB
1.6 MB PNG
>>101325759
water (no orange slices)
>>
Hibernation mode
>>
>>101326483
Nothing to talk about and we are all out of fresh prompts
>>
File: ComfyUI_Kolors_00679_.png (1.52 MB, 1216x832)
1.52 MB
1.52 MB PNG
half-assed prompt for kolors trying to get it to make samus aran since i'm playing super metroid.
>>
>>101326840
Looks good, got anymore?
>>
File: file.png (1.42 MB, 1920x2048)
1.42 MB
1.42 MB PNG
>>
File: ComfyUI_Kolors_00688_.png (1.62 MB, 1216x832)
1.62 MB
1.62 MB PNG
>>101326915
A few but they're basically all the same sort of thing.
>>
File: file.png (2.41 MB, 1920x2048)
2.41 MB
2.41 MB PNG
>>
do you guys think you'll be able to run the new lumina model when it releases?
>>
>>101327048
Biden's signature after his stroke.
>>
>>101327070
i imagine not
>>
File: file.png (1.45 MB, 1920x2048)
1.45 MB
1.45 MB PNG
>>
>>101327070
Don't think so. Damn these 4090 prices
>>
>>101327612
Get AMD code up to speed and we can live rich with two 7900xtx
>>
>>101327627
Need a miracle for it.
>>
>>101327651
ran out of those in 2020
>>
File: CLD.20241.jpg (181 KB, 1056x1400)
181 KB
181 KB JPG
>>
File: ComfyUI_00004_.png (325 KB, 512x512)
325 KB
325 KB PNG
>>
File: PA_0001.jpg (499 KB, 2560x1536)
499 KB
499 KB JPG
>>
File: Kolors_00012.png (1.36 MB, 1024x1024)
1.36 MB
1.36 MB PNG
>>
File: Kolors_00014.png (1.34 MB, 1024x1024)
1.34 MB
1.34 MB PNG
>>
>>101312179
Is there anything I can run in a Docker Container that uses CPU?
Also, does anyone know if using OpenCL requires any special kind of Docker configuration to access that hardware?
I've got an RK3588 that I'd like to try running Diffusion on.
Yes, I know it'll be slow as balls, but if there's software that incorporates upscaling low-res images, it'd be nice to be able to host that on my RK3588 homeserver as opposed to getting queued on online services (which generally don't provide you with a lot of control).
>>
File: Kolors_00018.png (1.34 MB, 1024x1024)
1.34 MB
1.34 MB PNG
>>
>>101329048
SD models will be around 1~2 min for 512x512
Pixart 3~5 min for 1024x1024
>>
Straight from the oven...
>>101329150
>>101329150
>>101329150
>>
So I'm using easy diffusion, my stable diffusion install took a shit when KDE took a shit and I tried to update everything. No other GUI will work (I got used to Automatic but it was six months ago) now with Easy it's...easy, and does what I want but I don't know how integrate like, adetailer in to fix faces and it's built in face "fix" is a literal horror show


wat do

I'm dumb, so...
>>
File: prompt.png (2.39 MB, 1377x688)
2.39 MB
2.39 MB PNG
Time to start learning Chinese. The meh prompt adherence seems to stem from the model's understanding of English being inferior to Chinese.

Left is English prompt, right is chinese.

A dog next to a cybernetic cat, the dog has a coat with the number "1" written on it |
一只狗旁边有一只赛博猫,狗身上穿着一件写有数字“1”的外套
>>
File: pro.png (1.68 MB, 1139x574)
1.68 MB
1.68 MB PNG
Jesus Christ riding a stegosaurus while smoking a cigar and wielding a thor's hammer. He is calling down lighting in the chaotic scene around him |
耶稣基督骑着一只剑龙,一边抽着雪茄,一边挥舞雷神之锤。他在混乱的场景中召唤闪电
>>
>>101330324
Can I just feed it to Google translate?
>>
File: pro.png (1.44 MB, 1031x515)
1.44 MB
1.44 MB PNG
>>101330378
I'm afraid not. You must now learn Mandarin chinese.

sonic the hedgehog riding in a new york taxi while giving a peace sign |
索尼克坐在纽约出租车里,做着胜利手势
>>
>>101330398
heh heh time to see if these yellow devils have a word for vagina
>>
>>101330324
>>101330359
>>101330398
why are they downscaled?
>>
>>101330507
I just screen capped the output because I couldn't be bothered to save and stich the image together and I didn't have an output grid.
>>
>>101313819
Shadows make no sense
>>
>>101331743
Welcome to AI



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.