[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Board
Settings Mobile Home
/g/ - Technology


Thread archived.
You cannot reply anymore.


[Advertise on 4chan]


File: tmp.jpg (1.07 MB, 3264x3264)
1.07 MB
1.07 MB JPG
General dedicated to creative use of free and open source text-to-image models

Previous /ldg/ bread : >>101282848

>Beginner UI
Fooocus: https://github.com/lllyasviel/fooocus
EasyDiffusion: https://easydiffusion.github.io
Metastable: https://metastable.studio

>Advanced UI
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
SwarmUI: https://github.com/mcmonkeyprojects/SwarmUI
InvokeAI: https://github.com/invoke-ai/InvokeAI
ComfyUI: https://github.com/comfyanonymous/ComfyUI

>Auto1111 forks
SD.Next: https://github.com/vladmandic/automatic
Anapnoe UX: https://github.com/anapnoe/stable-diffusion-webui-ux

>Pixart Sigma & Hunyuan DIT
https://huggingface.co/spaces/PixArt-alpha/PixArt-Sigma
https://huggingface.co/spaces/Tencent-Hunyuan/HunyuanDiT
Comfy Nodes: https://github.com/city96/ComfyUI_ExtraModels

>Use a VAE if your images look washed out
https://rentry.org/sdvae

>Models, LoRAs & training
https://civitai.com
https://huggingface.co
https://aitracker.art
https://github.com/Nerogar/OneTrainer
https://github.com/derrian-distro/LoRA_Easy_Training_Scripts

>Animation
https://rentry.org/AnimAnon
https://rentry.org/AnimAnon-AnimDiff
https://rentry.org/AnimAnon-Deforum

>Index of guides and other tools
https://rentry.org/sdg-link
https://rentry.org/rentrysd

>View and submit GPU performance data
https://vladmandic.github.io/sd-extension-system-info/pages/benchmark.html
https://docs.getgrist.com/3mjouqRSdkBY/sdperformance

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
sd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium

>Share image prompt info
https://rentry.org/hdgcb
https://catbox.moe

>Related boards
>>>/h/hdg
>>>/e/edg
>>>/d/ddg
>>>/b/degen
>>>/vt/vtai
>>>/aco/sdg
>>>/trash/sdg
>>
File: ComfyUI_00005_.png (2.77 MB, 2304x960)
2.77 MB
2.77 MB PNG
I think XL can do decent backgrounds. it heavily depends on the checkpoint i think
>>
Blessed thread of frenship
>>
>>101292106
Post the Asuka.
>>
File: file.jpg (773 KB, 1664x2304)
773 KB
773 KB JPG
>>
File: 00009-689655149.jpg (420 KB, 1576x1120)
420 KB
420 KB JPG
>>
File: ComfyUI_00007_.png (2.83 MB, 2016x1152)
2.83 MB
2.83 MB PNG
>>
File: 00033-792981613.jpg (1.17 MB, 2016x2592)
1.17 MB
1.17 MB JPG
>>101292206
no, but I give you a different one
>>
nice, I got in the OP
>>
>>101292365
nice
>>
File: file.png (1.41 MB, 832x1152)
1.41 MB
1.41 MB PNG
>faglooo
>>
File: 00080-2179116166.png (838 KB, 1024x1024)
838 KB
838 KB PNG
I finally made an image look like it's actually at night. Some things that helped was adding backlighting and highlights to the negative and cranking them up a bit plus the vectorscope.
>>
>>101292856
when I messed around with low light lora the adetailer would then turn a meh face into a derp face, lora worked best as a mix of shadow and light vs just dark
>>
>>101292900
Which lora was that?
https://civitai.com/models/375839?modelVersionId=419709
This one?
>>
File: file.png (1.12 MB, 832x1152)
1.12 MB
1.12 MB PNG
>>
>>101292939
https://civitai.com/models/432511
>>
>>101292365
I made a pic from the op a few months ago, I didn't expect anyone to like it.
>>
>>101293010
here is a fun one. 1.5 contrast loras to da max
>>
File: kolors.jpg (180 KB, 766x767)
180 KB
180 KB JPG
does anyone know how to run this on comfy?
https://huggingface.co/Kwai-Kolors/Kolors
it released today, on the paper one of the prompts they tried was the same one from the dalle3 paper (picrel)
https://github.com/Kwai-Kolors/Kolors/blob/master/imgs/Kolors_paper.pdf
>>
>>101292129
Oh yeah it can do them and dark tones, shadows too. You just have to use half of the prompt to describe them.
>>
>>101292770
Liberal mating call
>>
>>
File: 00032-2251818790.jpg (932 KB, 1260x1680)
932 KB
932 KB JPG
>>
>>
File: SD_0001.jpg (242 KB, 1152x1920)
242 KB
242 KB JPG
>>
File: SD_0008.jpg (344 KB, 1152x1920)
344 KB
344 KB JPG
>>
File: SD_0010.jpg (329 KB, 1152x1920)
329 KB
329 KB JPG
>>
What the fuck is BREAK and how do I get gud at it?
>>
File: SD_0012.jpg (308 KB, 1152x1920)
308 KB
308 KB JPG
>>
>>101291557
>>101291571
>>101291580
>>101291581
catbox pls???
>>
>>101293993
I'm still working on it but can share what I got.

Image is 1024x1024

Break it down to 512x512 tiles

Fill them with prompt. Let's say

Top left corner night stand with cellphone on top, cracked screen, broken screen, BREAK

top right corner person laying on the bed, hugging pillow, BREAK

35mm photograph, film, bokeh, professional, 4k, highly detailed
>>
File: SD_0014.jpg (269 KB, 1152x1920)
269 KB
269 KB JPG
>>
File: SD_0015.jpg (297 KB, 1152x1920)
297 KB
297 KB JPG
>>
File: SD_0016.jpg (280 KB, 1152x1920)
280 KB
280 KB JPG
>>
File: SD_0017.jpg (242 KB, 1152x1920)
242 KB
242 KB JPG
>>
>>
File: head_final3.png (3.85 MB, 2000x1456)
3.85 MB
3.85 MB PNG
Babe wake up, a new imagegen base model got released
https://huggingface.co/Kwai-Kolors/Kolors
>>
File: pobrane (1).png (198 KB, 395x512)
198 KB
198 KB PNG
>>101294169
>mfw
>>
>>101294365
lmao, where did you find this?
>>
>>101294378
>lmao, where did you find this?
https://github.com/Kwai-Kolors/Kolors/blob/master/imgs/Kolors_paper.pdf
that's the link from "technical report" in their huggingface introduction
>>
>>101292856
Why don't you img2img a completely black input image?
>>
File: 00145-590221378.jpg (607 KB, 1260x1680)
607 KB
607 KB JPG
>>
File: 00150-590221375.jpg (528 KB, 1260x1680)
528 KB
528 KB JPG
>>
https://stability.ai/news/license-update
>We acknowledge that our latest release, SD3 Medium, didn’t meet our community’s high expectations.
It's always our fault to them isn't it? Now that the licence has changed for the "better", will civitai integrate their models now or is it still not enough?
>>
>>101294783
It's our fault that it failed. It's our way to tell them "I don't want it"
>>
>>101294783
Took them one fucking month to do this? And when will they release the big gun? No one care about this shitty ass 2b, gtfo SAI
>>
>>101294169
>unet
DOA, the new paradigm is DiT now
>>
>>101294783
https://xcancel.com/StabilityAI/status/1809274908641489160
So many cocksuckers on the comment sections, looks like SAI is banning a lot of people on twitter kek
>>
File: kek.jpg (125 KB, 1806x1172)
125 KB
125 KB JPG
>>101294924
That one is good kek
>>
>>101294032
I haven't seen BREAK used with location before, why use this over regional prompting?
>>
>>101294999
That's someone trying to figure out what BREAK does lol
>>
File: ok chang.png (58 KB, 893x554)
58 KB
58 KB PNG
>>101293265
>>101294169
>For the human evaluation, we invited 50 imagery experts to conduct comparative evaluations of the results generated by different models. The experts rated the generated images based on three criteria: visual appeal, text faithfulness, and overall satisfaction. In the evaluation, Kolors achieved the highest overall satisfaction score and significantly led in visual appeal compared to other models.
Hm...
>>
>>101295200
>Chinks lying
color me shocked
>>
>>101295200
>>101295217

Have either of you ran it?
>>
File: sample_test.png (1.61 MB, 1024x1024)
1.61 MB
1.61 MB PNG
From Kolors

>Jesus Christ holding Mjolnir and casting thunder while riding a green T-Rex on Mars. The Martian landscape features red rocks and a dusty sky, with the Earth visible in the background. Jesus is depicted in traditional robes with a glowing halo, and Mjolnir is surrounded by lightning. The T-Rex is roaring, with sharp teeth and muscular build, its green scales contrasting against the red Martian terrain. The scene is dramatic and surreal, blending biblical and mythical elements in an extraterrestrial setting.
>>
>>101295225
Nope, but the simple fact they decided to show this picture on their paper means a lot >>101294365
>>
File: 00198-590221381.jpg (484 KB, 1260x1680)
484 KB
484 KB JPG
masamune shirow
>>
>>101295225
Unless inference is quicker than XL I don't really care. Plus what >>101294870 said.
>>
>>101295259
I think it's the same architecture as SDXL, Lykon said that
https://x.com/Lykon4072/status/1809535806496772325
>>
>>101293394
>bunline_v7
tried v8 yet?
>>
File: 00223-590221379.jpg (814 KB, 1260x1680)
814 KB
814 KB JPG
>>
>>101295233
how did you use that model?
>>
>>101295296
downloading nau
>>
>>101295233
Interesting it has similar pose and stuff on first try
>>
>>101295259
This place really is just /sdg/ 2.0
>>
>>101295233
someone please try
"woman lying on the grass"
as prompt and we know if this model is worth anything
>>
>>101294445
>>101294482
I miss this fantasy aesthetic
>>
>>101295379
>improves on the original release
I agree
>>
File: SD_0018.jpg (212 KB, 1152x1920)
212 KB
212 KB JPG
>>
>>101295427
2.0 doesn't always means "improvement on the previous versions", look at SD2.0 for example kek
>>
>>101295379
Elaborate?
>>101295455
Fair point kekd
>>
File: 00315-590221379.jpg (662 KB, 1260x1680)
662 KB
662 KB JPG
>>101295419
yeah me too, luckily there's SD
>>
>>101295481
Very nice
>>
>>101295455
exceptions confirm the rule, and here we proceed by the rule of cool
>>
>>101295463
Brainless 1girl posting with no care to ever try experimental models beyond what shitty comfy workflow you made months ago and now use to shit out 5000 gens every thread.
That's the essence of sdg.
>>
Ok hear me out. IP face adapter is amazing but the XL version sucks, as does photomaker. Also it's very limited in customization (I can't even get it to add makeup on the subject's faces).
I think that makes face IP with a good SD1.5 model a perfect candidate for data augmentation when you only have a handful of pictures.
>>
File: file.jpg (1.19 MB, 1664x2304)
1.19 MB
1.19 MB JPG
>>101293154
just ipadapter
>>101294021
https://files.catbox.moe/slcqcv.png
>>101295531
>Brainless 1girl posting
>no care to ever try experimental models
you realize this is the abstract pixart thread, right? nothing wrong with 1girls tho
>>
>>101295531
>experimental
What's experimental about Kolors?
>>
File: 00236-590221378.jpg (728 KB, 1260x1680)
728 KB
728 KB JPG
>>101295545
>I think that makes face IP with a good SD1.5 model a perfect candidate for data augmentation when you only have a handful of pictures.
I did this when I had just one picture to begin with, but it wasn't easy to get actually decent matching pics.
>>
>>101295548
come on man
>Local Diffusion
>>
be the change you want to see
>>
File: 3girls-band.jpg (193 KB, 1024x1536)
193 KB
193 KB JPG
>>101295548
>nothing wrong with 1girls
what about 3girls?
>>
>>101295579
It's hot out of the scientists research lab, how could it possibly get more experimental?
>>
>>101295579
It uses English to Chinese translation.
Then random bot will use that translation to gen your image.
if you didn't get what you wanted it is still in beta/experimental due to poor translation bot is only as good as the vocab you download.

Here is prompt to try
>1989 Tiananmen Square parade
>>
File: SD_0020.jpg (265 KB, 1152x1920)
265 KB
265 KB JPG
>>
>>101295681
>It uses English to Chinese translation.
>Then random bot will use that translation to gen your image.
Doesn't Hunyuan have the same ability?
>>
>>101293010
I forget what these fuckers were called but finally I get to see the infamous pillar monks of early christianity
>>
File: file.jpg (878 KB, 1664x2304)
878 KB
878 KB JPG
>>101295590
depends on the input image but yeah more is always better
>>101295624
more is always better heh
>>
>>101295681
can it do
"Free Tibet!!
or
Xi Jinping as Winnie-the-Pooh
>>
File: 0.jpg (429 KB, 2048x1024)
429 KB
429 KB JPG
>>
>>101295746
No clue, says "credit low" or something like that
>>
File: SD_0029.jpg (347 KB, 1152x1920)
347 KB
347 KB JPG
>>
>>
Why do people always scream about censorship in chinese models when literally none of them ever do and often let you get away with more shit than western models do?
Like the stereotype does not match with reality.
>>
>>101295804
kek
>>101295818
"chink" without any further punchline is funny to anon because he still has the humor of a young lad
>>
File: 00010-2446086210.jpg (205 KB, 1288x1288)
205 KB
205 KB JPG
>>101295804
>>
>>101295804
excellent
>>
File: file.jpg (1.28 MB, 1664x2304)
1.28 MB
1.28 MB JPG
>>
Let me ask just this one little thing.
So every single image generation model uses sampling to speed the generation process up. Instead of 1000/whatever steps we're able to achieve decent results with just 20-30 steps.
Why is this not applicable to training? Why can't we use sampling during training to possibly speed it up?
>>
>>
>>101296015
Not bad anime for a non-anime model. The example images looked better than any other pixart checkpoint.
>>
>>101295242
Really nice. Do you post ecchi stuff somewhere?
>>
>>101296014
Training is typically done at 20 steps.
>>
>>101296098
>Training is typically done at 20 steps.
Does this statement represent the entirety of /ldg/ opinions?
>>
File: file.jpg (1.43 MB, 1664x2304)
1.43 MB
1.43 MB JPG
shit, my vaes starting to bruise
>>
>>101296073
Nope. Are people interesting in seeing that sort of stuff?
>>
File: Not AI.jpg (18 KB, 254x393)
18 KB
18 KB JPG
>>101296014
Because every image while training to a machine looks like this to it.

The only way to speed it up would be through tags of the image. Like in the example: Car from side driver side two tires custom rims rear door is suicide door

which would come in later stages but helps to speed it up. But in then we're lacking in whole lot of data because now all cars will have windows and they are never broken. or tail lights or custom tires if none of images have them.

Simply put.

Alphabet -> Words -> Dictionary of said words -> Dictionary with description of said words -> Dictionary with examples in the sentence -> Books -> Books of specific genre -> Published papers.

and after all that training is done we use it as

pos:
Hot piece of ass, ooga ooga, blonde, make it look good
neg:
ugly, fugly,
>>
>>101296155
>pos:
>Hot piece of ass, ooga ooga, blonde, make it look good
>neg:
>ugly, fugly,
Sounds good. Post output
>>
File: 00154-841192235.png (3.69 MB, 2688x1536)
3.69 MB
3.69 MB PNG
Still using Forge to do my gens, is it a good idea to change?
I have a 3060, VRAM was never that much of an issue, but Forge let me free up enough so that other stuff could run fine and I could just gen and gen while doing other stuff on the puter
>>
>>101296155
Are you an unironical schizo or is your English that weak?
>>
>>101296209
if it aint broke dont fix it however there's no harm in trying comfyui
>>
>>101296232
I actually did back in SD1.5 era but it alongside OG A1111 broke apart and just made garbage gens despite me not changing the prompt
._.
>>
>>101296217
Which part didn't reach your low IQ?
>>
File: cqe3oh0yys6d1.jpg (259 KB, 1750x984)
259 KB
259 KB JPG
Any of you bored and willing to fuck around with something, frens? How good are these models with Minecraft pictures? i would like a hand to give img2img a try on these old screenshots to see if they can be salvaged.
>>
>>101296252
Let's start with
>Because every image while training to a machine looks like this to it.
>>
>>101296037
I forgot it's not animu model lol.
>>
>>101296217
>>101296280
its not that hard to extrapolate what anon is saying kek
>>
>>101296280
>asks dumb questions
>gets mad that someone didn't proofread for 5 minutes
Your behavior is the reason why half of the time your dumb questions get ignored.
>>
>>101296280
and now you understand how machine learning works.
>>
>>101296209
You might want to switch to vanilla auto once their dev branch merges with main, since it should include forge-lique optimizations. For now it still makes sense to stick with forge.
>>
>>101296274
>if they can be salvaged.
How do you mean?
>>
>>101296368
I don't see how your answer is related to my question at all. If you want to explain something, you must provide all related details and why they are related. You just threw some arbitrary statements with no connection whatsoever and expect me to read your Indian mind.
>>
>>101296465
>>>https://letmegooglethat.com/?q=am+I+retarded%3F&l=1
>>
>>101296465
It's obvious that you just want to be pissy and waste everyone's time rather than get an answer. Zoomers are so fucking doomed because the Low T killed their initiative so they need to be hand held and spoon fed for everything and when they get any push back they turn into pissy passive aggressive women. Fix your life, anon.
>>
File: 1700182661292.png (1.43 MB, 1280x1760)
1.43 MB
1.43 MB PNG
>>101294021
my comfui folder got wiped and i had to redownload these gens from the archive.
there's no embedded data so no point in posting it through catbox.
use noiseoffset lora by stableyogi and one of his checkpoints youll get the same effect
also koyha hires fix and freeuv2
godspeed
>>
>>101292305
yes
>>
>anon activated the spastic card
[monkeypaw autism intensifies]
>>
>>101296504
>resorted to insults
A pathetic sight, truly. You can't even link a source or two for your schizo claims for me to read further if you can't bother to explain yourself, yet you are bold enough to declare that you did, in fact, answer my question. How is that supposed to be an answer?

Let's try this again: Why aren't samplers used to train models?
>>
File: 17196337901196.png (2.76 MB, 3840x2160)
2.76 MB
2.76 MB PNG
>>101296464
They're completely artifact ridden to the point is super hard to get a satisfactory upscaling from them and wondered what an img2img using them as a base instead would look. It's hard to explain. Minecraft's screenshots have a certain "noise" that gives them a charm, imagine using a block filter for downsampling but slightly different (compare pic related with >>101296274) Well, cheap jpeg compression kills all of it, but these shitty screenshots are all what's left from an old server i used to be in.
>>
>>101296649
If you're so smart you can find the answer yourself. Oh wait, you're stupid :)
>>
File: Sigma_04213_.png (2.43 MB, 1024x1024)
2.43 MB
2.43 MB PNG
Breathe in, gen out
>>
File: Sigma_04147_.png (856 KB, 768x1280)
856 KB
856 KB PNG
>>
>>101296674
Don't you feel a little tickle on your balls when replying like this? I just asked a little question, why have you started clowning? Is this how /ldg/ usually answers questions?
>>
File: Sigma_04258_.png (1.62 MB, 832x1216)
1.62 MB
1.62 MB PNG
>>
>>101296755
It's obvious what you really want is to be pissy. I'm actually confident if someone engaged you in good faith you'd manage to make it pissy. If you want a friend to ask stupid questions too who is infinitely patient, talk to ChatGPT. I heard it will take as many stupid questions as you can give it.
>>
>>101296755
i decided to follow the replies to see whats up.
he gave you a comprehensive answer and you replied by insulting him.
also the question you asked gives me the vibe that you're a midwit
you must be a chinese agent or something
>>
File: cui_00126_.jpg (1.04 MB, 1664x2304)
1.04 MB
1.04 MB JPG
>>101296687
indeed
>>
>>101296775
>>101296794
Guys, it's the resident professional idiot. Easy to spot from passive aggressive seething
>>
>>101296830
yeah i know. he's been doing this for years now
has to be asian
>>
File: PA_0019.jpg (1.03 MB, 3328x1152)
1.03 MB
1.03 MB JPG
>>
File: A TROON.png (172 KB, 254x393)
172 KB
172 KB PNG
>>101296794
>why the sky is blue?
>Because you see this picture right? It explains everything.
That's how a COMPREHENSIVE answer I was looking for, thank you very much!
>>101296830
Legit mindbroken retard.
>>
>>101296856
It's really easy to ask time wasting questions. It's always funny when you come around because you always ask a bunch of technical stupid questions, get pissed when people give as much effort replying as you do asking them, and then you spend the rest of the day seething about it and making passive aggressive womanly comments.
>>
File: wtf.png (238 KB, 567x543)
238 KB
238 KB PNG
>>101296663
im trying to think about what kind of post processing they did with this but i have never seen this before
very odd
>>
File: PA_0020.jpg (649 KB, 3328x1152)
649 KB
649 KB JPG
>>
File: square.jpg (16 KB, 600x600)
16 KB
16 KB JPG
>>101296856
Is this a triangle?
>>
File: PA_0021.jpg (705 KB, 3328x1152)
705 KB
705 KB JPG
>>
>>101296562
thanks anon, Ill try them out
>>
File: 00497-590221380.jpg (376 KB, 1260x1680)
376 KB
376 KB JPG
>>
>>101296870
>It's always funny when you come around because you always ask a bunch of technical stupid questions, get pissed
Wait, is this real real? I just assumed to strike some interesting statements out of you since you looked like you just lost it, but looks like I was right. So /ldg/ really is actively trying to confuse and derail by making up the whole "resident professional idiot" thing and mocking anyone who tries to figure things out and pokes around, instead of at least setting them on the right path, huh?
>>101296896
Looks like a straight line in a positively-curvatured space projected onto a flat 2D surface to me.
>>
>being this purposely obtuse
>>
>>101297062
There is no right path with you, you ask stupid time wasting questions, you always find a reason to get pissed, and then you spend hours talking about how everyone is le mean to you and you're actually here in good faith. Anon, go use ChatGPT if you actually want answers and here's what's crazy: it will even answer follow up questions! You can just ask stupid questions all day and do nothing productive just like you do here. But it's really obvious now that what you want is to be pissy and you can't have that with ChatGPT.
>>
File: PA_0023.jpg (855 KB, 3328x1152)
855 KB
855 KB JPG
>>101296912
fixed version
>>
File: 1718653296862157.png (520 KB, 1024x1024)
520 KB
520 KB PNG
Very late sorry anon
>>
File: tmpwzb50spx.png (1.73 MB, 1744x984)
1.73 MB
1.73 MB PNG
>>101296274
>>
>>101297084
Have you even tried using ChatGPT for this stuff before sperging out this ""advice""?
>>
>>101297084
I think if you tell ChatGPT to be arrogant in all its replies it will accomplish that goal.
>>
>>101297118
Yes I was able to get an answer from ChatGPT for your stupid question. I was going to distill it for you earlier but then you became a pissy woman at the other anon.
>>
>>101297099
You're like 3 hours late!
>>
File: PA_0024.jpg (1.07 MB, 3328x1152)
1.07 MB
1.07 MB JPG
>>
>>101297136
>Yes I was able to get an answer from ChatGPT for your stupid question
Ah, I see, that's probably the reason it looked so schizophrenic.
>>
File: 1718646347616061.png (597 KB, 1296x760)
597 KB
597 KB PNG
>>101297138
The damage has been done :o
>>
>>101297150
See, this is what you do all the time. You just want to be pissy and troll and then say "haha I was just pretending to be retarded :^)"
>>
File: PA_0025.jpg (710 KB, 3328x1152)
710 KB
710 KB JPG
>>
File: PA_0026.jpg (784 KB, 3328x1152)
784 KB
784 KB JPG
>>
>>101297114
nice
>>101297169
have you tried the new bunline?
>>
>>101297160
>pretending ChatGPT gives accurate, totally non-schizophrenic and on-topic answers
Ah, I see now how you attempted to troll me.
>>
File: 1718653780188478.png (349 KB, 1024x1024)
349 KB
349 KB PNG
>>
>>101297184
No, I'm still on PixArt only. I'll download it.

Any requests for first run?
>>
>>101297228
>he lost it
>>
>>101297220
Correction, PixArt with Booru69 mix
>>
>>101297220
>requests for first run?
The prompt from >>101279231 image
>>
File: PA_0029.jpg (634 KB, 3328x1152)
634 KB
634 KB JPG
>>101297266
>>101297266
pos:
a colony of humans living on a distant planet,  luminous, black, unusual, by Leonid Afremov, Carl Spitzweg, by Takeshi Obata, Amber Brown hue, highly dramatic lighting

neg:
2girls1cup
>>
>>101297282
Very pretty
>neg:
>2girls1cup
Kek
>>
>>101297251
>Booru69 mix
What's that?
>>
File: ComfyUl_0001.jpg (713 KB, 2048x2048)
713 KB
713 KB JPG
>>101297266
>>
File: ComfyUl_0002.jpg (1.14 MB, 2048x2048)
1.14 MB
1.14 MB JPG
>>101297338
https://civitai.com/models/490203/booru-madness

For me it just allows PixArt to understand
pos:
cyberpunk [foggy cityscape in the distance : very pretty girl, city:0.43] immersed in a giant flow of wind, 

neg:
not a fart
>>
>>101297282
>>101297377
Oh god, I think reading this whole thread impacted my brain for prompting
>>
I see my question got completely derailed, so let me ask it for the third fucking time.

Given how different samplers dramatically save time needed to generate a picture by "rescaling" timestep count of the model while retaining most of the quality, why is it not possible to apply similar but reverse approach to training diffusion models? Or is it possible if you solve some problems that may come up, but no one has actually implemented it? What's the state of the matter currently?
>>
File: PA_0032.jpg (920 KB, 3328x1152)
920 KB
920 KB JPG
>>101297266
>>
>>101297419
Since my last reply was not sufficient for you.

https://en.wikipedia.org/wiki/Lossless_compression

https://arxiv.org/pdf/2403.04692
under 3. Framework
all the way until 4.

https://www.itl.nist.gov/div898/handbook/pri/section5/pri521.htm


https://en.wikipedia.org/wiki/Efficiency_(statistics)


https://www.semanticscholar.org/paper/Memory-efficient-DIT-based-SDF-IFFT-for-OFDM-Lee-Kim/082b9a24875e47210eec7e48ff288693d9319ae5
>>
File: PA_0001.jpg (580 KB, 2560x1536)
580 KB
580 KB JPG
>>101297266
Sorry, that was in the mixer. Here it is by it lonesome
>>
File: PA_0002.jpg (594 KB, 3328x1152)
594 KB
594 KB JPG
>>101297536
after removing extra steps in the workflow true first image.
>>
File: PA_0004.jpg (893 KB, 3328x1152)
893 KB
893 KB JPG
>>101269414
>>
>>101297114
This actually looks neat, thank you.
>>
>>101297486
Sorry but I was not asking *you* specifically, PA-schizo. But I skimmed through all the links you provided anyway and not a single one of them mentions the word "timestep", except PixArt one under the Appendix. Please don't waste my or anyone's time anymore and jump off the bridge already.
>>
File: okay google.png (49 KB, 207x205)
49 KB
49 KB PNG
>>
>>101296871
My guess is some ordered dithering plus downsampling. For what is worth, this screenshot was from Beta and has many weird glitchy artifacts if you look at it closely.
>>
>>101297631
Sure thing bud, come with me to the nearest one.
>>
File: PA_0007.jpg (525 KB, 3328x1152)
525 KB
525 KB JPG
>>
>>101297419
Because training the model to understand how to use noise is also part of the process and if you try to shortcut it you make the model less robust and have less quality in the same way that lower bit training degrades the quality.
>>
File: PA_0008.jpg (801 KB, 3328x1152)
801 KB
801 KB JPG
>>
File: PA_0012.jpg (553 KB, 3328x1152)
553 KB
553 KB JPG
>>
>AttributeError: 'NoneType' object has no attribute 'lowvram
Why am I getting this with every model I try switching to now?
>>
File: 116775249263548927-SD.png (2.51 MB, 1128x1400)
2.51 MB
2.51 MB PNG
hello anons
>>
>>101297832
Your config file is null.
>>
File: 116775249263548929-SD.png (1.88 MB, 776x1400)
1.88 MB
1.88 MB PNG
>>101297842
>>
>>101297860
dark
>>
File: 116775249263548935-SD.png (1.89 MB, 944x1400)
1.89 MB
1.89 MB PNG
>>101297860
looks like a slow day
>>
>>101297693
>less robust and have less quality in the same way that lower bit training degrades the quality
I didn't mean just lowering the timestep count. Or kinda did, but not in the way you thought. Is there really no way to make reducing it both robust and have approximately the same quality? It just seems unnatural, models learn details very slowly I think.
For simplicity, let's assume we use uniform sigma schedule and let's say a single sampler step corresponds to 50 timesteps out of 1000. In this case, we rely on the sampler to get an approximation of what we will get from taking a loss 50 times, then we apply it to the noise. Why is it not possible to somehow reverse this process during training by taking a loss between 500/1000 and 550/1000 noise levels and approximate the loss for 500, 501, ..., 549 timesteps?
>>
File: 116775249263548938-SD.png (2.64 MB, 1144x1400)
2.64 MB
2.64 MB PNG
>>101297948
dark indeed
>>101297950
AI doesnt really know what a fucking ribcage is
>>
>>101297998
Because approximations would be a loss of quality. And the goal here is to give the AI more decisions, each timestep is a potential decision the AI can learn from which is what gives it diversity and robustness as far as I understand. The whole process is unnatural, we're literally just brute forcing noise into images.
>>
File: 116775249263548940-SD.png (2.82 MB, 1416x1400)
2.82 MB
2.82 MB PNG
>>101298022
>>
>>101297998
How many pixels of rotation of this photo >>101296896
until you know it is a cylinder. Count it in pixels that will be your steps.
>>
>>101298057
>each timestep is a potential decision the AI can learn from
Interesting. I think there probably should exist ways to approximate/predict it reliably? Maybe predicting in general will result in some kinds of seed "biases" at inference time if you try to adjust the sampler total steps from 20 to 30 or any algorithmic approach won't yield results better than just squishing the number of timesteps or something like that. Seems like this area is kinda unexplored, or am I missing something?
>>
File: ComfyUI_00015_.png (2.76 MB, 1152x2016)
2.76 MB
2.76 MB PNG
>>
File: ComfyUI_00050_.png (932 KB, 768x1344)
932 KB
932 KB PNG
>>
File: 116775249263548957-SD.png (2.04 MB, 824x1456)
2.04 MB
2.04 MB PNG
>>101298078
>>
>>101292106
I have a 6650XT, can I start dabbling in this stuff? To where it doesn’t take 10 years to generate an image like on these sites?
>>
>>101298475
https://rentry.org/rentrysd
>AMD section

Rough start but I'm sure you'll succeed
>>
>>101298243
You're kind of describing distillation but you need a full model to do distillation, there is no practical way to shortcut the steps during the training without negatively impacting the final model.
>>
File: Sigma_04299_.png (1.61 MB, 832x1216)
1.61 MB
1.61 MB PNG
>>
>>101298336
Checkpoint name?
>>
>>101298760
PonyRealism probably
>>
File: ComfyUI_00062_.png (1.04 MB, 832x1216)
1.04 MB
1.04 MB PNG
>>
>>101298582
I'm yet to see how distillation may be applicable here, but
>there is no practical way to shortcut the steps during the training without negatively impacting the final model
It might make models learn all the features faster. Yeah, some quality loss is expected but it may be easily negligible as long as you can train a full model within 10 epochs instead of 25. Same thing with VAE, it dramatically reduces the computation cost at a relatively small quality reduction. And if you train for 25 epochs then you will probably reach higher quality than if you trained using unmodified process, but that's not a guarantee obviously.

Also it should not be strictly necessary to pick the noise levels at the exact timestep intervals (500->550, 550->600, ...), maybe it can be beneficial to select left and right sides with a degree of randomness (500->550, 346->362, 831->956, ...), thus varying the approximation range.
>>
>>101298839
I'm excited to see your practical experiments
>>
>>101298839
why dont u go try it, sd-scripts is your playground
>>
File: 1718912357582016.png (173 KB, 545x565)
173 KB
173 KB PNG
does anyone know how to make animated thumbnail pictures on Civitai? is it a site function or do i just make my own gif?
>>
>>101298856
>>101298889
I'd like to see papers on the topic honestly. This idea is so fucking obvious, this must have come up sooner.
>>
>>101298956
Let me guess, you're not going to do any work.
>>
>>101298966
I'd rather not do a work if it has already been done previously by the actual researchers, anon.
>>
>>101298983
Yeah it was established you're all talk.
>>
>>101298988
Are you a retard?
>>
>>101299006
No, I am the retard.
>>
>>101297536
>>101297582
NICE
>>101297860
>>101297842
NICE
>>
File: ComfyUI_00020_.png (2.6 MB, 1248x1824)
2.6 MB
2.6 MB PNG
>>
>>101299006
I've never seen someone waste so much time as you, you're the picture definition of an armchair researcher.
>>
>>101299042
I see, you're a troll then. Did I claim somewhere that I am a researcher?
>>
>>101299072
You're clearly talking like one. You have an idea for a superior training method but all you do is talk talk talk. And now you want someone to do homework to see if someone already did your obviously superior way to train models. As I said before, you're just here to waste everyone's time and as always this devolves into you being a pissy woman because it's clearly obvious all you want to do is sit back in your armchair and make other people do work for you. Anon, how about you shut the fuck up and get off your ass and do some work. All you do is talk and it's irritating because it's clear you have no intention of doing anything. You're an information leech.
>>
File: 1718648531949146.png (250 KB, 732x768)
250 KB
250 KB PNG
Late again :/
>>
File: sisyphus.png (6 KB, 1797x41)
6 KB
6 KB PNG
>>101299042
>I've never seen someone waste so much time as you
Sounds like a challenge
>>
File: ComfyUI_PixArt_00155_.png (1.72 MB, 1216x832)
1.72 MB
1.72 MB PNG
Good afternoon, gentlemen. I wish you all a very happy evening.
>>
>>101299190
Except you've already done more work than him. He literally does zero work.
>>
>>101299116
I thought you jumped already.
>>
>>101299197
Hullo
>>
File: ComfyUI_PixArt_00164_.png (1.98 MB, 1216x832)
1.98 MB
1.98 MB PNG
Dare you enter the caves of dark secluded shrouded sinisterity?
>>
does having two loras both at 1.0 fuck things up? specifically the fact that there's more than one lora - does that warrant lowering the values?
>>
>>101299540
Loras rape the weights and it will fuck up your generations if you have too many of them. Generally you have to roughly keep your Loras adding up to 1 when added altogether but it all depends how much each of them are raping. Some Loras play nice and some do not.
>>
>>101295531
both threads are controlled by the same anons, most that realized this left, and since there isn't actual discussion, there is no reason to stay.
>>
>>101299559
thanks that's good information.
though I do worry a little if I've got a character lora at 0.75 and a concept lora at 0.25, that I would start to lose the character's appearance...
>>
>>101299590
he's trolling you anon, you should adjust the weight per lora as much as you want it to influence the model
>>
>>101299560
Proof?
>>
>>101299635
>bro if you just turn up all the Loras you'll get the best results :^)
>>
>>101292106
how do i get something like bottom left? i want create cool wallpapers for my pc and phone
>>
>>101299660
That's PixArt

https://github.com/PixArt-alpha/PixArt-sigma?tab=readme-ov-file#-available-models
>>
File: long ick general.png (820 KB, 635x1024)
820 KB
820 KB PNG
>>
File: What.png (1002 KB, 1080x1739)
1002 KB
1002 KB PNG
Release it nigga
>>
>>101299701
thanks, but i'm looking for the proompt
>>
>>101299823
have you tried using your command of the English language?
>>
>>101299212
I mean >>101297647
and show me how to do it.
>>
>>101299823
Use this one for now.
>>101297282
>>
>>101299844
>talking
>no work
>>
File: Here.jpg (6 KB, 251x45)
6 KB
6 KB JPG
>>101299869
>>
>>101299914
>passive aggressive woman who talks about things and never does anything, gets mad when people notice
>>
File: tmpad1bzmcl.png (200 KB, 582x512)
200 KB
200 KB PNG
>>
>>101299212
>>101299844
>>101299869
>>101299914
>>101299924
perfectly laid out uno reverse card holy shit
>>
File: tmpxgprp5wk.png (1.04 MB, 1344x768)
1.04 MB
1.04 MB PNG
>>
>anon working on 16ch 1.5, XL, and sigma VAE
we eatin good

>>101299981
what if i want to tho
>>
>>101300025
>what if i want to tho
at your own risk
>>
>>101300025
>16ch 1.5
is it the one from civitai? it was some french dude

>>101299981
lul
>>
File: tmpgzomjkzw.png (44 KB, 224x225)
44 KB
44 KB PNG
>>
>>101300051
>french dude
sounds french to me https://ostris.com/

>>101300242
heheh
>>
this is my favorite thread to goon in
>>
>>101299434
yes from a synthographic perspective
>>
>>101298336
>>101298078
>>101297950
these are great, share a catbox?
>>
File: Sigma_04356_.png (1.4 MB, 1024x1024)
1.4 MB
1.4 MB PNG
>>
>>101299981
you're not the boss of me
>>
if im not in the next collage im going to kill myself
>>
>>101300783
I think you are in it.
>>
So where should a retard start to learn how to make pretty stuff like you guys do? Also what's the difference between stable diffusion and local diffusion? My GPU is an AMD 6750 XT, would it be worth it to try and make my stuff on this machine?
>>
Loli Diffusion General
>>
Guys, I just installed automatic11111 with pony and some lora's and for some reason webui only uses my cpu, while my 4070 is doing nothing, how can I fix it?
>>
>>101300859
>AMD
It sure how far along it's come but you're probably going to have a really rough time
>>
>>101300859
>>101300914
works just fine on linux, although may be a little slower
>>
File: ComfyUI_00029_.png (3.3 MB, 1536x1920)
3.3 MB
3.3 MB PNG
>>101300859
jump straight into comfyui and develop an addiction.
your prompts will improve naturally over time
>>
>>101300900
What guide did you follow?
>>
>>101300313
Facts
>>
>>101301071
I used just from automatic's github, any good one?
>>
>>101294169
After playing with it on the hugging face page, I am very impressed with the anatomy understanding. Just like Dalle-3, it can do complicated poses with limbs being anatomically correct without needing controlnet etc.
Brilliant model.
Does not always follow the as well as Dalle-3 does, but image quality is very high.
I have to test more, but I like what I see.
>>
>>101301133
can you show us some examples anon? I'm interested
>>
>>101301113
No, that's the one you should be using. In the task manager where you got those charts you can view cpu usage, gpu usage etc, you need to click the downward arrow on one of the gpu ones and select CUDA for it to show actual usage when generating images. Msi afterburner, gpuz doesn't show cuda usage load
>>
File: file.png (4 KB, 180x215)
4 KB
4 KB PNG
>>101301168
I've googled and seen that, but I have no cuda option
>>
>>101301180
So when you click the orange generate button at the default settings, you get an image about a minute later? Because with a 4070 it would finish in about 5 to 10 seconds.
>>
File: tmp9g2qggvm.png (90 KB, 400x225)
90 KB
90 KB PNG
>>101301000
>comfyui
Please don't recommend it to newcomers. There's at least two better alternatives that use it as backend.
>>
>>101301215
I get picture like in 5 minutes
>>
>>101301220
Comfy is fine, just takes...extra week to figure out why the gens you create are crap. Default workflow needs some tuning
>>
>>101301225
Ok yeah that's definitely using your cpu. Are you sure you didn't accidentally miss a step in the guide?
>>
>>101299789
Interesting
>>
>>101301215
If nothing works, you could try using this branch of auto: https://github.com/lllyasviel/stable-diffusion-webui-forge
It's more efficient, and maybe it'll setup the right settings for you automatically.

Please refrain from removing Forge from OP :(

>>101301253
It's fine for experimenting, but you really shouldn't need to bother with spaghetti for very basic tasks. Most of the nodes could easily disappear since you barely ever modify them.
>>
>>101301220
Nta but i know Swarm is one of them, what's the other?
>>
File: ezgif-5-97ce99d25a.jpg (176 KB, 768x1344)
176 KB
176 KB JPG
>>101301146
You can use it on the huggingface website and test it yourself in any way you like.
And if you have a GPU with at least 20GB of VRAM, you can use it locally.

Hopefully the memory requirements come down to some reasonable levels with further optimization, like 16 or even 12.

Here is one example of "complex pose" which I just tried:
>Detailed painting of a man striking a high kick pose. He is wearing white karate outfit.

I took the first image out. You don't get that kind of high quality poses from many models out of the box. I am very impressed.
>>
>>101301316
Metastable, though I don't think it allows you to use and modify nodes as an option, unless they updated for that feature.
>>
>>101301336
Thanks for the info, might give it a try tomorrow
>>
>>101299197
Good afternoon to you too anon!
>>
>>101301259
Was there any step about doing something for it to use gpu?
>>
>>101301328
that's indeed really impressive, I don't think any base model can get something as close to that, that looks very good
>>
>>101301377
Can you link the exact page you followed. He has guides to install for cpu only and I want to double check that wasn't the page you followed.
>>
>>101301328
That's really cool, can you go for something more complex like a woman doing a handstand or a backflip?
>>
>>101301467
https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Install-and-Run-on-NVidia-GPUs
https://github.com/AUTOMATIC1111/stable-diffusion-webui
>>
>>101301479
Ok that's the right page. I'm guessing you chose Method 1. Since they designed that to be noob friendly it's supposed to use your GPU out of the box, no extra steps beyond what you followed there. Maybe give your pc a restart and see if that does anything.
>>
>>101301000
>comfyui
In the instal guide there is only nvidia+windows or amd+linux, what if I have amd+windows?
>>
>>101301328
>>101301377
This has to be bait
No wonder nobody is trying out new models if that's the "progress"
Oh and only 20GB VRAM!!!
>>
>>101301180
Reinstall Nvidia drivers/install CUDA toolkit
>>
>>101301553
I'll try restarting pc, thanks
>>
>>101301628
>I'll try restarting pc
that might unironically work, think I might've had the same issue back in the day
>>
>>101301628
I can't guarantee it will fix it. I installed Auto via the manual method over a year ago.
>>
File: d8ef361thnx01.jpg (159 KB, 1534x597)
159 KB
159 KB JPG
>mix my 5 favorite girls
>feed it to face ID portrait
>it produces extremely consistent pictures of a composite girl
I... I think I have synthesized a waifu. Goodbye external world.
>>
>>101301627
Cuda tool kit? It is included with drivers?
>>
>>101295709
Stylites
>>
Freshly off the oven:
>>101301739
>>101301739
>>101301739
>>
File: ComfyUI_00918_.png (273 KB, 512x512)
273 KB
273 KB PNG
ayo
does comfyUI store all its metadata in an image that you know about?
basically i'm looking to recreate an image that was really abstract
i dont remember my ksampler settings or checkpoint used
i know that the image contains some metadata like the prompt
how would I see all the metadata?
i tried using exiftool but nada
>>
>>101301854
Drag the image into the workspace and the workflow will populate
>>
>>101301901
nice!



[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.