[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Settings Mobile Home
/g/ - Technology

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

File: tmp.jpg (2.19 MB, 3072x3072)
2.19 MB
2.19 MB JPG
General dedicated to the discussion and development of local text-to-image models.

Previous /ldg/ bread : >>101061929

SD3 Edition

>Beginner UI
Fooocus: https://github.com/lllyasviel/fooocus
EasyDiffusion: https://easydiffusion.github.io
Metastable: https://metastable.studio

>Advanced UI
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
ComfyUI: https://github.com/comfyanonymous/ComfyUI
StableSwamUI: https://github.com/Stability-AI/StableSwarmUI
InvokeAI: https://github.com/invoke-ai/InvokeAI

>Auto1111 forks
SD.Next: https://github.com/vladmandic/automatic
Anapnoe UX: https://github.com/anapnoe/stable-diffusion-webui-ux

>Pixart Sigma & Hunyuan DIT
Comfy Nodes: https://github.com/city96/ComfyUI_ExtraModels

>Use a VAE if your images look washed out

>Models, LoRAs & training


>Index of guides and other tools

>View and submit GPU performance data

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
sd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium

>Share image prompt info

>Related boards
File: 0.jpg (285 KB, 1024x1024)
285 KB
285 KB JPG
Blessed thread
File: 1714075415716015.png (1.29 MB, 1024x1024)
1.29 MB
1.29 MB PNG
thanks for the big smile and this collage. now I go to bed for real
Finetune status?

Goodnight, /ldg/gers.
File: 0.jpg (254 KB, 1024x1024)
254 KB
254 KB JPG
File: gyate.png (660 KB, 1024x1024)
660 KB
660 KB PNG
File: 00027.png (1.14 MB, 1088x896)
1.14 MB
1.14 MB PNG
File: Fix_my_vae_0001.jpg (871 KB, 1248x1824)
871 KB
871 KB JPG
File: Widow_0001.jpg (1.15 MB, 1536x1536)
1.15 MB
1.15 MB JPG
What killed this general? Is pixshart dead?
Only SD3 died. Go back to sleep.
File: Widow_0003.jpg (1.06 MB, 1248x1824)
1.06 MB
1.06 MB JPG
>Is pixshart dead?
the pixart guys said on discord that they partnered with Nvidia and that they will make a more powerful base model, so I guess not?
File: angery_00060.jpg (469 KB, 1119x1119)
469 KB
469 KB JPG
File: angery_00092.jpg (291 KB, 1119x1119)
291 KB
291 KB JPG
File: looong dick general.jpg (1.77 MB, 3264x1611)
1.77 MB
1.77 MB JPG
File: angery_00045.jpg (306 KB, 1119x1119)
306 KB
306 KB JPG
>Launch of comfy.org
>The Team:
>HaoHao is passionate about connecting w/ creatives and developers in Comfy ecosystem. Prior to Comfy, he ran software conferences and founded an ed-tech startup. He loves dressing up as a "banana man".
Tell me more, anon.
Aside from a bit of cringe, everything mentioned there sounds reasonable.
>holds up (spork)
File: angery_00015.jpg (326 KB, 1119x1119)
326 KB
326 KB JPG
File: tmpu_i4shm2.png (1.16 MB, 1024x1024)
1.16 MB
1.16 MB PNG
File: 1687913247677084.jpg (1014 KB, 3024x1728)
1014 KB
1014 KB JPG
Cannot help but wonder on what more can be done.
File: angery_00115.jpg (310 KB, 1119x1119)
310 KB
310 KB JPG
File: 1698288228261187.jpg (1.71 MB, 3024x1728)
1.71 MB
1.71 MB JPG
File: 1693865696015.png (1.66 MB, 1024x1024)
1.66 MB
1.66 MB PNG
File: angery_00097.jpg (303 KB, 1119x1119)
303 KB
303 KB JPG
File: 1710501372951010.jpg (1.2 MB, 3024x1728)
1.2 MB
1.2 MB JPG
File: 1695604851689.png (1.59 MB, 832x1216)
1.59 MB
1.59 MB PNG
Seems like I'm getting better results off pony without including the score_4_up part that was recommended in the full string. Any less however, and it does start losing on quality.
File: 1687607299511.jpg (1.51 MB, 2048x2048)
1.51 MB
1.51 MB JPG
yo anon, I can dig it
So I transiently use image generation and don't keep up with current events

Is this just a ground floor general for when stable diffusion is dead and a more future proof name for the general is wanted?

Not sure it works that way. I remember how long the term "SFM" lasted long after artists had moved on from source filmmaker. In fact people still might call it that
these ain't too shabby either, like where you're going
Show me your best fucking HunYuan gen right now
I want to believe because fuck SD3 and fuck the cucked lgbtq+++ west but what I'm seeing as output is not pleasing me
>Is this just a ground floor general for when stable diffusion is dead and a more future proof name for the general is wanted?
More or less. The other general also has it's own issues with avatarfag circlejerks and related drama, or being an echochamber with more off-topic shitposting than actual discussion, so that's also a factor.
literally all of the avatarfags except debo and fran only post here now
Personally I'm not impressed with HunYuan yet, but I really like the artsy side of current PixArt-Sigma, and now that it's team is apparently going to cooperate with Nvidia on a bigger model, we might be eating ever better SoonTM.
File: tmp3lxb2rnt.png (650 KB, 773x1000)
650 KB
650 KB PNG
>try unsuccessfully to train a LoRA for a woman with a very particular face
>looks like shit
>bump dimension from 8 to 32
>suddenly works perfectly
File: Sigma_02351_.jpg (2.1 MB, 2048x2048)
2.1 MB
2.1 MB JPG
It's incredibly difficult to compete with round the clock autism
File: bestgen.jpg (123 KB, 1024x1024)
123 KB
123 KB JPG
bros, I keep getting this error when i try to inpaint.
>Error: index 1 is out of bounds for dimension 0 with size 1
How do I solve it? also picrel is prob the best gen i made so far as a newfag, its nothing crazy but finally im able to gen cool stuff.
File: 1715921265792456.png (1.4 MB, 1024x1024)
1.4 MB
1.4 MB PNG
do you want to post the whole traceback or share a screenshot of your workflow?
Care to post a screenshot of your inpaint settings?
>do you want to post the whole traceback or share a screenshot of your workflow?
>inpaint settings
How do i see this? im using EasyDiffusion.
Hmmm, I use comfy and have previously used a1111 (webui) and those had the command line interface available on my desktop to copy from.
File: Screenshot_3.png (79 KB, 1868x807)
79 KB
this is what you're talking about?
I'm not that familiar with EasyDiffusion either, so it's hard to say what to look for. If you can't find anything to screenshot for us, you could try following this and see if you missed something: https://youtu.be/j5XTwW1bseM?si=f5JvMSXLr85r2mok&t=303
File: 1696365316480520.png (809 KB, 1024x1024)
809 KB
809 KB PNG
yes, that's exactly it. I'm going to ponder this for a moment and see if I can help you
File: awesfrgdh.png (181 KB, 569x340)
181 KB
181 KB PNG
My first guess would be they didn't switch inpainting on, so it's not seeing any mask to inpaint over.
Yeah I think that's worth a shot. Nothing else is jumping out at me
put it to negative
File: Sigma_02361_.png (3.21 MB, 2560x1536)
3.21 MB
3.21 MB PNG
Good morning you beautiful thing, you. And I'm not talking about the gen (which is also good). sgm_uniform was in the trash bin until you came along and changed everything
File: Screenshot_2.png (1.25 MB, 798x777)
1.25 MB
1.25 MB PNG
I have it switched on but still.. maybe im selecting an area that's too small for the prompt+loraembedding??
gm i'm glad you enjoyed everything. I have some more stuff coming up today after I get bored of coom and a few work meetings are done.
Tried that already, and it seemed to worsen input desu, but needs more testing.
Then I need you to show me something related either to inpaint settings, or generation settings. Look for something with denoise, as an example.
Has anyone tried the 5B lumina model? The online demo only shows the 2B model and I've only seen people talk about the 2B model
File: Screenshot_3.png (205 KB, 613x767)
205 KB
205 KB PNG
its pretty basic shit desu. I even disabled the fix faces thing but still same error
That's still tells me nothing. Anything lower of that?
File: download (4).jpg (182 KB, 1024x1024)
182 KB
182 KB JPG
Thank you, I think I figured it out-- but don't know exactly why it's like this. I changed the sampler from DPM++SDE to something else and it's now inpainting albeit giving retarded results.
if you want good input you should just take screens of the whole interface
That's where you'd normally want to lower denoise, but from what little glimpses I found in your UI, it doesn't.. seem to have that?? It's really hard to help, because you're not giving me a lot to work with, as >>101083332 said. You might want to try and lower.. prompt strenght maybe? Holyshit EasyDiffusion is weird.
File: Screenshot_4.png (67 KB, 478x802)
67 KB
I was about to post a ss with the lower part desu, but then it started working! ty anyway for helping me figure it out! here's the lower part of the UI, nothing particularly about "denoise". I'm just assuming it's the sampler now.
File: Sigma_02362_.jpg (2.36 MB, 2560x1536)
2.36 MB
2.36 MB JPG
Have not. Report back with results?
File: 00652-3605764484.jpg (259 KB, 1058x1411)
259 KB
259 KB JPG
retro anime
I'm gonna guess "Prompt Strenght" is your Denoise, since it clearly isn't CFG.. speaking of, you probably want to lower your CFG to 7 - 5.5, also try increasing steps to 30/35 while you inpaint. If that Prompt Strenght is indeed denoise, try something in between 0.6 and 0.35
File: 1695173616734812.png (3.18 MB, 1432x1840)
3.18 MB
3.18 MB PNG
"guidance scale" will behave better at 7.5. Also, that's a 1.5 model and they don't necessarily perform as well at 1024x1024. You should be rendering at like 512x768 and upscaling using the "Scale up" setting down below, and probably just enable the "Show only the corrected/upscaled image" button
Oh, by CFG I meant Guidance Scale.
File: download.jpg (181 KB, 1024x1024)
181 KB
181 KB JPG
>picrel: cursed hands
Ty for all the tips. 1024x1024 turns out better for me at the cost of it being really slow sometimes (like 15+ minutes). I will try to render at lower resolutions and upscale it, and see the results.
I'm thinking of switching from EasyDiffusion to WebUI Forge... is there a relatively easy way to do so without having Forge installer download everything stable-diffusion related again?
i just came back after learning about SD3. but the other thread now has a discord link, and that is a massive red flag.
File: ComfyUI_00439_.png (1.91 MB, 2312x1792)
1.91 MB
1.91 MB PNG
good morning!
File: 1692708158858881.png (2.74 MB, 1432x1840)
2.74 MB
2.74 MB PNG
Lol. Lmao even.
The discord members most primarily Schizo Hijab anon pushed for this pixart/chang/shill spinoff. /ldg/ is only logical but you're kidding yourself if you think there isn't complete overlap between the two
gm ani
good morning
good morning
It is worrying indeed. Imagine going to a mongolian basket weaving forum, only to be redirected somewhere else. Sounds like a honeypot to me.
Different models were traine with different resolutions in mind. With models based on SD 1.5 you REALLY don't want to generate outside of 512x512 or 768x512/512x768, unless you hirezfix/upscale from it. 1024x1024 meanwhile is a default for SDXL/Pony models, and similarly you would upscale it (if needs be).
Devil is in the details, and small things can make a huge difference.
thanks for the info, i am out. please post the discord link on this thread too, to help out filters.
The forge installer is about as easy as it gets. You always have to redownload stuff, except for loras, vaes and checkpoints, these you can transfer over.
>posts on 4chan
>is worried about doxxing
Brother you are on every list imaginable already. Thread anons are the least of your worries.
at this point, i will just assume thread sabotage in hopes of making more anons join the discord.
literal glowie behavior, textbook even.
The thing about Discord is the mentally ill 4chan users can doxx you. :)
File: 1692382148689108.png (3.01 MB, 1432x1840)
3.01 MB
3.01 MB PNG
And then what? Please paint a picture of how I should be scared of some milquetoast schizo greasy bitch halfway across the planet. I'm an OpSec king and nothing you can do will change that
File: highmelanincontentmeme.png (365 KB, 680x680)
365 KB
365 KB PNG
>i will just assume thread sabotage
We do get regular visitors who attempt to derail threads, if that's what you mean. I'm having to remind anons every now and then to try and not engage with off-topic bullshit, since that's precisely one of the reason this general was made.
File: 0.jpg (502 KB, 1024x1024)
502 KB
502 KB JPG
Show me what you got.
File: 1700291233979969.png (1.64 MB, 1024x1024)
1.64 MB
1.64 MB PNG
this is me btw my range is incredible, and it can't be denied
moar weird stuff !
File: 1696923151601226.png (1.28 MB, 1024x1024)
1.28 MB
1.28 MB PNG
Clarity of this gen so juicy I could lick my monitor
File: 1698610589411565.png (1.16 MB, 1024x1024)
1.16 MB
1.16 MB PNG
The number of people they had to kill and compress their souls into the machine in order to produce SD3 is honestly distressing
File: 1703068600585175.png (1.56 MB, 1024x1024)
1.56 MB
1.56 MB PNG
>souls into image machine
With a bit of luck, their mangled bodies can yet be salvaged.
soul2img diffusion
File: tmpjgrkv786.png (1.98 MB, 1280x1280)
1.98 MB
1.98 MB PNG
File: 1691706888427706.png (1.44 MB, 1024x1024)
1.44 MB
1.44 MB PNG
I think I understand why Stable Diffusion Medium 4b and 8b are under wraps. It's going to change the world. This is the inflection point. Goosebumps writing that...
I'd lean moreso that there's a sweet parameter spot for local txt2img. Pony/SDXL proved to be a point where you start getting good hands, and SD3 managed to get text going, so there's still plenty to chisel at within these parameter amounts.
SD3 is only 2b whereas SDXL and Pony are 6b
I know, and seeing capabilities of both, I'm optimistic about there being way more to chisel and get away with in that parameter range.
File: tmp51dbdmr3.png (3.13 MB, 1536x1536)
3.13 MB
3.13 MB PNG
File: 1701504548447455.jpg (1.29 MB, 3024x1728)
1.29 MB
1.29 MB JPG
One day, anons...
File: 0.jpg (458 KB, 1024x1024)
458 KB
458 KB JPG
I use score_9, score_8_up, score_7_up in the positives, score_6, score_5, score_4, score_3 in the negatives.
pretty decent blood
sd3 is surprisingly good are gore
I haven't had my coffee yet boys
Oh right, I recall seeing these around. Guess I'l start comparisons.
>sd3 is surprisingly good are gore
which is very weird, given it's supposedly family-friendly lobotomy, then again didn't stop others from generating jailbait with it, so who knows what's in these weights
File: 1717282358374122.png (3.25 MB, 1432x1840)
3.25 MB
3.25 MB PNG
my thread ender last night resulted in
"see a psychiatrist" lol.
As if this website isn't literally snuff and pedo adjacent. And there are like what 10 saw movies now? I can't even watch that stuff.
>sd3 is surprisingly good are gore
I love SAI's approach...
>Naked women: UNSAFE
>Fucking murder, blood, dismemberment: SAFE
File: tmppoddhkdp.png (3.11 MB, 3072x1298)
3.11 MB
3.11 MB PNG
Excuse the confusing grid naming. Your method did worse, as compared to vanilla score_schizo, or having score_4_up in negatives.

I need to learn how to properly do prompt girds.
File: 1690287222470026.png (1.29 MB, 1024x1024)
1.29 MB
1.29 MB PNG
Those gens didn't actually involve prompting for that. They're the result of zero-shot prompting for ideas that blend well to create that imagery.
But yeah, their safety team is verifiably retarded. One of the contributors to the paper didn't know what "denoising" is.

(a photograph of a hi-8 camcorder viewfinder, through which a disturbing scene is visible:1.95), (vhs hi-8 video footage:1.3) of a meat packing facility, slaughterhouse, red juice, red meat, slop, violence, disgusting content, i'm really upset of annie Leibovitz, taken using a vhs hi-8 camcorder, some characteristics that describe the quality of the image are fine film grain, rich color depth, wide dynamic range, rule of thirds,(messy red sloppy scene, man wearing butchers apron and hockey mask, cutting, chopping, splashes of red liquid, demonic figure, chainsaw:1.2) (static, low quality, grainy, found footage, 480p,:1.9), low quality, low resolution, disturbing, disconcerting, stressful, angry, dark, sinful, disobedient, without cause, violence, gore, grotesque, excrement, sadness ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
>new pixart tune drops
>dumbass forgets to paint away the watermarks
When will they learn?
how horrifying
the watermarks are the least of the problems but at leas they gave it a shot
File: tmpmlwayzgv.png (3.82 MB, 3072x1298)
3.82 MB
3.82 MB PNG
based grid poster
For pony I don't think you want to use the *_up tags in the negatives. From my limited understanding that means anything from that scoring rank upwards which includes score_9, score_8, etc.
this feels like a major meme tbdesu
File: tmpxff037qa.png (3.9 MB, 3072x1182)
3.9 MB
3.9 MB PNG
>I don't think you want to use the *_up tags in the negatives.
Good point. Here goes a new method for Style: score_4 neg
>Positive: score_9, score_8_up, score_7_up, score_6_up, score_5_up
>Negative: score_4

Very marginal difference, but from this one example, eyes are less retarded in score_4, as comapred to score_4_up. Further comparisons on the way.
File: tmpttqlkub1.png (3.42 MB, 3072x1182)
3.42 MB
3.42 MB PNG
Here's a good example of the default recommended score_schizo doing worse.
As for score_4 vs score_4_up in negatives, close again. The non-up version is a bit more coherent, with slightly more vivid colours.
File: tmpc6ns6uow.png (3.83 MB, 3072x1182)
3.83 MB
3.83 MB PNG
neg: score_5, score_4, score_3
pos: score_9, score_8_up, score_7_up, score_6_up
File: tmp5p3fykln.jpg (1.04 MB, 4096x1298)
1.04 MB
1.04 MB JPG
>pos: score_9, score_8_up, score_7_up, score_6_up
>neg: score_5, score_4, score_3
Similar to just having score_4_up in negatives
File: tmpemhkwik3.jpg (1.03 MB, 4096x1298)
1.03 MB
1.03 MB JPG
SD3 / pixart furfag model when
it will be a while anon
File: file.png (169 KB, 256x256)
169 KB
169 KB PNG
SD3 - never. Their terms are egregious, they never released proper training code.

Pixart - what people actually want requires a model with more parameters. If my model training is successful you will have that as it's a base model with an extreme variety of images and concepts.
File: tmpyclmlqzh.jpg (797 KB, 4096x1298)
797 KB
797 KB JPG
All of the currently competing methods are very close in quality, but for my very keen eye, so far your mehtod might be most coherent and detailed.
>unless you hirezfix/upscale from it.
what upscaler should I use? or any of them works?
Thanks, i installed it and copied the models and it seems to be working fine so far.. although its UI is confusing compared to EasyDiffusion, idk how to apply loras in it reliably like i did on ED.. but it seems to have much more options that werent in ED.. like Refiner. what does the Refiner do?
File: podracing.png (6 KB, 457x70)
6 KB
>If my model training is successful you will have that as it's a base model with an extreme variety of images and concepts.
You rule. How many days, epochs remaining?
File: file.png (74 KB, 235x293)
74 KB
It's impossible to know, I haven't even completed compiling and captioning a basic dataset. Dogs and cats don't even really render anything yet. The sports car is slowly coming together. I expect it will take months and maybe by then Pixart will have the 16 channel VAE model done by then. I'm just hedging my bets and doing something interesting and fun.
Has anyone trained DyLoRA? I wonder if it's a meme

How many images in the dataset right now?

>I expect it will take months
>Their terms are egregious, they never released proper training code.
The latter point aside, you don't think they'll update their terms to allow for a goontune? They must realize how much push that gives models since, it seems, a ton of people are only in it for the goon and nothing else.
That, along with the current civi ban.
I'm training on 3-4 million images currently, I haven't counted everything. The currently training dataset has quite a bit of variety so the model should learn a lot of styles and concepts. I have 5 million images I scraped from Duckduckgo that I consider essential that I'm still downloading. In the end the dataset will be like 12 million or so images. Caching takes a ton of time so I basically cache up a million images every epoch and add them to the training, currently epochs are 2.5 days.
File: qewefwgredf.png (301 KB, 1894x973)
301 KB
301 KB PNG
>what upscaler should I use?
I'd recommend you start with default latent, maybe Lanczos since it's fast and simple. Other upscalers are likely to start a download. Other anons might have good recommendations in that regard. I'd lower denoise to 0.6 or 0.55, if you want to stay closer to the original output. Denoise of 0.7 in both hirez and inpainting or img2img start to vary wildly from original input.
>idk how to apply loras
Multiple ways to go about it. Default way to go about it is the Lora tab. You click a card, or you open up the side list, click either, and it's added to prompt, doesn't matter whether it's in the middle, begining or end. You can change :1 to apply it more or less, for example :0.5 or :1.5

If your new loras don't show up, use the refresh button in top right.
The terms are a reaction to Pony and other finetunes completely destroying the base model. They want to be a monopoly for core models so they can sell access via the API and let the peasants like us have dinky little loras which ultimately can never compete.
Also, with hirez fix steps, you can usually get away with half the amount you use in the initial gen. Leaving it at 0 does the same amount from your initial generation, which might be redundant. When it comes to steps in general, don't bother going above 35.
thanks a lot!
Happy genning!
cool training images btw i love seeing these
It's interesting to see the model slowly understand concepts, it's not like any of the captions it's trained on was ever specifically trained on that prompt. So it's figured out this image through inference from disparate captions.
Thank goodness for Florence by the way, probably sped up captioning 4x, but even with Florence it's ~11 days per million images.
This really is the general where things get done. Tutorials, comparisons, experiments, model training, artwork-level gens, collages, interesting discussions. We eating good.
>12 million or so images
is it still 1 terabyte per million images? that kind of dataset will be valuable in the future
Can you post a catbox of your pony workflow? I appreciate the work you're putting in
Pixart with a xl/1.5 refiner slapped on top is already mogging sd3 for me when it comes to softcore nsfw and creative/sovl gens. Seems like you currently have to combine the strengths of both to get good stuff.
Yeah it's probably about that, maybe 1.5 TB per million, my dataset hard drive is a little over 13 TB.
File: 00004-1010323004.png (2.44 MB, 1632x1344)
2.44 MB
2.44 MB PNG
I suppose I'm just astounded at the idea that this is their game plan. Everyone knows they don't give a shit about the community but how do they not realize that porn is what the most fervent genners want? Sure there is the legal component, but someone will have to bite the bullet at some point.
File: tmp32mmxu9x.jpg (875 KB, 4096x1298)
875 KB
875 KB JPG
Interesting point of note. The originally recommended score prefix tends to stay more true to whatever artstyle a given loras was trained to imitate. 2 of 3 examples were a clear case of that.
Do you have something particular in mind? Can't really catbox, because I save in an unorthodox manner that doesn't leave any metadata. I rely heavily on img2img and inpainting, which means I constantly change settings, which doesn't help either.
File: 235455_00001_.png (2.4 MB, 944x1408)
2.4 MB
2.4 MB PNG
>Deanonymizing (to an extent) this general was necessary to get some folks to be more peaceable.
Ever tried finetuning 1.5? I wish we had more models trained with v-prediction
File: 2e3wretrth.jpg (2.94 MB, 4088x3896)
2.94 MB
2.94 MB JPG
Fuck, I meant to post this one.
>The originally recommended score prefix tends to stay more true to whatever artstyle a given loras was trained to imitate. 2 of 3 examples were a clear case of that.
It's about the money, how can you compete against uncensored models? Even when not talking about porn, uncensored models have a broader dataset, more pop culture, people, etc. Because SAI is a bunch of handwringing cowards they don't have the balls to host these models so they ultimately get outcompeted on the API side. If Pony goes API with a 4B SD3 model, how can SAI compete?
Where'd you get this image of me?
I've never liked SD 1.5. There's something about Pixart that pleases me from a training perspective and it trains way faster than SD. SD 1.5 is a finicky model and it gets blown out way too easily and CLIP is absolutely ass.
I wonder if 1.5 wonky performance is related to caption dropout being used with base model training
caption dropout is needed for the uncond training
File: tmpxa_0gf82.png (1.27 MB, 1024x1024)
1.27 MB
1.27 MB PNG
Oh wow, I actually fucked up on score_6_up & score_3 neg style, though difference remains small.
In the examples I posted settings are very basic, if that helps. 25 steps of Euler A SGMUniform, 7 CFG, 1024 resolution. Usually simple 1girl prompt. Random loras. Nothing fancy.
Really well done, catbox? I imagine there was some inpainting
I think SD 1.5 is a legacy model with a legacy architecture that has aged poorly.
File: 1702139016188061.gif (2.75 MB, 512x512)
2.75 MB
2.75 MB GIF
some people are afraid to let it go even if they have the hardware to run sdxl and others are tied to a certain aesthetic from a lora/model not present in sdxl
File: Sigma_02366_.jpg (2.06 MB, 2560x1536)
2.06 MB
2.06 MB JPG
Improving daily hype!
centurii, typo, incase?
I'm hoping with my essentially raw captions from duckduckgo and flickr people will get the sense of adventure that LAION with SD 1.5 evoked.
something feels off with that plotting but it could just be me
File: 00017-3801861407.png (587 KB, 512x768)
587 KB
587 KB PNG
I again, ask for help... how do i stop this funny thing from happening on ForgeUI? EasyDiffusion's simple inpaint options never made it so that the whole prompt is inpainted in the small area, and only inpainted to fix the helmet which i highlighted.. I'm tired of seeing full soldiers coming out of the neck of another soldier.
File: 1689070105536553.png (2.96 MB, 1432x1840)
2.96 MB
2.96 MB PNG
I can't even imagine using 1.5 anymore
this looks like it's related to how your prompt is being fed to the adetailer/facedetailer/inpaint
1.5 styles are still king IMO (maybe that's nostalgia tho) but it is antiquated for other things
Is that gen XL?
File: 1704639952979954.png (3.05 MB, 1432x1840)
3.05 MB
3.05 MB PNG
I think that's just nostalgia. This is ponyrealism with a dash of SDXXXL
that and there's also the option (unless it's not in forge) where you can inpaint only the mask or it treats the entire canvas as the area to inpaint
File: 1695758388911239.gif (2.84 MB, 512x512)
2.84 MB
2.84 MB GIF
i genuinely like using 1.5. I like the jank. It's like old pretty Italian car that breaks all the time, but when it works it's just great.
File: 1701737471384182.gif (2.68 MB, 512x512)
2.68 MB
2.68 MB GIF
>In the examples I posted settings are very basic, if that helps. 25 steps of Euler A SGMUniform, 7 CFG, 1024 resolution. Usually simple 1girl prompt. Random loras. Nothing fancy.
Thank you, anon!
and either:
File: 1692318551387419.png (3.03 MB, 1432x1840)
3.03 MB
3.03 MB PNG
sd3 is an amazing example of new jank
I select "inpaint only the mask" and it still does this retarded shit lol
File: ComfyUI_SDXL_0055.jpg (1.97 MB, 1664x2432)
1.97 MB
1.97 MB JPG
i'd shoot my ghost into her shell if you know what I mean
Multiple ways to go about inpainting. If the thing being generated is too detached from the original content try following:
Lower denoise, Lower CFG, Change prompt to better reflect what you inpaint, increase "Only masked padding, pixels".

It's all about what context you give the model. Generally you want to inpaint in the "only masked" mode, since it's like zooming in and regenerating with higher detail. By increasing masked padding, you give it more context in from of the area around your inpaint mask. You can also inpaint in the "whole picutre" mode, but then your resolution is applied to the entire image, so you want to keep it the same as the original. There's a button next to resolution that allows you to copy whatever resolution of your current image is.

Denoise menwhile, is kinda like "how clearly" does the ai see the original content that it's painting over and around. The lower it is, the less "blurry" it is for the AI, but also less changes will be made. Your prompt, CFG (how loud you shout at the AI to follow it), and the visual context it sees, whether in denoise or masked padding, is how you mainly balance out the thing.

Other than increasing masked padding, you can give it more context by painting the mask on more of the area, or other areas. That will make it "zoom out", but it will also come at the cost of detail. It's all a balacing act.
outside the framerate, superb. doing anything special?
I don't know where I belong as a poster or as a genner. Should I stick with/sdg/ or move on to /ldg/?
We can't think for you anon. Lurk in both places, see for yourself what you like more.
Neither, both are compromised. keep searching, there are more AI threads than just the ones mentioned in the op.
File: 1697770598725896.gif (2.51 MB, 512x512)
2.51 MB
2.51 MB GIF
<lora:add_detail:1> 1girl, African-American woman sleeping, straight hair, elegant (detailed vibrant shining pupils) Cinematic lighting, portrait, [Jean - Baptiste Monge], [josephine wall], [muted colors],
poorly drawn hands) deformed face deformed eyes (((bad anatomy))) easynegative negative_hand-neg
Was too lazy to cut the "detailed pupils" lol.
Ticked both the "Calculate the optimal GIF palette" and "Optimize GIFs with gifsicle" options in AnimateDiff extension settings.
The latter one does prevent you from saving PNGs though.
"Straight hair" refers to the hair of something like picrel when it's an African-American woman, but sometimes outputs Eurasian hair.
Kept framerate low for genning speed, but it's no issue at all.
Im a bit of a newfag but I'm quite taken by the poster "Debo" who seems consistent and very helpful, but /ldg/ has a more serious vibe that I also enjoy so it's really bugging me
File: tmp9jomifgz.png (637 KB, 770x1000)
637 KB
637 KB PNG
Maybe I really should try and make that guide for inpainting.
please do, anon.
File: 1704574149299885.gif (3.66 MB, 512x512)
3.66 MB
3.66 MB GIF
Josephine Wall keeps adding freckles.
great idea
I need to update my inpainting game
>One of the contributors to the paper didn't know what "denoising" is.
Sorry not handy atm, but it was stupidly simple.
>technical drawing of zendaya
i think.
One of those is normal.
>browsing new loras on civit
>中年失业在线乞讨 Middle-aged unemployed begging online
not what I expected when looking at it
File: z.jpg (1.01 MB, 2244x2244)
1.01 MB
1.01 MB JPG
> zendaya
fucking loli central here lol
File: z1.jpg (975 KB, 2243x2243)
975 KB
975 KB JPG
Starting to see loras captioned with florence. Interesting time we live in.
>The training data are images captioned by wd and florence, with minimal manual input. Overall, a quick and dirty LoRA with surprisingly good results!
>The training data are images captioned by wd and florence
how do you automate this? get both interrogations into a single txt file
very cool, if the minimal manual input is true that is pretty damn good

kind of gives me inspiration to finally get off my ass and make some loras I've been wanting to do
File: z2.jpg (999 KB, 2243x2243)
999 KB
999 KB JPG
File: z3.jpg (1.03 MB, 1488x2240)
1.03 MB
1.03 MB JPG
File: tmpi3ehfb6m.png (610 KB, 777x1006)
610 KB
610 KB PNG
Might come in handy. I like minimalism, and I have barely any loras to fill in the gap. Hopefully minimalism isn't as hard to work with as realism was.
nice looking lora desu
Works better than the other kurzgesagt pony lora, since it doesn't require a triggerword.
>(score_9, score_8_up:1.1), score_7_up,
File: z4.jpg (1.03 MB, 2242x2242)
1.03 MB
1.03 MB JPG
It's a pony ritual chant
yeah but doing extra weight on some of the scores is truly cursed
if it yields what you're looking for then I don't see a problem
I've done a bit of testing on score_schizo today, and my jimmies are rustled.
File: 00274-1648174558.jpg (627 KB, 1452x1936)
627 KB
627 KB JPG
[score_9|score_8_up|score_7_up|score_6|score_5|score_4 : score_4|score_5|score_6|score_7_up|score_8_up|score_9 : 0.5]
Blessed thread of improovment
File: tmp444e9b6n.png (246 KB, 607x342)
246 KB
246 KB PNG
File: 1696453381838442.jpg (1.04 MB, 3024x1728)
1.04 MB
1.04 MB JPG
File: z5.jpg (1.04 MB, 1981x1981)
1.04 MB
1.04 MB JPG
File: 01639-1648174563.jpg (413 KB, 1210x1613)
413 KB
413 KB JPG
File: 01703-3691622107.jpg (271 KB, 1210x1613)
271 KB
271 KB JPG
File: 01711-1175289786.jpg (275 KB, 1210x1613)
275 KB
275 KB JPG
File: tmpf149cwns.png (1.26 MB, 1280x1280)
1.26 MB
1.26 MB PNG
File: 01715-1175289790.jpg (250 KB, 1210x1613)
250 KB
250 KB JPG
>healer with str build
File: z6.jpg (1.72 MB, 1978x1978)
1.72 MB
1.72 MB JPG
adetailer, no inpaint - https://files.catbox.moe/ls00yc.png
File: z7.jpg (890 KB, 1981x1981)
890 KB
890 KB JPG
very nice, anon
File: z8.jpg (1.01 MB, 1981x1981)
1.01 MB
1.01 MB JPG
a wonderfull night to you all
Part of me wants to get one of those portable scanners to build out loras from old media you can't find anywhere else but in a physical format
Does it work tho
why wouldn't it? it's just some prompt interpolation
Meaning, is that the best way to use score tags kek I need the secret sauce
There is no "best" way. There are negative tokens that can universally be used for different gens.
At the end of the day you just want to use what works best for you. That anon's approach might not necessarily tease out the look you are trying to achieve for your gens.
And to me that's the great thing about this
File: 1691442239817489.jpg (1.72 MB, 3024x1728)
1.72 MB
1.72 MB JPG
patiently awaiting the next collage (it can't come soon enough)
Cool gen!
Is that str8 prompt?
File: 0.jpg (466 KB, 1024x1024)
466 KB
466 KB JPG
Yep - just an old SDXL gen on a model called Cheyenne
File: 1710508786492999.png (2.88 MB, 1432x1840)
2.88 MB
2.88 MB PNG
you have to i2i this with (veiny breasts) + (exaggerated features)
Very nice booba
Me in the background, in awe of the lamp's texture.
pretty cool anon
>in awe of the lamp's texture
don't tell me it was made from the j's
I've been enlightened by the lamp, it's got a a wedding dress, so I'm going to marry it.
Anybody tried chameleon yet?

>mixed-modal model which can produce both images and text
unless something recently changes the weights aren't available for imagegen
File: 0.jpg (183 KB, 1024x1024)
183 KB
183 KB JPG
Is that first one the behind the dune artist style?
Why is she in onion skins?
File: 0.jpg (470 KB, 1024x1024)
470 KB
470 KB JPG
File: classic_1girl.png (1.71 MB, 1024x1024)
1.71 MB
1.71 MB PNG
This is good. I like this.
File: classic_1girl_2.jpg (145 KB, 1024x1024)
145 KB
145 KB JPG
Thanks. Found it the junk pile and touched up the planet a little.
File: tmp4yar2hax.png (308 KB, 608x384)
308 KB
308 KB PNG
>touched up the planet a little
File: 0.jpg (300 KB, 1024x1024)
300 KB
300 KB JPG
File: 1543768295364123.gif (1.6 MB, 498x373)
1.6 MB
1.6 MB GIF
>decide to try zluda from directml because VRAM was maxxing out and OOMing despite owning 24gb 7900xtx
>Noteworthy: The first image generation can take from 15 up to 40 minutes. First time only.
>no errors, so i installed it correctly at least

Damn he wasnt lying, that first image generation can take from 15 up to 40 minutes.
File: tmp2gkst834.png (323 KB, 541x380)
323 KB
323 KB PNG
>from 15 up to 40
God I love tits
I'd live with rocm and potential OOM shit that ever having to deal with that madness
File: output_image.jpg (281 KB, 744x744)
281 KB
281 KB JPG
I'm going to use this gif in a music video, I swear. Feels like an artfag's wet dream.
File: 1710986567664550.jpg (1007 KB, 3024x1728)
1007 KB
1007 KB JPG
File: 1876243587346.png (622 KB, 956x628)
622 KB
622 KB PNG
>Startup Time 205.5s
>Model loaded in 465.2s
Niggers said this was faster.
I will assume this is just applying to the first gen.

native windows rocm support would fix all of this needless suffering.
unbridled soul
I'm sure you area already aware, and probably have been using it, but https://rocm.docs.amd.com/projects/radeon/en/latest/docs/install/wsl/install-radeon.html
But yeah, native would be pretty beneficial for windows users
File: moar_1girl.png (1.65 MB, 1024x1024)
1.65 MB
1.65 MB PNG
>native windows rocm support would fix all of this needless suffering.
meanwhile me as intelfag, still remembered that last year ipex couldn't run on Windows without wsl and now it runs on windows native just fine, heh...
File: 645376548457621.gif (316 KB, 250x180)
316 KB
316 KB GIF
>already installed
It just finished and yea hold shit that load was a awhile but the gen time was fast as fuck.
>It only applied to the first gen
>18 sec image gen
>15 sec highresfix time
what native latent size & scale? steps?
not bad #'s
File: 1236455427653486.gif (3.69 MB, 640x364)
3.69 MB
3.69 MB GIF
Double checked and for no reason at all it just decided to use my 7800xt instead.
I mean thats not the card i wanted to use but im still impressed either way.
Whats the arg to tell it to use a different device?


1024x1024, 6.5, 30 steps,
1.2 to 1228x1228, 17 steps, 0.52 denoising
File: file.png (74 KB, 290x301)
74 KB
Alright, so what kind of local model can I run with the following?

>nvidia laptop 4060
>32 gigso f ram

Keep in mind I know enough about computers to be able to work an .exe install/figure out which directoroy to install whatever and need a gui
You should be able to go up to Pony/SDXL with something like the Forge or SD.Next branch of auto, or comfy based stuff like StableSwarm or MetaStable.
You are essentially able to run just about everything.
File: 1714971711090510.png (230 KB, 421x453)
230 KB
230 KB PNG
I-I can?? Looks like my shitposting will be enhanced.
How is that relevant. I am just unaware.
install linux
kek I thought it's 6.5 it/s
File: ComfyUI_SDXL_0085.jpg (2.23 MB, 1664x2432)
2.23 MB
2.23 MB JPG
File: 4435527464522345.png (45 KB, 889x1175)
45 KB
In this case Linus is faster sampling
do we have any good online image generators like bing? I have too shitty laptop for stable diffusion
they'll help you out
this is why we need to keep low parameter models in mind, gotta look after out vramlet bros
File: file.png (31 KB, 686x146)
31 KB
File: 2541316153.jpg (113 KB, 1280x720)
113 KB
113 KB JPG
i will tell you my it/s rate as soon as it loads into my OTHER gpu....
Damn, that first image generation can take from 15 up to 40 minutes.
File: tmpkjoi1epd.png (413 KB, 600x534)
413 KB
413 KB PNG
You vill c00nquer ze bread und yuo vill liek it:
File: ComfyUI_temp_pphdd_00036_.png (1.45 MB, 1024x1024)
1.45 MB
1.45 MB PNG
new pixart animetune lookin nice
I can dig it, that's a very aesthetic vibe
TY baker!
Very nice. Link?
release under AGPL3.0/FAIPL-1.0
I've been using this
I didn't realize he released a 1024 version. Neat.
File: 0.jpg (390 KB, 1024x1024)
390 KB
390 KB JPG
File: 0.jpg (310 KB, 1024x1024)
310 KB
310 KB JPG
new thread idiot

[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.