[a / b / c / d / e / f / g / gif / h / hr / k / m / o / p / r / s / t / u / v / vg / vm / vmg / vr / vrpg / vst / w / wg] [i / ic] [r9k / s4s / vip / qa] [cm / hm / lgbt / y] [3 / aco / adv / an / bant / biz / cgl / ck / co / diy / fa / fit / gd / hc / his / int / jp / lit / mlp / mu / n / news / out / po / pol / pw / qst / sci / soc / sp / tg / toy / trv / tv / vp / vt / wsg / wsr / x / xs] [Settings] [Search] [Mobile] [Home]
Settings Mobile Home
/g/ - Technology

Thread archived.
You cannot reply anymore.

[Advertise on 4chan]

File: long dick general.jpg (2.76 MB, 3264x3264)
2.76 MB
2.76 MB JPG
General dedicated to free and open source text-to-image models.

Previous /ldg/ bread : >>101118874

Arguable Edition

>Beginner UI
Fooocus: https://github.com/lllyasviel/fooocus
Metastable: https://metastable.studio
EasyDiffusion: https://easydiffusion.github.io

>Advanced UI
Automatic1111: https://github.com/automatic1111/stable-diffusion-webui
StableSwarmUI: https://github.com/Stability-AI/StableSwarmUI
InvokeAI: https://github.com/invoke-ai/InvokeAI
ComfyUI: https://github.com/comfyanonymous/ComfyUI

>Auto1111 forks
SD.Next: https://github.com/vladmandic/automatic
Forge: https://github.com/lllyasviel/stable-diffusion-webui-forge
Anapnoe UX: https://github.com/anapnoe/stable-diffusion-webui-ux

>Use a VAE if your images look washed out

>Models, LoRAs & training

>Pixart Sigma & Hunyuan DIT
Comfy Nodes: https://github.com/city96/ComfyUI_ExtraModels
*SD.Next also works with PixArt-Sigma


>Index of guides and other tools

>View and submit GPU performance data

>Try online without registration
txt2img: https://www.mage.space
img2img: https://huggingface.co/spaces/huggingface/diffuse-the-rest
sd3: https://huggingface.co/spaces/stabilityai/stable-diffusion-3-medium

>Share image prompt info

>Related boards
>they were so scared that people could directly compare XL outputs to the chink models that they made up an excuse to ban it
File: Sigma_02600_.jpg (2.34 MB, 1536x2560)
2.34 MB
2.34 MB JPG
And they almost got away with it. Good job scoob
it was just an attempt of the sdg schizo to move people back into sdg, he has done this before in the last split. spammed the thread with bs. then samefagged bs about an XL ban and made a OP banning XL/SD3 during the dead hours where few would contest it.
Blessed thread of frenship
where's the first thread? was having a convo with an anon there, any link?
File: tmpp7upndmq.png (2.09 MB, 1744x992)
2.09 MB
2.09 MB PNG
It's all so tiresome, but if it helps even one or two anons become artists in this new and fascinating medium, it will all have been worth the bit of trouble.
May your gens remain bountiful.
sure it's right here https://desuarchive.org/g/thread/101128489/
Why did the thread get jannied? I see no suspect material
File: ComfyUI_14113_.jpg (247 KB, 1536x1536)
247 KB
247 KB JPG
Good morning
collageanon failed us.
terrible choices today
You can always make your own collage. No need for this
jannied thread had a sick collage
this is pretty weak, discord-tier stuff
It was old collage someone else made. 0/10 for the effort.
>thats a bad collage
>use our discord collaeg
>thats a bad thread
>use our discord thread
>thats bad doxx
>use our discord doxx
File: just.png (7 KB, 1446x32)
7 KB
Another day, another failure. Let's see

I don't get it but it's ok
No other way around it, than through trial and error. Best of luck anon. Just remember to switch up your methodology. Can hardly expect different results from the same approach.
any idea whyd that thread get deleted?
Mass reports from discord
Nice collage TY baker!
>Just remember to switch up your methodology
I've been running prodigy for ages. Should probably switch it for a while
Great collage! Loving it
(Posted in old thread on accident, oops.)

Just an FYI, the fag that's crying over adetailer usage/bad gens and saying people need to be shipped back to /sdg/ is actually a furfag who has been harassing the /aco/ threads as well, and admitted to being a rapefugee from the /trash/ threads. You can see him cry over it here -

Prodigy sucks unless you liked it fried. For all around CAME is best, for "can't fuck it up" use AdamW. AdamW is even better when using it 8 bit because you get efficiency.
Not necessarily, but you might know better. Even changes in your dataset or the way it's captioned could be enough.
>It's all so tiresome
true but they will never stop us
>Prodigy sucks unless you liked it fried
Lower d_coef setting and scale weight norms should prevent it

>AdamW is even better when using it 8 bit because you get efficiency
With what settings, rates etc?

>Even changes in your dataset or the way it's captioned could be enough
Indeed. Some of the best loras I've made were pretty much manually captioned. It just becomes too much work with 500+ images
File: tmp3595dwsj.png (1.09 MB, 768x1344)
1.09 MB
1.09 MB PNG
horrible collage
>Prodigy sucks unless you liked it fried.
Works for me on SD/SDXL with a slow learning rate at the start and the rate adjustments like d_coef.

But not well on Onetrainer/Sigma, something is IMO going wrong there. It also unexpectedly fails with other optimizers tho.
trolls really have nothing better to do, huh
>a slow learning rate at the start
do you mean warmup or what?
Mostly just a low d0 like 1e-7. I think I prefer it to warmup.
I had the assumption that it was just 1 automatically with prodigy
>le collage
File: ComfyUI_temp_vopdm_00094_.png (2.69 MB, 1120x1440)
2.69 MB
2.69 MB PNG
File: ComfyUI_temp_vopdm_00096_.png (2.69 MB, 1120x1440)
2.69 MB
2.69 MB PNG
I think LR is 1 (adjusted by prodigy) d_coef and d0 and the other stuff is configurable.
File: 00287-4042871332.png (682 KB, 600x800)
682 KB
682 KB PNG
File: 00289-4042871334.png (699 KB, 600x800)
699 KB
699 KB PNG
File: ComfyUI_temp_vopdm_00101_.png (2.67 MB, 1120x1448)
2.67 MB
2.67 MB PNG
Asking for a friend what does it mean?
File: file.png (4 KB, 153x81)
4 KB
if a checkpoint creator suggests i use dpmpp_sde for the sampler, what scheduler is he implying i should use? on auto1111 i didnt have this option
File: ComfyUI_temp_vopdm_00109_.png (2.79 MB, 1120x1448)
2.79 MB
2.79 MB PNG
SDE Karras?
File: ComfyUI_temp_lmvrj_00138_.png (1.91 MB, 1128x1448)
1.91 MB
1.91 MB PNG
File: image.jpg (442 KB, 1890x2520)
442 KB
442 KB JPG
File: tmpdjn0v6fm.png (1.51 MB, 1680x960)
1.51 MB
1.51 MB PNG
File: ComfyUI_00058_tmp.jpg (1.17 MB, 1894x1894)
1.17 MB
1.17 MB JPG
File: tmpel2_s27h.png (1.61 MB, 1608x899)
1.61 MB
1.61 MB PNG
do you inpaint to get two different people or do you have some neat workflow?
File: lowres.jpg (242 KB, 850x708)
242 KB
242 KB JPG
its the initial d at d=0
Sorry, I meant epoch 0. d gets changed by prodigy according to d_coef
File: ComfyUI.jpg (313 KB, 2965x1340)
313 KB
313 KB JPG
In this case with well known people i started with a simple prompt and spun the RNG a while.
Then i started to dig in with pic related.
If you have any tips&tricks im all ears, because i cant get a single gen without inpainting.

TY. Gotta switch to Onetrainer for now. CAME and Huber loss looks pretty interesting
ong bak gives some nice scenes
cool, I think the main guy almost got killed in this or some other movie
Can someone do a gen of Yup or Ran sucking on that debodick
that could be, some stunts are quite nice
Huber loss is very helpful as far as I can tell, particularly in the SNR variant in kohya_ss it rarely seems to hurt and sometimes seems to help massively. CAME also seems good but likewise I only tried it in the Sigma training scripts.
maybe its more mortal combat then ong bak
>python.exe installer.py local' returned non-zero exit status 1
just fuck my shit up
make sure they all look underage
File: tmpmxx9btzf.png (1.25 MB, 1344x768)
1.25 MB
1.25 MB PNG
File: tmpl02pm1fc.png (2.09 MB, 1680x960)
2.09 MB
2.09 MB PNG
how about Xena
File: tmpwlfcdq51.png (3.26 MB, 1152x2016)
3.26 MB
3.26 MB PNG
oh wow, this one turned out much more interesting, and all it took was a vertical swap
it's fun to put images like these trough interrogator

>a painting of a tree in the middle of a lake, concept art, inspired by Cyril Rolando, fantasy art, ferrofluid oceans, floating kelp, 4 k detail fantasy, peter mohrbacher and dan mumford, flowing tendrils, cute detailed artwork
>peter mohrbacher
oh I like this one

Wonder how the new microsoft? interrogation model would do. What was it called again?
I think one anon who is finetuning sigma is using it, it looked really decent from his screenshot
Ah! Florence 2, wasn't it.
File: 1708149105085696.jpg (1.53 MB, 3024x1728)
1.53 MB
1.53 MB JPG
File: 1709573430575807.jpg (1.11 MB, 3024x1728)
1.11 MB
1.11 MB JPG
File: 1712674790193628.png (3.34 MB, 1432x1840)
3.34 MB
3.34 MB PNG
sorry that gay stuff is reserved for the other thread
File: 01035-1183159440.png (2.74 MB, 1152x1536)
2.74 MB
2.74 MB PNG
File: 01040-244814509.png (1.49 MB, 864x1152)
1.49 MB
1.49 MB PNG
Any illustrators that img2img their stuff here?
File: 01056-2551274253.png (1.31 MB, 864x1152)
1.31 MB
1.31 MB PNG
probably not, and they may not say so even if
File: ComfyUI_temp_rpvdq_00012_.png (1.96 MB, 1120x1440)
1.96 MB
1.96 MB PNG
File: 01062-442147701.png (1.3 MB, 864x1152)
1.3 MB
1.3 MB PNG
I know some who use this for tracing and making quick drafts
controlnet is much more useful
File: 7 (8).png (352 KB, 512x512)
352 KB
352 KB PNG
Does anyone know what models nemusona was using? I know it was anythingv4.5, but I tried a few checkpoints called anything4.5 (and v3 too), but they look very different. Or does anyone have suggestions on how to make something that looks similar
File: 1.png (755 KB, 1931x2296)
755 KB
755 KB PNG
Depending on the model i find that img2img helps with exploration, and rigging the control net skeleton was such a pain in the ass for me, to each their own tho. Pretty sick how i can just download a Rei lora for this sketch
File: body1.png (1.03 MB, 1056x1272)
1.03 MB
1.03 MB PNG
Bada bing bada boom, stellar reference for the body no lora required. Shit is freelo
i produce internal concept art for some mundane applications. usually i produce some starting images either traditionally or gen and then produce a set of variants with a weakened controlnet, get feedback on the preferred results, and recycle until done
jesus christ how horrifying
sweet jesus how nice
You could've just scribbled a blob man approximating the pose if you're taking it that far
Horrifying is the direction anon
File: ComfyUI_temp_lnauq_00021_.png (2.4 MB, 1120x1440)
2.4 MB
2.4 MB PNG
File: 00047.jpg (443 KB, 1944x2592)
443 KB
443 KB JPG
I enjoy her personality
more please
File: 00605-3062461947.png (1.23 MB, 1024x1024)
1.23 MB
1.23 MB PNG
Based Boomer.
File: ComfyUI_temp_lnauq_00032_.png (2.29 MB, 1120x1440)
2.29 MB
2.29 MB PNG
File: ComfyUI_temp_lnauq_00033_.png (2.39 MB, 1120x1440)
2.39 MB
2.39 MB PNG
File: 00083-2284758461.png (1.08 MB, 1024x1024)
1.08 MB
1.08 MB PNG
File: 00328-2284758706.png (1.26 MB, 1024x1024)
1.26 MB
1.26 MB PNG
File: ComfyUI_temp_lnauq_00041_.png (2.5 MB, 1120x1440)
2.5 MB
2.5 MB PNG
File: ComfyUI_temp_lnauq_00043_.png (2.2 MB, 1120x1440)
2.2 MB
2.2 MB PNG
File: tmp_3g8a1qz.png (1.26 MB, 1544x880)
1.26 MB
1.26 MB PNG
File: ComfyUI_14246_.jpg (623 KB, 1536x1536)
623 KB
623 KB JPG
moody bloodborne prompts
File: 00544-2284758922.png (1.28 MB, 1024x1024)
1.28 MB
1.28 MB PNG
File: ComfyUI_00925_.png (1.04 MB, 1024x1024)
1.04 MB
1.04 MB PNG
File: 00029-3354090524.png (1.41 MB, 1024x1024)
1.41 MB
1.41 MB PNG
File: tmpwdp7yz3v.png (1.82 MB, 1680x960)
1.82 MB
1.82 MB PNG
File: UI_0222.jpg (2.64 MB, 1664x2432)
2.64 MB
2.64 MB JPG
File: 22.png (1.07 MB, 1931x2296)
1.07 MB
1.07 MB PNG
Some color, I wonder what the machine will spit out from this. I leave the layer with the wings out cause thats just gonna make a mess
nice vertical split gen
File: ComfyUI_temp_rpvdq_00078_.png (1.71 MB, 1120x1440)
1.71 MB
1.71 MB PNG
File: ComfyUI_temp_rpvdq_00079_.png (1.68 MB, 1120x1440)
1.68 MB
1.68 MB PNG
File: 1711625953083517.jpg (1.53 MB, 3024x1728)
1.53 MB
1.53 MB JPG
File: UI_0225.jpg (2.55 MB, 1664x2432)
2.55 MB
2.55 MB JPG
Thanks m8
File: ComfyUI_temp_rpvdq_00088_.png (1.76 MB, 1120x1440)
1.76 MB
1.76 MB PNG
File: 1689517967569780.jpg (1.32 MB, 3024x1728)
1.32 MB
1.32 MB JPG
File: UI_0226.jpg (2.38 MB, 1664x2432)
2.38 MB
2.38 MB JPG
cheers m8
File: 1709658458779105.jpg (1.59 MB, 3024x1728)
1.59 MB
1.59 MB JPG
File: 1708029462116047.jpg (1.22 MB, 3024x1728)
1.22 MB
1.22 MB JPG
File: 1692517624936754.jpg (1.68 MB, 3024x1728)
1.68 MB
1.68 MB JPG
File: UI_0232.jpg (2.54 MB, 1664x2432)
2.54 MB
2.54 MB JPG
File: 1706580928337104.jpg (1.33 MB, 3024x1728)
1.33 MB
1.33 MB JPG
File: 1689332275989556.jpg (956 KB, 3024x1728)
956 KB
956 KB JPG
File: UI_0238.jpg (2.06 MB, 1664x2432)
2.06 MB
2.06 MB JPG
File: 1690401493862236.jpg (839 KB, 3024x1728)
839 KB
839 KB JPG
File: UI_0239.jpg (2.11 MB, 1664x2432)
2.11 MB
2.11 MB JPG
File: 1695896025380131.jpg (1.05 MB, 3024x1728)
1.05 MB
1.05 MB JPG
File: UI_0242.jpg (1.12 MB, 1248x1824)
1.12 MB
1.12 MB JPG
File: UI_0249.jpg (1.55 MB, 1248x1824)
1.55 MB
1.55 MB JPG
I miss chang
decent gens ITT
File: PAPi_0012.jpg (2.34 MB, 2560x1536)
2.34 MB
2.34 MB JPG
I'm tired of building these sand castles anon.
File: PAPi_0013.jpg (2.33 MB, 2560x1536)
2.33 MB
2.33 MB JPG
theyre very nice castles tho

prompt ?
1grill, face down ass up, cooking
File: PAPi_0001.jpg (3.82 MB, 3840x2304)
3.82 MB
3.82 MB JPG
what model?
PixArt -> Pony -> imageupscale
pony model over photorealism?
File: PAPi_0006.jpg (2.85 MB, 2560x1536)
2.85 MB
2.85 MB JPG
I need something to stabilize the image and if I put it through another PixArt it just eats my Video card like a fat kid through 2 cakes and asks for more.
Ah alright. That looks very good, even canvas texture is top notch
File: PAPi_0007.jpg (719 KB, 1280x768)
719 KB
719 KB JPG
Base Image
File: PAPi_0008.jpg (2.34 MB, 2560x1536)
2.34 MB
2.34 MB JPG
File: 05967-2462015412-1_4_4.png (2.04 MB, 1152x1536)
2.04 MB
2.04 MB PNG
File: PAPi_0009.jpg (2.58 MB, 2560x1536)
2.58 MB
2.58 MB JPG
Upscaled using pixart again (half power because I don't have the juice)
File: 05966-2462015409-1_4_1.png (2 MB, 1152x1536)
2 MB
>hot moms in your area
careful you got a pokie
A good sign of a woman who gives zero fucks is the how dingy the carpet is
File: PAPi_0013.jpg (1.89 MB, 2560x1536)
1.89 MB
1.89 MB JPG
File: PAPi_0016.jpg (2.25 MB, 2560x1536)
2.25 MB
2.25 MB JPG
cool style in that catbox
File: PAPi_0017.jpg (1.88 MB, 2560x1536)
1.88 MB
1.88 MB JPG
File: PAPi_0018.jpg (2.77 MB, 2560x1536)
2.77 MB
2.77 MB JPG
Had to put it there because it's over 4mb
File: PAPi_0019.jpg (2.42 MB, 2560x1536)
2.42 MB
2.42 MB JPG
File: PAPi_0021.jpg (2.84 MB, 2560x1536)
2.84 MB
2.84 MB JPG
File: PAPi_0022.jpg (2.67 MB, 2560x1536)
2.67 MB
2.67 MB JPG
Anyone know how novelAI's vibe transfer works, or more specifically how to recreate it with stable diffusion A1111? I tried t2i-adapter, base controlnet, and IP adapter with the Ponyv6 model but they didn't work very well at transferring styles let alone specific characters.
File: PAPi_0023.jpg (2.63 MB, 2560x1536)
2.63 MB
2.63 MB JPG
Sorry m8, don't know what you're talking about so I can't even help.
We can only guess at their secret sauce. What have you tried doing so far?
File: PAPi_0025.jpg (2.46 MB, 2560x1536)
2.46 MB
2.46 MB JPG
File: PAPi_0027.jpg (2.23 MB, 2560x1536)
2.23 MB
2.23 MB JPG
File: PAPi_0030.jpg (2.72 MB, 2560x1536)
2.72 MB
2.72 MB JPG
I wonder why Kohya uses folder names for repeats etc. instead settings from GUI. Very unintuitive.
IP-adapter clip, some T2i-adapter models, and openpose, lineart, and depth control net models. They don't seem to have a huge impact on the result unless the prompt is simple. NovelAI seems to be able to generate pretty complex stuff
I've just split mine in to Model\date\Seed#

File: PAPi_0031.jpg (1.88 MB, 2560x1536)
1.88 MB
1.88 MB JPG
File: PAPi_0032.jpg (1.95 MB, 2560x1536)
1.95 MB
1.95 MB JPG
IPAdapter, if used in a way, leads to the input images having a GREATER impact on the output than the prompt regardless of its length. I may be wrong but it sounds like you're expecting the free tools to be as easy to use as the paid ones which will never be the case.
I understand MJ and NAI make it incredibly simple to drop a few reference images in and press go (and obviously whatever they're doing behind the scenes far outclasses what one can accomplish quickly on their own computer) but IMO you're missing out on the breadth of customizability afforded when you own the tools, among other things.
Digging through the links in OP will likely bring elucidation but if all else fails: experimentation is key.
pony might be shit with it have you tried a different model?
File: PAPi_0038.jpg (2.63 MB, 2560x1536)
2.63 MB
2.63 MB JPG
>Digging through the links in OP will likely bring elucidation but if all else fails: experimentation is key.
I've been experimenting, but LoRA seem to be the only way to get stable diffusion to generate characters accurately, but it's a lot less flexible since I would have to train my own. I haven't had a lot of luck with complex poses even with loras

The base SDXL model, but it's not that great at anime characters. Also some pony derived checkpoints
Karras most of the time. auto111 has a scheduler select now.
sometimes Exponential is good with dpmpp_sde[_gpu]
Give euler_ancestral with ddim_uniform a try
File: PAPi_0042.jpg (2.02 MB, 1408x2816)
2.02 MB
2.02 MB JPG
Good night
File: 00029.png (1.56 MB, 1280x768)
1.56 MB
1.56 MB PNG
ty collageanon
ipadapter is the poormans lora after all
Very blatant indeed.
File: grid.jpg (1.23 MB, 1792x2176)
1.23 MB
1.23 MB JPG
What are you trying to accomplish?
File: 00011_t.png (1.03 MB, 600x600)
1.03 MB
1.03 MB PNG
File: 00011_t-2.png (1 MB, 600x600)
1 MB
File: GQ28yjGbwAM5qQG.jpg (182 KB, 512x512)
182 KB
182 KB JPG
A new imagemodel will be opensourced, this one is a 5.6B "replication" of SD3
>The goal of Open Diffusion is to create Open Source resources and models for all generative AI creators to freely use. Unrestricted, uncensored models built by the community with the single purpose of being as good as they can be.
I'm sorry if this is retarded, but how do I train a style LoRA rather than a character one? Same shit just don't tag characters and tag artist/style instead?
now that's interesting
Kind of the opposite. When you train a style lora, I think you should just tag pretty much everything, except traits characteristic for the style. If it's dark and moody, or bright and colourfull, or the characters always have big heads, don't mention any of those. Caption everything else, from subject, to action, context, background, etc. Caption it like you would prompt for this very image whilst using your desired lora.
Oh wow! It's Simo and his friends while at it. Now that's a real OG for /sdg/, rather than their thread celebrities.
I don't know those guys? Are they actually good?
File: XL_gen_tmp_13.jpg (510 KB, 1536x1200)
510 KB
510 KB JPG
Morning anons
File: 1691556934086553.jpg (1.03 MB, 3024x1728)
1.03 MB
1.03 MB JPG
Caption drop rate should also work well with this
Don't know about others, but Simo themself is one of the folks, if not THE person behind our tools for lora training.
File: 1693003234830115.jpg (1.09 MB, 3024x1728)
1.09 MB
1.09 MB JPG
>Caption drop rate
Never tried it, but now that you mention it, does make sense..
Holy shit, that sounds good, I also saw that from their statement, that's based af.
>As *Open* Diffusion, we wish to produce models that are useful for the entire community. Questions of morality and ethics beyond the law are beyond the scope of this project. We are not an ethics board or a group of philosophers.
File: 00000-0-63.png (245 KB, 512x512)
245 KB
245 KB PNG
Can anyone recommend an anime dataset with "proper" descriptions? I don't mind if they were generated by llava or other image-to-text model, I do mind if they are just tags from yetanotherbooru.

There's Borismile/Anime-dataset but it's too small and most of it has no captions (or upper cases).
Picrelated: "a painting of a person walking in a field with a flag in the background and people walking in the distance"
I hope this works out and we can finally let SAI die in peace
>implying you need more than 1girl
I wonder what the person behind Pony uses, especially now that they mentioned better comprehension of non-booru tags.
It's a nice compromise since those style loras can work pretty well without captioning at all
That's what SAI should've done in the first place, train their model however they want in the scope of legality, the rest is useless and they shouldn't act like they are the master of morality, that's fucked up
Man, I can't wait to switch off Pony for something less vram hungry. I would love to finally try and train my own finetune on the breathtaking 8vram of my gpu.
what's preventing you trying the SD1.5 models again?
>I can't wait to switch off Pony
Same, but not due to vram but because I hate how ugly pony looks in terms of the actual image quality.
It's like the whole dataset had jpeg artifacts and a piss filter thrown over it.
Nothing really, I've just spent enough time with them. As much nostalgia as I have for 1.5, and as much as I have done with them, even for the reduced speeds of PDXL, the bump in quality has been more than worth staying with it. Even something like hands became way less of a hassle. If I'm to return into lower parameters similar to that of 1.5, I'd rather wait for more pixart support, or any other alternatives for anons on a vram budget. Also 1.5 was indeed very hard to finetune, what little experiance I had with it. Tried making my own base by training own loras/lycoris and merging them into it, but I failed miserably.
>It's like the whole dataset had jpeg artifacts
Because it actually did!
>JPEG Artifacts
>An issue I hadn't initially noticed in V6, which was brought to my attention by several users, is the presence of JPEG artifacts. Although this problem is only evident in certain styles, I am committed to addressing it. The issue appears to stem from two main sources: some of the source material already contains artifacts, and my pipeline, which involves saving images at 95% quality twice, likely exacerbates the problem.
>To resolve this, I am making adjustments to the pipeline to ensure images are directly transferred from the source to VAE encoding without intermediate quality reductions. Additionally, I am developing methods to detect and either automatically correct or exclude images with noticeable artifacts. This should significantly reduce the presence of JPEG artifacts in the output of V7.
Sauce: https://civitai.com/articles/5069/towards-pony-diffusion-v7

I personally don't take issue with it's quality, but I have my own plenty gripes with it. Still, probably the model I've had most fun with.
>and my pipeline, which involves saving images at 95% quality twice, likely exacerbates the problem.
wtf, that's retarded, how could he make such a basic mistake in the first place?
the simplest mistakes are also the simplest to make
I'm not so hyped of his V7 anymore, he started cucking his models on V6 by removing the artist tags, his models will be more and more cucked as the versions will go on
that thread is a month old
The tweet is 1 day old, and this anouncment is also 1 day old
I remain optimistic, since I don't see the lack of artist tags as an issue. These very same artists likely remain in the dataset, meaning it's learning good traits from them anyway. Instead I'm looking forward to see how it better handles non-booru prompts, any improvements on realism, and getting rid of the lenghty score_schizo.
i think he said the uses wd and then expands those tags with llava
Sounds plausible. I just checked one of the datasets for that Open Diffusion model and it did include both.
that's a quite terrible approach desu, you can have a shit ton of solution when transforming tags into real sentences

1girl, table, sitting, chair could be "a girl is sitting on a chair in front of a table" or "a girl is stting on a table in front of a chair"
sure but it just has to be better than the alternatives
why can't he use CogVLM or Florence instead?
Less efficient in tokens, but there is merit in this meaning better results for using natural language in prompting, rather than trying to fit into a particular captioning system.
Didn't it come out like just a couple of days ago?
ask him, its pretty possible hes using something else now
but i really dont see why florence is so hyped, sure it is decent for its size and is fast, but the results i was getting when trying it werent really anything amazing compared to the better captioners
>the better captioners
which one allows NFSW in the first place?
they might not know nsfw, but i dont think they reject you either, and florence isnt really descriptive about sex either and might fuck up completely when trying to describe it, since i assume there are not very many images involving it in the dataset
someone should finetune CogVLM or florence (or whatever good captioner model) only with nfsw to make them better at describing people doing anything else than just standing imo
someone tried/is trying to make a nsfw tune vlm but i assume its a huge undertaking to caption it alone
What happened to it anyway? I checked and there was no announcement in his twitter but he's still active.
File: PAPi_0051.jpg (2.16 MB, 2560x1536)
2.16 MB
2.16 MB JPG
File: PAPi_0053.jpg (1.75 MB, 2560x1536)
1.75 MB
1.75 MB JPG
File: PAPi_0055.jpg (2.55 MB, 2560x1536)
2.55 MB
2.55 MB JPG
File: PAPi_0056.jpg (1.65 MB, 2560x1536)
1.65 MB
1.65 MB JPG
File: PAPi_0058.jpg (1.74 MB, 2560x1536)
1.74 MB
1.74 MB JPG
debo btw
>debo btw
chill. no one cares.
it's weird that he is allowed to post here while pretending to be other anons, man
im just pointing it out
we can't disallow anyone from posting on a monglian basketweaving forum, now please either stay on topic, or don't bother posting
thanks, adding to the filters
i can only understand that you are somebody incredibly unfamiliar with this general and ill excuse your ignorance as the ignorance of a newfriend
but, to be clear
you have absolutely no idea the person you're defending right now or their ideals and past actions. you are basically asking this comfy, fine, good general to go exactly the same way as /sdg/, a literal and metaphorical cess pit of retardation
Whom? Why are you labeling me with some random cunts name?
I'm one of the folks who bakes these threads, and among other reasons, I bake them precisely to avoid this off-topic bullshit slipping over from the other general. I only have a vague idea of what's going on there, and I want none of it here. We're not janitors, you're free to report if there's a valid reason. The less it's mentioned here, the better for this general, and let's leave it at that.
Don't try to use grammar you clearly don't understand. It's just "Who?", you're not using the pronoun in the objective sense in your post.
Classic Debo move and an absolutely pathetic attempt to pretend to be someone more educated than you are to try and throw us off the obvious scent.
Back to /sdg/ with you, please. And promptly.
Fuck off with this weird corpo personality debo. Please keep this dumb ass shit in /sdg/. No need for it here.
>/sdg/ containment status: breached
Chinese models are superior to anything StabilityAI has ever created because China believes in workers.
Hunyan Lubu is going to be released this Friday. I cannot reveal more details except it is The Dream.
File: ComfyUI_10259_.png (1.86 MB, 1080x1920)
1.86 MB
1.86 MB PNG
Here we can see true work of an auteur. This what only Chinese models can achieve.
Very funny gens. I know you tried your best but I can't stop laughing.
That's not all of them. Keep tagging all my guns in the thread
I don't care.
Ya you do you cheeky cunt. Otherwise you wouldn't tag me.
The most obvious sign that Debo is trolling a thread is exactly this type of response
whats's that?
sounds like troonshit
Debo is going wild both here and /sdg/
File: 1718719896366249.png (102 KB, 356x359)
102 KB
102 KB PNG
just gen
your jealous cuz only certain people are skilled enough to create interesting images
that's why debo is all over this thread
File: 1717006044112598.png (165 KB, 1024x750)
165 KB
165 KB PNG
>all it takes is one retard replying on cooldown
So PAPi is the new debo tag for /ldg/ to add to the filter?
We might consider adding some of this stuff to the pastebin, has anyone seen Ran lately?
You're absolutely correct sir.
File: tmpgnijakhm.png (988 KB, 1344x768)
988 KB
988 KB PNG
>just gen
I endorse this message.
I like the style on this one
Oh, I can give you all my tags.


PAPi and SDXL end in .jpg since they are high quality. The rest are .png
Can we add this to the pastebin?
Cheers anon
The IQ of these posters is through the roof here.
File: 1715990326659378.jpg (102 KB, 1024x1024)
102 KB
102 KB JPG
remember gyate
File: tmpm6dfjyel.png (707 KB, 642x1024)
707 KB
707 KB PNG
Bring out your gens, it's here to collect, and soon you will reap fruits of what you've sown.
Don't you mean gyat?
tfw it's almost time for new collage
You are too stupid to know what gyate means, slav nigger.
File: 1715995862219103.jpg (160 KB, 1024x1024)
160 KB
160 KB JPG
not very gyate of you
File: file.png (1.26 MB, 896x1152)
1.26 MB
1.26 MB PNG
lora training on onetrainer fell flat
Teams behind Invoke, ComfyOrg, CivitAI and Laion are coordinating together with couple of goals in mind:
>True open source: Permissively licensed using an approved Open Source Initiative license, and developed with open and transparent principles
>Capable: A competitive model built to provide the creative flexibility and extensibility needed by creatives
>Ethical: Addressing major, substantiated complaints about unconsented references to artists and other individuals in the base model while recognizing training activities as fair use.

File: notgyate.jpg (87 KB, 896x1152)
87 KB
>/sdg/ discordtroon involved
it's shit
Looks gyate to me desu
Time to scrape all loras and good models off civitai
Ethical means death. The only ethics is: "was your image available in a public space, yes or not". If your name and art is public and publicly available, it's fair game.
File: file.png (1.04 MB, 896x1152)
1.04 MB
1.04 MB PNG
that's just about the closest it got to being gyate
So what does this mean for the little man (me)
It's an initiative focused on actually open AI image models but run by cowards. But honestly if they just focus more on the tech and training and less on the training itself, that's fine. We really need more people like Pixart that focus on proofs of concepts that try to push the bounds of efficiency. For example, they really should figure out distributed training.
when will they learn?
File: file.png (1.3 MB, 896x1152)
1.3 MB
1.3 MB PNG
so basically I'm waiting on kohya or something for another attempt
Then Nvidia will stop making big bucks
don't even know where to start, looking at a pony diffusion v6 model that's lora but i want to run it locally and no idea where to fucking start.
looking at multiple tensors, and i have webui by auto1111 but idk where to go from here
File: file.png (1.39 MB, 896x1152)
1.39 MB
1.39 MB PNG
someone else can make it unethical, I suppose?
They'll make more than enough big bucks selling 5090s and anything with raw computer. A million 3060s will never be good even if they could theoretically be used.
Mend the schism
Unite the threads
That is all
obviously you load pony diffusion as main model / checkpoint and the lora as lora in your UI

in comfyui its checkpoint->lora->sampler
The SAI shills and autists are not worth being with.
it's already in the process >>101141573
File: s.jpg (74 KB, 896x1152)
74 KB
you can't be the sole judge of ethics, that's dangerous to think you are some kind of god who know better than the others what's "good" and "bad"
Put 'em on the torrents.
Ethics are for the little people.
A fresh loap of bread? Save me a loaf:
File: sy.jpg (85 KB, 896x1152)
85 KB
society isn't ready for excellent ethics
There's already one floating around with 2022 and down models and loras
File: 8679305.jpg (67 KB, 1200x675)
67 KB
oh no, I epic brainfarted

[Advertise on 4chan]

Delete Post: [File Only] Style:
[Disable Mobile View / Use Desktop Site]

[Enable Mobile View / Use Mobile Site]

All trademarks and copyrights on this page are owned by their respective parties. Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.