>>106523962
it's getting there. or maybe it is faster? I wouldn't know, I got rid of all the pytorch shit off my machine but I remembered it was like ~7 seconds for 1024x1024 sdxl on a 4090. I'm not using lightning or anything
[INFO]: stable-diffusion.cpp:1523 - sampling completed, taking 4.86s
[INFO]: stable-diffusion.cpp:1531 - generating 1 latent images completed, taking 4.92s
[INFO]: stable-diffusion.cpp:1534 - decoding 1 latents
[DEBUG]: ggml_extend.hpp:1148 - vae compute buffer size: 6656.00 MB(VRAM)
[DEBUG]: stable-diffusion.cpp:1127 - computing vae [mode: DECODE] graph completed, taking 0.66s
[INFO]: stable-diffusion.cpp:1544 - latent 1 decoded, taking 0.67s
[INFO]: stable-diffusion.cpp:1548 - decode_first_stage completed, taking 0.67s
[INFO]: stable-diffusion.cpp:1682 - txt2img completed in 5.67s