Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, & Bjorn Ommer (2022)
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 10674-10685.
DOI: https://doi.org/10.1109/cvpr52688.2022.01042
Abstract. Introduces Latent Diffusion Models (the architecture behind Stable Diffusion), which run the diffusion process in the compressed latent space of a pretrained autoencoder rather than in pixel space, dramatically reducing the computational cost of high-resolution image generation.
Tags: generative diffusion stable-diffusion latent-diffusion