Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, & Bjorn Ommer (2022), References, Textbook of AI

Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, & Bjorn Ommer (2022)

2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 10674-10685.

DOI: https://doi.org/10.1109/cvpr52688.2022.01042

Abstract. Introduces Latent Diffusion Models (the architecture behind Stable Diffusion), which run the diffusion process in the compressed latent space of a pretrained autoencoder rather than in pixel space, dramatically reducing the computational cost of high-resolution image generation.

Tags: generative diffusion stable-diffusion latent-diffusion

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).

High-Resolution Image Synthesis with Latent Diffusion Models