Prafulla Dhariwal & Alex Nichol (2021)
Advances in Neural Information Processing Systems 34.
URL: https://arxiv.org/abs/2105.05233
Abstract. The paper that decisively shifted image synthesis from GANs to diffusion. Introduces architectural improvements (improved U-Net, AdaGN, multi-head attention at multiple resolutions) and classifier guidance, using gradients from a separate classifier trained on noisy images to steer the diffusion sampling process. The result: ImageNet 256 FID of 4.59, beating BigGAN-deep at the same sampling cost, and ImageNet 512 FID of 7.72. Classifier guidance was shortly afterward replaced by classifier-free guidance, but this paper established diffusion as the dominant generative paradigm for images.
Tags: generative-models diffusion
Cited in: