Aäron van den Oord, Nal Kalchbrenner, & Koray Kavukcuoglu (2016), References, Textbook of AI

Aäron van den Oord, Nal Kalchbrenner, & Koray Kavukcuoglu (2016)

International Conference on Machine Learning.

URL: https://arxiv.org/abs/1601.06759

Abstract. Introduces PixelRNN and PixelCNN, autoregressive models for image generation. PixelRNN scans an image row by row with an LSTM, factorising the joint distribution as a product of per-pixel categorical conditionals; PixelCNN replaces the recurrence with masked convolutions for parallel training. The model produces sharp, high-quality samples and provides exact log-likelihoods, in contrast to GANs which trade likelihoods for sample quality. PixelRNN/CNN was the dominant likelihood-based image-generation framework before the diffusion revolution and remains pedagogically important as the simplest non-trivial autoregressive image model.

Tags: generative-models vision

Cited in:

Chapter 14: Generative Models

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).

Pixel Recurrent Neural Networks