Ian J. Goodfellow, Jonathon Shlens, & Christian Szegedy (2015), References, Textbook of AI

Ian J. Goodfellow, Jonathon Shlens, & Christian Szegedy (2015)

International Conference on Learning Representations.

URL: https://arxiv.org/abs/1412.6572

Abstract. The paper that turned adversarial examples from a curiosity into a research programme. Argues that the existence of adversarial examples is a property of high-dimensional linear models with sufficiently steep gradients, not a peculiarity of overparameterised neural networks. Introduces the Fast Gradient Sign Method (FGSM), a one-step attack along the sign of the input gradient, and demonstrates it on MNIST, CIFAR-10 and ImageNet. The canonical panda-to-gibbon example originated here. FGSM remains the standard one-step adversarial baseline.

Tags: adversarial safety robustness

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).

Explaining and Harnessing Adversarial Examples