Nicholas Carlini & David Wagner (2017), References, Textbook of AI

Nicholas Carlini & David Wagner (2017)

IEEE Symposium on Security and Privacy.

DOI: https://doi.org/10.1109/SP.2017.49

Abstract. Introduces the Carlini-Wagner (C&W) attack, an optimisation-based adversarial attack that minimises perturbation $\ell_p$ norm subject to misclassification. Replaces the hard misclassification constraint with a smooth surrogate based on the logit gap and tunes a careful optimiser schedule. The C&W attack consistently produced smaller perturbations than the FGSM and PGD baselines of the time and exposed numerous "broken" defences that had only worked because the evaluators had used weaker attacks. C&W is the standard reference white-box attack in the adversarial-robustness literature.

Tags: adversarial safety robustness

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).

Towards Evaluating the Robustness of Neural Networks