A hands-on walkthrough for crafting adversarial examples against a pre-trained Inception v3 neural network using TensorFlow. Covers the theory behind adversarial inputs as constrained optimization problems solved via projected gradient descent, then implements a basic targeted attack that fools the classifier into labeling a cat image as 'guacamole'. Extends the approach to synthesize rotation-robust adversarial examples by optimizing over an ensemble of stochastically transformed inputs, demonstrating that a single perturbation can remain adversarial across a range of rotation angles.

7m read timeFrom anishathalye.com
Post cover image
Table of contents
A Step-by-Step Guide to Synthesizing Adversarial ExamplesSetupAdversarial examplesRobust adversarial examples

Sort: