Catalog
- Results include
-
A Spectral View of Adversarially Robust Features
Zhang, Brian Hu2018While great progress has been made in image classification using machine learning, often achieving near-human or even superhuman accuracy on image classification tasks, recent studies have found that image classification models are vulnerable to adversarial attacks: special images crafted to fool the models into mislabeling the picture. In this work, we investigate the problem of creating an adversarially robust feature: a feature f whose value at any point x cannot be changed much by perturbing x slightly. We establish strong connections between adversarially robust features and a natural spectral property of the geometry of the dataset and metric of interest. This connection can be leveraged both to provide robust features and to provide a lower bound on the robustness of any function that has significant variance across the dataset. Finally, we provide empirical evidence that the adversarially robust features yielded via this spectral approach can be fruitfully leveraged to learn a robust (and accurate) model.
Guides
Library website
Exhibits
EarthWorks
More search tools
Tools to help you discover resources at Stanford and beyond.