Humans can decipher adversarial images

Zhenglong Zhou; Chaz Firestone

Journal ArticleOPEN ACCESS

Humans can decipher adversarial images

Nature Communications (2019) 10(1)

DOI: 10.1038/s41467-019-08931-6

83Citations

202Readers

Abstract

Does the human mind resemble the machine-learning systems that mirror its performance? Convolutional neural networks (CNNs) have achieved human-level benchmarks in classifying novel images. These advances support technologies such as autonomous vehicles and machine diagnosis; but beyond this, they serve as candidate models for human vision itself. However, unlike humans, CNNs are “fooled” by adversarial examples—nonsense patterns that machines recognize as familiar objects, or seemingly irrelevant image perturbations that nevertheless alter the machine’s classification. Such bizarre behaviors challenge the promise of these new advances; but do human and machine judgments fundamentally diverge? Here, we show that human and machine classification of adversarial images are robustly related: In 8 experiments on 5 prominent and diverse adversarial imagesets, human subjects correctly anticipated the machine’s preferred label over relevant foils—even for images described as “totally unrecognizable to human eyes”. Human intuition may be a surprisingly reliable guide to machine (mis)classification—with consequences for minds and machines alike.

Cite

CITATION STYLE

APA

Zhou, Z., & Firestone, C. (2019). Humans can decipher adversarial images. Nature Communications, 10(1). https://doi.org/10.1038/s41467-019-08931-6

Humans can decipher adversarial images

Abstract

Cite

Register to see more suggestions