Generating robust audio adversarial examples with temporal dependency

16Citations
Citations of this article
21Readers
Mendeley users who have this article in their library.

Abstract

Audio adversarial examples, imperceptible to humans, have been constructed to attack automatic speech recognition (ASR) systems. However, the adversarial examples generated by existing approaches usually incorporate noticeable noises, especially during the periods of silences and pauses. Moreover, the added noises often break temporal dependency property of the original audio, which can be easily detected by state-of-the-art defense mechanisms. In this paper, we propose a new Iterative Proportional Clipping (IPC) algorithm that preserves temporal dependency in audios for generating more robust adversarial examples. We are motivated by an observation that the temporal dependency in audios imposes a significant effect on human perception. Following our observation, we leverage a proportional clipping strategy to reduce noise during the low-intensity periods. Experimental results and user study both suggest that the generated adversarial examples can significantly reduce human-perceptible noises and resist the defenses based on the temporal structure.

Cite

CITATION STYLE

APA

Zhang, H., Yan, Q., Zhou, P., & Liu, X. Y. (2020). Generating robust audio adversarial examples with temporal dependency. In IJCAI International Joint Conference on Artificial Intelligence (Vol. 2021-January, pp. 3167–3173). International Joint Conferences on Artificial Intelligence. https://doi.org/10.24963/ijcai.2020/438

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free