Improving Spoofing Capability for End-to-end Any-to-many Voice Conversion

Hua Hua; Ziyi Chen; Yuxiang Zhang; Ming Li; Pengyuan Zhang

Conference ProceedingsOPEN ACCESS

Improving Spoofing Capability for End-to-end Any-to-many Voice Conversion

DDAM 2022 - Proceedings of the 1st International Workshop on Deepfake Detection for Audio Multimedia (2022) 93-100

DOI: 10.1145/3552466.3556532

5Citations

17Readers

Get full text

Abstract

Audio deep synthesis techniques have been able to generate highquality speech whose authenticity is difficult for humans to recognize. Meanwhile, many anti-spoofing systems have been developed to capture artifacts in the synthesized speech that are imperceptible to human hearing, thus a continuous escalating race of 'attacking and defending' in voice deepfake has started. Hence, to further improve the probability of successfully cheating anti-spoofing systems, we propose a fully end-to-end, any-to-many voice conversion method based on a non-autoregressive structure with the addition of two light but strong post-processing strategies namely silence replacement and global noise perturbation. Experimental results show that the proposed method performs better than current baselines in fooling several state-of-the-art anti-spoofing systems. Better naturalness and speaker similarity are also achieved, resulting in our proposed method showing high deception performance against humans.

Author supplied keywords

Cite

CITATION STYLE

APA

Hua, H., Chen, Z., Zhang, Y., Li, M., & Zhang, P. (2022). Improving Spoofing Capability for End-to-end Any-to-many Voice Conversion. In DDAM 2022 - Proceedings of the 1st International Workshop on Deepfake Detection for Audio Multimedia (pp. 93–100). Association for Computing Machinery, Inc. https://doi.org/10.1145/3552466.3556532

Improving Spoofing Capability for End-to-end Any-to-many Voice Conversion

Abstract

Author supplied keywords

Cite

Register to see more suggestions