Annotation error detection: Anomaly detection vs. classification

Jindřich Matoušek; Daniel Tihelka

Conference Proceedings

Annotation error detection: Anomaly detection vs. classification

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2017) 10458 LNAI 141-151

DOI: 10.1007/978-3-319-66429-3_13

1Citations

3Readers

Get full text

Abstract

We compare two approaches to automatic detection of annotation errors in single-speaker read-speech corpora used for speech synthesis: anomaly- and classification-based detection. Both approaches principally differ in that the classification-based approach needs to use both correctly annotated and misannotated words for training. On the other hand, the anomaly-based detection approach needs only the correctly annotated words for training (plus a few misannotated words for validation). We show that both approaches lead to statistically comparable results when all available misannotated words are utilized during detector/classifier development. However, when a smaller number of misannotated words are used, the anomaly detection framework clearly outperforms the classification-based approach. A final listening test showed the effectiveness of the annotation error detection for improving the quality of synthetic speech.

Author supplied keywords

Cite

CITATION STYLE

APA

Matoušek, J., & Tihelka, D. (2017). Annotation error detection: Anomaly detection vs. classification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10458 LNAI, pp. 141–151). Springer Verlag. https://doi.org/10.1007/978-3-319-66429-3_13

Annotation error detection: Anomaly detection vs. classification

Abstract

Author supplied keywords

Cite

Register to see more suggestions