We explore the possibility of using word-level transcription to detect non-native English speaker (NNES)’s phoneme mispronunciation tendencies. We focus on word-level instead of phoneme-level transcription as the former is readily accessible and mature. We define phoneme mispronunciation tendency as the recurring imperfect pronunciation of a phoneme across different words. We use an Automatic Speech Recognition (ASR) service to generate alternative transcripts from speaker’s reading aloud audio data. We build features based on the divergence of the audio transcriptions and the texts, as well as the confidence of the audio transcriptions. We found the features are informative for detecting phoneme mispronunciation tendencies.
CITATION STYLE
Pu, S., Becker, L., & Kato, M. (2022). Automatic Identification of Non-native English Speaker’s Phoneme Mispronunciation Tendencies. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13356 LNCS, pp. 608–611). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-11647-6_126
Mendeley helps you to discover research relevant for your work.