Speech-to-text applications’ accuracy in English language learners’ speech transcription

Akiyo Hirai; Angelina Kovalyova

Journal ArticleOPEN ACCESS

Speech-to-text applications’ accuracy in English language learners’ speech transcription

Language Learning and Technology (2024) 28(1) 1-22

DOI: 10.64152/10125/73555

6Citations

22Readers

Abstract

Speech-to-text applications have great potential for helping students with English language comprehension and pronunciation practice. This study explores the functionality of five speech-to-text (STT) applications (Google Docs voice typing tool, Apple Dictation, Windows 10 Dictation, Dictation.io [a website service], and “Transcribe” [an app on iOS]) to measure their speech transcription accuracy of American English. The experiment involved 30 nonnative speakers, who were asked to perform four speaking tasks and whose speeches were recorded and transcribed with these applications. The transcriptions produced by the applications were then compared with human-made transcriptions to evaluate the accuracy rate of each application's speech transcription ability. The results revealed that the accuracy rate of speech transcriptions depends not only on the applications’ automatic speech recognition ability but also on the types of speech produced, as well as each speaker's L1 influence on L2 (English). The study also offers examples of Japanese speakers’ pronunciation errors attained through STT transcription, demonstrating great pedagogical potential for pronunciation practice and assessment in English classrooms.

Author supplied keywords

Cite

CITATION STYLE

APA

Hirai, A., & Kovalyova, A. (2024). Speech-to-text applications’ accuracy in English language learners’ speech transcription. Language Learning and Technology, 28(1), 1–22. https://doi.org/10.64152/10125/73555

Speech-to-text applications’ accuracy in English language learners’ speech transcription

Abstract

Author supplied keywords

Cite

Register to see more suggestions