Abstract
Speech-to-text applications have great potential for helping students with English language comprehension and pronunciation practice. This study explores the functionality of five speech-to-text (STT) applications (Google Docs voice typing tool, Apple Dictation, Windows 10 Dictation, Dictation.io [a website service], and “Transcribe” [an app on iOS]) to measure their speech transcription accuracy of American English. The experiment involved 30 nonnative speakers, who were asked to perform four speaking tasks and whose speeches were recorded and transcribed with these applications. The transcriptions produced by the applications were then compared with human-made transcriptions to evaluate the accuracy rate of each application's speech transcription ability. The results revealed that the accuracy rate of speech transcriptions depends not only on the applications’ automatic speech recognition ability but also on the types of speech produced, as well as each speaker's L1 influence on L2 (English). The study also offers examples of Japanese speakers’ pronunciation errors attained through STT transcription, demonstrating great pedagogical potential for pronunciation practice and assessment in English classrooms.
Author supplied keywords
Cite
CITATION STYLE
Hirai, A., & Kovalyova, A. (2024). Speech-to-text applications’ accuracy in English language learners’ speech transcription. Language Learning and Technology, 28(1), 1–22. https://doi.org/10.64152/10125/73555
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.