Speech-to-text applications’ accuracy in English language learners’ speech transcription

6Citations
Citations of this article
22Readers
Mendeley users who have this article in their library.

Abstract

Speech-to-text applications have great potential for helping students with English language comprehension and pronunciation practice. This study explores the functionality of five speech-to-text (STT) applications (Google Docs voice typing tool, Apple Dictation, Windows 10 Dictation, Dictation.io [a website service], and “Transcribe” [an app on iOS]) to measure their speech transcription accuracy of American English. The experiment involved 30 nonnative speakers, who were asked to perform four speaking tasks and whose speeches were recorded and transcribed with these applications. The transcriptions produced by the applications were then compared with human-made transcriptions to evaluate the accuracy rate of each application's speech transcription ability. The results revealed that the accuracy rate of speech transcriptions depends not only on the applications’ automatic speech recognition ability but also on the types of speech produced, as well as each speaker's L1 influence on L2 (English). The study also offers examples of Japanese speakers’ pronunciation errors attained through STT transcription, demonstrating great pedagogical potential for pronunciation practice and assessment in English classrooms.

Cite

CITATION STYLE

APA

Hirai, A., & Kovalyova, A. (2024). Speech-to-text applications’ accuracy in English language learners’ speech transcription. Language Learning and Technology, 28(1), 1–22. https://doi.org/10.64152/10125/73555

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free