Nowadays, public institutions usually provide videos that contain important information in their webpages. However, people suffering from hearing impairment have difficulties accessing content provided by that mean, and the manual transcription of those videos is a time-consuming task. This problem can be faced by means of Automatic Speech Recognition (ASR) systems. In this work, we have evaluated the performance of several ASR systems when applied to videos from the Government of La Rioja, Spain. Our study shows that the Whisper medium model provides the best trade-off between accuracy and speed. Using this model, we have generated the transcription of all the videos from the YouTube channel of the Government of La Rioja. In addition, we have created a tool to facilitate this task for other YouTube Spanish channels. Hence, this can be seen as a step towards improving the accessibility of the information and contents produced by Spanish public administrations.
CITATION STYLE
Martín, M. S., Heras, J., & Mata, G. (2023). Automatic Generation of Subtitles for Videos of the Government of La Rioja. In Communications in Computer and Information Science (Vol. 1824 CCIS, pp. 393–402). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-34020-8_30
Mendeley helps you to discover research relevant for your work.