Towards customized automatic segmentation of subtitles

17Citations
Citations of this article
18Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Automatic subtitling through speech recognition technology has become an important topic in recent years, where the effort has mostly centered on improving core speech technology to obtain better recognition results. However, subtitling quality also depends on other parameters aimed at favoring the readability and quick understanding of subtitles, like correct subtitle line segmentation. In this work, we present an approach to automate the segmentation of subtitles through machine learning techniques, allowing the creation of customized models adapted to the specific segmentation rules of subtitling companies. Support Vector Machines and Logistic Regression classifiers were trained over a reference corpus of subtitles manually created by professionals and used to segment the output of speech recognition engines. We describe the performance of both classifiers and discuss the merits of the approach for the automatic segmentation of subtitles.

Cite

CITATION STYLE

APA

Álvarez, A., Arzelus, H., & Etchegoyhen, T. (2014). Towards customized automatic segmentation of subtitles. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 8854, 229–238. https://doi.org/10.1007/978-3-319-13623-3_24

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free