Scene text detection and tracking for wearable text-to-speech translation camera

Hideaki Goto; Kunqi Liu

Conference Proceedings

Scene text detection and tracking for wearable text-to-speech translation camera

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2016) 9759 23-26

DOI: 10.1007/978-3-319-41267-2_4

0Citations

7Readers

Get full text

Abstract

Camera-based character recognition applications equipped with voice synthesizer are useful for the blind to read text messages in the environments. Such applications in the current market and/or similar prototypes under research require users’ active reading actions, which hamper other activities. We presented a different approach at ICCHP2014; the user can be passive, while the device actively finds useful text in the scene. Text tracking feature was introduced to avoid duplicate reading of the same text. This report presents an improved system with two key components, scene text detection and tracking, that can handle text in various languages including Japanese/Chinese and resolve some scene analysis problems such as merging of text lines. We have employed the MSER (Maximally Stable Extremal Regions) algorithm to obtain better text images, and developed a new text validation filter. Some technical challenges for future device design are presented as well.

Author supplied keywords

Cite

CITATION STYLE

APA

Goto, H., & Liu, K. (2016). Scene text detection and tracking for wearable text-to-speech translation camera. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9759, pp. 23–26). Springer Verlag. https://doi.org/10.1007/978-3-319-41267-2_4

Scene text detection and tracking for wearable text-to-speech translation camera

Abstract

Author supplied keywords

Cite

Register to see more suggestions