Fast text caption localization on video using visual rhythm

11Citations
Citations of this article
51Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this paper, a fast DCT-based algorithm is proposed to efficiently locate text captions embedded on specific areas in a video sequence through visual rhythm, which can be fast constructed by sampling certain portions of a DC image sequence and temporally accumulating the samples along time. Our proposed approach is based on the observations that the text captions carrying important information suitable for indexing often appear on specific areas on video frames, from where sampling strategies are derived for a visual rhythm. Our method then uses a combination of contrast and temporal coherence information on the visual rhythm to detect text frames such that each detected text frame represents consecutive frames containing identical text strings, thus significantly reducing the amount of text frames needed to be examined for text localization from a video sequence. It then utilizes several important properties of text caption to locate the text caption from the detected frames.

Cite

CITATION STYLE

APA

Chun, S. S., Kim, H., Kim, J. R., Oh, S., & Sull, S. (2002). Fast text caption localization on video using visual rhythm. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2314, pp. 259–268). Springer Verlag. https://doi.org/10.1007/3-540-45925-1_24

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free