PEDIVHANDI: Multimodal indexation and retrieval system for lecture videos

Nhu Van Nguyen; Jean Marc Ogier; Franck Charneau

Conference Proceedings

PEDIVHANDI: Multimodal indexation and retrieval system for lecture videos

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2013) 7725 LNCS(PART 2) 382-393

DOI: 10.1007/978-3-642-37444-9_30

3Citations

6Readers

Get full text

Abstract

Since text in slides and teacher's speech complementarily represent lecture contents, lecture videos can be indexed and retrieved by using a fully automatic and complete system based on the multimodal analysis of speech and text. In this paper, we present the multimodal lecture content indexing approach used in the PEDIVHANDI project. We use the discretization of speech and changes of slide's texts to identify lecture slides in the video. We also propose a duplicate verification to remove nearly-duplicate slides. After using the Stroke Width Transfrom (SWT) text detector to obtain text regions, a standard OCR engine is used for text recognition. Finally, a context-based spell check is proposed to correct words recognized. Our system achieves the recognition precision 71% and 57% recall on a corpus of 6 presentation videos for a total duration of 8 hours. © 2013 Springer-Verlag.

Cite

CITATION STYLE

APA

Van Nguyen, N., Ogier, J. M., & Charneau, F. (2013). PEDIVHANDI: Multimodal indexation and retrieval system for lecture videos. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7725 LNCS, pp. 382–393). https://doi.org/10.1007/978-3-642-37444-9_30

PEDIVHANDI: Multimodal indexation and retrieval system for lecture videos

Abstract

Cite

Register to see more suggestions