Speaker change detection using variable segments for video indexing

0Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Video indexing based on shots obtained by visual features is useful for content-based video browsing but has more limited success in facilitating semantic search of videos. Meanwhile, recent developments in speech recognition allow the option of surpassing many difficulties associated with the detections of semantic meanings over visual features by operating directly on the verbal content. The use of language based indexing inspires a new video segmentation technique based on speaker change detection. This paper deals with the improvement of existing speaker change detectors by introducing an extra preprocessing step which aligns the audio features with syllables. We investigate the benefits of such synchronization and propose a variable presegmentation scheme that utilizes both magnitude and frequency information to attain such alignment. The experimental results show that the quality of the extracted audio feature is improved, resulting in a better recall rate. © 2011 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Tam, K. Y., Lay, J., & Levy, D. (2011). Speaker change detection using variable segments for video indexing. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6523 LNCS, pp. 296–306). https://doi.org/10.1007/978-3-642-17832-0_28

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free