Speaker change detection using variable segments for video indexing

King Yiu Tam; Jose Lay; David Levy

Conference Proceedings

Speaker change detection using variable segments for video indexing

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2011) 6523 LNCS(PART 1) 296-306

DOI: 10.1007/978-3-642-17832-0_28

0Citations

2Readers

Get full text

Abstract

Video indexing based on shots obtained by visual features is useful for content-based video browsing but has more limited success in facilitating semantic search of videos. Meanwhile, recent developments in speech recognition allow the option of surpassing many difficulties associated with the detections of semantic meanings over visual features by operating directly on the verbal content. The use of language based indexing inspires a new video segmentation technique based on speaker change detection. This paper deals with the improvement of existing speaker change detectors by introducing an extra preprocessing step which aligns the audio features with syllables. We investigate the benefits of such synchronization and propose a variable presegmentation scheme that utilizes both magnitude and frequency information to attain such alignment. The experimental results show that the quality of the extracted audio feature is improved, resulting in a better recall rate. © 2011 Springer-Verlag Berlin Heidelberg.

Author supplied keywords

Cite

CITATION STYLE

APA

Tam, K. Y., Lay, J., & Levy, D. (2011). Speaker change detection using variable segments for video indexing. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6523 LNCS, pp. 296–306). https://doi.org/10.1007/978-3-642-17832-0_28

Speaker change detection using variable segments for video indexing

Abstract

Author supplied keywords

Cite

Register to see more suggestions