Semantic understanding of video is an important frontier in content based retrieval. In the research literature, significant attention has been given to the visual aspect of video, however, relatively little work directly uses audio content for video retrieval. Our paper gives an overview of our current research directions in semantic video retrieval using audio content. We discuss the effectiveness of classifying audio into semantic categories by combining both global and local audio features based in the frequency spectrum. Furthermore, we introduce two novel features called Frequency Spectrum Differentials (FSD), and Differential Swap Rate (DSR), that both model the shape of the spectrum.
CITATION STYLE
Bakker, E. M., & Lew, M. S. (2002). Semantic video retrieval using audio analysis. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2383, pp. 271–277). Springer Verlag. https://doi.org/10.1007/3-540-45479-9_29
Mendeley helps you to discover research relevant for your work.