Structural and semantic modeling of audio for content-based querying and browsing

2Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

A typical content-based audio management system deals with three aspects namely audio segmentation and classification, audio analysis, and content-based retrieval of audio. In this paper, we integrate the three aspects of content-based audio management into a single framework and propose an efficient method for flexible querying and browsing of auditory data. More specifically, we utilize two robust feature sets namely MPEG-7 Audio Spectrum Flatness (ASF) and Mel Frequency Cepstral Coefficients (MFCC) as the underlying features in order to improve the content-based retrieval accuracy, since both features have some advantages for distinct types of audio (e.g., music and speech). The proposed system provides a wide range of opportunities to query and browse an audio data by content, such as querying and browsing for a chorus section, sound effects, and query-by-example. In addition, the clients can express their queries in the form of point, ronge, and k-neanst neighbor, which are particularly significant in the multimedia domain. © Springer-Verlag Berlin Heidelberg 2006.

Cite

CITATION STYLE

APA

Sert, M., Baykal, B., & Yazici, A. (2006). Structural and semantic modeling of audio for content-based querying and browsing. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4027 LNAI, pp. 319–330). Springer Verlag. https://doi.org/10.1007/11766254_27

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free