Nearest-neighbor automatic sound annotation with a WordNet taxonomy

9Citations
Citations of this article
34Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Sound engineers need to access vast collections of sound effects for their film and video productions. Sound effects providers rely on text-retrieval techniques to give access to their collections. Currently, audio content is annotated manually, which is an arduous task. Automatic annotation methods, normally fine-tuned to reduced domains such as musical instruments or limited sound effects taxonomies, are not mature enough for labeling with great detail any possible sound. A general sound recognition tool would require first, a taxonomy that represents the world and, second, thousands of classifiers, each specialized in distinguishing little details. We report experimental results on a general sound annotator. To tackle the taxonomy definition problem we use WordNet, a semantic network that organizes real world knowledge. In order to overcome the need of a huge number of classifiers to distinguish many different sound classes, we use a nearest-neighbor classifier with a database of isolated sounds unambiguously linked to WordNet concepts. A 30% concept prediction is achieved on a database of over 50,000 sounds and over 1600 concepts. © Springer Science + Business Media, Inc. 2005.

Cite

CITATION STYLE

APA

Cano, P., Koppenberger, M., Le Groux, S., Ricard, J., Wack, N., & Herrera, P. (2005). Nearest-neighbor automatic sound annotation with a WordNet taxonomy. Journal of Intelligent Information Systems, 24(2–3), 99–111. https://doi.org/10.1007/s10844-005-0318-4

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free