Skip to content

Extending the Folksonomies of using content-based Audio Analysis

by Elena Martınez, Oscar Celma, Mohamed Sordo
Sound and Music … ()


This paper presents an in–depth study of the social tagging mechanisms used in, an online community where users share and browse audio files by means of tags and content–based audio similarity search. We performed two analyses of the sound collection. The first one is related with how the users tag the sounds, and we could detect some well–known problems that occur in collaborative tagging systems (i.e. polysemy, synonymy, and the scarcity of the existing annotations). Moreover, we show that more than 10% of the collection were scarcely annotated with only one or two tags per sound, thus frustrating the retrieval task. In this sense, the second analysis focuses on enhancing the semantic annotations of these sounds, by means of content–based audio similarity (autotagging). In order to “autotag” the sounds, we use a k–NN classifier that selects the available tags from the most similar sounds. Human assessment is performed in order to evaluate the perceived quality of the candidate tags. The results show that, in 77% of the sounds used, the annotations have been correctly extended with the proposed tags derived from audio similarity.

Cite this document (BETA)

Authors on Mendeley

  1. Mohamed Sordo
    Ph.D. Student
    Departament de Tecnologies de la Informació i la Comunicació, Universitat Pompeu Fabra

Readership Statistics

21 Readers on Mendeley
by Discipline
67% Computer Science
14% Arts and Humanities
5% Business, Management and Accounting
by Academic Status
29% Student > Ph. D. Student
19% Professor > Associate Professor
19% Researcher
by Country
19% Spain
5% Finland
5% France

Sign up today - FREE

Mendeley saves you time finding and organizing research. Learn more

  • All your research in one place
  • Add and import papers easily
  • Access it anywhere, anytime

Start using Mendeley in seconds!

Sign up & Download

Already have an account? Sign in