Theory and evaluation of a Bayesian music structure extractor

40Citations
Citations of this article
66Readers
Mendeley users who have this article in their library.

Abstract

We introduce a new model for extracting classified structural segments, such as intro, verse, chorus, break and so forth, from recorded music. Our approach is to classify signal frames on the basis of their audio properties and then to agglomerate contiguous runs of similarly classified frames into texturally homogenous (or 'self-similar') segments which inherit the classificaton of their consituent frames. Our work extends previous work on automatic structure extraction by addressing the classification problem using using an unsupervised Bayesian clustering model, the parameters of which are estimated using a variant of the expectation maximisation (EM) algorithm which includes deterministic annealing to help avoid local optima. The model identifies and classifies all the segments in a song, not just the chorus or longest segment. We discuss the theory, implementation, and evaluation of the model, and test its performance against a ground truth of human judgements. Using an analogue of a precisionrecall graph for segment boundaries, our results indicate an optimal trade-off point at approximately 80% precision for 80% recall. © 2005 Queen Mary, University of London.

Author supplied keywords

Cite

CITATION STYLE

APA

Abdallah, S., Noland, K., Sandler, M., Casey, M., & Rhodes, C. (2005). Theory and evaluation of a Bayesian music structure extractor. In ISMIR 2005 - 6th International Conference on Music Information Retrieval (pp. 420–425).

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free