- Abdallah S
- Plumbley M

IEEE Transactions on Neural Networks (2006) 17(1) 179-196

- 59Mendeley users who have this article in their library.
- 48Citations of this article.

We investigate a data-driven approach to the analysis and transcription of polyphonic music, using a probabilistic model which is able to find sparse linear decompositions of a sequence of short-term Fourier spectra. The resulting system represents each input spectrum as a weighted sum of a small number of " atomic " spectra chosen from a larger dictionary; this dictionary is, in turn, learned from the data in such a way as to represent the given training set in an (information theoretically) efficient way. When exposed to examples of polyphonic music, most of the dictionary elements take on the spectral characteristics of individual notes in the music, so that the sparse decomposition can be used to identify the notes in a polyphonic mixture. Our approach differs from other methods of polyphonic analysis based on spectral decomposition by combining all of the following: a) a formulation in terms of an explicitly given probabilistic model, in which the process estimating which notes are present corresponds naturally with the inference of latent variables in the model; b) a particularly simple generative model, motivated by very general considerations about efficient coding, that makes very few assumptions about the musical origins of the signals being processed; and c) the ability to learn a dictionary of atomic spectra (most of which converge to harmonic spectral profiles associated with specific notes) from polyphonic examples alone—no separate training on monophonic examples is required. Abstract—We investigate a data-driven approach to the anal-ysis and transcription of polyphonic music, using a probabilistic model which is able to find sparse linear decompositions of a se-quence of short-term Fourier spectra. The resulting system repre-sents each input spectrum as a weighted sum of a small number of " atomic " spectra chosen from a larger dictionary; this dictionary is, in turn, learned from the data in such a way as to represent the given training set in an (information theoretically) efficient way. When exposed to examples of polyphonic music, most of the dic-tionary elements take on the spectral characteristics of individual notes in the music, so that the sparse decomposition can be used to identify the notes in a polyphonic mixture. Our approach dif-fers from other methods of polyphonic analysis based on spectral decomposition by combining all of the following: a) a formulation in terms of an explicitly given probabilistic model, in which the process estimating which notes are present corresponds naturally with the inference of latent variables in the model; b) a particularly simple generative model, motivated by very general considerations about efficient coding, that makes very few assumptions about the musical origins of the signals being processed; and c) the ability to learn a dictionary of atomic spectra (most of which converge to harmonic spectral profiles associated with specific notes) from polyphonic examples alone—no separate training on monophonic examples is required. Index Terms—Learning overcomplete dictionaries, polyphonic music, probabilistic modeling, redundancy reduction, sparse fac-torial coding, unsupervised learning.

- Learning overcomplete dictionaries
- Polyphonic music
- Probabilistic modeling
- Redundancy reduction
- Sparse factorial coding
- Unsupervised learning

Mendeley saves you time finding and organizing research

Sign up here

Already have an account ?Sign in

Choose a citation style from the tabs below