We evaluate the convolutive nonnegative matrix factorization in the context of automatic music transcription of polyphonic piano recordings and the associated problem of note isolation. Our intention is to find out whether the temporal continuity of piano notes is truthfully captured by the convolutional kernels and how the performance scales with complexity. Systematic studies of this kind are lacking in existing literature. We make use of established measures of accuracy and similarity. NMF dictionaries covering the piano’s pitch range are learned from a given sample bank of isolated notes. The kernel alias patch size is varied. By using a measure of performance advantage, we show up that the improvements due to convolved bases do not justify the extra computational effort as compared to the standard NMF. In particular, this is true for the more realistic case, in which the dictionary does not fully correspond to the mixture signal. Further pertinent conclusions are drawn as well.
CITATION STYLE
Gorlow, S., & Janer, J. (2015). Evaluation of the convolutional NMF for supervised polyphonic music transcription and note isolation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9237, pp. 437–445). Springer Verlag. https://doi.org/10.1007/978-3-319-22482-4_51
Mendeley helps you to discover research relevant for your work.