The problem of blind unmixing of multichannel speech recordings in an underdetermined and convolutive case is discussed. A power spectrogram of each source is modeled by superposition of nonnegative rank-1 basic spectrograms, which leads to a Nonnegative Matrix Factorization (NMF) model for each source. Since the number of recording channels may be lower than the number of true sources (speakers), under-determinedness is possible. Hence, any meaningful a priori information about the source or the mixing operator can improve the results of blind separation. In our approach, we assume that the basic rank-1 power spectrograms are locally smoothed both in frequency as well as time domains. To enforce the local smoothness, we incorporate the Markov Random Field (MRF) model in the form of the Gibbs prior to the complete data likelihood function. The simulations demonstrate that this approach considerably improves the separation results. © 2011 Springer-Verlag.
CITATION STYLE
Zdunek, R. (2011). Convolutive nonnegative matrix factorization with Markov random field smoothing for blind unmixing of multichannel speech recordings. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7015 LNAI, pp. 25–32). https://doi.org/10.1007/978-3-642-25020-0_4
Mendeley helps you to discover research relevant for your work.