Abstract
Speech localization and enhancement involves sound source mapping and reconstruction from noisy recordings of speech mixtures with microphone arrays. Conventional beamforming methods suffer from low resolution, especially with a limited number of microphones. In practice, there are only a few sources compared to the possible directions-of-arrival (DOA). Hence, DOA estimation is formulated as a sparse signal reconstruction problem and solved with sparse Bayesian learning (SBL). SBL uses a hierarchical two-level Bayesian inference to reconstruct sparse estimates from a small set of observations. The first level derives the posterior probability of the complex source amplitudes from the data likelihood and the prior. The second level tunes the prior towards sparse solutions with hyperparameters which maximize the evidence, i.e., the data probability. The adaptive learning of the hyperparameters from the data auto-regularizes the inference problem towards sparse robust estimates. Simulations and experimental data demonstrate that SBL beamforming provides high-resolution DOA maps outperforming traditional methods especially for correlated or non-stationary signals. Specifically for speech signals, the high-resolution SBL reconstruction offers not only speech enhancement but effectively speech separation.
Cite
CITATION STYLE
Xenaki, A., Bünsow Boldt, J., & Græsbøll Christensen, M. (2018). Sound source localization and speech enhancement with sparse Bayesian learning beamforming. The Journal of the Acoustical Society of America, 143(6), 3912–3921. https://doi.org/10.1121/1.5042222
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.