In this paper, we propose a new framework called independent deeply learned matrix analysis (IDLMA), which unifies a deep neural network (DNN) and independence-based multichannel audio source separation. IDLMA utilizes both pretrained DNN source models and statistical independence between sources for the separation, where the time-frequency structures of each source are iteratively optimized by a DNN while enhancing the estimation accuracy of the spatial demixing filters. As the source generative model, we introduce a complex heavy-tailed distribution to improve the separation performance. In addition, we address a semi-supervised situation; namely, a solo-recorded audio dataset can be prepared for only one source in the mixture signal. To solve the limited-data problem, we propose an appropriate data augmentation method to adapt the DNN source models to the observed signal, which enables IDLMA to work even in the semi-supervised situation. Experiments are conducted using music signals with a training dataset in both supervised and semi-supervised situations. The results show the validity of the proposed method in terms of the separation accuracy.
CITATION STYLE
Makishima, N., Mogami, S., Takamune, N., Kitamura, D., Sumino, H., Takamichi, S., … Ono, N. (2019). Independent Deeply Learned Matrix Analysis for Determined Audio Source Separation. IEEE/ACM Transactions on Audio Speech and Language Processing, 27(10), 1601–1615. https://doi.org/10.1109/TASLP.2019.2925450
Mendeley helps you to discover research relevant for your work.