Deep learning-based direction-of-arrival estimation for multiple speech sources using a small scale array

  • Zhang M
  • Pan X
  • Shen Y
  • et al.
14Citations
Citations of this article
11Readers
Mendeley users who have this article in their library.
Get full text

Abstract

A high resolution direction-of-arrival (DOA) approach is presented based on deep neural networks (DNNs) for multiple speech sources localization using a small scale array. First, three invariant features from the time-frequency spectrum of the input signal include generalized cross correlation (GCC) coefficients, GCC coefficients in the mel-scaled subband, and the combination of GCC coefficients and logarithmic mel spectrogram. Then the DNN labels are designed to fit the Gaussian distribution, which is similar to the spatial spectrum of the multiple signal classification. Finally, DOAs are predicted by performing peak detection on the DNN outputs, where the maximum values correspond to speech signals of interest. The DNN-based DOA estimation method outperforms the existing high resolution beamforming techniques in numerical simulations. The proposed framework implemented with a four-element microphone array can effectively localize multiple speech sources in an indoor environment.

References Powered by Scopus

Deep residual learning for image recognition

178837Citations
N/AReaders
Get full text

MULTIPLE EMITTER LOCATION AND SIGNAL PARAMETER ESTIMATION.

11804Citations
N/AReaders
Get full text

Librispeech: An ASR corpus based on public domain audio books

5136Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Introduction to the special issue on machine learning in acoustics

21Citations
N/AReaders
Get full text

Robust high-resolution direction-of-arrival estimation method using DenseBlock-based U-net

10Citations
N/AReaders
Get full text

Direction-of-Arrival Estimation Method Based on Neural Network with Temporal Structure for Underwater Acoustic Vector Sensor Array

7Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Zhang, M., Pan, X., Shen, Y., & Qiu, J. (2021). Deep learning-based direction-of-arrival estimation for multiple speech sources using a small scale array. The Journal of the Acoustical Society of America, 149(6), 3841–3850. https://doi.org/10.1121/10.0005127

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 7

88%

Lecturer / Post doc 1

13%

Readers' Discipline

Tooltip

Engineering 3

43%

Social Sciences 2

29%

Computer Science 1

14%

Agricultural and Biological Sciences 1

14%

Save time finding and organizing research with Mendeley

Sign up for free