Deep learning makes a great breakthrough in the field of artificial intelligence. Currently, the robustness of the speech recognition in time domain performs poorly, and the spectrogram complexity of the speech recognition in frequency domain also needs to be reduced greatly. Therefore, this paper presents a faster R-CNN-based target detection method to recognize the spectrogram for the speech recognition in the time and frequency domain. The presented method only focuses on the local interest regions (obvious voiceprint) of the spectrogram, which filters the high frequency noise to improve performance. The experimental results show that the presented method has higher accuracy and robustness than existing methods, and which can perform well evenly in the noisy factory.
CITATION STYLE
Li, Y., Pi, S., & Xiao, N. (2019). Speech recognition method based on spectrogram. In Advances in Intelligent Systems and Computing (Vol. 856, pp. 889–897). Springer Verlag. https://doi.org/10.1007/978-3-030-00214-5_110
Mendeley helps you to discover research relevant for your work.