A Comprehensive Review on Machine Learning Approaches for Enhancing Human Speech Recognition

Maha Adnan Shanshool; Husam Ali Abdulmohsin

ArticleOPEN ACCESS

A Comprehensive Review on Machine Learning Approaches for Enhancing Human Speech Recognition

Traitement du Signal

DOI: 10.18280/ts.400529

0Citations

7Readers

Abstract

As a fundamental element of human-computer interaction, speech recognition—the ability of software systems to identify and interpret human language—has garnered immense attention in recent years. This review offers a rigorous examination of machine learning techniques deployed for optimizing speech recognition capabilities. It delves into the utilization of prominent datasets—such as Librispeech, Timit, and Voxforge—in speech recognition research and underscores their significant contributions to enhancing the accuracy of recognition systems. Furthermore, the efficacy of assorted classification techniques—including deep neural networks (DNN), convolutional neural networks (CNN), support vector machines (SVM), and random forests (RF)—is evaluated in the context of voice recognition. It is observed that Mel-Frequency Cepstral Coefficients (MFCC) often render superior discriminatory abilities in human voice recognition trials. This review stands to provide valuable insights for both researchers and professionals active in the field of speech recognition, thereby paving the way for future advancements in this domain.

Author supplied keywords

Cite

CITATION STYLE

APA

Shanshool, M. A., & Abdulmohsin, H. A. (2023). A Comprehensive Review on Machine Learning Approaches for Enhancing Human Speech Recognition. Traitement Du Signal. International Information and Engineering Technology Association. https://doi.org/10.18280/ts.400529

A Comprehensive Review on Machine Learning Approaches for Enhancing Human Speech Recognition

Abstract

Author supplied keywords

Cite

Register to see more suggestions