A Comprehensive Review on Machine Learning Approaches for Enhancing Human Speech Recognition

0Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.

Abstract

As a fundamental element of human-computer interaction, speech recognition—the ability of software systems to identify and interpret human language—has garnered immense attention in recent years. This review offers a rigorous examination of machine learning techniques deployed for optimizing speech recognition capabilities. It delves into the utilization of prominent datasets—such as Librispeech, Timit, and Voxforge—in speech recognition research and underscores their significant contributions to enhancing the accuracy of recognition systems. Furthermore, the efficacy of assorted classification techniques—including deep neural networks (DNN), convolutional neural networks (CNN), support vector machines (SVM), and random forests (RF)—is evaluated in the context of voice recognition. It is observed that Mel-Frequency Cepstral Coefficients (MFCC) often render superior discriminatory abilities in human voice recognition trials. This review stands to provide valuable insights for both researchers and professionals active in the field of speech recognition, thereby paving the way for future advancements in this domain.

Cite

CITATION STYLE

APA

Shanshool, M. A., & Abdulmohsin, H. A. (2023). A Comprehensive Review on Machine Learning Approaches for Enhancing Human Speech Recognition. Traitement Du Signal. International Information and Engineering Technology Association. https://doi.org/10.18280/ts.400529

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free