Voice disorder classification using convolutional neural network based on deep transfer learning

27Citations
Citations of this article
37Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Voice disorders are very common in the global population. Many researchers have conducted research on the identification and classification of voice disorders based on machine learning. As a data-driven algorithm, machine learning requires a large number of samples for training. However, due to the sensitivity and particularity of medical data, it is difficult to obtain sufficient samples for model learning. To address this challenge, this paper proposes a pretrained OpenL3-SVM transfer learning framework for the automatic recognition of multi-class voice disorders. The framework combines a pre-trained convolutional neural network, OpenL3, and a support vector machine (SVM) classifier. The Mel spectrum of the given voice signal is first extracted and then input into the OpenL3 network to obtain high-level feature embedding. Considering the effects of redundant and negative high-dimensional features, model overfitting easily occurs. Therefore, linear local tangent space alignment (LLTSA) is used for feature dimension reduction. Finally, the obtained dimensionality reduction features are used to train the SVM for voice disorder classification. Fivefold cross-validation is used to verify the classification performance of the OpenL3-SVM. The experimental results show that OpenL3-SVM can effectively classify voice disorders automatically, and its performance exceeds that of the existing methods. With continuous improvements in research, it is expected to be considered as auxiliary diagnostic tool for physicians in the future.

References Powered by Scopus

Random forests

96389Citations
N/AReaders
Get full text

Support-Vector Networks

46245Citations
N/AReaders
Get full text

Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization

15661Citations
N/AReaders
Get full text

Cited by Powered by Scopus

An introduction to machine learning and generative artificial intelligence for otolaryngologists—head and neck surgeons: a narrative review

14Citations
N/AReaders
Get full text

MFCC in audio signal processing for voice disorder: a review

7Citations
N/AReaders
Get full text

Classification of laryngeal diseases including laryngeal cancer, benign mucosal disease, and vocal cord paralysis by artificial intelligence using voice analysis

5Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Peng, X., Xu, H., Liu, J., Wang, J., & He, C. (2023). Voice disorder classification using convolutional neural network based on deep transfer learning. Scientific Reports, 13(1). https://doi.org/10.1038/s41598-023-34461-9

Readers over time

‘23‘24‘2509182736

Readers' Seniority

Tooltip

Professor / Associate Prof. 5

36%

PhD / Post grad / Masters / Doc 5

36%

Lecturer / Post doc 3

21%

Researcher 1

7%

Readers' Discipline

Tooltip

Computer Science 8

53%

Engineering 4

27%

Biochemistry, Genetics and Molecular Bi... 2

13%

Medicine and Dentistry 1

7%

Article Metrics

Tooltip
Mentions
News Mentions: 1

Save time finding and organizing research with Mendeley

Sign up for free
0