Transformer-Based Approach to Pathology Diagnosis Using Audio Spectrogram

0Citations
Citations of this article
29Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Early detection of infant pathologies by non-invasive means is a critical aspect of pediatric healthcare. Audio analysis of infant crying has emerged as a promising method to identify various health conditions without direct medical intervention. In this study, we present a cutting-edge machine learning model that employs audio spectrograms and transformer-based algorithms to classify infant crying into distinct pathological categories. Our innovative model bypasses the extensive preprocessing typically associated with audio data by exploiting the self-attention mechanisms of the transformer, thereby preserving the integrity of the audio’s diagnostic features. When benchmarked against established machine learning and deep learning models, our approach demonstrated a remarkable 98.69% accuracy, 98.73% precision, 98.71% recall, and an F1 score of 98.71%, surpassing the performance of both traditional machine learning and convolutional neural network models. This research not only provides a novel diagnostic tool that is scalable and efficient but also opens avenues for improving pediatric care through early and accurate detection of pathologies.

Cite

CITATION STYLE

APA

Tami, M., Masri, S., Hasasneh, A., & Tadj, C. (2024). Transformer-Based Approach to Pathology Diagnosis Using Audio Spectrogram. Information (Switzerland), 15(5). https://doi.org/10.3390/info15050253

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free