Early detection of infant pathologies by non-invasive means is a critical aspect of pediatric healthcare. Audio analysis of infant crying has emerged as a promising method to identify various health conditions without direct medical intervention. In this study, we present a cutting-edge machine learning model that employs audio spectrograms and transformer-based algorithms to classify infant crying into distinct pathological categories. Our innovative model bypasses the extensive preprocessing typically associated with audio data by exploiting the self-attention mechanisms of the transformer, thereby preserving the integrity of the audio’s diagnostic features. When benchmarked against established machine learning and deep learning models, our approach demonstrated a remarkable 98.69% accuracy, 98.73% precision, 98.71% recall, and an F1 score of 98.71%, surpassing the performance of both traditional machine learning and convolutional neural network models. This research not only provides a novel diagnostic tool that is scalable and efficient but also opens avenues for improving pediatric care through early and accurate detection of pathologies.
CITATION STYLE
Tami, M., Masri, S., Hasasneh, A., & Tadj, C. (2024). Transformer-Based Approach to Pathology Diagnosis Using Audio Spectrogram. Information (Switzerland), 15(5). https://doi.org/10.3390/info15050253
Mendeley helps you to discover research relevant for your work.