Abstract
Worldwide, heart disease is the leading cause of mortality. Cardiac auscultation, when conducted by a trained professional, is a non-invasive, cost-effective, and readily available method for the initial assessment of cardiac health. Automated heart sound analysis offers a promising and accessible approach to supporting cardiac diagnosis. This work introduces a novel method for classifying heart sounds as normal or abnormal by leveraging time-frequency representations. Our approach combines three distinct time-frequency representations—short-time Fourier transform (STFT), mel-scale spectrogram, and wavelet synchrosqueezed transform (WSST)—to create images that enhance classification performance. These images are used to train five convolutional neural networks (CNNs): AlexNet, VGG-16, ResNet50, a CNN specialized in STFT images, and our proposed CNN model. The method was trained and tested using three public heart sound datasets: PhysioNet/CinC Challenge 2016, CirCor DigiScope Phonocardiogram Dataset 2022, and another open database. While individual representations achieve maximum accuracy of ≈85.9%, combining STFT, mel, and WSST boosts accuracy to ≈99%. By integrating complementary time-frequency features, our approach demonstrates robust heart sound analysis, achieving consistent classification performance across diverse CNN architectures, thus ensuring reliability and generalizability.
Author supplied keywords
Cite
CITATION STYLE
Orozco-Reyes, L., Alonso-Arévalo, M. A., García-Canseco, E., Ibarra-Hernández, R. F., & Conte-Galván, R. (2025). A Deep-Learning Approach to Heart Sound Classification Based on Combined Time-Frequency Representations. Technologies, 13(4). https://doi.org/10.3390/technologies13040147
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.