Segmentation of expiratory and inspiratory sounds in baby cry audio recordings using hidden Markov models

  • Aucouturier J
  • Nonaka Y
  • Katahira K
  • et al.
20Citations
Citations of this article
47Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The paper describes an application of machine learning techniques to identify expiratory and inspiration phases from the audio recording of human baby cries. Crying episodes were recorded from 14 infants, spanning four vocalization contexts in their first 12 months of age; recordings from three individuals were annotated manually to identify expiratory and inspiratory sounds and used as training examples to segment automatically the recordings of the other 11 individuals. The proposed algorithm uses a hidden Markov model architecture, in which state likelihoods are estimated either with Gaussian mixture models or by converting the classification decisions of a support vector machine. The algorithm yields up to 95% classification precision (86% average), and its ability generalizes over different babies, different ages, and vocalization contexts. The technique offers an opportunity to quantify expiration duration, count the crying rate, and other time-related characteristics of baby crying for screening, diagnosis, and research purposes over large populations of infants.

Cite

CITATION STYLE

APA

Aucouturier, J.-J., Nonaka, Y., Katahira, K., & Okanoya, K. (2011). Segmentation of expiratory and inspiratory sounds in baby cry audio recordings using hidden Markov models. The Journal of the Acoustical Society of America, 130(5), 2969–2977. https://doi.org/10.1121/1.3641377

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free