The majority of laryngectomees use the electrolarynx as their primary mode of verbal communication after total laryngectomy surgery. However, the archetypal electrolarynx suffers from a monotonous tone and the inconvenience of requiring manual control. This paper presents the potential of pattern recognition to support electrolarynx use by predicting fundamental frequency (F0) and voicing state (VS) from surface EMG of the infrahyoid and suprahyoid muscles, as well as from a respiratory trace. In this study, surface EMG signals from the infrahyoid and suprahyoid muscle groups and respiratory trace were collected from 10 able-bodied, adult males (18-60 years old). Participants performed three kinds of vocal tasks - tones, legatos and phrases. Signal features were extracted from the EMG and respiratory trace, and a Support Vector Machine (SVM) classifier with radial basis function kernels was employed to predict F0 and voicing state. An average root mean squared error of 2.81 ± 0.6 semitones was achieved for the estimation of vocal frequency in the range of 90-360 Hz. An average cross-validation (CV) accuracy of 78.05 ± 6.3% was achieved for the prediction of voicing state from EMG and 65.24 ± 7.8% from the respiratory trace. The proposed method has the advantage of being non-invasive compared with studies that relied on intramuscular electrodes (invasive), while still maintaining an accuracy above chance. Pattern classification of neck-muscle surface EMG has merit in the prediction of fundamental frequency and voicing state during vocalization, encouraging further study of automatic pitch modulation for electrolarynges and silent speech interfaces. © 2014 Elsevier B.V. All rights reserved.
De Armas, W., Mamun, K. A., & Chau, T. (2014). Vocal frequency estimation and voicing state prediction with surface EMG pattern recognition. Speech Communication, 63–64, 15–26. https://doi.org/10.1016/j.specom.2014.04.004