The aim of this study is the analysis of voice and speech recordings for the task of Parkinson’s disease detection. Voice modality corresponds to sustained phonation /a/ and speech modality to a short sentence in Lithuanian language. Diverse information from recordings is extracted by 22 well-known audio feature sets. Random forest is used as a learner, both for individual feature sets and for decision-level fusion. Essentia descriptors were found as the best individual feature set, achieving equal error rate of 16.3% for voice and 13.3% for speech. Fusion of feature sets and modalities improved detection and achieved equal error rate of 10.8%. Variable importance in fusion revealed speech modality as more important than voice.
CITATION STYLE
Vaiciukynas, E., Verikas, A., Gelzinis, A., Bacauskiene, M., Vaskevicius, K., Uloza, V., … Ciceliene, J. (2016). Fusing various audio feature sets for detection of Parkinson’s disease from sustained voice and speech recordings. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9811 LNCS, pp. 328–337). Springer Verlag. https://doi.org/10.1007/978-3-319-43958-7_39
Mendeley helps you to discover research relevant for your work.