The viability of multimodal fusion of linguistic and acoustic biomarkers in speech to help in identifying a person with probable Alzheimer’s dementia symptoms have been explored in this research. For capturing the effect of dementia on person’s language and verbal abilities, a novel way of disease detection was explored based on visual analysis of images of spectrogram extracted from patient’s interview recordings. We put forward three fusion methods, which allow the major advancements in representation learning to be utilized. The objective of the empirical study and ensuing discussion presented in this paper was threefold: 1) to examine the potential of state-of-the-art transformer-based architectures and transfer learning to assist the disease diagnosis, 2) to map the problem of acoustic analysis into the realm of image processing, by transforming spectrograms into images and employing pretrained deep neural networks, such as ResNet to extract visual patterns, and 3) to investigate the sound interplay of multi-modal biomarkers of Alzheimer’s dementia when fusing the learned representations in different modalities. We present the results of independent evaluations of the unimodal methods against which the fusion methods have been compared to.
CITATION STYLE
Krstev, I., Pavikjevikj, M., Toshevska, M., & Gievska, S. (2022). Multimodal Data Fusion for Automatic Detection of Alzheimer’s Disease. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13320 LNCS, pp. 79–94). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-06018-2_6
Mendeley helps you to discover research relevant for your work.