Abstract
We present a study of multi-modal freehand gesture recognition relying on three sensory modalities. The modalities are RGB images, depth data, and acceleration data from an IMD attached to the hand. Based on a new self-recorded dataset, we initially establish the ability of a deep Long Short-Term Memory (LSTM) network to correctly classify individual data streams from each modality. Notably, classifying the IMD stream alone generates very good results already. In addition, we investigate two different strategies of multi-modal fusion, since there is no agreement in the literature as to which strategy is preferable. Combining the modalities leads to better recognition performance. Most importantly, fusion considerably improves ahead-of-time classification, i.e., gesture class estimates before sequences are completed, for classes that are difficult to classify on their own.
Author supplied keywords
Cite
CITATION STYLE
Schak, M., & Gepperth, A. (2020). On Multi-modal Fusion for Freehand Gesture Recognition. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12396 LNCS, pp. 862–873). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-61609-0_68
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.