The question of how to model spatiotemporal similarity between gestures arising in 3D motion capture data streams is of major significance in currently ongoing research in the domain of human communication. While qualitative perceptual analyses of co-speech gestures, which are manual gestures emerging spontaneously and unconsciously during face-to-face conversation, are feasible in a small-to-moderate scale, these analyses are inapplicable to larger scenarios due to the lack of efficient query processing techniques for spatiotemporal similarity search. In order to support qualitative analyses of co-speech gestures, we propose and investigate a simple yet effective distance-based similarity model that leverages the spatial and temporal characteristics of co-speech gestures and enables similarity search in 3D motion capture data streams in a query-by-example manner. Experiments on real conversational 3D motion capture data evidence the appropriateness of the proposal in terms of accuracy and efficiency.
Beecks, C., Hassani, M., Hinnell, J., Schüller, D., Brenger, B., Mittelberg, I., & Seidl, T. (2015). Spatiotemporal similarity search in 3D motion capture gesture streams. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9239, pp. 355–372). Springer Verlag. https://doi.org/10.1007/978-3-319-22363-6_19