Human action recognition from visual data is a popular topic in Computer Vision, applied in a wide range of domains. State-of-the-art solutions often include deep-learning approaches based on RGB videos and pre-computed optical flow maps. Recently, 3D Gray-Code Kernels projections have been assessed as an alternative way of representing motion, being able to efficiently capture space-time structures. In this work, we investigate the use of GCK pooling maps, which we called GCK-Maps, as input for addressing Human Action Recognition with CNNs. We provide an experimental comparison with RGB and optical flow in terms of accuracy, efficiency, and scene-bias dependency. Our results show that GCK-Maps generally represent a valuable alternative to optical flow and RGB frames, with a significant reduction of the computational burden.
CITATION STYLE
Nicora, E., Pastore, V. P., & Noceti, N. (2023). GCK-Maps: A Scene Unbiased Representation for Efficient Human Action Recognition. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 14233 LNCS, pp. 62–73). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-43148-7_6
Mendeley helps you to discover research relevant for your work.