Combining per-frame and per-track cues for multi-person action recognition

Sameh Khamis; Vlad I. Morariu; Larry S. Davis

Conference ProceedingsOPEN ACCESS

Combining per-frame and per-track cues for multi-person action recognition

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2012) 7572 LNCS(PART 1) 116-129

DOI: 10.1007/978-3-642-33718-5_9

20Citations

58Readers

Abstract

We propose a model to combine per-frame and per-track cues for action recognition. With multiple targets in a scene, our model simultaneously captures the natural harmony of an individual's action in a scene and the flow of actions of an individual in a video sequence, inferring valid tracks in the process. Our motivation is based on the unlikely discordance of an action in a structured scene, both at the track level and the frame level (e.g., a person dancing in a crowd of joggers). While we can utilize sampling approaches for inference in our model, we instead devise a global inference algorithm by decomposing the problem and solving the subproblems exactly and efficiently, recovering a globally optimal joint solution in several cases. Finally, we improve on the state-of-the-art action recognition results for two publicly available datasets. © 2012 Springer-Verlag.

Cite

CITATION STYLE

APA

Khamis, S., Morariu, V. I., & Davis, L. S. (2012). Combining per-frame and per-track cues for multi-person action recognition. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7572 LNCS, pp. 116–129). https://doi.org/10.1007/978-3-642-33718-5_9

Combining per-frame and per-track cues for multi-person action recognition

Abstract

Cite

Register to see more suggestions