This paper addresses the problem of recognizing and localizing coherent activities of a group of people, called collective activities, in video. Related work has argued the benefits of capturing long-range and higher-order dependencies among video features for robust recognition. To this end, we formulate a new deep model, called Hierarchical Random Field (HiRF). HiRF models only hierarchical dependencies between model variables. This effectively amounts to modeling higher-order temporal dependencies of video features. We specify an efficient inference of HiRF that iterates in each step linear programming for estimating latent variables. Learning of HiRF parameters is specified within the max-margin framework. Our evaluation on the benchmark New Collective Activity and Collective Activity datasets, demonstrates that HiRF yields superior recognition and localization as compared to the state of the art. © 2014 Springer International Publishing.
CITATION STYLE
Amer, M. R., Lei, P., & Todorovic, S. (2014). HiRF: Hierarchical Random Field for collective activity recognition in videos. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8694 LNCS, pp. 572–585). Springer Verlag. https://doi.org/10.1007/978-3-319-10599-4_37
Mendeley helps you to discover research relevant for your work.