We present spatio-temporal feature descriptors that can be inferred from video and used as building blocks in action recognition systems. They capture the evolution of "elementary action elements" under a set of assumptions on the image-formation model and are designed to be insensitive to nuisance variability (absolute position, contrast), while retaining discriminative statistics due to the fine-scale motion and the local shape in compact regions of the image. Despite their simplicity, these descriptors, used in conjunction with basic classifiers, attain state of the art performance in the recognition of actions in benchmark datasets. © 2010 Springer-Verlag.
CITATION STYLE
Raptis, M., & Soatto, S. (2010). Tracklet descriptors for action modeling and video analysis. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6311 LNCS, pp. 577–590). Springer Verlag. https://doi.org/10.1007/978-3-642-15549-9_42
Mendeley helps you to discover research relevant for your work.