Unsupervised deep networks for temporal localization of human actions in streaming videos

Binu M. Nair

Conference Proceedings

Unsupervised deep networks for temporal localization of human actions in streaming videos

Nair B

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2016) 10073 LNCS 143-155

DOI: 10.1007/978-3-319-50832-0_15

0Citations

3Readers

Get full text

Abstract

We propose a deep neural network which captures latent temporal features suitable for localizing actions temporally in streaming videos. This network uses unsupervised generative models containing autoencoders and conditional restricted Boltzmann machines to model temporal structure present in an action. Human motions are non-linear in nature, and thus require continuous temporal model representation of motion which are crucial for streaming videos. The generative ability would help predict features at future time steps which can give an indication of completion of action at any instant. To accumulate M classes of action, we train an autencoder to seperate out actions spaces, and learn generative models per action space. The final layer accumulates statistics from each model, and estimates action class and percentage of completion in a segment of frames. Experimental results prove that this network provides a good predictive and recognition capability required for action localization in streaming videos.

Cite

CITATION STYLE

APA

Nair, B. M. (2016). Unsupervised deep networks for temporal localization of human actions in streaming videos. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10073 LNCS, pp. 143–155). Springer Verlag. https://doi.org/10.1007/978-3-319-50832-0_15

Unsupervised deep networks for temporal localization of human actions in streaming videos

Abstract

Cite

Register to see more suggestions