This paper presents a novel framework, based on maximum likelihood, for training models to recognise simple spatial-motion events, such as those described by the verbs pick up, put down, push, pull, drop, and throw, and classifying novel observations into previously trained classes. The model that we employ does not presuppose prior recognition or tracking of 3D object pose, shape, or identity. We describe our general framework for using maximum-likelihood techniques for visual event classification, the details of the generative model that we use to characterise observations as instances of event types, and the implemented computational techniques used to support training and classification for this generative model. We conclude by illustrating the operation of our implementation on a small example.
CITATION STYLE
Siskind, J. M., & Morris, Q. (1996). A Maximum-likelihood approach to visual event classification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1065, pp. 347–360). Springer Verlag. https://doi.org/10.1007/3-540-61123-1_152
Mendeley helps you to discover research relevant for your work.