We present a novel approach for automatically learning models of temporal trajectories extracted from video data. Instead of using a representation of linearly time-normalised vectors of fixed-length, our approach makes use of Dynamic Time Warp distance as a similarity measure to capture the underlying ordered structure of variable-length temporal data while removing the non-linear warping of the time scale. We reformulate the structure learning problem as an optimal graph-partitioning of the dataset to solely exploit Dynamic Time Warp similarity weights without the need for intermediate cluster centroid representations. We extend the graph partitioning method and in particular, the Normalised Cut model originally introduced for static image segmentation to unsupervised clustering of temporal trajectories with fully automated model order selection. By computing hierarchical average Dynamic Time Warp for each cluster, we learn warp-free trajectory models and recover the time warp profiles and structural variance in the data. We demonstrate the approach on modelling trajectories of continuous hand-gestures and moving objects in an indoor environment.
CITATION STYLE
Ng, J., & Gong, S. (2006). Computer Vision — ECCV 2002. (A. Heyden, G. Sparr, M. Nielsen, & P. Johansen, Eds.), Computer Vision — ECCV 2002 (Vol. 2353, pp. 397–401). Berlin, Heidelberg: Springer Berlin Heidelberg. https://doi.org/10.1007/3-540-47979-1
Mendeley helps you to discover research relevant for your work.