We present an algorithm for automatically clustering tagged videos. Collections of tagged videos are commonplace, however, it is not trivial to discover video clusters therein. Direct methods that operate on visual features ignore the regularly available, valuable source of tag information. Solely clustering videos on these tags is error-prone since the tags are typically noisy. To address these problems, we develop a structured model that considers the interaction between visual features, video tags and video clusters. We model tags from visual features, and correct noisy tags by checking visual appearance consistency. In the end, videos are clustered from the refined tags as well as the visual features. We learn the clustering through a max-margin framework, and demonstrate empirically that this algorithm can produce more accurate clustering results than baseline methods based on tags or visual features, or both. Further, qualitative results verify that the clustering results can discover sub-categories and more specific instances of a given video category. © 2014 Springer International Publishing.
CITATION STYLE
Vahdat, A., Zhou, G. T., & Mori, G. (2014). Discovering video clusters from visual features and noisy tags. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8694 LNCS, pp. 526–539). Springer Verlag. https://doi.org/10.1007/978-3-319-10599-4_34
Mendeley helps you to discover research relevant for your work.