Hierarchical late fusion for concept detection in videos

Sabin Tiberius Strat; Alexandre Benoit; Patrick Lambert; Hervé Bredin; Georges Quénot

Book Chapter

Hierarchical late fusion for concept detection in videos

Springer London, (2014), 53-77

DOI: 10.1007/978-3-319-05696-8_3

8Citations

13Readers

Get full text

Abstract

Current research shows that the detection of semantic concepts (e.g., animal, bus, person, dancing, etc.) in multimedia documents such as videos, requires the use of several types of complementary descriptors in order to achieve good results. In this work, we explore strategies for combining dozens of complementary content descriptors (or “experts”) in an efficient way, through the use of late fusion approaches, for concept detection in multimedia documents. We explore two fusion approaches that share a common structure: both start with a clustering of experts stage, continue with an intra-cluster fusion and finish with an inter-cluster fusion, and we also experiment with other state-of-the-art methods. The first fusion approach relies on a priori knowledge about the internals of each expert to group the set of available experts by similarity. The second approach automatically obtains measures on the similarity of experts from their output to group the experts using agglomerative clustering, and then combines the results of this fusion with those from other methods. In the end, we show that an additional performance boost can be obtained by also considering the context of multimedia elements.

Cite

CITATION STYLE

APA

Strat, S. T., Benoit, A., Lambert, P., Bredin, H., & Quénot, G. (2014). Hierarchical late fusion for concept detection in videos. In Advances in Computer Vision and Pattern Recognition (Vol. 64, pp. 53–77). Springer London. https://doi.org/10.1007/978-3-319-05696-8_3

Hierarchical late fusion for concept detection in videos

Abstract

Cite

Register to see more suggestions