NoisyActions2M: A Multimedia Dataset for Video Understanding from Noisy Labels

2Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Deep learning has shown remarkable progress in a wide range of problems. However, efficient training of such models requires large-scale datasets, and getting annotations for such datasets can be challenging and costly. In this work, we explore user-generated freely available labels from web videos for video understanding. We create a benchmark dataset consisting of around 2 million videos with associated user-generated annotations and other meta information. We utilize the collected dataset for action classification and demonstrate its usefulness with existing small-scale annotated datasets, UCF101 and HMDB51. We study different loss functions and two pretraining strategies, simple and self-supervised learning. We also show how a network pretrained on the proposed dataset can help against video corruption and label noise in downstream datasets. We present this as a benchmark dataset in noisy learning for video understanding. The dataset, code, and trained models are publicly available here for future research. A longer version of our paper is also available here.

Cite

CITATION STYLE

APA

Sharma, M., Patra, R. A., Desai, H., Vyas, S., Rawat, Y., & Shah, R. R. (2021). NoisyActions2M: A Multimedia Dataset for Video Understanding from Noisy Labels. In ACM International Conference Proceeding Series. Association for Computing Machinery. https://doi.org/10.1145/3469877.3490580

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free