Abstract
An effective video classification method by means of a small number of samples is urgently needed. The deficiency of samples could be alleviated by generating samples through generative adversarial networks (GANs). However, the generation of videos in a typical category remains underexplored because the complex actions and the changeable viewpoints are difficult to simulate. Thus, applying GANs to perform video augmentation is difficult. In this study, we propose a generative data augmentation method for video classification using dynamic images. The dynamic image compresses the motion information of a video into a still image, removing the interference factors such as the background. Thus, utilizing the GANs to augment dynamic images can keep the categorical motion information and save memory compared with generating videos. To deal with the uneven quality of generated images, we propose a self-paced selection method to automatically select high-quality generated samples for training. These selected dynamic images are used to enhance the features, attain regularization, and finally achieve video augmentation. Our method is verified on two benchmark datasets, namely, HMDB51 and UCF101. Experimental results show that the method remarkably improves the accuracy of video classification under the circumstance of sample insufficiency and sample imbalance.
Author supplied keywords
Cite
CITATION STYLE
Zhang, Y., Jia, G., Chen, L., Zhang, M., & Yong, J. (2020). Self-Paced Video Data Augmentation by Generative Adversarial Networks with Insufficient Samples. In MM 2020 - Proceedings of the 28th ACM International Conference on Multimedia (pp. 1652–1660). Association for Computing Machinery, Inc. https://doi.org/10.1145/3394171.3414003
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.