Video-Based Crowd Counting Using a Multi-scale Optical Flow Pyramid Network

3Citations
Citations of this article
15Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper presents a novel approach to the task of video-based crowd counting, which can be formalized as the regression problem of learning a mapping from an input image to an output crowd density map. Convolutional neural networks (CNNs) have demonstrated striking accuracy gains in a range of computer vision tasks, including crowd counting. However, the dominant focus within the crowd counting literature has been on the single-frame case or applying CNNs to videos in a frame-by-frame fashion without leveraging motion information. This paper proposes a novel architecture that exploits the spatiotemporal information captured in a video stream by combining an optical flow pyramid with an appearance-based CNN. Extensive empirical evaluation on five public datasets comparing against numerous state-of-the-art approaches demonstrates the efficacy of the proposed architecture, with our methods reporting best results on all datasets.

Cite

CITATION STYLE

APA

Hossain, M. A., Cannons, K., Jang, D., Cuzzolin, F., & Xu, Z. (2021). Video-Based Crowd Counting Using a Multi-scale Optical Flow Pyramid Network. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12626 LNCS, pp. 3–20). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-69541-5_1

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free