Abstract
Machine-learning and feature-based approaches have been developed in recent years to count shoppers in retail stores utilizing RGB-D sensors without occlusion in a top-view configuration. Since entering the era of large-scale media, deep learning approaches have become very popular and are used for a various variety of applications like the detection and identification of people in crowded scenes. Detecting and counting people is a difficult task especially in cluttered and crowded environments like malls, airports, and retail stores. Understanding the behavior of humans in a retail store is crucial for the efficient functioning of the business. We present an approach to segment and count people heads in a heavy occlusion environment by using a convolutional neural network. We present a novel semantic segmentation approach to detect people heads using top-view depth image data. The goal of our approach is to segment and count the human heads where the datasets are acquired by depth sensors (ASUS Xtion pro). For semantic segmentation, RGB images are used, but here in this case we are going to use depth images to segment human heads. The proposed architecture begins with ResNet50 as the pre-trained encoder and is then followed by the decoder network. The framework is assessed using the publicly available TVHeads Dataset, which contains depth images of people collected using an RGB-D sensor positioned in a top-view configuration. The results show good accuracy and prove that our approach is efficient and appropriate.
Author supplied keywords
Cite
CITATION STYLE
Abed, A., Akrout, B., & Amous, I. (2022). A Novel Deep Convolutional Neural Network Architecture for Customer Counting in the Retail Environment. In Communications in Computer and Information Science (Vol. 1589 CCIS, pp. 327–340). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-08277-1_27
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.