Gesture understanding is one of the most challenging problems in computer vision. Among them, traffic hand signal recognition requires the consideration of speed and the validity of the commanding signal. The lack of available datasets is also a serious problem. Most classifiers approach these problems using the skeletons of target actors in an image. Extracting the three-dimensional coordinates of skeletons is simplified when depth information accompanies the images. However, depth cameras cost significantly more than RGB cameras. Furthermore, the extraction of the skeleton needs to be performed in prior. Here, we show a hand signal detection algorithm without skeletons. Instead of skeletons, we use simple object detectors trained to acquire hand directions. The variance in the time length of gestures mixed with random pauses and noise is handled with a recurrent neural network (RNN). Furthermore, we have developed a flag sequence algorithm to assess the validity of the commanding signal. In whole, the computed hand directions are sent to the RNN, which identifies six types of hand signals given by traffic controllers with the ability to distinguish time variations and intermittent randomly appearing noises. We constructed a hand signal dataset composed of 100 thousand RGB images that is made publicly available. We achieved correct recognition of the hand signals with various backgrounds at 91% accuracy. A processing speed of 30 FPS in FHD video streams, which is a 52% improvement over the best among previous works, was achieved. Despite the extra burden of deciding the validity of the hand signals, this method surpasses methods that solely use RGB video streams. Our work is capable of performing with nonstationary viewpoints, such as those taken from moving vehicles. To accomplish this goal, we set a higher priority for the speed and validity assessment of the recognized commanding signals. The collected dataset is made publicly available through the Korean government portal under the URL "data.go.kr/data/15075814/fileData.do."
CITATION STYLE
Baek, T., & Lee, Y. G. (2022). Traffic control hand signal recognition using convolution and recurrent neural networks. Journal of Computational Design and Engineering, 9(2), 296–309. https://doi.org/10.1093/jcde/qwab080
Mendeley helps you to discover research relevant for your work.