Emotional detection based on facial expressions is an important procedure in high-risk tasks such as criminal investigation or lie detection. To reduce the impact of the inconsistency in the duration of macro-and micro-expression, we propose an effective Spatio-temporal Convolutional Attention Network (STCAN) for spotting macro-and micro-expression intervals in long video sequences. The spatial features of each image in the video sequence are extracted through the Convolution Neural Network. Then, considering the problem of the inconsistency in the duration of the macro-and micro-expression, the multi-head self-attention model is used to analyze the weight of the spatial feature of the image in the temporal space. Finally, the time interval of emotional changes is determined according to the weight of each frame of the video sequence, and the macro-and micro-expression intervals are obtained through the threshold segmentation model. Considering the problem of Leave-One-Subject-Out cross-validation the training time long, we verified the effectiveness of our model on the SAMM Long Video and CAS(ME)2 datasets through the Leave-Half-Subject-Out (LHSO) cross-validation method. The experiments show that the STCAN model can achieve competitive results on Facial Micro-Expression (FME) Challenge 2021.
CITATION STYLE
Pan, H., Xie, L., & Wang, Z. (2021). Spatio-temporal Convolutional Attention Network for Spotting Macro-and Micro-expression Intervals. In FME 2021 - Proceedings of the 1st Workshop on Facial Micro-Expression: Advanced Techniques for Facial Expressions Generation and Spotting (pp. 25–30). Association for Computing Machinery, Inc. https://doi.org/10.1145/3476100.3484463
Mendeley helps you to discover research relevant for your work.