Single-stage activity recognition methods have been gaining popularity within the construction domain. However, their low per-frame accuracy necessitates additional post-processing to link the per-frame detections. Therefore, limiting their real-time monitoring capabilities is an indispensable component of the emerging construction of digital twins. This study proposes knowledge DIstillation of temporal Gradient data for construction Entity activity Recognition (DIGER), built upon the you only watch once (YOWO) method and improving its activity recognition and localization performance. Activity recognition is improved by designing an auxiliary backbone to exploit the complementary information in the temporal gradient data (transferred into YOWO using knowledge distillation), while localization is improved primarily through integration of complete intersection over union loss. DIGER achieved a per-frame activity recognition accuracy of 93.6% and localization mean average precision at 50% of 79.8% on a large custom dataset, outperforming state-of-the-art methods without requiring additional computation during inference, making it highly effective for real-time monitoring of construction site activities.
CITATION STYLE
Ghelmani, A., & Hammad, A. (2024). Improving single-stage activity recognition of excavators using knowledge distillation of temporal gradient data. Computer-Aided Civil and Infrastructure Engineering, 39(13), 2028–2053. https://doi.org/10.1111/mice.13157
Mendeley helps you to discover research relevant for your work.