Infant engagement during guided play is a reliable indicator of early learning outcomes, psychiatric issues and familial wellbeing. An obstacle to using such information in real-world scenarios is the need for a domain expert to assess the data. We show that an end-to-end Deep Learning approach can perform well in automatic infant engagement detection from a single video source, without requiring a clear view of the face or the whole body. To tackle the problem of explainability in learning methods, we evaluate how four common attention mapping techniques can be used to perform subjective evaluation of the network's decision process and identify multimodal cues used by the network to discriminate engagement levels. We further propose a quantitative comparison approach, by collecting a human attention baseline and evaluating its similarity to each technique.
CITATION STYLE
Fraile, M., Fawcett, C., Lindblad, J., Sladoje, N., & Castellano, G. (2022). End-to-End Learning and Analysis of Infant Engagement During Guided Play: Prediction and Explainability. In ACM International Conference Proceeding Series (pp. 444–454). Association for Computing Machinery. https://doi.org/10.1145/3536221.3556629
Mendeley helps you to discover research relevant for your work.