Abstract
We investigate the effect of observed data modality on human and machine scoring of informative presentations in the context of oral English communication training and assessment. Three sets of raters scored the content of three minute presentations by college students on the basis of either the video, the audio or the text transcript using a custom scoring rubric. We find significant differences between the scores assigned when raters view a transcript or listen to audio recordings in comparison to watching a video of the same presentation, and present an analysis of those differences. Using the human scores, we train machine learning models to score a given presentation using text, audio, and video features separately. We analyze the distribution of machine scores against the modality and label bias we observe in human scores, discuss its implications for machine scoring and recommend best practices for future work in this direction. Our results demonstrate the importance of checking and correcting for bias across different modalities in evaluations of multi-modal performances.
Author supplied keywords
Cite
CITATION STYLE
Lepp, H., Leong, C. W., Roohr, K., Martin-Raugh, M., & Ramanarayanan, V. (2020). Effect of Modality on Human and Machine Scoring of Presentation Videos. In ICMI 2020 - Proceedings of the 2020 International Conference on Multimodal Interaction (pp. 630–634). Association for Computing Machinery, Inc. https://doi.org/10.1145/3382507.3418880
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.