Fooling automatic short answer grading systems

14Citations
Citations of this article
23Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

With the rising success of adversarial attacks on many NLP tasks, systems which actually operate in an adversarial scenario need to be reevaluated. For this purpose, we pose the following research question: How difficult is it to fool automatic short answer grading systems? In particular, we investigate the robustness of the state of the art automatic short answer grading system proposed by Sung et al. towards cheating in the form of universal adversarial trigger employment. These are short token sequences that can be prepended to students’ answers in an exam to artificially improve their automatically assigned grade. Such triggers are especially critical as they can easily be used by anyone once they are found. In our experiments, we discovered triggers which allow students to pass exams with passing thresholds of 50 % without answering a single question correctly. Furthermore, we show that such triggers generalize across models and datasets in this scenario, nullifying the defense strategy of keeping grading models or data secret.

Cite

CITATION STYLE

APA

Filighera, A., Steuer, T., & Rensing, C. (2020). Fooling automatic short answer grading systems. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12163 LNAI, pp. 177–190). Springer. https://doi.org/10.1007/978-3-030-52237-7_15

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free