Previous work has shown that automated essay scoring systems, in particular machine learning-based systems, are not capable of assessing the quality of essays, but are relying on essay length, a factor irrelevant to writing proficiency. In this work, we first show that state-of-the-art systems, recent neural essay scoring systems, might be also influenced by the correlation between essay length and scores in a standard dataset. In our evaluation, a very simple neural model shows the state-of-the-art performance on the standard dataset. To consider essay content without taking essay length into account, we introduce a simple neural model assessing the similarity of content between an input essay and essays assigned different scores. This neural model achieves performance comparable to the state of the art on a standard dataset as well as on a second dataset. Our findings suggest that neural essay scoring systems should consider the characteristics of datasets to focus on text quality.
CITATION STYLE
Jeon, S., & Strube, M. (2021). Countering the Influence of Essay Length in Neural Essay Scoring. In SustaiNLP 2021 - 2nd Workshop on Simple and Efficient Natural Language Processing, Proceedings of SustaiNLP (pp. 32–38). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.sustainlp-1.4
Mendeley helps you to discover research relevant for your work.