MCScript2.0: A machine comprehension corpus focused on script events and participants

Simon Ostermann; Michael Roth; Manfred Pinkal

Conference ProceedingsOPEN ACCESS

MCScript2.0: A machine comprehension corpus focused on script events and participants

*SEM@NAACL-HLT 2019 - 8th Joint Conference on Lexical and Computational Semantics (2019) 103-117

DOI: 10.18653/v1/s19-1012

28Citations

92Readers

Abstract

We introduce MCScript2.0, a machine comprehension corpus for the end-to-end evaluation of script knowledge. MCScript2.0 contains approx. 20,000 questions on approx. 3,500 texts, crowdsourced based on a new collection process that results in challenging questions. Half of the questions cannot be answered from the reading texts, but require the use of commonsense and, in particular, script knowledge. We give a thorough analysis of our corpus and show that while the task is not challenging to humans, existing machine comprehension models fail to perform well on the data, even if they make use of a commonsense knowledge base. The dataset is available at http://www.sfb1102.uni-saarland.de/?page_id=2582.

Cite

CITATION STYLE

APA

Ostermann, S., Roth, M., & Pinkal, M. (2019). MCScript2.0: A machine comprehension corpus focused on script events and participants. In *SEM@NAACL-HLT 2019 - 8th Joint Conference on Lexical and Computational Semantics (pp. 103–117). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/s19-1012

MCScript2.0: A machine comprehension corpus focused on script events and participants

Abstract

Cite

Register to see more suggestions