Recent question answering and machine reading benchmarks frequently reduce the task to one of pinpointing spans within a certain text passage that answers the given question. Typically, these systems are not required to actually understand the text on a deeper level that allows for more complex reasoning on the information contained. We introduce a new dataset called BiQuAD that requires deeper comprehension in order to answer questions in both extractive and deductive fashion. The dataset consist of 4; 190 closed-domain texts and a total of 99; 149 question-answer pairs. The texts are synthetically generated soccer match reports that verbalize the main events of each match. All texts are accompanied by a structured Datalog program that represents a (logical) model of its information. We show that state-of-the-art QA models do not perform well on the challenging long form contexts and reasoning requirements posed by the dataset. In particular, transformer based state-of-theart models achieve F1-scores of only 39:0. We demonstrate how these synthetic datasets align structured knowledge with natural text and aid model introspection when approaching complex text understanding.
CITATION STYLE
Grimm, F., & Cimiano, P. (2021). BiQuAD: Towards QA based on deeper text understanding. In *SEM 2021 - 10th Conference on Lexical and Computational Semantics, Proceedings of the Conference (pp. 105–115). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.starsem-1.10
Mendeley helps you to discover research relevant for your work.