SMART: A Stratified Machine Reading Test

2Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We present a Stratified MAchine Reading Test (SMART) data set for Chinese in which each question is assigned a “level” that reflects the type of reasoning that is needed to answer the question. This data set consists of close to 40 K question-answer pairs and its stratified design allows machine reading researchers to quickly focus in on areas that present the most challenge for a machine comprehension system. We further establish a baseline for future research with BERT, and present results that show the levels we have designed correspond well with the level of difficulty that BERT experiences in answering these questions, as reflected by the lower accuracy for higher levels. We have also collected human answers to the questions in the test portion of this data set, and show that humans and the machine have different challenges when answering these questions. This means that even though the machine is approaching human-level performance on this task, humans and the machine perform this task with very different mechanisms.

Cite

CITATION STYLE

APA

Yao, J., Feng, M., Feng, H., Wang, Z., Zhang, Y., & Xue, N. (2019). SMART: A Stratified Machine Reading Test. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11838 LNAI, pp. 67–79). Springer. https://doi.org/10.1007/978-3-030-32233-5_6

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free