EQUATE: A benchmark evaluation framework for quantitative reasoning in natural language inference

40Citations
Citations of this article
105Readers
Mendeley users who have this article in their library.

Abstract

Quantitative reasoning is a higher-order reasoning skill that any intelligent natural language understanding system can reasonably be expected to handle. We present EQUATE1 (Evaluating Quantitative Understanding Aptitude in Textual Entailment), a new framework for quantitative reasoning in textual entailment. We benchmark the performance of 9 published NLI models on EQUATE, and find that on average, state-of-the-art methods do not achieve an absolute improvement over a majority-class baseline, suggesting that they do not implicitly learn to reason with quantities. We establish a new baseline Q-REAS that manipulates quantities symbolically. In comparison to the best performing NLI model, it achieves success on numerical reasoning tests (+24.2%), but has limited verbal reasoning capabilities (-8.1%). We hope our evaluation framework will support the development of models of quantitative reasoning in language understanding.

Cite

CITATION STYLE

APA

Ravichander, A., Naik, A., Rose, C., & Hovy, E. (2019). EQUATE: A benchmark evaluation framework for quantitative reasoning in natural language inference. In CoNLL 2019 - 23rd Conference on Computational Natural Language Learning, Proceedings of the Conference (pp. 349–361). Association for Computational Linguistics. https://doi.org/10.18653/v1/k19-1033

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free