MedNLI Is Not Immune: Natural Language Inference Artifacts in the Clinical Domain

Christine Herlihy; Rachel Rudinger

Conference Proceedings

MedNLI Is Not Immune: Natural Language Inference Artifacts in the Clinical Domain

ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference (2021) 2 1020-1027

DOI: 10.18653/v1/2021.acl-short.129

14Citations

71Readers

Get full text

Abstract

Crowdworker-constructed natural language inference (NLI) datasets have been found to contain statistical artifacts associated with the annotation process that allow hypothesis-only classifiers to achieve better-than-random performance (Poliak et al., 2018; Gururangan et al., 2018; Tsuchiya, 2018). We investigate whether MedNLI, a physician-annotated dataset with premises extracted from clinical notes, contains such artifacts (Romanov and Shivade, 2018). We find that entailed hypotheses contain generic versions of specific concepts in the premise, as well as modifiers related to responsiveness, duration, and probability. Neutral hypotheses feature conditions and behaviors that co-occur with, or cause, the condition(s) in the premise. Contradiction hypotheses feature explicit negation of the premise and implicit negation via assertion of good health. Adversarial filtering demonstrates that performance degrades when evaluated on the difficult subset. We provide partition information and recommendations for alternative dataset construction strategies for knowledge-intensive domains.

Cite

CITATION STYLE

APA

Herlihy, C., & Rudinger, R. (2021). MedNLI Is Not Immune: Natural Language Inference Artifacts in the Clinical Domain. In ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference (Vol. 2, pp. 1020–1027). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.acl-short.129

MedNLI Is Not Immune: Natural Language Inference Artifacts in the Clinical Domain

Abstract

Cite

Register to see more suggestions