CARB: A crowdsourced benchmark for open IE

Sangnie Bhardwaj; Samarth Aggarwal; undefined Mausam

Conference ProceedingsOPEN ACCESS

CARB: A crowdsourced benchmark for open IE

EMNLP-IJCNLP 2019 - 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference (2019) 6262-6267

DOI: 10.18653/v1/d19-1651

58Citations

107Readers

Abstract

Open Information Extraction (Open IE) systems have been traditionally evaluated via manual annotation. Recently, an automated evaluator with a benchmark dataset (OIE2016) was released - it scores Open IE systems automatically by matching system predictions with predictions in the benchmark dataset (Stanovsky and Dagan, 2016). Unfortunately, our analysis reveals that its data is rather noisy, and the tuple matching in the evaluator has issues, making the results of automated comparisons less trustworthy. We contribute CaRB, an improved dataset and framework for testing Open IE systems. To the best of our knowledge, CaRB is the first crowdsourced Open IE dataset and it also makes substantive changes in the matching code and metrics. NLP experts annotate CaRB's dataset to be more accurate than OIE2016. Moreover, we find that on one pair of Open IE systems, CaRB framework provides contradictory results to OIE2016. Human assessment verifies that CaRB's ranking of the two systems is the accurate ranking. We release the CaRB framework along with its crowdsourced dataset.

Cite

CITATION STYLE

APA

Bhardwaj, S., Aggarwal, S., & Mausam. (2019). CARB: A crowdsourced benchmark for open IE. In EMNLP-IJCNLP 2019 - 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference (pp. 6262–6267). Association for Computational Linguistics. https://doi.org/10.18653/v1/d19-1651

CARB: A crowdsourced benchmark for open IE

Abstract

Cite

Register to see more suggestions