DuEE: A Large-Scale Dataset for Chinese Event Extraction in Real-World Scenarios

37Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper introduces DuEE, a new dataset for Chinese event extraction (EE) in real-world scenarios. DuEE has several advantages over previous EE datasets. (1) Scale: DuEE consists of 19,640 events categorized into 65 event types, along with 41,520 event arguments mapped to 121 argument roles, which, to our knowledge, is the largest Chinese EE dataset so far. (2) Quality: All the data is human annotated with crowdsourced review, ensuring that the annotation accuracy is higher than 95%. (3) Reality: The schema covers trending topics from Baidu Search and the data is collected from news on Baijiahao. The task is also close to real-world scenarios, e.g., a single instance is allowed to contain multiple events, different event arguments are allowed to share the same argument role, and an argument is allowed to play different roles. To advance the research on Chinese EE, we release DuEE as well as a baseline system to the community. We also organize a shared competition on the basis of DuEE, which has attracted 1,206 participants. We analyze the results of top performing systems and hope to shed light on further improvements.

Cite

CITATION STYLE

APA

Li, X., Li, F., Pan, L., Chen, Y., Peng, W., Wang, Q., … Zhu, Y. (2020). DuEE: A Large-Scale Dataset for Chinese Event Extraction in Real-World Scenarios. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12431 LNAI, pp. 534–545). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-60457-8_44

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free