Abstract
In this work, we present the first dataset, MAILEX, for performing event extraction from conversational email threads. To this end, we first proposed a new taxonomy covering 10 event types and 76 arguments in the email domain. Our final dataset includes 1.5K email threads and ∼4K emails, which are annotated with totally ∼8K event instances. To understand the task challenges, we conducted a series of experiments comparing three types of approaches, i.e., fine-tuned sequence labeling, fine-tuned generative extraction, and few-shot in-context learning. Our results showed that the task of email event extraction is far from being addressed, due to challenges lying in, e.g., extracting non-continuous, shared trigger spans, extracting non-named entity arguments, and modeling the email conversational history. Our work thus suggests more future investigations in this domain-specific event extraction task.
Cite
CITATION STYLE
Srivastava, S., Singh, G., Matsumoto, S., Raz, A., Costa, P., Poore, J., & Yao, Z. (2023). MAILEX: Email Event and Argument Extraction. In EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings (pp. 12964–12987). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.emnlp-main.801
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.