In this thesis proposal, we explore event extraction and event representation on literary texts. Due to its variety of genres and varying document length, literature is a challenging domain, yet the representation of literary content has received relatively little attention. As most individual events contribute little to the overall semantics of literary documents, we model events at different granularities. On the conceptual level, we adapt the previous definition of schemas as sequences of events, all describing a single process connected through shared participants, and extend the notion to allow modeling a document’s content using sequences of schemas. Technically, the segmentation of event sequences into schemas is approached by modeling such sequences, making use of the narrative cloze task, which is the prediction of masked events in event sequence contexts. We propose building on sequences of event embeddings to form schema representations, thereby summarizing sections of documents using a fixed-size representation. This approach will give rise to comparisons of sections such as chapters up to the comparison of entire literary works on the level of their schema structure, paving the way to a computational approach to quantitative literary research.
CITATION STYLE
Hatzel, H. O., & Biemann, C. (2021). Towards Layered Events and Schema Representations in Long Documents. In NAACL-HLT 2021 - 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Student Research Workshop (pp. 32–39). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.naacl-srw.5
Mendeley helps you to discover research relevant for your work.