ProppLearner: Deeply annotating a corpus of Russian folktales to enable the machine learning of a Russian formalist theory

Mark A. Finlayson

Journal ArticleOPEN ACCESS

ProppLearner: Deeply annotating a corpus of Russian folktales to enable the machine learning of a Russian formalist theory

Finlayson M

Digital Scholarship in the Humanities (2017) 32(2) 284-300

DOI: 10.1093/llc/fqv067

24Citations

32Readers

Abstract

I describe the collection and deep annotation of the semantics of a corpus of Russian folktales. This corpus, which I call the 'ProppLearner' corpus, was assembled to provide data for an algorithm designed to learn Vladimir Propp's morphology of Russian hero tales. The corpus is the most deeply annotated narrative corpus available at this time. The algorithm and learning results are described elsewhere; here, I provide detail on the layers of annotation and how they were chosen, novel layers of annotation required for successful learning, the selection of the texts for annotation, the annotation process itself, and the resulting inter-annotator agreement measures. In particular, the corpus comprised fifteen texts totaling 18,862 words. There were eighteen layers of annotation, five of which were developed specifically to support learning Propp's morphology: referent attributes, context relationships, event valences, Propp's 'dramatis personae', and Propp's functions. All annotations were created by trained annotators with the Story Workbench annotation tool, following a double-annotation paradigm. I discuss lessons learned from this effort and what they mean for future digital humanities efforts when working with the semantics of natural language text.

Cite

CITATION STYLE

APA

Finlayson, M. A. (2017). ProppLearner: Deeply annotating a corpus of Russian folktales to enable the machine learning of a Russian formalist theory. Digital Scholarship in the Humanities, 32(2), 284–300. https://doi.org/10.1093/llc/fqv067

ProppLearner: Deeply annotating a corpus of Russian folktales to enable the machine learning of a Russian formalist theory

Abstract

Cite

Register to see more suggestions