Novels are a rich source of data for extracting interesting information. Besides the plot, the characters of a novel are its most important elements that shape the story and its message. An interesting task to consider is extracting these characters from novels in the form of the personas they embody. In this paper, we define and introduce a method to extract such personas of characters in fiction novels, in the form of descriptive phrases. These personas are divided into three types of description—facts, states and feelings. We show that such a model performs satisfactorily returning an extraction precision of 91% and average classification accuracy of 80%. The algorithm uses universal dependency trees, POS tags and WordNet to capture semantically meaningful descriptions of characters portrayed. The results have the potential to serve as input for future NLP tasks on literature fiction like character clustering and classification using techniques such as sentence embeddings.
CITATION STYLE
Prabhu, N., & Natarajan, S. (2019). Extraction of Character Personas from Novels Using Dependency Trees and POS Tags. In Advances in Intelligent Systems and Computing (Vol. 882, pp. 65–74). Springer Verlag. https://doi.org/10.1007/978-981-13-5953-8_6
Mendeley helps you to discover research relevant for your work.