The present paper describes the identification of prominent characters and their adjectives from Indian mythological epic, Mahabharata, written in English texts. However, in contrast to the traditional approaches of named entity identification, the present system extracts hidden attributes associated with each of the characters (e.g., character adjectives). We observed distinct phrase level linguistic patterns that hint the presence of characters in different text spans. Such six patterns were used in order to extract the characters. On the other hand, a distinguishing set of novel features (e.g., multi-word expression, nodes and paths of parse tree, immediate ancestors etc.) was employed. Further, the correlation of the features is also measured in order to identify the important features. Finally, we applied various machine learning algorithms (e.g., Naive Bayes, KNN, Logistic Regression, Decision Tree, Random Forest etc.) along with deep learning to classify the patterns as characters or non-characters in order to achieve decent accuracy. Evaluation shows that phrase level linguistic patterns as well as the adopted features are highly active in capturing characters and their adjectives.
CITATION STYLE
Paul, A., & Das, D. (2017). Identification of character adjectives from Mahabharata. In International Conference Recent Advances in Natural Language Processing, RANLP (Vol. 2017-September, pp. 569–576). Incoma Ltd. https://doi.org/10.26615/978-954-452-049-6_074
Mendeley helps you to discover research relevant for your work.