Persian pronoun resolution using data driven approaches

0Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Pronoun resolution is one of the challenges of natural language processing (NLP). The proposed solutions range from heuristic rule-based to machine learning data driven approaches. In this article, we follow a previous machine learning approach on Persian pronoun anaphora resolution. The primary goal of this paper is to improve the results, mainly by extracting more balanced data through using heuristic rules in instance sampling, and utilizing more relevant features in classification. Using PCAC2008 dataset, we consider noun phrase structure as a way to extract more suitable training data. Incorporated features include syntactic and semantic information. Finally, we train and test different classifiers in order to find and compare the results. The best result is achieved by using the C4.5 decision tree classifier. The results show a significant improvement over the previous work by achieving 75% F-measure compared to 45%. An analysis of extracted features and their contribution are also discussed.

Cite

CITATION STYLE

APA

Nourbakhsh, A., & Bahrani, M. (2017). Persian pronoun resolution using data driven approaches. In Communications in Computer and Information Science (Vol. 756, pp. 574–585). Springer Verlag. https://doi.org/10.1007/978-3-319-67642-5_48

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free