Abstract
Personal attributes extraction plays a significant role in information mining, event tracing and personal name disambiguation. It mainly involves two problems, attribute recognition and decision making on whether this attribute belongs to the extracted person. Personal attributes generally involve named entities, which are recognized mainly by adjusting word segmentation software. As for those which cannot be recognized by word segmentation, the combination of feature words and rules can be used for their recognition. The combination of sentences classifications and rules is employed for attribute ownership decision. At first, all the sentences in the document are classified into those with attribute words and those without, with the latter omitted. The former are then classified into description sentences with one person and description sentences with more persons, according to the criterion that whether there are more than one person described in the sentence. According to statistics of description sentences with one person, anaphora resolution is not necessary, which reduces recognition errors from anaphora resolution failures. Minimum slicing is used for description sentences with more persons, and attribute ownership decision is made within the minimum language segment with the co-occurrence of both the person and the attribute. This method achieves 0.507388780 and 0.489505010 respectively in the lenient evaluation results and the strict evaluation results of SF Value in CIPS-SIGHAN20141 Bakeoff, which turns out to be the best. The fact has shown that the method is effective.
Cite
CITATION STYLE
Cheng, N. C., Zong, C. Q., Hou, M., & Teng, Y. L. (2014). A Study on Personal Attributes Extraction Based on the Combination of Sentences Classifications and Rules. In CLP 2014 - 3rd CIPS-SIGHAN Joint Conference on Chinese Language Processing (pp. 192–201). Association for Computational Linguistics (ACL). https://doi.org/10.3115/v1/w14-6831
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.