Integrating punctuation rules and näve Bayesian model for Chinese creation title recognition

Conrad Chen; Hsin Hsi Chen

Conference Proceedings

Integrating punctuation rules and näve Bayesian model for Chinese creation title recognition

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2005) 3651 LNAI 838-848

DOI: 10.1007/11562214_73

1Citations

66Readers

Get full text

Abstract

Creation titles, i.e. titles of literary and/or artistic works, comprise over 7% of named entities in Chinese documents. They are the fourth large sort of named entities in Chinese other than personal names, location names, and organization names. However, they are rarely mentioned and studied before. Chinese title recognition is challenging for the following reasons. There are few internal features and nearly no restrictions in the naming style of titles. Their lengths and structures are varied. The worst of all, they are generally composed of common words, so that they look like common fragments of sentences. In this paper, we integrate punctuation rules, lexicon, and naïve Bayesian models to recognize creation titles in Chinese documents. This pioneer study shows a precision of 0.510 and a recall of 0.685 being achieved. The promising results can be integrated into Chinese segmentation, used to retrieve relevant information for specific titles, and so on. © Springer-Verlag Berlin Heidelberg 2005.

Cite

CITATION STYLE

APA

Chen, C., & Chen, H. H. (2005). Integrating punctuation rules and näve Bayesian model for Chinese creation title recognition. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 3651 LNAI, pp. 838–848). https://doi.org/10.1007/11562214_73

Integrating punctuation rules and näve Bayesian model for Chinese creation title recognition

Abstract

Cite

Register to see more suggestions