Enhancing chinese word segmentation with character clustering

2Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In semi-supervised learning framework, clustering has been proved a helpful feature to improve system performance in NER and other NLP tasks. However, there hasn't been any work that employs clustering in word segmentation. In this paper, we proposed a new approach to compute clusters of characters and use these results to assist a character based Chinese word segmentation system. Contextual information is considered when we perform character clustering algorithm to address character ambiguity. Experiments show our character clusters result in performance improvement. Also, we compare our clusters features with widely used mutual information (MI). When two features integrated, further improvement is achieved. © Springer-Verlag 2013.

Cite

CITATION STYLE

APA

Liu, Y., Che, W., & Liu, T. (2013). Enhancing chinese word segmentation with character clustering. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8202 LNAI, pp. 52–60). https://doi.org/10.1007/978-3-642-41491-6_6

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free