Chinese novelty mining

Yi Zhang; Flora S. Tsai

Conference Proceedings

Chinese novelty mining

EMNLP 2009 - Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: A Meeting of SIGDAT, a Special Interest Group of ACL, Held in Conjunction with ACL-IJCNLP 2009 (2009) 1561-1570

DOI: 10.3115/1699648.1699703

36Citations

91Readers

Get full text

Abstract

Automated mining of novel documents or sentences from chronologically ordered documents or sentences is an open challenge in text mining. In this paper, we describe the preprocessing techniques for detecting novel Chinese text and discuss the influence of different Part of Speech (POS) filtering rules on the detection performance. Experimental results on APWSJ and TREC 2004 Novelty Track data show that the Chinese novelty mining performance is quite different when choosing two dissimilar POS filtering rules. Thus, the selection of words to represent Chinese text is of vital importance to the success of the Chinese novelty mining. Moreover, we compare the Chinese novelty mining performance with that of English and investigate the impact of preprocessing steps on detecting novel Chinese text, which will be very helpful for developing a Chinese novelty mining system. © 2009 ACL and AFNLP.

Cite

CITATION STYLE

APA

Zhang, Y., & Tsai, F. S. (2009). Chinese novelty mining. In EMNLP 2009 - Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: A Meeting of SIGDAT, a Special Interest Group of ACL, Held in Conjunction with ACL-IJCNLP 2009 (pp. 1561–1570). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1699648.1699703

Chinese novelty mining

Abstract

Cite

Register to see more suggestions