Text window denoising autoencoder: Building deep architecture for Chinese word segmentation

Ke Wu; Zhiqiang Gao; Cheng Peng; Xiao Wen

Conference Proceedings

Text window denoising autoencoder: Building deep architecture for Chinese word segmentation

Communications in Computer and Information Science (2013) 400 1-12

DOI: 10.1007/978-3-642-41644-6_1

8Citations

17Readers

Get full text

Abstract

Deep learning is the new frontier of machine learning research, which has led to many recent breakthroughs in English natural language processing. However, there are inherent differences between Chinese and English, and little work has been done to apply deep learning techniques to Chinese natural language processing. In this paper, we propose a deep neural network model: text window denoising autoencoder, as well as a complete pre-training solution as a new way to solve classical Chinese natural language processing problems. This method does not require any linguistic knowledge or manual feature design, and can be applied to various Chinese natural language processing tasks, such as Chinese word segmentation. On the PKU dataset of Chinese word segmentation bakeoff 2005, applying this method decreases the F1 error rate by 11.9% for deep neural network based models. We are the first to apply deep learning methods to Chinese word segmentation to our best knowledge. © Springer-Verlag Berlin Heidelberg 2013.

Author supplied keywords

Cite

CITATION STYLE

APA

Wu, K., Gao, Z., Peng, C., & Wen, X. (2013). Text window denoising autoencoder: Building deep architecture for Chinese word segmentation. In Communications in Computer and Information Science (Vol. 400, pp. 1–12). Springer Verlag. https://doi.org/10.1007/978-3-642-41644-6_1

Text window denoising autoencoder: Building deep architecture for Chinese word segmentation

Abstract

Author supplied keywords

Cite

Register to see more suggestions