Recurrent neural network (RNN) has been broadly applied to natural language process (NLP) problems. This kind of neural network is designed for modeling sequential data and has been testified to be quite efficient in sequential tagging tasks. In this paper, we propose to use bi-directional RNN with long short-term memory (LSTM) units for Chinese word segmentation, which is a crucial task for modeling Chinese sentences and articles. Classical methods focus on designing and combining hand-craft features from context, whereas bi-directional LSTM network (BLSTM) does not need any prior knowledge or pre-designing, and is expert in creating hierarchical feature representation of contextual information from both directions. Experiment result shows that our approach gets state-of-the-art performance in word segmentation on both traditional Chinese datasets and simplified Chinese datasets.
CITATION STYLE
Yao, Y., & Huang, Z. (2016). Bi-directional LSTM recurrent neural network for chinese word segmentation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9950 LNCS, pp. 345–353). Springer Verlag. https://doi.org/10.1007/978-3-319-46681-1_42
Mendeley helps you to discover research relevant for your work.