The prevalent approaches for the task of Chinese word segmentation almost rely on the Bi-LSTM neural network. However, the methods based the Bi-LSTM have an inherent drawback: the Vanishing Gradients, which cause the little efficient in capturing the faraway character information of a long sentence for the task of word segmentation. In this work, we propose a novel sequence-to-sequence transformer model for Chinese word segmentation, which is premised a type of convolutional neural network named temporal convolutional network. The model uses the temporal convolutional network to construct an encoder, and uses a fully-connected neural network to build a decoder, and applies the Viterbi algorithm to build an inference layer to infer the final result of the Chinese word segmentation. Meanwhile, the model captures the faraway character information of a long sentence by adding the layers of the encoder. For achieving a superior result of word segmentation, the model binds the Conditional Random Fields model to train parameters. The experiments on the Chinese corpus show that the performance of Chinese word segmentation of the model is better than the Bi-LSTM model, and the model has a better ability to process a long sentence than the Bi-LSTM.
CITATION STYLE
Jiang, W., Wang, Y., & Tang, Y. (2020). A sequence-to-sequence transformer premised temporal convolutional network for chinese word segmentation. In Communications in Computer and Information Science (Vol. 1163, pp. 541–552). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-981-15-2767-8_47
Mendeley helps you to discover research relevant for your work.