Word segmentation is an integral step in many knowledge discovery applications. However, existing word segmentation methods have problems when applying to Chinese judicial documents: (1) existing methods rely on large-scale labeled data which is typically unavailable in judicial documents, and (2) judicial document has its own language features and writing formats. In this paper, a word segmentation method is proposed for Chinese judicial documents. The proposed method consists of two steps: (1) automatically generating some labeled data as legal dictionaries, and (2) applying a hybrid multi-layer neural networks to do word segmentation incorporating legal dictionaries. Experiments are conducted on a dataset of Chinese judicial documents showing that the proposed model can achieve better results than the existing methods.
CITATION STYLE
Yao, L., Ge, J., Li, C., Yao, Y., Li, Z., Zeng, J., … Chang, V. (2019). Word Segmentation for Chinese Judicial Documents. In Communications in Computer and Information Science (Vol. 1058, pp. 466–478). Springer Verlag. https://doi.org/10.1007/978-981-15-0118-0_36
Mendeley helps you to discover research relevant for your work.