This paper presents an efficient dictionary structure of Partof Speech(POS) Tagging for Japanese/Korean by extending Aho and Corasick's pattern matching machine. The proposed method is a simple and fast algorithm to find all possible morphemes in an input sentence and in a single pass, and it stores the relations of grammatical connectivity of neighboring morphemes into the output functions. Therefore, the proposed method can reduce both costs of the dictionary lookup and the connection check to find the most suitable word segmentation. From the simulation results, it turns out that the proposed method was 21.8% faster (CPU time) than the general approach using the trie structure. Concerning the number of candidates for checking connections, it was 27.4% less than that of the original morphological analysis.
CITATION STYLE
Ando, K., Lee, T. H., Shishibori, M., & Aoe, J. I. (2001). A method of pre-computing connectivity relations for Japanese/Korean POS tagging. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2004, pp. 363–374). Springer Verlag. https://doi.org/10.1007/3-540-44686-9_36
Mendeley helps you to discover research relevant for your work.