Recent demands for translating Japanese statutes into foreign languages necessitate the compilation of standard bilingual dictionaries. To support this costly task, we propose a bootstrapping-basedlexical knowledge extraction algorithm Monaka, to automatically extract dictionary term candidates from unsegmented Japanese legal text. The algorithm is based on the Tchai algorithm and extracts reliable patterns and instances in an iterative manner, but instead uses character n-grams as contextual patterns, and introduces a special constraint to ensure proper segmentation of the extracted terms. The experimental results show that this algorithm can extract correctly segmented and important dictionary terms with higher accuracy compared to conventional methods. © Springer-Verlag Berlin Heidelberg 2009.
CITATION STYLE
Hagiwara, M., Ogawa, Y., & Toyama, K. (2009). Bootstrapping-Based extraction of dictionary terms from unsegmented legal text. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5447 LNAI, pp. 213–227). https://doi.org/10.1007/978-3-642-00609-8_19
Mendeley helps you to discover research relevant for your work.