To realize high quality machine translation, we proposed a Non-Compositional Language Model, and developed a sentence pattern dictionary of 226, 800 pattern pairs for Japanese compound and complex sentences consisting of 2 or 3 clauses. In pattern generation from a parallel corpus, Compositional Constituents that could be generalized were 74% of independent words, 24% of phrases and only 15% of clauses. This means that in Japanese-to-English MT, most of the translation results as shown in the parallel corpus could not be obtained by methods based on Compositional Semantics. This dictionary achieved a syntactic coverage of 98% and a semantic coverage of 78%. It will substantially improve translation quality. © 2008 Licensed under the Creative Commons.
CITATION STYLE
Ikehara, S., Tokuhisa, M., & Murakami, J. (2008). Non-compositional language model and pattern dictionary development for japanese compound and complex sentences. In Coling 2008 - 22nd International Conference on Computational Linguistics, Proceedings of the Conference (Vol. 1, pp. 353–360). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1599081.1599126
Mendeley helps you to discover research relevant for your work.