Abstract
Due to the historical and cultural reasons, English phases, especially the proper nouns and new words, frequently appear in Web pages written primarily in Asian languages such as Chinese and Korean. Although these English terms and their equivalences in the Asian languages refer to the same concept, they are erroneously treated as independent index units in traditional Information Retrieval (IR). This paper describes the degree to which the problem arises in IR and suggests a novel technique to solve it. Our method firstly extracts an English phrase from Asian language Web pages, and then unifies the extracted phrase and its equivalence(s) in the language as one index unit. Experimental results show that the high precision of our conceptual unification approach greatly improves the IR performance. © 2006 Association for Computational Linguistics.
Cite
CITATION STYLE
Li, Q., Myaeng, S. H., Jin, Y., & Kang, B. Y. (2006). Concept unification of terms in different languages for IR. In COLING/ACL 2006 - 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Vol. 1, pp. 641–648). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1220175.1220256
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.