Homonym normalisation by word sense clustering: a case in Japanese

1Citations
Citations of this article
63Readers
Mendeley users who have this article in their library.

Abstract

This work presents a method of word sense clustering that differentiates homonyms and merge homophones, taking Japanese as an example, where orthographical variation causes problem for language processing. It uses contextualised embeddings (BERT) to cluster tokens into distinct sense groups, and we use these groups to normalise synonymous instances to a single representative form. We see the benefit of this normalisation in language model, as well as in transliteration.

Cite

CITATION STYLE

APA

Sato, Y., & Heffernan, K. (2020). Homonym normalisation by word sense clustering: a case in Japanese. In COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference (pp. 3324–3332). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2020.coling-main.295

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free