Controlled and balanced dataset for Japanese lexical simplification

17Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.

Abstract

We propose a new dataset for evaluating a Japanese lexical simplification method. Previous datasets have several deficiencies. All of them substitute only a single target word, and some of them extract sentences only from newswire corpus. In addition, most of these datasets do not allow ties and integrate simplification ranking from all the annotators without considering the quality. In contrast, our dataset has the following advantages: (1) it is the first controlled and balanced dataset for Japanese lexical simplification with high correlation with human judgment and (2) the consistency of the simplification ranking is improved by allowing candidates to have ties and by considering the reliability of annotators.

Cite

CITATION STYLE

APA

Kodaira, T., Kajiwara, T., & Komachi, M. (2016). Controlled and balanced dataset for Japanese lexical simplification. In 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Student Research Workshop (pp. 1–7). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/p16-3001

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free