Abstract
Question generation (QG) is the problem of automatically generating questions from inputs such as declarative sentences. The Shared Evaluation Task Challenge (QG-STEC) Task B that took place in 2010 evaluated several state-of-the-art QG systems. However, analysis of the evaluation results was affected by low inter-rater reliability. We adapted Nonaka & Takeuchi's knowledge creation cycle to the task of improving the evaluation annotation guidelines with a preliminary test showing clearly improved inter-rater reliability.
Cite
CITATION STYLE
Godwin, K., & Piwek, P. (2016). Collecting reliable human judgements on machine-generated language: The case of the QG-stec data. In INLG 2016 - 9th International Natural Language Generation Conference, Proceedings of the Conference (pp. 212–216). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w16-6634
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.