A Proposal for a Coherence Corpus in Machine Translation

Karin Sim Smith; Wilker Aziz; Lucia Specia

Conference ProceedingsOPEN ACCESS

A Proposal for a Coherence Corpus in Machine Translation

DiscoMT 2015 - Discourse in Machine Translation, Proceedings of the Workshop (2015) 52-58

DOI: 10.18653/v1/w15-2507

6Citations

73Readers

Abstract

Coherence in Machine Translation (MT) has received little attention to date. One of the main issues we face in work in this area is the lack of labelled data. While coherent (human authored) texts are abundant and incoherent texts could be taken from MT output, the latter also contains other errors which are not specifically related to coherence. This makes it difficult to identify and quantify issues of coherence in those texts. We introduce an initiative to create a corpus consisting of data artificially manipulated to contain errors of coherence common in MT output. Such a corpus could then be used as a benchmark for coherence models in MT, and potentially as training data for coherence models in supervised settings.

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Smith, K. S., Aziz, W., & Specia, L. (2015). A Proposal for a Coherence Corpus in Machine Translation. In DiscoMT 2015 - Discourse in Machine Translation, Proceedings of the Workshop (pp. 52–58). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w15-2507

Readers' Seniority

PhD / Post grad / Masters / Doc 19

59%

Researcher 9

28%

Lecturer / Post doc 3

Professor / Associate Prof. 1

Readers' Discipline

Computer Science 26

70%

Linguistics 8

22%

Engineering 2

Neuroscience 1

A Proposal for a Coherence Corpus in Machine Translation

Abstract

References Powered by Scopus

Modeling local coherence: An entity-based approach

Correcting ESL errors using phrasal SMT techniques

A model of coherence based on distributed sentence representation

Cited by Powered by Scopus

How are neural machine-translated Chinese-to-English short stories constructed and cohered? An exploratory study based on theme-rheme structure

An overview on text coherence methods

Register to see more suggestions

Cite

Readers' Seniority

Readers' Discipline