Corpus annotation by generation

1Citations
Citations of this article
79Readers
Mendeley users who have this article in their library.
Get full text

Abstract

As the interest in annotated corpora is spreading, there is increasing concern with using existing language technology for corpus processing. In this paper we explore the idea of using natural language generation systems for corpus annotation. Resources for generation systems often focus on areas of linguistic variability that are under-represented in analysis-directed approaches. Therefore, making use of generation resources promises some significant extensions in the kinds of annotation information that can be captured. We focus here on exploring the use of the KPML (Komet-Penman MultiLingual) generation system for corpus annotation. We describe the kinds of linguistic information covered in KPML and show the steps involved in creating a standard XML corpus representation from KPML's generation output.

Cite

CITATION STYLE

APA

Teich, E., Bateman, J. A., & Eckart, R. (2006). Corpus annotation by generation. In COLING ACL 2006 - Frontiers in Linguistically Annotated Corpora 2006, A Merged Workshop with 7th International Workshop on Linguistically Interpreted Corpora, LINC 2006 and Frontiers in Corpus Annotation III, Proceedings of the Workshop (pp. 86–93). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1641991.1642002

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free