A web-framework for ODIN annotation

Ryan Georgi; Michael Wayne Goodman; Fei Xia

Conference ProceedingsOPEN ACCESS

A web-framework for ODIN annotation

54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - System Demonstrations (2016) 31-36

DOI: 10.18653/v1/p16-4006

0Citations

76Readers

Abstract

The current release of the ODIN (Online Database of Interlinear Text) database contains over 150,000 linguistic examples, from nearly 1,500 languages, extracted from PDFs found on the web, representing a significant source of data for language research, particularly for low-resource languages. Errors introduced during PDF-totext conversion or poorly formatted examples can make the task of automatically analyzing the data more difficult, so we aim to clean and normalize the examples in order to maximize accuracy during analysis. In this paper we describe a system that allows users to automatically and manually correct errors in the source data in order to get the best possible analysis of the data. We also describe a RESTful service for managing collections of linguistic examples on the web. All software is distributed under an open-source license.

Cite

CITATION STYLE

APA

Georgi, R., Goodman, M. W., & Xia, F. (2016). A web-framework for ODIN annotation. In 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - System Demonstrations (pp. 31–36). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/p16-4006

A web-framework for ODIN annotation

Abstract

Cite

Register to see more suggestions