A web-framework for ODIN annotation

0Citations
Citations of this article
76Readers
Mendeley users who have this article in their library.

Abstract

The current release of the ODIN (Online Database of Interlinear Text) database contains over 150,000 linguistic examples, from nearly 1,500 languages, extracted from PDFs found on the web, representing a significant source of data for language research, particularly for low-resource languages. Errors introduced during PDF-totext conversion or poorly formatted examples can make the task of automatically analyzing the data more difficult, so we aim to clean and normalize the examples in order to maximize accuracy during analysis. In this paper we describe a system that allows users to automatically and manually correct errors in the source data in order to get the best possible analysis of the data. We also describe a RESTful service for managing collections of linguistic examples on the web. All software is distributed under an open-source license.

Cite

CITATION STYLE

APA

Georgi, R., Goodman, M. W., & Xia, F. (2016). A web-framework for ODIN annotation. In 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - System Demonstrations (pp. 31–36). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/p16-4006

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free