Handling noisy training and testing data

15Citations
Citations of this article
75Readers
Mendeley users who have this article in their library.

Abstract

In the field of empirical natural language processing, researchers constantly deal with large amounts of marked-up data; whether the markup is done by the researcher or someone else, human nature dictates that it will have errors in it. This paper will more fully characterise the problem and discuss whether and when (and how) to correct the errors. The discussion is illustrated with specific examples involving function tagging in the Penn treebank.

Cite

CITATION STYLE

APA

Blaheta, D. (2002). Handling noisy training and testing data. In Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing, EMNLP 2002 (pp. 111–116). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1118693.1118708

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free