Error mining is a useful technique for identifying forms that cause incomplete parses of sentences. We extend the iterative method of Sagot and de la Clergerie (2006) to treat n-grams of an arbitrary length. An inherent problem of incorporating longer n-grams is data sparseness. Our new method takes sparseness into account, producing n-grams that are as long as necessary to identify problematic forms, but not longer. Not every cause for parsing errors can be captured effectively by looking at word n-grams. We report on an algorithm for building more general patterns for mining, consisting of words and part of speech tags. It is not easy to evaluate the various error mining techniques. We propose a new evaluation metric which will enable us to compare different error miners.
CITATION STYLE
de Kok, D., Ma, J., & van Noord, G. (2009). A generalized method for iterative error mining in parsing results. In ACL-IJCNLP 2009 - GEAF 2009: 2009 Workshop on Grammar Engineering Across Frameworks, Proceedings of the Workshop (pp. 71–79). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1690359.1690368
Mendeley helps you to discover research relevant for your work.