Abstract
In this paper we compare different approaches to extract definitions of four types using a combination of a rule-based grammar and machine learning. We collected a Dutch text corpus containing 549 definitions and applied a grammar on it. Machine learning was then applied to improve the results obtained with the grammar. Two machine learning experiments were carried out. In the first experiment, a standard classifier and a classifier designed specifically to deal with im-balanced datasets are compared. The algorithm designed specifically to deal with imbalanced datasets for most types outperforms the standard classifier. In the second experiment we show that classification results improve when information on definition structure is included. © 2009 Association for Computational Linguistics.
Cite
CITATION STYLE
Westerhout, E. (2009). Extraction of definitions using grammar-enhanced machine learning. In EACL 2009 - 12th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings (pp. 88–96). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1609179.1609190
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.