Sample Selection for Statistical Grammar Induction

Rebecca Hwa

Conference ProceedingsOPEN ACCESS

Sample Selection for Statistical Grammar Induction

Hwa R

Proceedings of the 2000 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, SIGDAT-EMNLP 2000 - Held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics, ACL 2000 (2000) 45-52

DOI: 10.3115/1117794.1117800

50Citations

87Readers

Abstract

Corpus-based grammar induction relies on using many hand-parsed sentences as training examples. However, the construction of a training corpus with detailed syntactic analysis for every sentence is a labor-intensive task. We propose to use sample selection methods to minimize the amount of annotation needed in the training data, thereby reducing the workload of the human annotators. This paper shows that the amount of annotated training data can be reduced by 36% without degrading the quality of the induced grammars.

Cite

CITATION STYLE

APA

Hwa, R. (2000). Sample Selection for Statistical Grammar Induction. In Proceedings of the 2000 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, SIGDAT-EMNLP 2000 - Held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics, ACL 2000 (pp. 45–52). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1117794.1117800

Sample Selection for Statistical Grammar Induction

Abstract

Cite

Register to see more suggestions