This paper studies the properties and performance of models for estimating local probability distributions which are used as components of larger probabilistic systems - history-based generative parsing models. We report experimental results showing that memory-based learning outperforms many commonly used methods for this task (Witten-Bell, Jelinek-Mercer with fixed weights, decision trees, and log-linear models). However, we can connect these results with the commonly used general class of deleted interpolation models by showing that certain types of memory-based learning, including the kind that performed so well in our experiments, are instances of this class. In addition, we illustrate the divergences between joint and conditional data likelihood and accuracy performance achieved by such models, suggesting that smoothing based on optimizing accuracy directly might greatly improve performance.
CITATION STYLE
Toutanova, K., Mitchell, M., & Manning, C. D. (2003). Optimizing local probability models for statistical parsing. In Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science) (Vol. 2837, pp. 409–420). Springer Verlag. https://doi.org/10.1007/978-3-540-39857-8_37
Mendeley helps you to discover research relevant for your work.