Abstract
Many models in natural language processing define probabilistic distributions over linguistic structures. We argue that (1) the quality of a model's posterior distribution can and should be directly evaluated, as to whether probabilities correspond to empirical frequencies; and (2) NLP uncertainty can be projected not only to pipeline components, but also to exploratory data analysis, telling a user when to trust and not trust the NLP analysis. We present a method to analyze calibration, and apply it to compare the miscalibration of several commonly used models. We also contribute a coreference sampling algorithm that can create confidence intervals for a political event extraction task.
Cite
CITATION STYLE
Nguyen, K., & O’Connor, B. (2015). Posterior calibration and exploratory analysis for natural language processing models. In Conference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing (pp. 1587–1598). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/d15-1182
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.