Posterior calibration and exploratory analysis for natural language processing models

Khanh Nguyen; Brendan O'Connor

Conference ProceedingsOPEN ACCESS

Posterior calibration and exploratory analysis for natural language processing models

Conference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing (2015) 1587-1598

DOI: 10.18653/v1/d15-1182

113Citations

139Readers

Abstract

Many models in natural language processing define probabilistic distributions over linguistic structures. We argue that (1) the quality of a model's posterior distribution can and should be directly evaluated, as to whether probabilities correspond to empirical frequencies; and (2) NLP uncertainty can be projected not only to pipeline components, but also to exploratory data analysis, telling a user when to trust and not trust the NLP analysis. We present a method to analyze calibration, and apply it to compare the miscalibration of several commonly used models. We also contribute a coreference sampling algorithm that can create confidence intervals for a political event extraction task.

Cite

CITATION STYLE

APA

Nguyen, K., & O’Connor, B. (2015). Posterior calibration and exploratory analysis for natural language processing models. In Conference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing (pp. 1587–1598). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/d15-1182

Posterior calibration and exploratory analysis for natural language processing models

Abstract

Cite

Register to see more suggestions