Mining commonsense knowledge from corpora suffers from reporting bias, over-representing the rare at the expense of the trivial (Gordon and Van Durme, 2013). We study to what extent pre-trained language models overcome this issue. We find that while their generalization capacity allows them to better estimate the plausibility of frequent but unspoken of actions, outcomes, and properties, they also tend to overestimate that of the very rare, amplifying the bias that already exists in their training corpus.
CITATION STYLE
Shwartz, V., & Choi, Y. (2020). Do Neural Language Models Overcome Reporting Bias? In COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference (pp. 6863–6870). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2020.coling-main.605
Mendeley helps you to discover research relevant for your work.