The sensitivity of topic coherence evaluation to topic cardinality

Jey Han Lau; Timothy Baldwin

Conference ProceedingsOPEN ACCESS

The sensitivity of topic coherence evaluation to topic cardinality

2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2016 - Proceedings of the Conference (2016) 483-487

DOI: 10.18653/v1/n16-1057

44Citations

115Readers

Abstract

When evaluating the quality of topics generated by a topic model, the convention is to score topic coherence - either manually or automatically - using the top-N topic words. This hyper-parameter N, or the cardinality of the topic, is often overlooked and selected arbitrarily. In this paper, we investigate the impact of this cardinality hyper-parameter on topic coherence evaluation. For two automatic topic coherence methodologies, we observe that the correlation with human ratings decreases systematically as the cardinality increases. More interestingly, we find that performance can be improved if the system scores and human ratings are aggregated over several topic cardinalities before computing the correlation. In contrast to the standard practice of using a fixed value of N (e.g. N = 5 or N = 10), our results suggest that calculating topic coherence over several different cardinalities and averaging results in a substantially more stable and robust evaluation. We release the code and the datasets used in this research, for reproducibility.1

Cite

CITATION STYLE

APA

Lau, J. H., & Baldwin, T. (2016). The sensitivity of topic coherence evaluation to topic cardinality. In 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2016 - Proceedings of the Conference (pp. 483–487). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/n16-1057

The sensitivity of topic coherence evaluation to topic cardinality

Abstract

Cite

Register to see more suggestions