Non-parametric Jensen-Shannon divergence

Hoang Vu Nguyen; Jilles Vreeken

Conference ProceedingsOPEN ACCESS

Non-parametric Jensen-Shannon divergence

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2015) 9285 173-189

DOI: 10.1007/978-3-319-23525-7_11

23Citations

30Readers

Abstract

Quantifying the difference between two distributions is a common problem in many machine learning and data mining tasks. What is also common in many tasks is that we only have empirical data. That is, we do not know the true distributions nor their form, and hence, before we can measure their divergence we first need to assume a distribution or perform estimation. For exploratory purposes this is unsatisfactory, as we want to explore the data, not our expectations. In this paper we study how to non-parametrically measure the divergence between two distributions. More in particular, we formalise the well-known Jensen-Shannon divergence using cumulative distribution functions. This allows us to calculate divergences directly and efficiently from data without the need for estimation. Moreover, empirical evaluation shows that our method performs very well in detecting differences between distributions, outperforming the state of the art in both statistical power and efficiency for a wide range of tasks.

Cite

CITATION STYLE

APA

Nguyen, H. V., & Vreeken, J. (2015). Non-parametric Jensen-Shannon divergence. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9285, pp. 173–189). Springer Verlag. https://doi.org/10.1007/978-3-319-23525-7_11

Non-parametric Jensen-Shannon divergence

Abstract

Cite

Register to see more suggestions