We study the construction of coresets for kernel den-sity estimates. That is we show how to approximate the kernel density estimate described by a large point set with another kernel density estimate with a much smaller point set. For characteristic kernels (including Gaussian and Laplace kernels), our approximation pre-serves the L1 error between kernel density estimates within error ", with coreset size 4/2, but no other as-pects of the data, including the dimension, the diameter of the point set, or the bandwidth of the kernel common to other approximations. When the dimension is unre-stricted, we show this bound is tight for these kernels as well as a much broader set. This work provides a careful analysis of the iterative Frank-Wolfe algorithm adapted to this context, an algorithm called kernel herding. This analysis unites a broad line of work that spans statistics, machine learning, and geometry. When the dimension d is constant, we demonstrate much tighter bounds on the size of the coreset specifically for Gaussian kernels, showing that it is bounded by the size of the coreset for axis-aligned rectangles. Cur-rently the best known constructive bound is O( 1 " logd 1 " ), and non-constructively, this can be improved by q log 1 ". This improves the best constant dimension bounds poly-nomially for d ≥3.
CITATION STYLE
Phillips, J. M., & Tai, W. M. (2018). Improved coresets for kernel density estimates. In Proceedings of the Annual ACM-SIAM Symposium on Discrete Algorithms (pp. 2718–2727). Association for Computing Machinery. https://doi.org/10.1137/1.9781611975031.173
Mendeley helps you to discover research relevant for your work.