This paper introduces a novel data clustering algorithm based on Langevin dynamics, where the associated potential is constructed directly from the data. To introduce a self-consistent potential, we adopt the potential model from the established Quantum Clustering method. The first step is to use a radial basis function to construct a density distribution from the data. A potential function is then constructed such that this density distribution is the ground state solution to the time-independent Schrödinger equation. The second step is to use this potential function with the Langevin dynamics at subcritical temperature to avoid ergodicity. The Langevin equations take a classical Gibbs distribution as the invariant measure, where the peaks of the distribution coincide with minima of the potential surface. The time dynamics of individual data points lead to different metastable states, which are interpreted as cluster centers. Clustering is therefore achieved when subsets of the data aggregate—as a result of the Langevin dynamics for a moderate period of time—in the neighborhood of a particular potential minimum. While the data points are pushed towards potential minima by the potential gradient, Brownian motion allows them to effectively tunnel through local potential barriers and escape saddle points into locations of the potential surface otherwise forbidden. The algorithm’s feasibility is first established based on several illustrating examples and theoretical analyses, followed by a stricter evaluation using a standard benchmark dataset.
CITATION STYLE
Lafata, K., Zhou, Z., Liu, J.-G., & Yin, F.-F. (2018). Data clustering based on Langevin annealing with a self-consistent potential. Quarterly of Applied Mathematics, 77(3), 591–613. https://doi.org/10.1090/qam/1521
Mendeley helps you to discover research relevant for your work.