In this paper we propose an efficient and fast EM algorithm for model-based clustering of large databases. Drawing ideas from its stochastic descendant, the Monte Carlo EM algorithm, the method uses only a sub-sample of the entire database per iteration. Starting with smaller samples in the earlier iterations for computational efficiency, the algorithm increase the sample size intelligently towards the end of the algorithm to assure maximum accuracy of the results. The intelligent sample size updating rule is centered around EM's highly-appraised likelihood-ascent property and only increases the sample when no further improvements are possible based on the current sample. In several simulation studies we show the superiority of Ascent-EM over regular EM implementations. We apply the method to an example of clustering online auctions. © 2005 Springer Science+Business Media, Inc.
CITATION STYLE
Jank, W. (2005). Fast and efficient model-based clustering with the Ascent-EM algorithm. Operations Research/ Computer Science Interfaces Series, 29, 201–212. https://doi.org/10.1007/0-387-23529-9_14
Mendeley helps you to discover research relevant for your work.