The application of spark-based gaussian mixture model for farm environmental data analysis

3Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

For fully taking into account the feature of environmental data set the Gaussian mixture model (GMM) is combined with the Dirichlet Process (DP) to solve the problem of specifying the initial cluster number. The Gibbs sampling algorithm is also used as the substitute of the Expectation Maximization algorithm to estimate the parameter of the model with Dirichlet Process. The clustering process is implemented under the framework of Spark so as to deal with farm environmental data set stored in distributed computer cluster. Experiment results with external criterion show that the improved clustering method has a better ability in data anomaly detection compared with other common cluster methods. Farm environmental data anomaly detection is implemented by the improved clustering method.

Cite

CITATION STYLE

APA

Pang, H., Deng, L., Wang, L., & Fei, M. (2016). The application of spark-based gaussian mixture model for farm environmental data analysis. In Communications in Computer and Information Science (Vol. 645, pp. 164–173). Springer Verlag. https://doi.org/10.1007/978-981-10-2669-0_18

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free