Data clustering using hidden variables in hybrid Bayesian networks

13Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

In this paper, we analyze the problem of data clustering in domains where discrete and continuous variables coexist. We propose the use of hybrid Bayesian networks with naïve Bayes structure and hidden class variable. The model integrates discrete and continuous features, by representing the conditional distributions as mixtures of truncated exponentials (MTEs). The number of classes is determined through an iterative procedure based on a variation of the data augmentation algorithm. The new model is compared with an EM-based clustering algorithm where each class model is a product of conditionally independent probability distributions and the number of clusters is decided by using a cross-validation scheme. Experiments carried out over real-world and synthetic data sets showthat the proposal is competitive with state-of-the-artmethods. Even though the methodology introduced in this manuscript is based on the use of MTEs, it can be easily instantiated to other similar models, like the Mixtures of Polynomials or the mixtures of truncated basis functions in general.

Cite

CITATION STYLE

APA

Fernández, A., Gámez, J. A., Rumí, R., & Salmerón, A. (2014). Data clustering using hidden variables in hybrid Bayesian networks. Progress in Artificial Intelligence, 2(2–3), 141–152. https://doi.org/10.1007/s13748-014-0048-3

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free