Bayesian networks for data mining

David Heckerman

Journal Article

Bayesian networks for data mining

Heckerman D

Data Mining and Knowledge Discovery (1997) 1(1) 79-119

DOI: 10.1023/A:1009730122752

545Citations

421Readers

Get full text

Abstract

A Bayesian network is a graphical model that encodes probabilistic relationships among variables of interest. When used in conjunction with statistical techniques, the graphical model has several advantages for data modeling. One, because the model encodes dependencies among all variables, it readily handles situations where some data entries are missing. Two, a Bayesian network can be used to learn causal relationships, and hence can be used to gain understanding about a problem domain and to predict the consequences of intervention. Three, because the model has both a causal and probabilistic semantics, it is an ideal representation for combining prior knowledge (which often comes in causal form) and data. Four, Bayesian statistical methods in conjunction with Bayesian networks offer an efficient and principled approach for avoiding the overfitting of data. In this paper, we discuss methods for constructing Bayesian networks from prior knowledge and summarize Bayesian statistical methods for using data to improve these models. With regard to the latter task, we describe methods for learning both the parameters and structure of a Bayesian network, including techniques for learning with incomplete data. In addition, we relate Bayesian-network methods for learning to techniques for supervised and unsupervised learning. We illustrate the graphical-modeling approach using a real-world case study. © 1997 Kluwer Academic Publishers.

Author supplied keywords

Cite

CITATION STYLE

APA

Heckerman, D. (1997). Bayesian networks for data mining. Data Mining and Knowledge Discovery, 1(1), 79–119. https://doi.org/10.1023/A:1009730122752

Bayesian networks for data mining

Abstract

Author supplied keywords

Cite

Register to see more suggestions