The most challenging task in dealing with Bayesian networks is learn- ing their structure. Two classical approaches are often used for learning Bayesian network structure: Constraint-Based method and Score-and-Search-Based one. However, neither the first nor the second one are completely satisfactory. There- fore, the heuristic search such as Genetic Algorithms with a fitness score function is considered for learning Bayesian network structure. To assure the closeness of the genetic operators, the ordering among variables (nodes) must be de- termined. In this paper, we determine the node ordering by considering the Principal Component Analysis (PCA). For this purpose, we first determine the appropriate correlation between variables and then use the absolute value of variable's coefficients in the first component. It means that a node Xi can only The most challenging task in dealing with Bayesian networks is learn- ing their structure. Two classical approaches are often used for learning Bayesian network structure: Constraint-Based method and Score-and-Search-Based one. However, neither the first nor the second one are completely satisfactory. There- fore, the heuristic search such as Genetic Algorithms with a fitness score function is considered for learning Bayesian network structure. To assure the closeness of the genetic operators, the ordering among variables (nodes) must be de- termined. In this paper, we determine the node ordering by considering the Principal Component Analysis (PCA). For this purpose, we first determine the appropriate correlation between variables and then use the absolute value of variable's coefficients in the first component. It means that a node Xi can only have the node Xj as a parent if the absolute value of coefficient Xj in the first component is higher than Xi. We then use the Genetic Algorithm with fitness score BIC regarding the node ordering to construct the Bayesian Network. Experimental results over well-known networks Asia, Alarm and Hailfinder show that our new technique has higher accuracy and better degree of data matching. In addition, we apply our technique to the real data set which is related to Bank's debtor that owe over 500 million Rials to Maskan Bank in Iran. Results also show that the proposed technique has greater modeling power than other node ordering techniques such as Hruschka et al. (2007), Chen et al. (2008) and K2 algorithm.
CITATION STYLE
Tabar, V. R., Mahdavi, M., Heidari, S., & Naghizadeh, S. (2016). Learning Bayesian network structure using genetic algorithm with consideration of the node ordering via principal component analysis. Journal of the Iranian Statistical Society, 15(2), 45–61. https://doi.org/10.18869/acadpub.jirss.15.2.45
Mendeley helps you to discover research relevant for your work.