Abstract
BAYDA is a software package for flexible data analysis in predictive data mining tasks. The mathematical model underlying the program is based on a simple Bayesian network, the Naive Bayes classifier It is well-known that the Naive Bayes classifier performs well in predictive data mining tasks, when compared to approaches using more complex models. However, the model makes strong independence assumptions that are frequently violated in practice. For this reason, the BAYDA software also provides a feature selection scheme which can be used for analyzing the problem domain, and for improving the prediction accuracy of the models constructed by BAYDA. The scheme is based on a novel Bayesian feature selection criterion introduced in this paper The suggested criterion is inspired by the Cheeseman-Stutz approximation for computing the marginal likelihood of Bayesian networks with hidden variables. The empirical results with several widely-used data sets demonstrate that the automated Bayesian feature selection scheme can dramatically decrease the number of relevant features, and lead to substantial improvements in prediction accuracy.
Cite
CITATION STYLE
Kontkanen, P., Myllymäki, P., Silander, T., & Tirri, H. (1998). BAYDA: Software for Bayesian Classification and Feature Selection. In Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining, KDD 1998. AAAI Press.
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.