Classifying unseen cases with many missing values

Zijian Zheng; Boon Toh Low

Conference Proceedings

Classifying unseen cases with many missing values

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (1999) 1574 370-375

DOI: 10.1007/3-540-48912-6_50

6Citations

2Readers

Get full text

Abstract

Handling missing attribute values is an important issue for classifier learning, since missing attribute values in either training data or test (unseen) data affect the prediction accuracy of learned classifiers. In many real KDD applications, attributes with missing values are very common. This paper studies the robustness of four recently developed committee learning techniques, including Boosting, Bagging, Sasc, and SascMB, relative to C4.5 for tolerating missing values in test data. Boosting is found to have a similar level of robustness to C4.5 for tolerating missing values in test data in terms of average error in a representative collection of natural domains under investigation. Bagging performs slightly better than Boosting, while Sasc and SascMB perform better than them in this regard, with SascMB performing best.

Cite

CITATION STYLE

APA

Zheng, Z., & Low, B. T. (1999). Classifying unseen cases with many missing values. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1574, pp. 370–375). Springer Verlag. https://doi.org/10.1007/3-540-48912-6_50

Classifying unseen cases with many missing values

Abstract

Cite

Register to see more suggestions