Data reduction using multiple models integration

Aleksandar Lazarevic; Zoran Obradovic

Journal ArticleOPEN ACCESS

Data reduction using multiple models integration

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2001) 2168 301-313

DOI: 10.1007/3-540-44794-6_25

12Citations

14Readers

Get full text

Abstract

Large amount of available information does not necessarily imply that induction algorithms must use all this information. Samples often provide the same accuracy with less computational cost. We propose several effective techniques based on the idea of progressive sampling when progressively larger samples are used for training as long as model accuracy improves. Our sampling procedures combine all the models constructed on previously considered data samples. In addition to random sampling, controllable sampling based on the boosting algorithm is proposed, where the models are combined using a weighted voting. To improve model accuracy, an effective pruning technique for inaccurate models is also employed. Finally, a novel sampling procedure for spatial data domains is proposed, where the data examples are drawn not only according to the performance of previous models, but also according to the spatial correlation of data. Experiments performed on several data sets showed that the proposed sampling procedures outperformed standard progressive sampling in both the achieved accuracy and the level of data reduction. © Springer-Verlag Berlin Heidelberg 2001.

Cite

CITATION STYLE

APA

Lazarevic, A., & Obradovic, Z. (2001). Data reduction using multiple models integration. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2168, 301–313. https://doi.org/10.1007/3-540-44794-6_25

Data reduction using multiple models integration

Abstract

Cite

Register to see more suggestions