Rasgele Orman Yönteminde Eksik Veri Probleminin İncelenmesi

Hülya Özen; Cengiz Bal

Journal ArticleOPEN ACCESS

Rasgele Orman Yönteminde Eksik Veri Probleminin İncelenmesi

Özen H
Bal C

OSMANGAZİ JOURNAL OF MEDICINE (2019) 00

DOI: 10.20515/otd.496524

N/ACitations

12Readers

Abstract

Random Forest is an ensemble method that combines many trees constructed from bootstrap samples of the original data. Random Forest is used for both classification and regression and provides many advantages such as having a high accuracy, calculating a generalization error, determining the important variables and outliers, performing supervised and unsupervised learning and imputing missing values with an algorithm based on proximity matrix. In this study, we aimed to compare the proximity based imputation method of Random Forest with k nearest neighbor imputation prior to fitting. Therefore, simulation studies were performed for a classification problem under various scenarios including different percentage of missing values, number of neighbors and correlation structures between predictor variables. The results showed that for highly correlated structures proximity matrix based imputation method should be used meanwhile k nearest neighbor imputation method should be preferred for low and medium correlated structures.

Cite

CITATION STYLE

APA

Özen, H., & Bal, C. (2019). Rasgele Orman Yönteminde Eksik Veri Probleminin İncelenmesi. OSMANGAZİ JOURNAL OF MEDICINE, 00. https://doi.org/10.20515/otd.496524

Rasgele Orman Yönteminde Eksik Veri Probleminin İncelenmesi

Abstract

Cite

Register to see more suggestions