Abstract
As the data stored in the medical database may contain missing values and redundant data, making medical data classification challenging. According to the characteristics of the medical data set containing missing values, the classification and regression (CART) algorithm is naturally thought of. However, when the CART algorithm processes a data set with too many categories, the error rate will increase rapidly and easily lead to overfitting. This paper proposes a solution for the characteristics of medical data sets and the shortcomings of CART algorithm. In order to improve the accuracy of medical data, the Boruta method was proposed to reduce the dimension. Then CART algorithm is used to classify feature subset. The data set on UCI was used in the experiment, and the results show that the accuracy of the CART algorithm is improved.
Author supplied keywords
Cite
CITATION STYLE
Tang, R., & Zhang, X. (2020). CART Decision Tree Combined with Boruta Feature Selection for Medical Data Classification. In 2020 5th IEEE International Conference on Big Data Analytics, ICBDA 2020 (pp. 80–84). Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1109/ICBDA49040.2020.9101199
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.