Abstract
The K-nearest neighbor interpolation method was used to fill in missing data of five indicators of coronary heart disease, diabetes, total cholesterol, triglycerides, and albumin;, and the SMOTE algorithm was used to balance the number of variable indicators. The Relief-F algorithm was used to remove 18 variable indicators and retain 42 variable indicators. LASSO and ridge regression algorithms were used to remove eight variable indicators and retain 52 variable indicators; The prediction accuracy, recall, and AUC values of the linear kernel support vector machine model filtered using Relief-F and LASSO features are high, and the prediction results are optimal; The test result of random forest screened by Relief-F and LASSO features is better than that of the support vector machine model. It is concluded that the random forest model screened by Relief-F features is better as a prediction of lung cancer typing. The research results provide theoretical data support for predicting lung cancer classification using machine learning methods.
Author supplied keywords
Cite
CITATION STYLE
Li, D., Li, G., Li, S., & Bang, A. (2023). Classification Prediction of Lung Cancer Based on Machine Learning Method. International Journal of Healthcare Information Systems and Informatics, 19(1). https://doi.org/10.4018/IJHISI.333631
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.