Random forest classifier combined with feature selection for breast cancer diagnosis and prognostic

  • Nguyen C
  • Wang Y
  • Nguyen H
  • 47


    Mendeley users who have this article in their library.
  • N/A


    Citations of this article.


As the incidence of this disease has increased significantly in the recent years, expert systems and machine learning techniques to this problem have also taken a great attention from many scholars. This study aims at diagnosing and prognosticating breast cancer with a machine learning method based on random forest classifier and feature selection technique. By weighting, keeping useful features and removing redundant features in datasets, the method was obtained to solve diagnosis problems via classifying Wisconsin Breast Cancer Diagnosis Dataset and to solve prognosis problem via classifying Wisconsin Breast Cancer Prognostic Dataset. On these datasets we obtained classification accuracy of 100% in the best case and of around 99.8% on average. This is very promising compared to the previously reported results. This result is for Wisconsin Breast Cancer Dataset but it states that this method can be used confidently for other breast cancer diagnosis problems, too.

Author-supplied keywords

  • Breast Cancer
  • Diagnosis
  • Feature Selection
  • Prognosis
  • Random Forest
  • breast cancer
  • diagnosis
  • feature selection
  • prognosis
  • random forest

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in


  • Cuong Nguyen

  • Yong Wang

  • Ha Nam Nguyen

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free