Extracting subset of relevant features for breast cancer to improve accuracy of classifier

2Citations
Citations of this article
17Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Data mining is the essential step which identifies hidden patterns from large repositories. Medical diagnosis became a major area of current research in data mining. Machine learning technique which use statistical methods to enable machine to improve with experiences and identify hidden patterns in data like regression algorithms, clustering algorithms, classification algorithms, neural networks(ANN,CNN,DL),recommender system algorithms, Apriori algorithms, page ranking algorithms, text search and NLP(natural language processing) etc.., but due to lack of evaluation, these algorithms are unsuccessful in finding a better classifier for images to estimate accuracy of classification in medical image processing. Classification is an supervised learning which predicts the future class for an unknown object. The main purpose is to identify an unknown class by consulting with the neighbor class characteristics. Clustering can be known as unsupervised learning as it label the objects based on the scale of similar characteristics without consulting its class label. Main principle of clustering is find the distance like nearby and faraway based on their similarities and dissimilarities and groups the objects and hence can be used to identify outliers (which are far away from from the object). Feature extraction, variable selection is a method of obtaining a subset of relevant characteristics from large dataset. Too many features of a class may affect the accuracy of classifier. Therefore, feature extraction technique can be used to eliminate irrelevant attributes and increases the accuracy of classifier. In this paper we performed an induction to increase the accuracy of classifier by applying mining techniques in WEKA tool. Breast Cancer dataset is chosen from learning repository to analyze and an experimental analysis was conducted with WEKA tool using training dataset by applying naïve bayes, bayesnet, and PART, ZeroR, J48 and Random Forest techniques on the Wisconsin's dataset on Breast cancer. Finally presented the best classifier where the accuracy is more.

Cite

CITATION STYLE

APA

Saturi, R., Dara, R., & Prem Chand, P. (2019). Extracting subset of relevant features for breast cancer to improve accuracy of classifier. International Journal of Innovative Technology and Exploring Engineering, 8(11), 1670–1674. https://doi.org/10.35940/ijitee.K1507.0981119

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free