Feature selection for adaptive decision making in big data analytics

0Citations
Citations of this article
18Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Abstract Rapid growth in technology and its accessibility by general public produce voluminous, heterogeneous and unstructured data resulted in the emergence of new concepts, viz. Big Data and Big Data Analytics. High dimensionality, variability, uncertainty and speed of generating such data pose new challenges in data analysis using standard statistical methods, especially when Big Data consists of redundant as well as important information. Devising intelligent methods is the need of the hour to extract meaningful information from Big Data. Different computational tools such as rough-set theory, fuzzy-set theory, fuzzy-rough-set and genetic algorithm that are often applied to analyse such kind of data are the focus of this chapter. But sometimes local optimal solution is achieved due to premature convergence, so hybridization of genetic algorithm with local search methods has been discussed here. Genetic algorithm, a well-proven global optimization algorithm, has been extended to search the fitness space more efficiently in order to select global optimum feature subset. Real-life data is often vague, so fuzzy logic and rough-set theory are applied to handle uncertainty and maintain consistency in the data sets. The aim of the fuzzy-rough-based method is to generate optimum variation in the range of membership functions of linguistic variables. As a next step, dimensionality reduction is performed to search the selected features for discovering knowledge from the given data set. The searching of most informative features may terminate at local optimum, whereas the global optimum may lie elsewhere in the search space. To remove local minima, an algorithm is proposed using fuzzy-rough-set concept and genetic algorithm. The proposed algorithm searches the most informative attribute set by utilising the optimal range of membership values used to design the objective function. Finally, a case study is given where the dimension reduction techniques are applied in the field of agricultural science, a real-life application domain. Rice plants diseases infect leaves, stems, roots and other parts, which cause degradation of production. Disease identification and taking precaution is very important data analytic task in the field of agriculture. Here, it is demonstrated in the case study to show how the images are collected from the fields, diseased features are extracted and preprocessed and finally important features are selected using genetic algorithm-based local searching technique and fuzzy-rough-set theory. These features are important to develop a decision support system to predict the diseases and accordingly devise methods to protect the most important crops.

Cite

CITATION STYLE

APA

Sil, J., & Das, A. K. (2016). Feature selection for adaptive decision making in big data analytics. In Data Science and Big Data Computing: Frameworks and Methodologies (pp. 269–292). Springer International Publishing. https://doi.org/10.1007/978-3-319-31861-5_12

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free