Methods to investigate concept drift in big data streams

undefined Nidhi; Veenu Mangat; Vishal Gupta; Renu Vig

Book Chapter

Methods to investigate concept drift in big data streams

Springer Singapore, (2018), 51-74

DOI: 10.1007/978-981-10-6680-1_3

6Citations

23Readers

Get full text

Abstract

The explosion of information from various social networking sites, Web clickstream, information retrieval, customers’ records, users’ reviews, business transactions, network event logs, etc. Results in generating a continuous deluge of data at different rates, called streaming data. Organizing, indexing, analyzing, or mining hidden knowledge from such a data deluge becomes a critical functionality for a broad range of content analysis tasks that includes emerging topic detection, interesting content identification, user interest profiling, and real-time Web search. But managing such ‘Big Data’ becomes even more challenging when streaming data is taken for analyzing and producing results in real time. The streaming data may include numeric, categorical, or mixed value. Most of the current research has been done on numeric data streams by exploiting the statistical properties of the numeric data. But now categorical/textual data streams have also gained researchers’ interest due to the high availability of data in textual format on the Internet. Applying classification for managing data streams is an unrealistic approach as not every incoming data has a class label. So, in such a case, for managing unlabeled data streams, a clustering technique is applied. One property that can affect the results of any clustering algorithm is concept drift. Therefore, detecting and managing concept drift over a period imposes a great challenge to better cluster analysis. This chapter provides an in-depth critique of various algorithms that have been introduced to handle concept drift in a real environment. A framework for examining concept drift in big data streams is also proposed.

Author supplied keywords

Cite

CITATION STYLE

APA

Nidhi, Mangat, V., Gupta, V., & Vig, R. (2018). Methods to investigate concept drift in big data streams. In Knowledge Computing and Its Applications: Knowledge Manipulation and Processing Techniques: Volume 1 (pp. 51–74). Springer Singapore. https://doi.org/10.1007/978-981-10-6680-1_3

Methods to investigate concept drift in big data streams

Abstract

Author supplied keywords

Cite

Register to see more suggestions