A top K relative outlier detection algorithm in uncertain datasets

0Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Focusing on outlier detection in uncertain datasets, we combine distance-based outlier detection techniques with classic uncertainty models. Both variety of data's value and incompleteness of data's probability distribution are considered. In our research, all data objects in an uncertain dataset are described using x-tuple model with their respective probabilities. We find that outliers in uncertain datasets are probabilistic. Neighbors of a data object are different in distinct possible worlds. Based on possible world and x-tuple models, we propose a new definition of top K relative outliers and the RPOS algorithm. In RPOS algorithm, all data objects are compared with each other to find the most probable outliers. Two pruning strategies are utilized to improve efficiency. Besides that we construct some data structures for acceleration. We evaluate our research in both synthetic and real datasets. Experimental results demonstrate that our method can detect outliers more effectively than existing algorithms in uncertain environment. Our method is also in superior efficiency. © 2014 Springer International Publishing Switzerland.

Cite

CITATION STYLE

APA

Liu, F., Yin, H., & Han, W. (2014). A top K relative outlier detection algorithm in uncertain datasets. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8709 LNCS, pp. 36–47). Springer Verlag. https://doi.org/10.1007/978-3-319-11116-2_4

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free