Abstract
Outlier detection is essential in data-based science. It aims to detect those itemsets that have a significant difference from the other data. With the limitations of equipment precision and network transmission, uncertain data are becoming more common in daily life. However, the traditional outlier detection methods are not applicable for uncertain data stream, and the large volume of data makes outlier detection costly in terms of memory usage and time. Moreover, the multiple scanning of the data stream required for Apriori-like methods is unrealistic. In this paper, a matrix structure is constructed to store the information of an uncertain data stream, and the subsequent mining process is conducted on the matrix structure; therefore, the whole data stream needs to be scanned only once. Then, the “upper cap” concept is used in the FIM-UDS method to mine the frequent itemsets more effectively to support outlier detection. Moreover, two outlier factors and an outlier detection method called FIM-UDSOD are designed to detect potential outliers. Finally, two public datasets are used to verify the efficiency of the FIM-UDS method, and one synthetic dataset is used to evaluate the FIM-UDSOD method. The experimental results show that our proposed FIM-UDSOD method is more effective than other methods in detecting outliers.
Author supplied keywords
Cite
CITATION STYLE
Hao, S., Cai, S., Sun, R., & Li, S. (2019). An efficient outlier detection approach over uncertain data stream based on frequent itemset mining. Information Technology and Control, 48(1), 34–46. https://doi.org/10.5755/j01.itc.48.1.21162
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.