Effective Discovery of Meaningful Outlier Relationships

  • Bessa A
  • Freire J
  • Dasu T
  • et al.
N/ACitations
Citations of this article
11Readers
Mendeley users who have this article in their library.

Abstract

We propose Predictable Outliers in Data-trendS (PODS) , a method that, given a collection of temporal datasets, derives data-driven explanations for outliers by identifying meaningful relationships between them. First, we formalize the notion of meaningfulness, which so far has been informally framed in terms of explainability. Next, since outliers are rare and it is difficult to determine whether their relationships are meaningful, we develop a new criterion that does so by checking if these relationships could have been predicted from non-outliers, i.e., whether we could see the outlier relationships coming . Finally, searching for meaningful outlier relationships between every pair of datasets in a large data collection is computationally infeasible. To address that, we propose an indexing strategy that prunes irrelevant comparisons across datasets, making the approach scalable. We present the results of an experimental evaluation using real datasets and different baselines, which demonstrates the effectiveness, robustness, and scalability of our approach.

Cite

CITATION STYLE

APA

Bessa, A., Freire, J., Dasu, T., & Srivastava, D. (2020). Effective Discovery of Meaningful Outlier Relationships. ACM/IMS Transactions on Data Science, 1(2), 1–33. https://doi.org/10.1145/3385192

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free