Quantiles on Streams

  • Buragohain C
  • Suri S
N/ACitations
Citations of this article
6Readers
Mendeley users who have this article in their library.
Get full text

Abstract

SYNONYMS Median; histogram; selection; order statistics DEFINITION Quantiles are order statistics of data: the φ-quantile (0 ≤ φ ≤ 1) of a set S is an element x such that φ|S| elements of S are less than or equal to x and the remaining (1 − φ)|S| are greater than x. This article describes data stream (single-pass) algorithms for computing an approximation of such quantiles. HISTORICAL BACKGROUND The need to summarize data has been around since the earliest days of data processing. Large volumes of raw, unstructured data easily overwhelm human ability to comprehend or digest, and tools that help identify the major underlying trends or patterns in data have enormous value. Quantiles characterize distributions of real world data sets in ways that are less sensitive to outliers than simpler alternatives such as the mean and the variance. Consequently, quantiles are of interest to both database implementers and users: for instance, they are a fundamental tool for query optimization, splitting of data in parallel database systems, and statistical data analysis. Quantiles are closely related to the familiar concepts of frequency distributions and histograms. The cumulative frequency distribution F () is commonly used to summarize the distribution of a (totally ordered) set S. Specifically, for any value x, F (x) = Number of values less than x. (1) The quantile Q(φ), or the φ-th quantile is simply the inverse of F (x). Specifically, if the set S has n elements, then the element x has the property that Q(F (x)/n) = x. (2)

Cite

CITATION STYLE

APA

Buragohain, C., & Suri, S. (2016). Quantiles on Streams. In Encyclopedia of Database Systems (pp. 1–6). Springer New York. https://doi.org/10.1007/978-1-4899-7993-3_290-2

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free