The remedian: A robust averaging method for large data sets

51Citations
Citations of this article
41Readers
Mendeley users who have this article in their library.
Get full text

Abstract

It is often assumed that to compute a robust estimator on n data values one needs at least n storage elements (contrary to the sample average, that may be calculated with an updating mechanism). This is one of the main reasons why robust estimators are seldom used for large data sets and why they are not included in most statistical packages. We introduce a new estimator that takes up little storage space, investigate its statistical properties, and provide an example on real-time curve “averaging” in a medical context. The remedian with base b proceeds by computing medians of groups of b observations, and then medians of these medians, until only a single estimate remains. This method merely needs k arrays of size b (where n = bk), so the total storage is O(log n) for fixed b or, alternatively, O(nl/k) for fixed k. Its storage economy makes it useful for robust estimation in large data bases, for real-time engineering applications in which the data themselves are not stored, and for resistant “averaging” of curves or images. The method is equivariant for monotone transformations. Optimal choices of b with respect to storage and finite-sample breakdown are derived. The remedian is shown to be a consistent estimator of the population median, and it converges at a nonstandard rate to a median-stable distribution. © 1990 Taylor & Francis Group, LLC.

Cite

CITATION STYLE

APA

Rousseeuw, P. J., & Bassett, G. W. (1990). The remedian: A robust averaging method for large data sets. Journal of the American Statistical Association, 85(409), 97–104. https://doi.org/10.1080/01621459.1990.10475311

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free