The distribution of multivariate quantitative survey data usually is not normal. Skewed and semi-continuous distributions occur often. In addition, missing values and non- response is common. All together this mix of problems makes multivariate outlier detection difficult. Examples of surveys where these problems occur are most business surveys and some household surveys like the Survey for the Statistics of Income and Living Condition (SILC) of the European Union. Several methods for multivariate outlier detection are collected in the R package modi. This paper gives an overview of the package modi and its functions for outlier detection and corresponding imputation. The use of the methods is explained with a business survey data set. The discussion covers pre- and post-processing to deal with skewness and zero-ination, advantages and disadvantages of the methods and the choice of the parameters.
CITATION STYLE
Bill, M., & Hulliger, B. (2016). Treatment of multivariate outliers in incomplete business survey data. Austrian Journal of Statistics, 45(1), 3–23. https://doi.org/10.17713/ajs.v45i1.86
Mendeley helps you to discover research relevant for your work.