In this chapter, we explore difficulties one often encounters when applying machine learning techniques to real-world data, which frequently show skewness properties. A typical example from industry where skewed data is an intrinsic problem is fraud detection in finance data. In the following we provide examples, where appropriate, to facilitate the understanding of data mining of skewed data. The topics explored include but are not limited to: data preparation, data cleansing, missing values, characteristics construction, variable selection, data skewness, objective functions, bottom line expected prediction, limited resource situation, parametric optimisation, model robustness and model stability.
Mendeley helps you to discover research relevant for your work.
CITATION STYLE
Alonso Gadi, M. F., do Lago, A. P., & Mehne, J. (2010). Data Mining with Skewed Data. In New Advances in Machine Learning. InTech. https://doi.org/10.5772/9382