An n-spheres based synthetic data generator for supervised classification

Javier Sánchez-Monedero; Pedro Antonio Gutiérrez; María Pérez-Ortiz; César Hervás-Martínez

Conference Proceedings

An n-spheres based synthetic data generator for supervised classification

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2013) 7902 LNCS(PART 1) 613-621

DOI: 10.1007/978-3-642-38679-4_62

14Citations

11Readers

Get full text

Abstract

Synthetic datasets can be useful in a variety of situations, specifically when new machine learning models and training algorithms are developed or when trying to seek the weaknesses of an specific method. In contrast to real-world data, synthetic datasets provide a controlled environment for analysing concrete critic points such as outlier tolerance, data dimensionality influence and class imbalance, among others. In this paper, a framework for synthetic data generation is developed with special attention to pattern order in the space, data dimensionality, class overlapping and data multimodality. Variables such as position, width and overlapping of data distributions in the n-dimensional space are controlled by considering them as n-spheres. The method is tested in the context of ordinal regression, a paradigm of classification where there is an order arrangement between categories. The contribution of the paper is the full control over data topology and over a set of relevant statistical properties of the data. © 2013 Springer-Verlag Berlin Heidelberg.

Author supplied keywords

Cite

CITATION STYLE

APA

Sánchez-Monedero, J., Gutiérrez, P. A., Pérez-Ortiz, M., & Hervás-Martínez, C. (2013). An n-spheres based synthetic data generator for supervised classification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7902 LNCS, pp. 613–621). https://doi.org/10.1007/978-3-642-38679-4_62

An n-spheres based synthetic data generator for supervised classification

Abstract

Author supplied keywords

Cite

Register to see more suggestions