Using subclasses to improve classification learning

Achim Hoffmann; Rex Kwok; Paul Compton

Conference ProceedingsOPEN ACCESS

Using subclasses to improve classification learning

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2001) 2167 203-213

DOI: 10.1007/3-540-44795-4_18

11Citations

17Readers

Abstract

We propose to use systematic simulation studies as opposed to the use of real-world benchmark datasets to better understand the behaviour, strengths and weaknesses of machine learning algorithms. Simulated data sets allow much better control and understanding of the nature of the learning problem than empirical benchmark data sets. To demonstrate the value of our proposed research methodology, we describe in this paper the results of our studies concerning the problem of learning multiple classes. We derived the following hypothesis: “Learning classification functions using decision tree learners can be helped by providing additional subclass labels.” To illustrate, for learning a two class problem “car is OK/car needs service” it can be helpful to provide a finer-grained classification in the training data such as “car OK”, “faulty brakes”, “faulty engine”, “faulty lights”, etc. This hypothesis was corroborated using a number of ‘real-world’ multi-class data sets from the UCIMLrepository. Our empirical studies demonstrate the usefulness of the proposed research methodology using artificial data sets as an important methodological complement to using real-world datasets.

Cite

CITATION STYLE

APA

Hoffmann, A., Kwok, R., & Compton, P. (2001). Using subclasses to improve classification learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2167, pp. 203–213). Springer Verlag. https://doi.org/10.1007/3-540-44795-4_18

Using subclasses to improve classification learning

Abstract

Cite

Register to see more suggestions