Matching Data Mining Algorithm Suitability to Data Characteristics Using a Self-Organizing Map

Kate A. Smith; Frederick Woo; Vic Ciesielski; Remzi Ibrahim

Book Chapter

Matching Data Mining Algorithm Suitability to Data Characteristics Using a Self-Organizing Map

Smith K
Woo F
Ciesielski V
et al.

Physica-Verlag HD, (2002), 169-179

DOI: 10.1007/978-3-7908-1782-9_13

N/ACitations

9Readers

Get full text

Abstract

The vast range of data mining algorithms available for learningclassification problems has encouraged a trial-and-error approach tofinding the best model. This problem is exacerbated by the fact thatlittle is known about which techniques are suited to which types ofproblems. This paper provides some insights into the datacharacteristics that suit particular data mining algorithms. Ourapproach consists of four main stages. First, the performance of sixleading data mining algorithms is examined across a collection of 57well-known classification problems from the machine learning literature.Secondly, a collection of statistics that describe each of the 57problems in terms of data complexity is collated. Thirdly, aself-organising map (SOM) is used to cluster the 57 problems based onthese measures of complexity. Each cluster represents a group ofclassification problems with similar data characteristics. Theperformance of each data mining algorithm within each cluster is thenexamined in the Final stage to provide both quantitative and qualitativeinsights into which techniques perform best on certain problem types.

Cite

CITATION STYLE

APA

Smith, K. A., Woo, F., Ciesielski, V., & Ibrahim, R. (2002). Matching Data Mining Algorithm Suitability to Data Characteristics Using a Self-Organizing Map. In Hybrid Information Systems (pp. 169–179). Physica-Verlag HD. https://doi.org/10.1007/978-3-7908-1782-9_13

Matching Data Mining Algorithm Suitability to Data Characteristics Using a Self-Organizing Map

Abstract

Cite

Register to see more suggestions