Matching Data Mining Algorithm Suitability to Data Characteristics Using a Self-Organizing Map

  • Smith K
  • Woo F
  • Ciesielski V
  • et al.
N/ACitations
Citations of this article
9Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The vast range of data mining algorithms available for learningclassification problems has encouraged a trial-and-error approach tofinding the best model. This problem is exacerbated by the fact thatlittle is known about which techniques are suited to which types ofproblems. This paper provides some insights into the datacharacteristics that suit particular data mining algorithms. Ourapproach consists of four main stages. First, the performance of sixleading data mining algorithms is examined across a collection of 57well-known classification problems from the machine learning literature.Secondly, a collection of statistics that describe each of the 57problems in terms of data complexity is collated. Thirdly, aself-organising map (SOM) is used to cluster the 57 problems based onthese measures of complexity. Each cluster represents a group ofclassification problems with similar data characteristics. Theperformance of each data mining algorithm within each cluster is thenexamined in the Final stage to provide both quantitative and qualitativeinsights into which techniques perform best on certain problem types.

Cite

CITATION STYLE

APA

Smith, K. A., Woo, F., Ciesielski, V., & Ibrahim, R. (2002). Matching Data Mining Algorithm Suitability to Data Characteristics Using a Self-Organizing Map. In Hybrid Information Systems (pp. 169–179). Physica-Verlag HD. https://doi.org/10.1007/978-3-7908-1782-9_13

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free