Data mining is the semi-automatic discovery of patterns, associations, changes, anomalies, and statistically significant structures and events in data. Traditional data analysis is assumption driven in the sense that a hypothesis is formed and validated against the data. Data mining, in contrast, is data driven in the sense that patterns are automatically ex- tracted from data. The goal of this tutorial is to provide an introduction to data mining techniques. The focus will be on methods appropriate for mining massive datasets using techniques from scalable and high perfor- mance computing. The techniques covered include association rules, se- quence mining, decision tree classification, and clustering. Some aspects of preprocessing and postprocessing are also covered. The problem of predicting contact maps for protein sequences is used as a detailed case study. The material presented here is compiled by LW based on the original tutorial slides of MJZ at the 2002 Post-Genome Knowledge Discovery Programme
CITATION STYLE
SAKURAI, S. (2009). Data Mining Techniques. Journal of Japan Society for Fuzzy Theory and Intelligent Informatics, 21(3), 348–357. https://doi.org/10.3156/jsoft.21.3_348
Mendeley helps you to discover research relevant for your work.