Seamless integration of data mining with DBMS and applications

Hongjun Lu

Conference Proceedings

Seamless integration of data mining with DBMS and applications

Lu H

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2001) 2035 3

DOI: 10.1007/3-540-45357-1_3

2Citations

1Readers

Get full text

Abstract

Data mining has been widely recognized as a powerful tool for exploring added value from data accumulated in the daily operations of an organization. A large number of data mining algorithms have been developed during the past decade. Those algorithms can be roughly di- vided into two groups. The fist group of techniques, such as classification, clustering, prediction and deviation analysis, has been studied for a long time in machine learning, statistics, and other fields. The second group of techniques, such as association rule mining, mining in spatial-temporal databases and mining from the Web, addresses problems related to large amounts of data. Most classical algorithms in the first group assume that the data to be mined is somehow available in memory. Although initial ef- fort in data mining has concentrated on making those algorithms scalable with respect to large volume of data, most of those scalable algorithms, even developed by database researchers, are still stand-alone. It is of- ten assumed that data is available in desired forms, without considering the fact that most organizations store their data in databases managed by database management systems (DBMS). As such, most data min- ing algorithms can only be loosely coupled with data infrastructures in organizations and are dificult to infuse into existing mission-critical ap- plications. Seamlessly integrating data mining techniques with database applications and database management systems remains an open prob- lem. In this paper, we propose to tackle the problem of seamless integration of data mining with DBMS and applications from three directions. First, with the recent development of database technology, most database man- agement systems have extended their functionality in data analysis. Such capability should be fully explored to develop DBMS-awre data mining algorithms. Ideally, data mining algorithms can be fully implemented using DBMS supported functions so that they become database appli- cation themselves. Second, major dificulties in integrating data mining with applications are algorithm selection and parameter setting. Reduc- ing or eliminating mining parameters as much as possible and develop- ing automatic or semi-automatic mining algorithm selection techniques will greatly increase the application friendliness of data mining systems. Lastly, standardizing the interface among databases, data mining al- gorithms and applications can also facilitate the integration to certain extent.

Cite

CITATION STYLE

APA

Lu, H. (2001). Seamless integration of data mining with DBMS and applications. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2035, p. 3). Springer Verlag. https://doi.org/10.1007/3-540-45357-1_3

Seamless integration of data mining with DBMS and applications

Abstract

Cite

Register to see more suggestions