Data Mining in Integrated Data Access and Data Analysis Systems

Ruixin Yang; Menas Kafatos; Kwang-Su Yang; X. Sean Wang

Book Chapter

Data Mining in Integrated Data Access and Data Analysis Systems

Yang R
Kafatos M
Yang K
et al.

DOI: 10.1007/978-1-4615-1733-7_11

N/ACitations

2Readers

Get full text

Abstract

The rapid increase in the volume of scientific data sets has resulted in distributed data information systems applicable to Earth system science. Such a system should help users to locate data sets, to provide preliminary research results quickly and to support data deliveries under users' request. At George Mason University, we have been developing a data information system with both search and analysis components. In this system, three phases of data accesses are supported: phase one for meta-data search; phase two for on-line data analysis; and phase three for data ordering. For large volumes of data, searching on meta-data only will not be adequate. Scientists often need to search for data based on actual data values. This is a particular kind of data mining, which searches for data sets based on data content. In this chapter, we first describe the system architecture. We then develop the concept of a data pyramid model and propose a histogram clustering technique for content-based searches. We use the model and the related technique to answer content-based queries approximately but efficiently. We will also describe our prototypes that integrate the content-based searches into a data information system.

Cite

CITATION STYLE

APA

Yang, R., Kafatos, M., Yang, K.-S., & Wang, X. S. (2001). Data Mining in Integrated Data Access and Data Analysis Systems (pp. 183–199). https://doi.org/10.1007/978-1-4615-1733-7_11

Data Mining in Integrated Data Access and Data Analysis Systems

Abstract

Cite

Register to see more suggestions