Data Mining in Integrated Data Access and Data Analysis Systems

  • Yang R
  • Kafatos M
  • Yang K
  • et al.
N/ACitations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The rapid increase in the volume of scientific data sets has resulted in distributed data information systems applicable to Earth system science. Such a system should help users to locate data sets, to provide preliminary research results quickly and to support data deliveries under users' request. At George Mason University, we have been developing a data information system with both search and analysis components. In this system, three phases of data accesses are supported: phase one for meta-data search; phase two for on-line data analysis; and phase three for data ordering. For large volumes of data, searching on meta-data only will not be adequate. Scientists often need to search for data based on actual data values. This is a particular kind of data mining, which searches for data sets based on data content. In this chapter, we first describe the system architecture. We then develop the concept of a data pyramid model and propose a histogram clustering technique for content-based searches. We use the model and the related technique to answer content-based queries approximately but efficiently. We will also describe our prototypes that integrate the content-based searches into a data information system.

Cite

CITATION STYLE

APA

Yang, R., Kafatos, M., Yang, K.-S., & Wang, X. S. (2001). Data Mining in Integrated Data Access and Data Analysis Systems (pp. 183–199). https://doi.org/10.1007/978-1-4615-1733-7_11

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free