Data quality management in a database cluster with lazy replication

  • Pape C
  • Gancarski S
  • Valduriez P
  • 6

    Readers

    Mendeley users who have this article in their library.
  • N/A

    Citations

    Citations of this article.

Abstract

We consider the use of a database cluster with lazy replication. In this context, controlling the quality of replicated data based on users' requirements is important to improve performance. However, existing approaches are limited to a particular aspect of data quality. In this paper, we propose a general model of data quality which makes the difference between "freshness" and "validity" of data. Data quality is expressed through divergence measures from the data with perfect quality. Users can thus specify the minimum level of quality for their queries. This information can be exploited to optimize query load balancing. We implemented our approach in our Refresco prototype. The results show that freshness control can help increase query throughput significantly. They also show significant improvement when freshness requirements are specified at the relation level rather than at the database level. [ABSTRACT FROM AUTHOR]

Author-supplied keywords

  • database cluster
  • databases
  • electronic information resources
  • freshness
  • information storage & retrieval systems
  • middleware
  • quality
  • query (information retrieval system)
  • replication
  • validity

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document

Authors

  • Cecile Le Pape

  • Stephane Gancarski

  • Patrick Valduriez

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free