Heuristic measures of interestingness

Robert J. Hilderman; Howard J. Hamilton

Conference ProceedingsOPEN ACCESS

Heuristic measures of interestingness

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (1999) 1704 232-241

DOI: 10.1007/978-3-540-48247-5_25

29Citations

17Readers

Abstract

The tuples in a generalized relation (i.e., a summary generated from a database) are unique, and therefore, can be considered to be a population with a structure that can be described by some probability distribution. In this paper, we present and empirically compare sixteen heuristic measures that evaluate the structure of a summary to assign a single real-valued index that represents its interestingness relative to other summaries generated from the same database. The heuristics are based upon well-known measures of diversity, dispersion, dominance, and inequality used in several areas of the physical, social, ecological, management, information, and computer sciences. Their use for ranking summaries generated from databases is a new application area. All sixteen heuristics rank less complex summaries (i.e., those with few tuples and/or few non-ANY attributes) as most interesting. We demonstrate that for sample data sets, the order in which some of the measures rank summaries is highly correlated.

Cite

CITATION STYLE

APA

Hilderman, R. J., & Hamilton, H. J. (1999). Heuristic measures of interestingness. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1704, pp. 232–241). Springer Verlag. https://doi.org/10.1007/978-3-540-48247-5_25

Heuristic measures of interestingness

Abstract

Cite

Register to see more suggestions