Procedure to select the best dataset for a task

Andrew U. Frank; Eva Grum; Bérengère Vasseur

Journal Article

Procedure to select the best dataset for a task

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2004) 3234 81-93

DOI: 10.1007/978-3-540-30231-5_6

17Citations

21Readers

Get full text

Abstract

This paper models the decision process when selecting among different datasets the one most suitable for a task. It shows how metadata describing the quality of the dataset and descriptions of the task are used to make this decision. A simple comparison of task requirements and available data quality is supplemented with general, common-sense knowledge about effects of errors, lack of precision in the data and the dilution of quality over time. It consists of two steps: first, compute the data quality considering the time elapsed since the data collection; and second, assess the utility of the available data for the decision. A practical example of an assessment of the suitability of two datasets for two different tasks is computed and leads to the intuitively expected result. © Springer-Verlag 2004.

Cite

CITATION STYLE

APA

Frank, A. U., Grum, E., & Vasseur, B. (2004). Procedure to select the best dataset for a task. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 3234, 81–93. https://doi.org/10.1007/978-3-540-30231-5_6

Procedure to select the best dataset for a task

Abstract

Cite

Register to see more suggestions