A data quality assessment model and its application to cybersecurity data sources

0Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The proliferation of large storage systems such as data lake or big data implies that companies and public institutions need to evaluate the quality of the collected data, in order to ensure that the decisions that take are the most adequate. This fact implies to assess in some cases the own institution, or the data sources that provide the data. Current methods to evaluate data quality are primarily focused on traditional storage systems such as relational databases. In this work, we present a multidimensional data quality evaluation model. We propose a set of data quality dimensions and present an assessment methodology for each of them. The quality of each data source is computed by a mathematical formula that provides us a quality score that let us obtain a ranking of data sources. We also present a software tool that automatically performs the presented quality evaluation model. This tool is flexible enough to be adapted to different datastore systems. In particular, the model is applied over a real datastore of cybersecurity events with data collected from 27 different data sources. They have obtained quality values between −0.125 and 0.719.

Author supplied keywords

Cite

CITATION STYLE

APA

DeCastro-García, N., & Pinto, E. (2021). A data quality assessment model and its application to cybersecurity data sources. In Advances in Intelligent Systems and Computing (Vol. 1267 AISC, pp. 263–272). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-57805-3_25

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free