Biodiversity big-data (BBD) has the potential to provide answers to some unresolved questions-at spatial and taxonomic swathes that were previously inaccessible. However, BBDs contain serious error and bias. Therefore, any study that uses BBD should ask whether data quality is sufficient to provide a reliable answer to the research question. We propose that the question of data quality and the research question could be addressed simultaneously, by binding data-cleaning to data analysis. The change in signal between the pre-and post-cleaning phases, in addition to the signal itself, can be used to evaluate the findings, their implications, and their robustness. This approach includes five steps: 1. Downloading raw occurrence data from a BBD. 2. Data analysis, statistical and / or simulation modeling in order to answer the research question, using the raw data after the necessary basic cleaning. This part is similar to the common practice.
CITATION STYLE
Gueta, T., & Carmel, Y. (2017). Integrating data-cleaning with data analysis to enhance usability of biodiversity big-data. Proceedings of TDWG, 1, e20244. https://doi.org/10.3897/tdwgproceedings.1.20244
Mendeley helps you to discover research relevant for your work.