Towards a generic infrastructure for sustainable management of quality controlled primary data
Collecting primary data in scientific research is currently being performed in numerous repositories. Frequently, these repositories have not been designed to support long-term evolution of data, processes, and tools. Furthermore, in many cases repositories have been set up for the specific needs of some research project, and are not maintained any longer when the project is terminated. Finally, quality control and data provenance issues are not addressed to a sufficient extent. Based on the experiences gained in a joint project with biologists in the domain of biodiversity informatics, we propose a generic infrastructure for sustainable management of quality controlled primary data. The infrastructure encompasses both project and institutional repositories and provides a process for migrating project data into institutional repositories. Evolution and adaptability are supported through a generic approach with respect to underlying data schemas, processes, and tools. Specific emphasis is placed on quality assurance and data provenance.