Lessons Learned From Building a Data Platform for Longitudinal, Analytical Use Cases and Scaling to 77 German Hospitals: Implementation Report

0Citations
Citations of this article
15Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Background: Increasing adoption of electronic health records (EHRs) enables research on real-world data. In Germany, this has been limited to university hospitals, and data from acute care hospitals below the university level are lacking. To address this, we used established design patterns to build a data platform that aggregates and standardizes pseudonymized EHR data with patients’ consent. Objective: We report on the design and implementation of the research platform, as well as patient participation and lessons learned during the scaling of the platform, to incorporate real-world data (with participant consent) from 77 hospitals into a unified data lake. Methods: Due to variations in EHR adoption, IT infrastructure, software vendors, interface availability, and regulatory requirements, we used an agile development cycle that involves constant, incremental standardization of data. We implemented a layered lambda infrastructure built on Apache Hadoop. Decentralized connectors ensured data minimization and pseudonymization. Implementation (Results): We successfully scaled our data model both vertically and horizontally in 77 hospitals. However, we encountered issues during the scaling of real-time data pipelines and IHE (Integrating the Healthcare Enterprise) interfaces. During the first 2 years, patients were asked to consent to secondary data use 1,475,244 times during inpatient admission. We registered 1,023,633 broad instances of consent (consent rate 70.2%). Conclusions: Patients are generally willing to provide consent for secondary use of their data, but obtaining consent requires considerable effort. Building a research data platform is not an end goal, but rather a necessary step in collecting and standardizing longitudinal data to enable research on real-world data. Through the combination of agile development, phased rollouts, and very high levels of automation, we have been able to achieve fast turnaround times for incorporating user feedback and are constantly improving data quality and standardization.

Cite

CITATION STYLE

APA

Bockhacker, M., Martens, P., von Münchow, C., Löser, S., Günther, R., Kuhlen, R., … Ortleb, S. (2025). Lessons Learned From Building a Data Platform for Longitudinal, Analytical Use Cases and Scaling to 77 German Hospitals: Implementation Report. JMIR Medical Informatics, 13. https://doi.org/10.2196/69853

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free