Practical Lessons from Generating Synthetic Healthcare Data with Bayesian Networks

4Citations
Citations of this article
16Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Healthcare data holds huge societal and monetary value. It contains information about how disease manifests within populations over time, and therefore could be used to improve public health dramatically. To the growing AI in health industry, this data offers huge potential in generating markets for new technologies in healthcare. However, primary care data is extremely sensitive. It contains data on individuals that is of a highly personal nature. As a result, many countries are reluctant to release this resource. This paper explores some key issues in the use of synthetic data as a substitute for real primary care data: Handling the complexities of real world data to transparently capture realistic distributions and relationships, modelling time, and minimising the matching of real patients to synthetic datapoints. We show that if the correct modelling approaches are used, then transparency and trust can be ensured in the underlying distributions and relationships of the resulting synthetic datasets. What is more, these datasets offer a strong level of privacy through lower risks of identifying real patients.

Cite

CITATION STYLE

APA

de Benedetti, J., Oues, N., Wang, Z., Myles, P., & Tucker, A. (2020). Practical Lessons from Generating Synthetic Healthcare Data with Bayesian Networks. In Communications in Computer and Information Science (Vol. 1323, pp. 38–47). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-65965-3_3

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free