Getting access to real medical data for research is notoriously difficult. Even when data exist they are usually incomplete and subject to restrictions due to confidentiality and privacy. Synthetic data (SD) are best replacements for real data but must be verifiably realistic. There is little or no investigation into systematically achieving realism in SD. This work investigates this problem, and contributes the ATEN framework, which incorporates three component approaches: (1) THOTH for synthetic data generation (SDG); (2) RA for characterising realism is SD, and (3) HORUS for validating realism in SD. The framework is found promising after its use in generating the realistic synthetic EHR (RS-EHR) for labour and birth. This framework is significant in guaranteeing realism in SDG projects. Future efforts focus on further validation of ATEN in a controlled multi-stream SDG process.
CITATION STYLE
McLachlan, S., Dube, K., Gallagher, T., Simmonds, J. A., & Fenton, N. (2019). Realistic Synthetic Data Generation: The ATEN Framework. In Communications in Computer and Information Science (Vol. 1024, pp. 497–523). Springer Verlag. https://doi.org/10.1007/978-3-030-29196-9_25
Mendeley helps you to discover research relevant for your work.