Implicit bias of encoded variables: Frameworks for addressing structured bias in EHR–GWAS data

Hillary R. Dueñas; Carina Seah; Jessica S. Johnson; Laura M. Huckins

ArticleOPEN ACCESS

Implicit bias of encoded variables: Frameworks for addressing structured bias in EHR–GWAS data

Human Molecular Genetics

DOI: 10.1093/hmg/ddaa192

13Citations

30Readers

Abstract

The ‘discovery’ stage of genome-wide association studies required amassing large, homogeneous cohorts. In order to attain clinically useful insights, we must now consider the presentation of disease within our clinics and, by extension, within our medical records. Large-scale use of electronic health record (EHR) data can help to understand phenotypes in a scalable manner, incorporating lifelong and whole-phenome context. However, extending analyses to incorporate EHR and biobank-based analyses will require careful consideration of phenotype definition. Judgements and clinical decisions that occur ‘outside’ the system inevitably contain some degree of bias and become encoded in EHR data. Any algorithmic approach to phenotypic characterization that assumes non-biased variables will generate compounded biased conclusions. Here, we discuss and illustrate potential biases inherent within EHR analyses, how these may be compounded across time and suggest frameworks for large-scale phenotypic analysis to minimize and uncover encoded bias.

Cite

CITATION STYLE

APA

Dueñas, H. R., Seah, C., Johnson, J. S., & Huckins, L. M. (2020, September 15). Implicit bias of encoded variables: Frameworks for addressing structured bias in EHR–GWAS data. Human Molecular Genetics. Oxford University Press. https://doi.org/10.1093/hmg/ddaa192

Implicit bias of encoded variables: Frameworks for addressing structured bias in EHR–GWAS data

Abstract

Cite

Register to see more suggestions