CAGI4 SickKids clinical genomes challenge: A pipeline for identifying pathogenic variants

7Citations
Citations of this article
29Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Compared with earlier more restricted sequencing technologies, identification of rare disease variants using whole-genome sequence has the possibility of finding all causative variants, but issues of data quality and an overwhelming level of background variants complicate the analysis. The CAGI4 SickKids clinical genome challenge provided an opportunity to assess the landscape of variants found in a difficult set of 25 unsolved rare disease cases. To address the challenge, we developed a three-stage pipeline, first carefully analyzing data quality, then classifying high-quality gene-specific variants into seven categories, and finally examining each candidate variant for compatibility with the often complex phenotypes of these patients for final prioritization. Variants consistent with the phenotypes were found in 24 out of the 25 cases, and in a number of these, there are prioritized variants in multiple genes. Data quality analysis suggests that some of the selected variants are likely incorrect calls, complicating interpretation. The data providers followed up on three suggested variants with Sanger sequencing, and in one case, a prioritized variant was confirmed as likely causative by the referring physician, providing a diagnosis in a previously intractable case.

Cite

CITATION STYLE

APA

Pal, L. R., Kundu, K., Yin, Y., & Moult, J. (2017). CAGI4 SickKids clinical genomes challenge: A pipeline for identifying pathogenic variants. Human Mutation, 38(9), 1169–1181. https://doi.org/10.1002/humu.23257

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free