Summary: Despite its great capability to detect rare variant associations, next-generation sequencing is still prohibitively expensive when applied to large samples. In case-control studies, it is thus appealing to sequence only a subset of cases to discover variants and genotype the identified variants in controls and the remaining cases under the reasonable assumption that causal variants are usually enriched among cases. However, this approach leads to inflated type-I error if analyzed naively for rare variant association. Several methods have been proposed in recent literature to control type-I error at the cost of either excluding some sequenced cases or correcting the genotypes of discovered rare variants. All of these approaches thus suffer from certain extent of information loss and thus are underpowered. We propose a novel method (BETASEQ), which corrects inflation of type-I error by supplementing pseudo-variants while keeps the original sequence and genotype data intact. Extensive simulations and real data analysis demonstrate that, in most practical situations, BETASEQ leads to higher testing powers than existing approaches with guaranteed (controlled or conservative) type-I error.Availability and implementation: BETASEQ and associated R files, including documentation, examples, are available at http://www.unc.edu/∼yunmli/betaseqContact: or yunli@med.unc.eduSupplementary information: Supplementary data are available at Bioinformatics online. © 2013 The Author.
CITATION STYLE
Yan, S., & Li, Y. (2014). BETASEQ: A powerful novel method to control type-I error inflation in partially sequenced data for rare variant association testing. Bioinformatics, 30(4), 480–487. https://doi.org/10.1093/bioinformatics/btt719
Mendeley helps you to discover research relevant for your work.