Abstract
Objective: Simulated samples have been widely used in the development of efficient statistical methods identifying genetic variants that predispose to human genetic diseases. Although it is well known that natural selection has a strong influence on the number and diversity of rare genetic variations in human populations, existing simulation methods are limited in their ability to simulate multi-locus selection models with realistic distributions of the random fitness effects of newly arising mutants. Methods: We developed a computer program to simulate large populations of gene sequences using a forward-time simulation approach. This program is capable of simulating several multi-locus fitness schemes with arbitrary diploid single-locus selection models with random or locus-specific fitness effects. Arbitrary quantitative trait or disease models can be applied to the simulated populations from which individual- or family-based samples can be drawn and analyzed. Results: Using realistic demographic and natural selection models estimated from empirical sequence data, datasets simulated using our method differ significantly in the number and diversity of rare variants from datasets simulated using existing methods that ignore natural selection. Our program thus provides a useful tool to simulate datasets with realistic distributions of rare genetic variants for the study of genetic diseases caused by such variants. Copyright © 2011 S. Karger AG.
Author supplied keywords
Cite
CITATION STYLE
Peng, B., & Liu, X. (2011). Simulating sequences of the human genome with rare variants. Human Heredity, 70(4), 287–291. https://doi.org/10.1159/000323316
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.