Study of large and highly stratified population datasets by combining iterative pruning principal component analysis and structure

Tulaya Limpiti; Apichart Intarapanich; Anunchai Assawamakin; Philip J Shaw; Pongsakorn Wangkumhang; Jittima Piriyapongsa; Chumpol Ngamphiw; Sissades Tongsima

Journal ArticleOPEN ACCESS

Study of large and highly stratified population datasets by combining iterative pruning principal component analysis and structure

Limpiti T
Intarapanich A
Assawamakin A
et al.

BMC Bioinformatics (2011) 12(1)

DOI: 10.1186/1471-2105-12-255

N/ACitations

25Readers

Abstract

The ever increasing sizes of population genetic datasets pose great challenges for population structure analysis. The Tracy-Widom (TW) statistical test is widely used for detecting structure. However, it has not been adequately investigated whether the TW statistic is susceptible to type I error, especially in large, complex datasets. Non-parametric, Principal Component Analysis (PCA) based methods for resolving structure have been developed which rely on the TW test. Although PCA-based methods can resolve structure, they cannot infer ancestry. Model-based methods are still needed for ancestry analysis, but they are not suitable for large datasets. We propose a new structure analysis framework for large datasets. This includes a new heuristic for detecting structure and incorporation of the structure patterns inferred by a PCA method to complement STRUCTURE analysis.

Cite

CITATION STYLE

APA

Limpiti, T., Intarapanich, A., Assawamakin, A., Shaw, P. J., Wangkumhang, P., Piriyapongsa, J., … Tongsima, S. (2011). Study of large and highly stratified population datasets by combining iterative pruning principal component analysis and structure. BMC Bioinformatics, 12(1). https://doi.org/10.1186/1471-2105-12-255

Study of large and highly stratified population datasets by combining iterative pruning principal component analysis and structure

Abstract

Cite

Register to see more suggestions