Abstract
Based on case studies, we discuss the extent to which genome-wide association studies (GWAS) are affected by outlier plants, i.e. those deviating from the expected distribution on a multi-criteria basis. Using a raw dataset consisting of daily measurements of leaf area, biomass, and plant height for thousands of plants, we tested three different cleaning methods for their effects on genetic analyses. No-cleaning resulted in the highest number of dubious quantitative trait loci, especially at loci with highly unbalanced allelic frequencies. A trade-off was identified between the risk of false-positives (with no-cleaning and/or a low threshold for minor allele frequency) and the risk of missing interesting rare alleles. Cleaning can lower the risk of the latter by making it possible to choose a higher threshold in GWAS.
Author supplied keywords
Cite
CITATION STYLE
Alvarez Prado, S., Sanchez, I., Cabrera-Bosquet, L., Grau, A., Welcker, C., Tardieu, F., & Hilgert, N. (2019, August 1). To clean or not to clean phenotypic datasets for outlier plants in genetic analyses? Journal of Experimental Botany. Oxford University Press. https://doi.org/10.1093/jxb/erz191
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.