To clean or not to clean phenotypic datasets for outlier plants in genetic analyses?

10Citations
Citations of this article
74Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Based on case studies, we discuss the extent to which genome-wide association studies (GWAS) are affected by outlier plants, i.e. those deviating from the expected distribution on a multi-criteria basis. Using a raw dataset consisting of daily measurements of leaf area, biomass, and plant height for thousands of plants, we tested three different cleaning methods for their effects on genetic analyses. No-cleaning resulted in the highest number of dubious quantitative trait loci, especially at loci with highly unbalanced allelic frequencies. A trade-off was identified between the risk of false-positives (with no-cleaning and/or a low threshold for minor allele frequency) and the risk of missing interesting rare alleles. Cleaning can lower the risk of the latter by making it possible to choose a higher threshold in GWAS.

Cite

CITATION STYLE

APA

Alvarez Prado, S., Sanchez, I., Cabrera-Bosquet, L., Grau, A., Welcker, C., Tardieu, F., & Hilgert, N. (2019, August 1). To clean or not to clean phenotypic datasets for outlier plants in genetic analyses? Journal of Experimental Botany. Oxford University Press. https://doi.org/10.1093/jxb/erz191

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free