Compression distance can discriminate animals by genetic profile, build relationship matrices and estimate breeding values

7Citations
Citations of this article
27Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Background: Genetic relatedness is currently estimated by a combination of traditional pedigree-based approaches (i.e. numerator relationship matrices, NRM) and, given the recent availability of molecular information, using marker genotypes (via genomic relationship matrices, GRM). To date, GRM are computed by genome-wide pair-wise SNP (single nucleotide polymorphism) correlations. Results: We describe a new estimate of genetic relatedness using the concept of normalised compression distance (NCD) that is borrowed from Information Theory. Analogous to GRM, the resultant compression relationship matrix (CRM) exploits numerical patterns in genome-wide allele order and proportion, which are known to vary systematically with relatedness. We explored properties of the CRM in two industry cattle datasets by analysing the genetic basis of yearling weight, a phenotype of moderate heritability. In both Brahman (Bos indicus) and Tropical Composite (Bos taurus by Bos indicus) populations, the clustering inferred by NCD was comparable to that based on SNP correlations using standard principal component analysis approaches. One of the versions of the CRM modestly increased the amount of explained genetic variance, slightly reduced the 'missing heritability' and tended to improve the prediction accuracy of breeding values in both populations when compared to both NRM and GRM. Finally, a sliding window-based application of the compression approach on these populations identified genomic regions influenced by introgression of taurine haplotypes. Conclusions: For these two bovine populations, CRM reduced the missing heritability and increased the amount of explained genetic variation for a moderately heritable complex trait. Given that NCD can sensitively discriminate closely related individuals, we foresee CRM having possible value for estimating breeding values in highly inbred populations.

Figures

  • Fig. 1 Given a genotype file (a) and a plausible pedigree (b), one can compute an NRM (c) and a GRM (d). One can also compute an NCD matrix (e) which in turn can be transformed into CRM1 (f) and CRM2 (g) given two different distance to similarity transformations. A sliding window‑based version of the CE analysis (h) can be used to generate a correlation matrix which underpins the computation of CRM3 (i)
  • Table 1 Estimates of variance components for BB cattle: comparison of estimates based on pedigree (NRM), normalized compression distance (CRM1 and CRM2) and genomic relationships (GRM)
  • Table 2 Estimates of variance components for TC cattle: Comparison between pedigree (NRM), normalized compression distance (CRM1 and CRM2) and genomic relationship (GRM)
  • Table 3 Accuracy of  estimates of  breeding values from  a model with a single random additive effect derived using different relationship matrices
  • Fig. 2 Comparison of CEh using different genotyping platforms. A comparison of CEh for BB (top panel) and TC (bottom panel) cows genotyped using both the HD chip (red dots) with 750 K SNPs and the new 71 K Indicus SNP chip (black dots). Each point represents a single animal
  • Table 4 Summary statistics for BB cows compared using NRM, GRM and NCD
  • Table 5 Summary statistics for TC cows compared using NRM, GRM and NCD
  • Table 6 Summary statistics for self–self pairs in both populations using NRM, GRM, CRM1, CRM2 and CRM3

References Powered by Scopus

Efficient methods to compute genomic predictions

4124Citations
N/AReaders
Get full text

A Universal Algorithm for Sequential Data Compression

3972Citations
N/AReaders
Get full text

Toward a universal law of generalization for psychological science

1773Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Runs of homozygosity for autozygosity estimation and genomic analysis in production animals

22Citations
N/AReaders
Get full text

The Bos taurus-Bos indicus balance in fertility and milk related genes

13Citations
N/AReaders
Get full text

Data compression can discriminate broilers by selection line, detect haplotypes, and estimate genetic potential for complex phenotypes

7Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Hudson, N. J., Porto-Neto, L., Kijas, J. W., & Reverter, A. (2015). Compression distance can discriminate animals by genetic profile, build relationship matrices and estimate breeding values. Genetics Selection Evolution, 47(1). https://doi.org/10.1186/s12711-015-0158-9

Readers over time

‘15‘16‘17‘18‘19‘20‘22‘23‘24‘2502468

Readers' Seniority

Tooltip

Researcher 9

53%

PhD / Post grad / Masters / Doc 7

41%

Professor / Associate Prof. 1

6%

Readers' Discipline

Tooltip

Agricultural and Biological Sciences 13

68%

Biochemistry, Genetics and Molecular Bi... 3

16%

Mathematics 2

11%

Pharmacology, Toxicology and Pharmaceut... 1

5%

Save time finding and organizing research with Mendeley

Sign up for free
0