Genomic prediction based on selected variants from imputed whole-genome sequence data in Australian sheep populations

Published on 2019-12-06T04:44:57Z (GMT) by
Abstract Background Whole-genome sequence (WGS) data could contain information on genetic variants at or in high linkage disequilibrium with causative mutations that underlie the genetic variation of polygenic traits. Thus far, genomic prediction accuracy has shown limited increase when using such information in dairy cattle studies, in which one or few breeds with limited diversity predominate. The objective of our study was to evaluate the accuracy of genomic prediction in a multi-breed Australian sheep population of relatively less related target individuals, when using information on imputed WGS genotypes. Methods Between 9626 and 26,657 animals with phenotypes were available for nine economically important sheep production traits and all had WGS imputed genotypes. About 30% of the data were used to discover predictive single nucleotide polymorphism (SNPs) based on a genome-wide association study (GWAS) and the remaining data were used for training and validation of genomic prediction. Prediction accuracy using selected variants from imputed sequence data was compared to that using a standard array of 50k SNP genotypes, thereby comparing genomic best linear prediction (GBLUP) and Bayesian methods (BayesR/BayesRC). Accuracy of genomic prediction was evaluated in two independent populations that were each lowly related to the training set, one being purebred Merino and the other crossbred Border Leicester x Merino sheep. Results A substantial improvement in prediction accuracy was observed when selected sequence variants were fitted alongside 50k genotypes as a separate variance component in GBLUP (2GBLUP) or in Bayesian analysis as a separate category of SNPs (BayesRC). From an average accuracy of 0.27 in both validation sets for the 50k array, the average absolute increase in accuracy across traits with 2GBLUP was 0.083 and 0.073 for purebred and crossbred animals, respectively, whereas with BayesRC it was 0.102 and 0.087. The average gain in accuracy was smaller when selected sequence variants were treated in the same category as 50k SNPs. Very little improvement over 50k prediction was observed when using all WGS variants. Conclusions Accuracy of genomic prediction in diverse sheep populations increased substantially by using variants selected from whole-genome sequence data based on an independent multi-breed GWAS, when compared to genomic prediction using standard 50K genotypes.

Cite this collection

Moghaddar, Nasir; Khansefid, Majid; van der Werf, Julius; Bolormaa, Sunduimijid; Duijvesteijn, Naomi; Clark, Samuel; et al. (2019): Genomic prediction based on selected variants from imputed whole-genome sequence data in Australian sheep populations. figshare. Collection. https://doi.org/10.6084/m9.figshare.c.4772336.v1