Veterinary Science

Prediction of breeding values for group-recorded traits including genomic information and an individually recorded correlated trait

X. Ma, O. F. Christensen, et al.

Discover groundbreaking insights from Xiang Ma and colleagues on how genomic information and correlated traits can elevate the accuracy of genetic evaluations for group-recorded traits in pigs. This innovative study reveals the significance of group records for challenging traits, showcasing the interplay between group sizes and individual relationships for optimal breeding value accuracy.... show more

Introduction

The study addresses how to improve the accuracy of estimated breeding values (EBVs) when only group-level phenotypes are available for traits that are difficult or costly to record on individuals (e.g., feed intake, egg production). Prior work indicates that group records can approximate individual-based evaluations under certain conditions, but loss of information is expected relative to individual records. The authors hypothesize that incorporating genomic information (via GBLUP/ssGBLUP) will increase the accuracy of EBVs for group-recorded traits, potentially more than for individual records due to capturing Mendelian sampling. They further hypothesize that including a genetically correlated trait with individual records in a bivariate model will substantially improve EBV accuracy for the group-recorded trait. The objectives are to quantify: (1) the effect of different proportions of genotyped animals (0%, 30%, 100%) on EBV accuracy using group records; and (2) the gain from including an individually recorded correlated trait in bivariate analyses. The work is motivated by practical breeding scenarios where some economically important traits are hard to measure individually, but correlated traits are readily available.

Literature Review

Previous studies comparing full-sib group records to individual records in fish and laying hens found negligible differences in variance components and consistent breeding value rankings (Nurgiartiningsih et al. 2004; Simianer and Gjerde 1991). Olson et al. (2006) proposed using pooled records for predicting individual BVs and showed selection based on group records can be effective, especially with small group sizes. Su et al. (2018) developed methods handling multiple fixed and random effects (litter, pen) for varying group sizes; variance components matched those from individual records but with larger standard errors, and EBV accuracy from group size 12 reached about 70% of individual-record accuracy. Genomic prediction (Meuwissen et al. 2001) and ssGBLUP (Aguilar et al. 2010; Christensen and Lund 2010; Legarra et al. 2009) have improved EBV accuracy for individual-recorded traits by leveraging markers and pedigrees. However, the use of genomic information specifically for group-recorded traits had not been investigated prior to this study.

Methodology

Design: Simulated pig nucleus population using QMSim to evaluate variance component estimation and EBV prediction from group versus individual records under varying genotyping proportions and models (PBLUP, GBLUP, ssGBLUP) in univariate and bivariate settings. Population and pedigree: Historical population of 400 unrelated animals mated randomly for 300 generations. Base of recent population formed by 30 sires × 200 dams producing 1,200 offspring. Recent population: 8 non-overlapping generations; each generation had 30 sires and 600 dams producing 600 litters (litter size 6, 3 males and 3 females). Last four generations (generations 5–8; 14,400 individuals) used for analysis; pedigree traced to generation 0 (total 29,430 individuals). The last generation served as validation under two conditions: Valid_R (records retained for candidates) and Valid_nR (records removed for candidates). Genome simulation: 18 chromosomes, each 100 cM. Per chromosome: 3,100 SNP markers and 50 QTL, all biallelic with random initial allele frequencies; mutation rate 2.5e-5 in historical period. In recent population, loci with MAF ≥ 0.01 retained, yielding 43,638 markers and 708 segregating QTL. QTL effects for two traits sampled from a bivariate normal with genetic correlation 0.8 (reflecting feed intake and daily gain correlations in pigs). True breeding values (a1 for trait 1, a2 for trait 2) defined by summing QTL effects and scaled to target variances. Traits and variance structure (designed values): Trait 1 h2 = 0.30, Trait 2 h2 = 0.25. Variance components: pen (10, 40; r = 0.3), litter (10, 40; r = 0.3), additive genetic (30, 100; genetic r = 0.8), residual (50, 220; r = 0.5). Pen, litter, and residual effects sampled from bivariate normal with these (co)variances. Group-record scenarios (trait 1): Group record defined as sum of individual records within a pen; pens constructed within generation. Three scenarios with variable group sizes (20% random deletion to create heterogeneity): (S12L2x3) average 9.6 pigs/pen (range 4–12): each litter split into two sublitters of 3 distributed to two pens; each pen contained four random sublitters. (S12Lran) average 9.6 pigs/pen (range 4–12): random assignment to pens up to 12 pigs per pen. (S24L2x3) average 19.2 pigs/pen (range 12–24): like S12L2x3 but pens included eight random sublitters. Genotyping scenarios: Genotype_0 (no animals genotyped, PBLUP); Genotype_30 (30% genotyped using ssGBLUP: all breeding animals ~16.5% plus 13.5% randomly selected non-breeders); Genotype_100 (all animals genotyped, GBLUP). For GBLUP, genomic relationship matrix G per VanRaden (2008) with allele frequencies from genotyped animals. For ssGBLUP, H matrix combines pedigree and genomic information with Gw = (1 − w)G + wA11 (w = 0.05) and scale adjustment per Christensen et al. (2012). Models:

Univariate model for individual records of trait 1: y = 1μ + Zl l + Zc c + Za a + e; random effects: l ~ N(0, σl2I), c ~ N(0, σc2I), a ~ N(0, σa2Ω), e ~ N(0, σe2I), with Ω = A (PBLUP), G (GBLUP), or H (ssGBLUP).
Group-record model (trait 1): Ty = Tl μ + TZl l + TZc c + TZa a + Te, where T aggregates individual contributions to group sums.
Bivariate model: joint analysis of trait 1 (group or individual) and trait 2 (individual), including the same random effects across traits with appropriate (co)variance structures using Ω as above. Due to software limitation, residual covariance between group-recorded trait 1 and individually recorded trait 2 was set to zero. Estimation and evaluation: Variance components estimated by AI-REML (DMU v5.4). For EBV prediction, true variances were used (no re-estimation) to reduce computation. Each scenario analyzed in 50 independent replicates; means and SDs reported. Accuracy defined as correlation between predicted and true BV; bias as regression coefficient of true BV on predicted BV. Accuracies/biases computed for: All validation animals, Group I (genotyped animals under Genotype_30), and Group II (non-genotyped animals under Genotype_30).

Key Findings

Unbiased predictions: Across univariate and bivariate analyses, regression coefficients of true BV on predicted BV were approximately 1 for all scenarios, indicating negligible bias for both group and individual records.
Variance component estimation: Using group records increased the standard deviations of estimated variance components compared to individual records, reflecting information loss with aggregation. In scenario S12L2x3 (trait 1), estimates from group records were broadly consistent with those from individual records, but with larger SDs. In bivariate models with trait 1 as group-recorded, setting residual covariance to zero led to overestimation of pen covariance and slight overestimation of other covariances relative to the true values.
Effect of genomic information (univariate trait 1): Adding genotypes increased EBV accuracy for both group and individual records. • Moving from 0% to 30% genotyped increased accuracy by ~1–3 percentage points (Valid_R) and ~2–3 points (Valid_nR) when considering all validation animals. • Moving from 0% to 100% genotyped increased accuracy by ~5–9 points (Valid_R) and ~6–11 points (Valid_nR) for all validation animals. • Under Genotype_30, accuracies were higher in Group I (genotyped) than Group II (non-genotyped). For Group I, 0%→30% genotyping increased accuracy by ~4–6 points (Valid_R) and ~4–8 points (Valid_nR); 0%→100% by ~5–9 (Valid_R) and ~5–11 (Valid_nR). For Group II, 0%→30% increased by ~1–2 (Valid_R) and ~1–3 (Valid_nR); 0%→100% by ~5–9 (Valid_R) and ~6–11 (Valid_nR).
Group structure and size effects: Relative to S12L2x3, random pen assignment (S12Lran) reduced accuracy to about 86–88% of S12L2x3 due to weaker within-group relationships. Increasing group size to average 19.2 (S24L2x3) further reduced accuracy to about 76–80% of S12L2x3.
Efficiency of group vs individual records: In S12L2x3, EBV accuracy from group records was 71–73% (Valid_R) and 58–66% (Valid_nR) of individual-record accuracy. In S12Lran: 62–63% (Valid_R) and 49–52% (Valid_nR). In S24L2x3: 54–58% (Valid_R) and 40–48% (Valid_nR). There was a tendency for these percentages to decrease with higher genotyping proportion under Valid_nR in S12L2x3 and S24L2x3.
Bivariate advantage: Including an individually recorded correlated trait (trait 2) in a bivariate model substantially improved estimation precision (smaller SDs) for pen, litter, and residual variances and slightly for additive genetic variance for the group-recorded trait. Despite setting residual covariance to zero, EBV accuracy for trait 1 from group records approached those from individual records more closely under the bivariate analysis than under univariate analysis.

Discussion

The results confirm the hypotheses. First, genomic information consistently increases EBV accuracy for group-recorded traits, mirroring established gains for individual-recorded traits, and providing benefits to both genotyped and non-genotyped animals through improved relationships in ssGBLUP. Second, leveraging a strongly correlated, individually recorded trait in a bivariate framework markedly enhances information content for the group-recorded trait, tightening estimates of environmental variances and improving EBV accuracy. The detrimental effects of larger group sizes and random group composition highlight the importance of designing pens with closer genetic relationships and maintaining smaller groups to maximize usable information from group sums. The approximately unbiased regression coefficients indicate that the modeling framework (including group-sum operator T and mixed models with A/G/H) yields well-calibrated EBVs despite aggregation. Practical implications are that breeding programs targeting traits that are difficult to measure individually (e.g., feed intake) can effectively use group records, particularly when augmented by genotyping (even at moderate coverage) and by including correlated traits with individual measurements.

Conclusion

Group records are a viable source of information for genetic evaluation of traits that are difficult or costly to record individually. EBV accuracy from group records increases substantially with the inclusion of genomic information, with further gains when all animals are genotyped. Incorporating an individually recorded, genetically correlated trait in a bivariate analysis greatly improves genetic evaluations for the group-recorded trait. Accuracy is higher with smaller group sizes and when penmates are more closely related; random assignment and larger groups reduce accuracy. Recommendations are to: (1) use genomic data (GBLUP/ssGBLUP) whenever possible; (2) include correlated individual-recorded traits in bivariate models; (3) design pens with relatively small sizes and maintain closer relationships among penmates to maximize information. Future work could extend software to model residual covariance between group and individual records and validate these findings with real populations and additional group structures.

Limitations

Simulation-based study: Results depend on the simulated pig population structure, LD, and specified (co)variances; real populations may differ.
Bivariate residual covariance constraint: Software limitations forced residual covariance between group-recorded trait 1 and individually recorded trait 2 to zero, which led to overestimation of pen covariance; this may affect variance component accuracy and potentially EBV accuracy.
Group records entail higher uncertainty: Variance component estimates from group records had substantially larger standard deviations than from individual records, indicating reduced precision.
Group size heterogeneity: While 20% random deletions mimicked variable group sizes, other real-world causes of variation (management, mortality patterns) were not explicitly modeled.
No re-estimation of variances for prediction: EBV predictions used true variances to reduce computation time, which may overstate performance compared to practical scenarios where variances are estimated from data.

Related Publications

Explore these studies to deepen your understanding of the subject.

Psychology

An internet-delivered acceptance and commitment therapy program for anxious affect, depression, and wellbeing: A randomized, parallel, two-group, waitlist-controlled trial in a Middle Eastern sample of college students

Z. Vally, H. Shah, et al.

Engineering and Technology

Graph neural networks for an accurate and interpretable prediction of the properties of polycrystalline materials

M. Dai, M. F. Demirel, et al.

Medicine and Health

Development of prediction models for screening depression and anxiety using smartphone and wearable-based digital phenotyping: protocol for the Smartphone and Wearable Assessment for Real-Time Screening of Depression and Anxiety (SWARTS-DA) observational study in Korea

Y. Shin, A. Y. Kim, et al.

Biology

B-SOID, an open-source unsupervised algorithm for identification and fast prediction of behaviors

A. I. Hsu and E. A. Yttri

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 12+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny