logo
ResearchBunny Logo
Introduction
Animal models are crucial for biological research, particularly in understanding human diseases and traits. Rodents are commonly used, but their physiological differences from humans limit their effectiveness. Primates offer better models but are expensive and ethically problematic. Livestock species, especially pigs, possess similar physiology and anatomy to humans, making them attractive alternatives. However, generating livestock models of human functional variants through genome editing is costly and time-consuming. This study explores an alternative: leveraging naturally occurring livestock models of human functional variants. The researchers hypothesized that a significant number of human variants would have orthologues in livestock species and that the effects of these variants would often be conserved. This approach avoids the cost and ethical issues of generating transgenic animals and offers a more efficient way to study human genetic variation and its impact on phenotypes. The study's importance lies in its potential to significantly accelerate research on human diseases and traits, offering a cost-effective and ethical alternative to traditional animal models.
Literature Review
The literature highlights the limitations of existing animal models, such as rodents and primates, in accurately reflecting human biology and disease mechanisms. While livestock, particularly pigs, are increasingly recognized as superior models due to physiological similarities, the creation of transgenic livestock models carrying specific human variants remains costly and time-consuming. Previous studies have shown isolated examples of functional variants conserved across mammalian species, suggesting the potential for a large-scale investigation into naturally occurring orthologues. This research builds on these findings by conducting a genome-wide analysis to assess the extent of this conservation and the potential for utilizing such naturally occurring models.
Methodology
The study utilized previously published and filtered genotype data for five mammalian species: humans (1000 Genomes Project), cattle, pigs, dogs, and water buffalo. The researchers compared human SNPs to orthologous positions in the livestock genomes, identifying variants with identical allele changes. The relationship between sample size and variant overlap was explored by downsampling livestock cohorts. Machine learning models (Random Forest, XGBoost, CatBoost) were trained on various genomic features (sequence conservation, position properties, VEP annotations, sequence context) to predict the likelihood of a human variant having a livestock orthologue. The models were evaluated using receiver operating characteristic (ROC) curves and metrics such as AUC, accuracy, and F1 score. To investigate the conservation of effects, the researchers examined orthologues of human ClinVar variants (pathogenic or likely pathogenic) and fine-mapped SNPs from the UK Biobank, focusing on their presence and impact on phenotypes. They also analyzed conserved regulatory variants by comparing human GTEx data with the cattleGTEx project. Enformer, a deep learning model predicting transcriptional potential, was used to investigate discrepancies between predicted and observed effects. The statistical methods included Chi-squared test, Two-sample Kolmogorov–Smirnov test and Fisher's exact test.
Key Findings
The researchers identified orthologues of over 1.6 million human variants in at least one of the four livestock species examined. A substantial proportion (55-56%) of these shared variants showed the exact same allele change. Machine learning models accurately predicted the presence of livestock orthologues (CatBoost achieving an AUC of 0.69), indicating that genomic features provide predictive power for cross-species variant conservation. The study revealed that ClinVar variants (pathogenic mutations) were significantly less likely to have livestock orthologues compared to the background rate, highlighting the effect of purifying selection. However, when orthologues of ClinVar variants were found, the impact on proteins (missense changes) was often conserved (80% for cattle, similar for pigs). Interestingly, some phenotypes showed a preferential sharing of variants, including biotinidase deficiency and neurofibromatosis. Analysis of UK Biobank fine-mapped SNPs revealed that human height variants were disproportionately likely to have cattle orthologues. Regulatory variants showed significant conservation, and in many cases, the direction of effect (allele associated with increased or decreased expression) was conserved between humans and cattle. The Enformer model helped explain discrepancies by identifying cases where isoform differences between species could affect variant impacts. The study emphasizes that the overlap is not random and reflects a complex interplay of mutation rates, selection pressures and functional significance.
Discussion
This study demonstrates the significant potential of naturally occurring livestock models for studying human genetic variation. The high frequency of conserved functional variants across species offers a cost-effective and ethical approach to understanding the impact of human genetic variation on phenotypes. The preferential sharing of variants associated with certain phenotypes suggests the existence of selection pressures shaping genetic architecture in both humans and livestock. The machine learning models provide a valuable tool for identifying human variants with a higher probability of having conserved effects in livestock. This research contributes to both human and animal genetics by providing insights into disease mechanisms, evolutionary processes, and potential targets for livestock improvement programs. Future studies could explore broader livestock species, investigate specific disease pathways in more detail using these models, and further refine prediction models to improve accuracy.
Conclusion
This study reveals millions of human variant orthologues in domesticated mammals, including hundreds linked to human diseases and traits. Machine learning successfully predicted which human variants are likely to have conserved effects in livestock, highlighting the value of these naturally occurring models. This efficient approach facilitates cost-effective and ethical research, advancing our understanding of human genetics and providing insights for livestock breeding. Future work should investigate more livestock species and refine prediction methods.
Limitations
The study's findings are limited to the specific livestock species and datasets used. The accuracy of orthologue identification depends on the quality of genome annotations and lift-over processes. Some discrepancies between predicted and observed effects may be due to differences in gene regulation and isoform usage across species, and the power to detect regulatory variants may also be affected by differences in sample sizes and tissue availability across species. Some rare variants with potentially important effects could be missed due to the limitations of current genome-wide association study designs.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs—just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny