Diversity analysis of 80,000 wheat accessions reveals

Index

Introduction

Wheat, the world's most widely grown crop, provides essential protein and calories for a significant portion of the global population. Its annual production surpasses 600 million tons, contributing approximately 20% of global protein and calorie intake and a substantial portion of essential micronutrients. While wheat cultivars have shown remarkable yield increases and adaptation to diverse climates over the past 10,000 years since domestication, this success has come at the cost of reduced genetic diversity within the elite gene pool. This limited diversity hinders the development of new varieties needed to sustainably meet the demands of a growing population facing climate change and various biotic and abiotic stresses. Wheat germplasm banks hold a vast collection of approximately 560,000 accessions, including crop wild relatives (CWR) and landraces, representing a reservoir of untapped genetic diversity crucial for overcoming these challenges. CWR and landraces possess evolved mechanisms for resilience in challenging environments, but their adaptive potential is largely unexplored due to a lack of understanding of their genetic makeup. Efficient identification of advantageous genetic variants and methods to avoid linkage drag when introducing these variants into elite germplasm are major challenges. Advances in genomics and molecular technologies, however, offer the potential to efficiently characterize genetic diversity in large germplasm collections and employ gene editing techniques to circumvent linkage drag. The International Maize and Wheat Improvement Center (CIMMYT) germplasm bank, one of the world's largest, distributes vast quantities of wheat seeds annually, and this study leverages the Seeds of Discovery (SeeD) initiative, which has characterized nearly 80,000 accessions from CIMMYT and ICARDA's collections, to comprehensively analyze this genetic resource.

Literature Review

The existing literature highlights the urgent need for increased genetic diversity in wheat breeding to address challenges posed by climate change and growing population demands. Studies have documented the reduction in genetic diversity in elite wheat varieties due to past breeding practices, emphasizing the importance of exploring diverse germplasm sources such as landraces and crop wild relatives. Previous research has utilized various genomic technologies to characterize wheat genetic diversity, but none on the scale of this study. The use of high-throughput genotyping methods has been shown to be effective in identifying core subsets of accessions representing the genetic diversity of larger collections, facilitating more efficient phenotyping and breeding efforts. Methods for incorporating desirable traits from CWR and landraces into elite germplasm, addressing the issue of linkage drag, have also been subjects of significant research. Finally, the importance of large-scale genomic data in genomic prediction and marker-assisted selection for improving wheat traits has been widely recognized in the field.

Methodology

This study analyzed the genetic diversity of 79,191 wheat accessions from CIMMYT and ICARDA germplasm banks, encompassing domesticated hexaploid and tetraploid wheats, and crop wild relatives (CWR). High-throughput genotyping was performed using DArTseq™ technology, generating over 300,000 high-quality SNPs and SilicoDArT markers. These markers were aligned to three reference maps: the IWGSC RefSeq v1.0 genome assembly, the durum wheat genome assembly (cv. Svevo), and a DArT genetic map. On average, 72% of the markers were uniquely placed on these maps, with 50% linked to genes. Data cleaning involved filtering markers based on missing rate and minor allele frequency. The analysis involved independent analyses of hexaploid, tetraploid, and CWR groups. Hierarchical clustering using modified Roger's distance (MRD) for SNPs and Jaccard distance for SilicoDArT markers grouped accessions based on genetic similarity. Admixture analysis was performed on a subset of markers to complement the clustering results. Multidimensional scaling (MDS) plots visualized the genetic relationships among accessions. Analysis of molecular variance (AMOVA) and FST values were calculated to identify genomic regions under selection and to assess the genetic differentiation among clusters. Core subsets were generated to capture the genetic diversity of the complete collections. A genome-wide association study (GWAS) was performed on a subset of samples phenotyped for grain protein content (GPC) and sedimentation values to identify loci associated with these traits. The DArT genetic map, along with visualization tools like CurlyWhirly, were used to enhance the analysis and visualization of the large datasets. The DArTseq methodology, while not dependent on a reference genome, presents some challenges as compared to alternative methods like microarrays, but was considered the most appropriate for the goals of this study given the nature of the material. The proprietary nature of the DArTsoft14 software limits the detailed description of certain parameters used in marker calling pipelines, but access is available through direct contact with the company.

Key Findings

The analysis of 56,342 hexaploid accessions revealed that a substantial portion of the genetic diversity present in landraces has not been utilized in modern breeding programs. Admixture analysis identified eight distinct groups, including traditional and modern landraces, synthetic materials, and elite germplasm. The largest genetic distance was observed between elite germplasm and synthetic derivatives, primarily due to alleles introduced from *A. tauschii*. AMOVA and FST analyses identified genomic regions associated with key agronomic traits, as well as regions reflecting the history of modern wheat breeding. The analysis of 18,946 tetraploid accessions showed that a large proportion of the genetic diversity is captured by elite durum accessions, with the exception of a distinct group of Ethiopian landraces representing largely unexplored diversity. The analysis of 3903 CWR accurately grouped accessions by genome constitution and identified subspecies ploidy levels. Core subsets containing 20% of the total accessions effectively captured the genetic diversity of each group. The analysis also revealed misclassifications in passport data for some accessions, highlighting the value of genomic profiling for germplasm curation. GWAS analyses revealed genomic regions associated with grain protein content (GPC) and sedimentation values, providing potential targets for marker-assisted selection. The identification of previously uncharacterized QTL for GPC provides new targets for breeding to improve this important trait.

Discussion

This study represents the most extensive genetic diversity analysis conducted for any agricultural crop to date, providing an unprecedented resource for wheat improvement. The findings highlight the untapped potential of landraces and CWR for enhancing wheat resilience to climate change and biotic/abiotic stresses. The identification of genomic regions under selection offers valuable insights into the history of wheat breeding and provides specific targets for future breeding efforts. The availability of the large-scale genotypic data and analysis tools allows for targeted exploration of specific genes or chromosome regions, facilitating the identification of germplasm conserving allelic diversity absent in current breeding programs. The significant genetic distinction of the Ethiopian tetraploid landraces underlines the importance of exploring geographically diverse germplasm collections. The identified QTLs for GPC and SDS sedimentation provide concrete targets for marker-assisted selection in breeding programs. The identification of misclassified accessions emphasizes the need for ongoing curation and validation of germplasm collections using genomic technologies.

Conclusion

This study delivers a comprehensive analysis of genetic diversity in nearly 80,000 wheat accessions, revealing untapped potential in landraces and CWR for enhancing wheat improvement. The large-scale genotypic data, analysis tools, and key findings provide valuable resources for the wheat research community to discover and utilize functional diversity in future breeding programs, addressing challenges like climate change and food security. Future research could focus on further exploring the identified genomic regions under selection, performing deeper phenotypic analyses on unexplored landraces, and utilizing the genotypic data for advanced genomic selection strategies.

Limitations

While this study represents an unprecedented scale of analysis, some limitations should be noted. The use of DArTseq technology, while offering high-throughput and unbiased assessment of genetic diversity, may not capture all types of genetic variation. The analysis relied heavily on passport data, which may contain inaccuracies, as highlighted by the identification of misclassified accessions. Further, the GWAS analysis was limited by the size of the phenotyped sample. Future studies could benefit from integrating other types of genomic data, improved passport data validation, and larger phenotyping efforts for a more comprehensive understanding.

Related Publications

Explore these studies to deepen your understanding of the subject.

Agriculture

Production of synthetic wheat lines to exploit the genetic diversity of emmer wheat and D genome containing Aegilops species in wheat breeding

G. Mirzaghaderi, Z. Abdolmalaki, et al.

Environmental Studies and Forestry

The socioeconomic and environmental niche of protected areas reveals global conservation gaps and opportunities

D. Mouillot, L. Velez, et al.

Medicine and Health

Comprehensive analysis of atherosclerotic plaques reveals crucial genes and molecular mechanisms associated with plaque progression and rupture

J. Wang, S. Xu, et al.

Medicine and Health

Population Pharmacokinetic and Exposure–Response Analysis of Finerenone: Insights Based on Phase IIb Data and Simulations to Support Dose Selection for Pivotal Trials in Type 2 Diabetes with Chronic Kidney Disease

N. Snelder, R. Heinig, et al.

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 12+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny