logo
ResearchBunny Logo
Introduction
Cotton, a widely cultivated fiber crop, is crucial to the textile industry. *Gossypium hirsutum* accounts for over 90% of global cotton production, with thousands of improved varieties contributing to yield increases. However, breeding efforts to synergistically improve yield, quality, and disease resistance are hampered by a limited understanding of the genomic basis of these complex traits. High-quality genome assemblies of modern *G. hirsutum* varieties are essential for both breeding and biological research, yet genomic information on recently developed cotton varieties has been scarce, leaving the genomic diversification during modern breeding unclear. *Gossypium barbadense*, contributing approximately 10% of global production, is known for its superior fiber quality. Transferring beneficial traits from *G. barbadense* to *G. hirsutum* is a promising breeding strategy; however, the genomic variations between *G. barbadense* and modern *G. hirsutum* are not well-defined. While the identification of single nucleotide polymorphisms (SNPs) has advanced our understanding of the genetic basis of cotton traits, the widespread genomic structural variations (insertions, deletions, inversions, and translocations) present a significant challenge. These structural variations can lead to missing or variant sequences not present in the majority of the population, highlighting the need to explore them for effective crop improvement. The genetic effects of these structural variations on agronomic traits remain largely unknown, emphasizing the importance of this research.
Literature Review
Previous research has focused on identifying SNPs associated with cotton traits, leading to advancements in our understanding of the genetic basis of fiber quality and yield. Studies have utilized genomic data from various cotton species, including *G. hirsutum*, *G. barbadense*, and their diploid ancestors, to identify key genomic regions and genes affecting agronomic traits. These studies have provided valuable insights but have often overlooked the contribution of structural variations. High-quality genome assemblies of some cotton cultivars have been published, providing a foundation for further research, but a comprehensive analysis integrating structural variation data with phenotypic information is lacking. The need for high-quality reference genomes for modern cotton cultivars, combined with large-scale resequencing data, is crucial for unlocking the full potential of cotton genomics in crop improvement. The current study directly addresses these gaps by generating high-quality genome assemblies of modern cultivars and performing comprehensive analysis of both SNPs and structural variations.
Methodology
This study generated high-quality reference genomes and annotations for modern *G. hirsutum* cv. NDM8 and *G. barbadense* acc. Pima90, selected for their importance in cotton research and breeding. NDM8, widely grown in China, exhibits high yield and disease resistance. Pima90, a valuable genetic material, has been used extensively in molecular breeding. Furthermore, 1,081 worldwide *G. hirsutum* accessions were resequenced. High-quality genomes were assembled using PacBio SMRT sequencing, Illumina paired-end data, 10x Genomics linked reads, and Hi-C data. Genome assembly quality was assessed using several metrics, including contig and scaffold N50 sizes, BUSCO scores, and mapping ratios. Gene prediction was conducted using a combination of homology-based, ab initio, and RNA-seq-based approaches. Repetitive sequences were annotated to identify LTR retrotransposons. Structural variations were identified by aligning Pima90 to the NDM8 genome and also by analyzing the 1,081 *G. hirsutum* accessions. A genome-wide association study (GWAS) was performed on seven key agronomic traits (fiber length (FL), fiber strength (FS), micronaire value (M), boll weight (BW), lint percentage (LP), seed index (SI), and Verticillium wilt resistance) using phenotypic data from multiple environments. The GWAS integrated both SNPs and structural variations, utilizing BLUP values to account for environmental effects. Expression analyses of selected genes were conducted using qRT-PCR, and functional validation was performed using virus-induced gene silencing (VIGS) in cotton and *Arabidopsis* overexpression.
Key Findings
The study successfully assembled high-quality genomes for NDM8 (2.29 Gb) and Pima90 (2.21 Gb). The assembled genomes showed high completeness and accuracy, with high contig and scaffold N50 values and a high percentage of anchored sequences. A total of 80,124 and 79,613 protein-coding gene models were predicted in NDM8 and Pima90, respectively. Analysis revealed a higher density of structural variations in the D-subgenome compared to the A-subgenome, suggesting stronger selection pressure on the D-subgenome during cotton domestication and breeding. The study identified a large number of structural variations between Pima90 and NDM8, highlighting significant genomic diversification between these two species. Moreover, 76,568 structural variations were identified within *G. hirsutum* NDM8, compared to TM-1. A GWAS of seven traits identified 446 structural variations significantly associated with these traits. Interestingly, structural variations associated with fiber quality and Verticillium wilt resistance were predominantly found in the D-subgenome, while those related to yield were primarily located in the A-subgenome. Specifically, a 2-bp deletion in the sucrose synthase (Sus) gene in Pima90 was associated with superior fiber quality. Additionally, a 1-bp deletion in the cinnamoyl-CoA reductase (CCR) gene in TM-1 was associated with susceptibility to Verticillium wilt, and functional validation demonstrated the role of GhNCS (a pathogenesis-related 10/Bet v1 family member) in Verticillium wilt resistance.
Discussion
The findings of this study significantly advance our understanding of the role of structural variations in cotton genome evolution and their contribution to agronomic traits. The high-quality genome assemblies provide valuable resources for future cotton research and breeding. The observed higher density of structural variations in the D-subgenome highlights its importance in shaping key traits. The GWAS results demonstrate the significant impact of structural variations on both fiber quality and yield traits, offering new targets for marker-assisted selection. The functional validation of genes associated with structural variations confirms the reliability of the GWAS findings and provides further insights into the mechanisms underlying these traits. This research underscores the need to consider both SNPs and structural variations in cotton breeding programs to achieve more significant improvements in cotton production. The discovery of GhNCS as a key gene influencing Verticillium wilt resistance opens new avenues for developing resistant cotton varieties.
Conclusion
This study successfully generated high-quality genome assemblies for two modern cotton cultivars and resequenced 1,081 *G. hirsutum* accessions. The analysis revealed significant structural variation contributing to agronomic traits, particularly highlighting the D-subgenome's role in fiber quality and disease resistance. The identification of key genes, such as GhNCS, associated with Verticillium wilt resistance offers promising targets for breeding strategies. Future research should focus on functional characterization of more structural variation-associated genes and the development of advanced breeding tools leveraging this genomic information for improved cotton production.
Limitations
While this study provides a comprehensive analysis of genomic variations and their association with agronomic traits, some limitations exist. The GWAS was based on a specific set of accessions and environments, and the results may not be fully generalizable to all cotton varieties. The functional validation was primarily focused on a few candidate genes, requiring further research to fully elucidate the role of other identified variations. Furthermore, the study did not account for complex interactions among different genes and environmental factors that influence cotton traits.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs—just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny