Introduction
Cytoplasmic male sterility (CMS) is a maternally inherited trait resulting from mitochondrial gene rearrangement and the interaction between mitochondrial and nuclear genes. Its discovery in rice (*Oryza sativa*) revolutionized hybrid rice breeding through the three-line system, leveraging heterosis to boost yields. Three main CMS types—Wild Abortive (CMS-WA), HongLian (CMS-HL), and Boro II (CMS-BT)—are utilized, with CMS-WA predominantly employed in China (1983–2018). The three-line system necessitates fertility restoration in CMS lines, making understanding the evolution and selection of *Rf* genes crucial for optimizing hybrid rice breeding.
The *Oryza* genus, encompassing 25 wild and 2 cultivated species (*O. sativa* and *O. glaberrima*), has diversified significantly over 15 million years. Asian cultivated rice further divides into *indica* and *japonica* subspecies. CMS-WA lines were developed by introgressing CMS cytoplasm from *O. rufipogon* into indica cultivars. The CMS-WA gene, *WA352*, encodes a cytotoxic mitochondrial protein causing anther tapetum cell death and pollen abortion. Evolutionary analysis reveals that *WA352* arose from non-functional CMS-like protogenes via mitochondrial genome rearrangements and selection in *O. rufipogon*. A major CMS-WA restorer gene, *Rf4*, has been cloned, encoding a pentatricopeptide repeat (PPR) protein that degrades *WA352* mRNA. While several *Rf4* variants are known, the evolutionary trajectory of *Rf4* and its co-evolution with *WA352* remain unclear. Most plant *Rf* genes encode P-type PPR proteins, acting as sequence-specific RNA-binding proteins to counteract newly emerged mitochondrial CMS genes. PPR genes frequently cluster, potentially expanding via tandem duplication. A PPR gene cluster on rice chromosome 10 contains at least ten genes, including *Rf4*, *Rf1a/b*, *Rf5*, and *Rf19* for different CMS types. The *Rf4* locus shows rapid evolution through sequence duplication and allelic variation, generating functional and non-functional variants. However, the role of copy number variation (CNV) at the *Rf4* locus remains unexplored. CNV is a significant source of genetic variation, influencing gene expression and contributing to adaptive traits in plants. It typically affects gene expression through gene dosage effects, altering phenotypes. This study investigates the origin and evolution of *Rf4*, identifying diverse structural variations (SV) and CNV at this locus to determine if CNV-mediated gene dosage of *Rf* genes affects fertility restoration.
Literature Review
Previous studies identified five *Rf4* variants (*Rf4", Rf4', rf4aus, rf4i, rf4j*) in rice. *Rf4* encodes a PPR protein that restores fertility by mediating *WA352* mRNA degradation. However, the evolutionary history of *Rf4* in the *Oryza* genus and its co-evolution with *WA352* remained unclear. Research on plant *Rf* genes indicates that they frequently encode P-type PPR proteins, which are rapidly evolving and often clustered. These clusters arise from tandem duplication, expanding the PPR gene family. A rice chromosome 10 cluster includes *Rf4* and other restorer genes for different CMS types. The *Rf4* locus exhibits rapid evolution via sequence duplication and allelic variation, resulting in functional and non-functional variants. Copy number variation (CNV) is known to be a significant driver of phenotypic variation in plants, affecting gene expression and contributing to adaptive traits. The impact of CNV on *Rf* gene function and fertility restoration was previously unknown.
Methodology
This study employed a multi-faceted approach combining genomic analysis, genetic engineering, and molecular marker development. Initially, the researchers re-analyzed publicly available genome sequences of the *Rf4* locus from several rice varieties (ZS97B, Nipponbare, MH63, SH498) to identify structural variations (SV) and copy number variations (CNV). Focusing on a PPR cluster containing the *Rf4* locus, they identified three PPR genes in fertility-restoring lines and compared them with those in lines lacking this ability, revealing differences in gene content and structure. They further expanded their analysis using PCR amplification and sequencing to examine SV and CNV in 311 rice cultivars. This involved designing site-specific primers based on single nucleotide polymorphisms (SNPs) to target specific *Rf4* variants (*Rf4a, Rf4b, rf4i*). The resulting haplotypes were classified based on the combination of Copy-a and Copy-b variants.
To understand the evolutionary history of *Rf4* within the *Oryza* genus, the researchers expanded their analysis to include wild rice and landrace accessions, identifying additional *Rf4* variants and haplotypes. A phylogenetic analysis was conducted using the nucleotide sequences of *Rf4* and its homologs identified in other Poaceae species via BLAST searches. To verify the functional roles of identified *rf4* haplotypes, they utilized a functional complementation assay. Binary vectors carrying different *rf4* variants (*rf4a*, *rf4b*, *rf4j*, *rf4aus*, *Rf4M*), under the control of the native *Rf4* promoter, were constructed and introduced into the CMS-WA line Jin23A. The fertility of resulting transgenic lines was assessed by pollen viability and seed-setting rate, characterizing pollen morphology under microscopy.
The study then explored the relationship between CNV at the *Rf4* locus and fertility restoration. Near-isogenic lines with varying *Rf4* copy numbers were produced via crosses and CRISPR/Cas9 editing, enabling examination of the dosage effect on fertility restoration. Pollen viability, anther phenotype, seed setting rates, and gene expression levels (*Rf4*, *WA352*) were measured in these lines to investigate the correlation between gene copy number and restoration efficiency. The expression levels were determined via qRT-PCR, using suitable reference genes.
Finally, a set of PCR-based molecular markers was developed for efficient genotyping of the *Rf4* locus in hybrid rice breeding programs. These markers were based on SNPs distinguishing different *Rf4* variants, enabling the identification of functional and non-functional alleles and CNVs. The utility and accuracy of these markers were confirmed by testing them on a panel of 304 Asian cultivated rice germplasms.
Key Findings
This research revealed that the *Rf4* locus, a major fertility restorer gene in rice, exhibits significant copy number variation (CNV) and structural variation (SV) across different rice varieties and wild species. Phylogenetic analysis traced the origin of *Rf4* to the oldest wild rice species *O. meyeriana*, suggesting an early emergence of the gene. Over time, the *Rf4* locus expanded in wild rice through sequence variation, duplication, and recombination, resulting in the formation of 69 haplotypes, with 61 new variants identified in the study. Eight of these haplotypes were enriched in modern rice cultivars, highlighting human selection pressure during domestication.
Crucially, the study found that varieties possessing two copies of the functional *Rf4* exhibited significantly stronger fertility restoration ability compared to those with one copy. This two-copy phenotype was consistently linked to higher pollen viability, improved anther morphology, and a higher seed setting rate. Gene expression analysis confirmed a correlation between *Rf4* copy number and expression levels; higher *Rf4* copy number corresponded to enhanced *Rf4* expression and reduced expression of the CMS gene *WA352*, demonstrating a dosage-dependent effect on fertility restoration. In contrast, the *rf4i* variant, a non-functional version, was preferentially selected for use in breeding current CMS-WA lines.
Functional characterization of *Rf4* variants revealed that 14 specific amino acid substitutions within three PPR motifs (PPR13-15) were critical for the functionalization of the *Rf4* protein. These amino acid changes differentiate functional *Rf4* alleles from non-functional *rf4* alleles. This study also found that the mitochondrial sterility gene *WA352c* co-evolved with *Rf4*, with some combinations being more prevalent in natural populations and others favored during artificial selection. Specifically, the *rf4i* variant combined with *WA352c* dominated in current CMS-WA lines, pointing to a selective sweep during the breeding process.
Finally, the researchers developed a set of eight PCR-based markers for efficient genotyping of the *Rf4* locus. These markers effectively distinguished different *Rf4* haplotypes, including the detection of CNVs, assisting in the selection of superior restorer lines for hybrid rice breeding. The markers differentiated one-copy and two-copy functional *Rf4* haplotypes as well as non-functional haplotypes such as *rf4i*, providing powerful tools for future breeding programs.
Discussion
The findings of this study significantly advance our understanding of the evolution and domestication of the rice *Rf4* gene, a key component of the CMS/restorer system for hybrid rice breeding. The demonstration of a strong dosage effect associated with *Rf4* copy number has direct implications for breeding strategies. The development of efficient molecular markers will allow for rapid screening and selection of superior restorer lines, accelerating the breeding process. The identification of key amino acid substitutions differentiating functional and non-functional *Rf4* alleles provides insights into the molecular mechanisms underlying fertility restoration. The co-evolutionary analysis of *Rf4* and *WA352c* reveals the complex interplay between mitochondrial and nuclear genomes in shaping CMS systems. The identification of the *rf4i* allele as the predominant form in current CMS-WA lines highlights the impact of human selection on the genetic architecture of this important breeding system. Future research could focus on investigating the detailed molecular mechanisms of *Rf4* action, exploring the potential for gene stacking of multiple *Rf* genes for broader compatibility, and conducting genome-wide association studies to further explore the genetic basis of fertility restoration and other agronomic traits related to *Rf4*.
Conclusion
This study demonstrates the crucial role of copy number variation at the *Rf4* locus in determining fertility restoration capacity in rice. The development of molecular markers will significantly accelerate the breeding of superior restorer lines. The findings reveal a complex interplay between natural and human selection in shaping the genetic architecture of the rice CMS/restorer system. Future work could investigate the mechanisms underlying the dosage effect and explore the potential of pyramiding multiple restorer genes for broader hybrid rice combinations.
Limitations
While this study provides comprehensive insights into the *Rf4* locus, certain limitations exist. The study focused primarily on Asian cultivated rice, limiting the generalizability to other rice subspecies or species. Further investigation is needed to explore the impact of other genetic and environmental factors on *Rf4* expression and function. The functional complementation assays used a limited number of transgenic lines; a larger-scale study would further strengthen the conclusions. The phylogenetic analysis was based on a limited number of *Rf4* homologs and could be expanded to include more diverse plant species.
Related Publications
Explore these studies to deepen your understanding of the subject.