logo
ResearchBunny Logo
Copy number variation of the restorer *Rf4* underlies human selection of three-line hybrid rice breeding

Agriculture

Copy number variation of the restorer *Rf4* underlies human selection of three-line hybrid rice breeding

Z. Zhao, Z. Ding, et al.

This research reveals the early emergence of the Rf4 locus, a vital fertility restorer for rice breeding, ahead of the CMS-WA gene in wild rice. With 69 haplotypes generated in the Oryza genus and significant findings on the selection of Rf4 in modern cultivars, this study enhances our understanding of the CMS/Rf systems, opening new avenues for crop breeding. Conducted by Zhe Zhao, Zhi Ding, Jingjing Huang, Hengjun Meng, Zixu Zhang, Xin Gou, Huiwu Tang, Xianrong Xie, Jingyao Ping, Fangming Xiao, Yao-Guang Liu, Yongyao Xie, and Letian Chen.... show more
Introduction

Cytoplasmic male sterility (CMS) is a maternally inherited trait arising from mitochondrial genome rearrangements and mitochondrial–nuclear interactions and is widely used to produce hybrid rice via the three-line system. Among CMS systems in rice, Wild Abortive (CMS-WA) is the most extensively used in China. Fertility restoration depends on nuclear restorer (Rf) genes. Rf4, encoding a P-type pentatricopeptide repeat (PPR) protein that mediates degradation of the CMS-WA mitochondrial mRNA WA352, is a major restorer for CMS-WA, whereas another restorer Rf3 remains uncloned. Despite cloning of WA352 (WA352c) and Rf4, the evolutionary trajectory of Rf4 within the Oryza genus, its structural variation including potential copy number variation (CNV), and how Rf4 co-evolved with the newly emerged WA352c under natural and human selection have been unclear. Given that CNV can drive dosage effects on gene expression and phenotype in crops, the study aims to determine the origin and evolution of the Rf4 locus, assess structural and copy number variation across Oryza, test whether Rf4 CNV confers a dosage-dependent restoration of CMS-WA fertility, and elucidate how Rf4 haplotypes were selected during the development of three-line hybrid rice.

Literature Review

Prior work established CMS types used in rice hybrid breeding (WA, HL, BT) and identified Rf genes, many of which encode rapidly evolving, clustered P-type PPR proteins acting as sequence-specific RNA-binding factors to suppress CMS gene function. In rice, a PPR-rich locus on chromosome 10 harbors several Rf genes (Rf4 for CMS-WA, Rf1a/Rf5 and Rf1b for CMS-BT/HL, Rf19 for CMS-FA). PPR clusters often arise via tandem duplication, leading to allelic diversification. Copy number variation is a common genetic source influencing dosage-sensitive traits in plants, with examples including GL7 affecting grain size in rice, TdDof controlling wheat stem solidity, and structural variation at WUS and Fascicled ear1 regulating maize inflorescence stem cells. In sugar beet, dosage of a semi-dominant Rf1 allele modulates fertility restoration. However, whether CNV at Rf loci impacts fertility restoration in crops had not been demonstrated prior to this study.

Methodology
  • Comparative genomics of the Rf4 locus: Re-analysis of public reference genomes (ZS97B maintainer with rf4i, Nipponbare japonica with rf4j, MH63 and SH498 indica restorers) focusing on a PPR cluster encompassing Rf4 identified structural variation (SV) and copy number variation (CNV). Regions flanking Rf4 copies were compared for sequence identity.
  • Haplotype discovery in cultivars: Designed site-specific PCR primers based on SNPs to genotype 311 rice cultivars for Copy-a/Copy-b variants (Rf4a/Rf4b) and rf4i; PCR amplicons were sequenced to define seven variants and eight haplotypes (H1–H8) based on combinations at Copy-a and Copy-b.
  • Expansion to wild Oryza and landraces: Amplified and sequenced Rf4/rf4 fragments across diverse Oryza species spanning GG to AA genomes to catalog additional variants and haplotypes; total of 61 additional variants and many one-copy and two-copy haplotypes were identified.
  • Cross-species homology and phylogeny: BLAST searches in Poaceae identified homologs; Setaria viridis LOC117839145 served as outgroup. Multiple sequence alignment (MAFFT) and maximum likelihood phylogenetic analysis (MEGA X) delineated clades and inferred relationships among variants and species lineages.
  • Protein sequence comparison: Compared amino acid sequences of RF4 and rf4 variants, focusing on PPR motifs PPR13–PPR15 to identify substitutions associated with functionality.
  • Functional complementation assays: Constructed binary vectors expressing Rf4 or rf4 variants (rf4a, rf4b, rf4j, rf4aus) under the native Rf4 promoter; transformed into CMS-WA line Jin23A. Evaluated pollen viability (I2-KI staining) and seed setting in T0 transgenics to assess restoration capability.
  • CNV dosage experiments via genetics and genome editing: Generated F1 plants with differing endogenous Rf4 copy number by crossing Jin23A with near-isogenic lines ZSRf41 (H7, one-copy Rf4) and ZSRF4M (H1, two-copy Rf4). Created CRISPR/Cas9 knockouts of Rf4a and Rf4b in ZSRF4M to reduce copy number; crossed mutants to wild type to produce mF1 with defined copy dosage. Measured pollen viability and seed set.
  • Transgene dosage series: Produced Jin23A lines homozygous for an Rf4 transgene (two-copy) or hemizygous (one-copy) and crossed to ZSRf41 to generate additional dosage combinations; assessed fertility phenotypes.
  • Gene expression analyses: Performed qRT-PCR on anthers at the microspore mother cell stage to quantify Rf4 and WA352c transcript levels; UFC1 and atp6 served as internal controls for nuclear and mitochondrial targets, respectively.
  • Application analysis: Compiled planting area data (China, 1983–2018) for hybrid rice varieties using restorer lines with one- vs two-copy Rf4 haplotypes to evaluate breeding impact.
  • Marker development: Designed and validated eight PCR markers (copy-specific and variant-specific) to genotype Copy-a, Copy-b, Rf4, rf4a, rf4a/b, rf4i, rf4aus, rf4j; tested on 304 O. sativa accessions to provide a rapid MAS toolkit.
  • Growth and phenotyping: All materials grown at South China Agricultural University fields; standardized microscopy and field phenotyping protocols were followed.
Key Findings
  • Origin and diversity of the Rf4 locus:
    • The Rf4 locus predates the CMS-WA gene WA352c and first appears as an ancestral single-copy variant (Anc-Rf4) at the Copy-a site in Oryza meyeriana (GG genome).
    • Across Oryza, the locus underwent extensive sequence diversification, duplication, and recombination, yielding 69 haplotypes spanning wild species, landraces, and cultivars. Eight haplotypes (H1–H8) are enriched in modern cultivars.
    • Certain species lacked detectable Rf4/rf4 at the surveyed sites (O. longistaminata, O. barthii, O. glaberrima), suggesting lineage-specific loss or bottlenecks.
  • Functional status of haplotypes:
    • All rf4 variants tested (rf4a, rf4b, rf4j, rf4aus) failed to restore fertility in CMS-WA Jin23A transgenics, while Rf4 restored fertility partially, confirming rf4 haplotypes are non-functional for CMS-WA restoration.
    • A shared set of 14 amino acid substitutions within PPR13–PPR15 distinguishes non-functional rf4 proteins from functional RF4, implicating these motifs in restoration activity.
  • CNV-mediated dosage effect:
    • Endogenous copy number: F1 Jin23A×ZSRf41 (one-copy Rf4) showed ~71% pollen viability and ~34% seed set, whereas Jin23A×ZSRF4M (two-copy Rf4) showed ~88% pollen viability and ~52% seed set, demonstrating higher restoration with two copies.
    • Genome editing dosage: Reducing copies via CRISPR in ZSRF4M lowered fertility; mF1 with two copies (H1h1) had ~85% pollen viability and ~48% seed set vs wild-type ZSRF4M (~92% and ~72%) carrying four copies.
    • Transgene dosage: One-copy transgenic Jin23A/Rf4− lines had ~73% pollen viability and ~36% spikelet fertility; two-copy lines (Jin23A/Rf4Rf4 and Jin23A/Rf4Rf4×ZSRf41) showed ~87–90% pollen viability and ~51–53% spikelet fertility.
    • Rf4 transcript abundance scaled with copy number, while WA352c transcript levels inversely correlated, supporting a dosage-dependent post-transcriptional suppression mechanism.
  • Breeding impact and selection:
    • Hybrid rice varieties using restorer lines with two-copy Rf4 (H1) covered a larger total planting area (135,998,667 ha) than those with one-copy Rf4 haplotypes (H6, H7, H8; 78,015,332 ha) in China.
    • Two-copy Rf4 restorer lines (e.g., MH63, Ce64-7) underpin the most widely planted hybrids, reflecting breeder selection for stronger restoration.
  • Co-evolution with WA352c:
    • In O. rufipogon, functional WA352c coexisted with Rf4 haplotypes H7 (Rf4a-rf4b), H14 (rf4a-like), and H28 (Rf4a''-rf4b-like). The original WA352c-associated non-restored cytoplasm likely carried rf4a-like.
    • During CMS-WA line development, rf4a-like was replaced by rf4i through backcrossing with indica maintainer lines, yielding the current CMS-WA (WA352c/rf4i rf4i) system paired with restorer lines carrying Rf4.
Discussion

This study resolves major unknowns about the Rf4 locus by demonstrating that it is ancestrally older than the CMS-WA gene WA352c and exhibits extensive structural and copy number diversification across Oryza. Functionally, only Rf4 variants restore CMS-WA fertility, while multiple rf4 variants are non-functional, with critical amino acid changes in PPR13–PPR15 likely determining activity. The key advance is the demonstration that CNV at Rf4 confers a clear dosage effect on fertility restoration: increasing copy number elevates Rf4 transcript levels, enhances pollen viability and seed set, and concomitantly reduces WA352c transcripts. These findings explain the strong human selection for two-copy Rf4 haplotypes (H1) in three-line hybrid breeding, reflected by their predominance in high-deployment restorer lines. The evolutionary and breeding model suggests that WA352c arose in O. rufipogon and was combined initially with non-restoring rf4a-like; subsequent breeding introduced rf4i into CMS-WA lines and selected restorer haplotypes with higher Rf4 dosage. The work refines our understanding of CMS/Rf co-evolution and underscores CNV as a pivotal mechanism shaping restorer efficacy and breeding outcomes.

Conclusion

The Rf4 locus originated before WA352c and diversified via sequence and copy number changes to yield numerous haplotypes across Oryza, with eight enriched in modern cultivars. Functional Rf4 variants restore CMS-WA by reducing WA352c transcripts, whereas rf4 variants are non-functional, likely due to specific amino acid substitutions in PPR13–PPR15. Crucially, Rf4 copy number drives a dosage-dependent increase in restoration capacity, explaining the widespread breeder selection of two-copy Rf4 restorer lines and their larger planting areas in hybrid rice production. Practical outputs include a validated set of PCR markers enabling rapid genotyping of Rf4 CNV and haplotypes to accelerate marker-assisted selection of strong restorers and accurate selection of CMS/maintainer lines. Future work should elucidate the mechanistic role of the PPR13–PPR15 residues in target recognition and cleavage, explore pyramiding of multiple Rf genes within the chromosome 10 cluster to broaden hybrid combinations, and survey broader germplasm to identify additional functional variants for other CMS systems.

Limitations
  • Mechanistic details of how the 14 amino acid substitutions in PPR13–PPR15 abolish restoration activity remain unresolved and require biochemical/structural studies.
  • The survey did not detect Rf4 haplotypes in several species (O. longistaminata, O. barthii, O. glaberrima); whether this reflects sampling limits, gene loss, or detection challenges is uncertain.
  • The co-evolutionary model linking WA352c and Rf4 haplotypes is inferential based on current sampling and would benefit from broader population genomics.
  • Another CMS-WA restorer, Rf3, remains uncloned; interactions between Rf3 and Rf4 dosage were not addressed.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny