logo
ResearchBunny Logo
Genomic analyses provide insights into spinach domestication and the genetic basis of agronomic traits

Agriculture

Genomic analyses provide insights into spinach domestication and the genetic basis of agronomic traits

X. Cai, X. Sun, et al.

This groundbreaking study by Xiaofeng Cai and colleagues presents a chromosome-scale reference genome assembly of spinach, highlighting significant genome rearrangements and the effects of artificial selection on leaf traits, bolting, and flowering. Insights into genetic diversity and candidate genes for agronomic traits offer valuable resources for spinach breeding.

00:00
00:00
~3 min • Beginner • English
Introduction
The study addresses the need for an accurate, chromosome-scale reference genome and comprehensive population genomic resources to advance spinach breeding and genetics. Despite previous QTL and marker discoveries for traits like leaf morphology, bolting/flowering, and nutritional quality, spinach improvement still requires better genomic tools, particularly to reduce oxalate content and enhance disease resistance. Prior spinach genome assemblies were fragmented and largely limited to transcribed regions. The researchers aimed to produce a high-quality genome assembly, reconstruct Chenopodiaceae ancestral karyotypes to understand spinach genome evolution, generate a dense variation map from resequencing diverse accessions, perform GWAS on key agronomic traits, and identify domestication sweeps shaping cultivated spinach.
Literature Review
Previous work identified QTLs and markers for spinach agronomic traits (leaf morphology, bolting/flowering, nutritional quality) and developed draft genome assemblies using short and long reads (Sp75, Spov3, SOL_r1.1). However, assemblies were fragmented and variation discovery was limited to transcriptomes, constraining genetic mapping and gene discovery. Wild Spinacia species (S. turkestanica as the likely progenitor and S. tetrandra) provide genetic diversity; introgression of NBS-LRR genes has been used for downy mildew resistance. There remained a need for a reference-grade genome to improve comparative genomics, GWAS, and cloning of trait genes.
Methodology
Plant material: An inbred line, Monoe-Viroflay (highly homozygous derivative of Viroflay), was used for the reference genome. Growth in controlled greenhouse conditions. Sequencing and assembly: Generated 110 Gb PacBio CLR reads (~118×) and 102 Gb Illumina PE reads (~109×). PacBio reads were corrected and assembled with Canu v1.7.1; polished with Arrow (SMRT Link 5.1) and Pilon v1.22 using Illumina reads. Contamination screening removed organellar/microbial contigs. Chicago and Hi-C libraries sequenced on Illumina HiSeq 4000 and used with HiRise for scaffolding; misassemblies curated using Hi-C signals and genetic maps. Assembly assessment: Merqury for QV and completeness; LAI for repeat assembly quality; BUSCO for gene completeness; mapping rates assessed with genomic and RNA-Seq reads. Annotation: Repeats detected with MITE-Hunter, LTRharvest, RepeatModeler; masking with RepeatMasker. Gene prediction via MAKER-P integrating ab initio (SNAP, AUGUSTUS, GeneMark-ES), homology (quinoa, sugar beet, Sp75, Arabidopsis proteins), and transcript evidence (RNA-Seq assembled by StringTie; CDS by PASA). Comparative genomics: Reconstructed ancestral Chenopodiaceae karyotype using sugar beet, garden orache, quinoa, and amaranth (outgroup). Synteny by MCScanX; blocks filtered by mean Ks; multi-genome blocks integrated with MGRA to infer nine protochromosomes. Gene family evolution via OrthoMCL, MAFFT, trimAl, IQ-TREE, MCMCTree, and CAFE. Population resequencing: 305 accessions (295 S. oleracea, 7 S. turkestanica, 3 S. tetrandra) sequenced (~5.5 Tb; mean depth 15.9×; 98.2% genome coverage). Reads trimmed (Trimmomatic) and aligned to Monoe-Viroflay (BWA-MEM). Variant calling with GATK v4.1 and hard filters; bi-allelic SNPs retained; SVs detected with smoove (LUMPY + SVTyper). Population genetics: Phylogeny from 549,814 filtered SNPs (VCF2Dis NJ tree). PCA (EIGENSOFT) and STRUCTURE using fourfold degenerate site SNPs excluding S. tetrandra-specific variants. LD decay with PopLDdecay. Nucleotide diversity (π) and FST calculated. Phenotyping and GWAS: Field trials (Shanghai Normal University, winter 2019) with randomized complete block (3 replicates, 10 plants per accession). Measured 20 traits: plant architecture (3), leaf traits (8), petiole traits (8), bolting, flowering, sex type, downy mildew (DM) resistance/incidence, soluble oxalate. DM assessed via inoculation with Peronospora farinosa f. sp. spinaciae; disease index computed. Soluble oxalate quantified colorimetrically. GWAS used 5,511,663 SNPs (MAF ≥ 0.01, missing ≤ 50%); linear mixed model in EMMAX with Balding–Nichols kinship; modified Bonferroni thresholds (α=0.05 and α=1). Associated regions defined as ±50 kb around significant SNPs; overlapping windows merged. Selective sweeps: Compared S. oleracea vs S. turkestanica using XP-CLR, FST, and π ratio in sliding windows (10 kb window, 1 kb step). Regions in top 1% in any two methods called sweeps. Co-localization with QTL/GWAS signals examined.
Key Findings
- High-quality reference genome: 894.3 Mb total size in 307 contigs; contig N50 = 23.8 Mb; 98.3% of contigs anchored and ordered onto 6 chromosomes. Consensus QV = 46.2 (~99.998% base accuracy); completeness 99.4% (Merqury). LAI = 20.32; BUSCO complete genes = 97.2%. Repeat content 69.9% (624.8 Mb), mostly LTR retrotransposons (Copia, Gypsy). Predicted 28,964 protein-coding genes; 98.6% supported by RNA-Seq or homologs. - Chromosome evolution: Reconstructed ancestral Chenopodiaceae karyotype (9 protochromosomes). Spinach shows extensive rearrangements (216 translocations, 80 inversions) and chromosome number reduction to 6, correlating with high repeat content (~70%) and recent LTR bursts. Evidence of contracted gene families (e.g., auxin-responsive, terpene synthases) and expanded families (sugar transporters, F-box proteins, TFs). Fewer NBS-LRR R genes (115; ~half in clusters), suggesting secondary loss. - Variation map: From 305 accessions, identified 17,760,485 SNPs and 68,328 SVs. About 49.9% SNPs genic/proximal; 2.91% medium/high effect. SNP sharing: 56.5% between S. oleracea and S. turkestanica; 97.5% between S. oleracea and S. tetrandra; indicating S. tetrandra is more distant from cultivated spinach. - Population genetics: Nucleotide diversity π: S. oleracea = 1.33×10⁻³; S. turkestanica = 1.52×10⁻³. Very low FST (0.03) between cultivated and S. turkestanica, indicating a weak domestication bottleneck. Clear substructure within S. oleracea (Asia vs Europe): Group I (Asia) π = 1.54×10⁻³; Group II (Europe) π = 1.23×10⁻³; FST(Asia vs Europe) = 0.06. LD decays rapidly to r²=0.2 within ~600 bp, consistent with outcrossing. - GWAS across 20 traits: 372 significant signals (α=0.05) for 12 traits; 34 signals for remaining 8 traits. • Downy mildew resistance/incidence: Major associated region on Chr3 (approx. 0.024–1.60 Mb) overlapping Pfs-1 marker and fine-mapped interval; contains NBS-LRR clusters and candidate gene SOV3g001250; nearby RAR1 ortholog and MKK7 receptor kinase. Minor peaks include Chr4 promoter variant in WSD6 (SOV4g053700). • Sex determination: Strong association on Chr4 (92–103 Mb), consistent with Y-linked male-specific region and recombination suppression due to a large inversion. • Plant type: Top signals on Chr2 (~85 Mb; SOV2g021360, unknown function) and Chr3 (~58 Mb; DEAD-box RNA helicase SOV3g025280). Additional regions harbor AVP1 (SOV4g047390) and TB1 homolog (SOV5g010230). • Organ size: Two Chr1 regions: 61,956,104–62,056,104 bp (leaf length/width, plant width; includes FAR1-related SOV1g011880) and 72,249,831–72,349,831 bp (plant height, petiole length). Another FAR1-related gene (SOV6g004340) near petiole width lead SNP. • Leaf surface texture: Region near end of Chr6 includes β-tubulin (SOV6g040410), implicated in microtubule-mediated leaf flattening. • Leaf base shape: Chr3 (~30 Mb) near genes SOV3g017870 (early-responsive to dehydration) and SOV3g017860 (unknown). • Bolting/flowering: Multiple regions including MADS-box TFs SOV6g023690 and SOV4g008150. • Soluble oxalate content: Chr5 (915,444–1,022,853 bp) with 15 genes; candidates include SOV5g000760 (heavy metal transport/detoxification) and SOV5g000810 (ZIP metal ion transporter), suggesting modulation of calcium/ion homeostasis could reduce soluble oxalate. - Domestication sweeps: Identified 996 sweeps spanning 17.6 Mb containing 748 genes. Many co-localize with GWAS/QTL for flowering, bolting, plant type, leaf texture/base shape, petiole color/width. Wild S. turkestanica has smooth leaves; cultivated Asia group includes smooth/wrinkled, Europe group mostly wrinkled, indicating selection for wrinkled leaves. The selected region for leaf surface texture overlaps candidate cutin biosynthesis genes GPAT4 (SOV2g031630) and CYP86A8/LCR (SOV2g031660).
Discussion
The high-quality, chromosome-scale spinach assembly overcomes prior limitations associated with heterozygosity and repetitive content, enabling robust comparative and population genomics. Spinach exhibits pronounced genome rearrangements and reduced chromosome number relative to the reconstructed Chenopodiaceae ancestor, plausibly linked to extensive transposable elements. Population analyses indicate a weak domestication bottleneck and high diversity with geographic differentiation between Asian and European cultivated groups. Rapid LD decay enhances GWAS resolution, which uncovered major and minor loci for key agronomic traits. Notably, a strong Chr3 locus with NBS-LRR candidates underlies downy mildew resistance, while additional minor loci suggest multifaceted resistance mechanisms. Associations for sex determination, plant architecture, organ size, leaf traits, flowering/bolting, and oxalate content highlight candidate genes and pathways. The overlap between domestication sweeps and trait-associated regions underscores the role of human selection in shaping phenotypes, such as wrinkled leaves linked to cutin biosynthesis genes. Collectively, these findings clarify spinach evolution and domestication and provide actionable targets for breeding and marker-assisted selection.
Conclusion
This work delivers a high-accuracy, chromosome-scale reference genome for spinach and a comprehensive genomic variation map from 305 accessions. It reconstructs Chenopodiaceae ancestral chromosomes, revealing extensive spinach-specific genome rearrangements, and dissects the genetic architecture of 20 agronomic traits via GWAS, identifying key candidate genes and loci. Nearly one thousand domestication sweeps overlapping important traits demonstrate human selection’s impact on spinach phenotypes. These genomic resources and insights will accelerate functional studies and facilitate marker-assisted and genomics-enabled breeding of spinach. Future research should validate candidate genes, elucidate mechanisms underlying complex traits (e.g., DM resistance networks, cuticle-mediated leaf texture), and leverage wild diversity for trait improvement, including reduced soluble oxalate.
Limitations
While GWAS identified numerous candidate regions, functional validation is needed to confirm causative genes (e.g., NBS-LRR SOV3g001250 for DM resistance; GPAT4/CYP86A8 for leaf texture; MADS-box genes for flowering). The soluble oxalate candidates (SOV5g000760, SOV5g000810) are hypothesized based on ion homeostasis and require experimental validation to establish causality and mechanistic links to calcium levels. Some trait associations have broad intervals or multiple candidates due to complex LD or polygenicity. Downy mildew resistance appears multifactorial; race-specific effects were not exhaustively explored. Population sampling, while broad, includes relatively few S. turkestanica and S. tetrandra accessions, potentially limiting resolution of wild diversity.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny