Agriculture
Fonio millet genome unlocks African orphan crop diversity for agriculture in a changing climate
M. Abrouk, H. I. Ahmed, et al.
Discover the potential of fonio millet, an orphan African cereal crop essential for sustainable agriculture in hot and dry environments. Researchers, including Michael Abrouk and Hanin Ibrahim Ahmed from KAUST, have unveiled high-quality genomic resources and genetic insights to enhance fonio breeding and adaptation to climate change.
~3 min • Beginner • English
Introduction
The study addresses how to unlock the genetic potential of the African orphan cereal fonio (Digitaria exilis) to diversify agriculture under climate change. Major cereals (wheat, maize, rice) dominate global calories but have limited genetic diversity for extreme environments, whereas orphan crops and wild relatives often thrive in marginal conditions. Fonio is a fast-maturing, drought-tolerant, West African C4 cereal with local importance but retains semi-domesticated traits (seed shattering, lodging, low yield). The research aims to generate comprehensive genomic resources for fonio—producing a chromosome-scale reference genome and resequencing a broad diversity panel—to illuminate its genome architecture, diversity, domestication history, and to identify genetic targets for rapid improvement (e.g., seed size, shattering) for dryland agriculture.
Literature Review
Background literature highlights the need to broaden agrobiodiversity and leverage genomic technologies, including de novo domestication and genome editing, to improve orphan crops. Many staple cereals originated in humid regions and may lack diversity for harsh climates, while orphan crops can be adapted to extreme conditions. Fonio’s likely wild progenitor is the tetraploid weed D. longiflora, and domestication may have occurred >5000 years ago in Mali’s Inner Niger Delta. Prior work underscored fonio’s rapid maturation and drought adaptation but also its undomesticated traits. Advances in plant genome assembly and population genomics provide opportunities to dissect polyploid genomes and identify domestication genes, as shown in related grasses and other crops.
Methodology
- Reference genome assembly: Selected accession CM05836 (Mali, dry region). Estimated genome size 893 Mb/1C by flow cytometry. Sequenced with Illumina paired-end (321×), mate-pair (241×), and 10x Genomics linked reads (84×). Assembled with DeNovoMAGIC3; scaffolded using Hi-C (122×) and Bionano optical mapping to produce 18 chromosome-scale pseudomolecules totaling ~655.7 Mb (assembly length 716.47 Mb), with BUSCO completeness 96.1%.
- Cytogenetics and subgenome assignment: Designed oligo painting FISH probes per pseudomolecule; validated 18 chromosomes and differentiation of homoeologs. Identified centromeric 314 bp tandem repeat. Re-assembled with TRITEX for collinearity confirmation. Used full-length LTR retrotransposon (fl-LTR-RT) family structure to distinguish two subgenomes (A, B) based on subgenome-specific TE clusters; dated TE bursts ~1.56 and ~1.14 MYA.
- Gene annotation and comparative genomics: Masked 34.1% repeats; annotated genes with MAKER using RNA-seq from four tissues and protein homology, yielding 59,844 protein-coding genes. Performed synteny analyses with grasses (e.g., Setaria italica) and computed Ks to date subgenome divergence (~3 MYA) and species divergences. Assessed subgenome dominance via gene counts and expression (no dominance detected).
- Diversity panel resequencing: Resequenced 166 D. exilis accessions from West Africa and 17 D. longiflora from Central/East/West Africa (avg 45× and 20× coverage, respectively). Mapped to CM05836; performed GATK-based variant calling and stringent filtering to retain 11,046,501 high-quality bi-allelic SNPs. Annotated variants with snpEff.
- Population genetics: PCA and ancestry estimation (sNMF, K=2–10) on SNPs; LD decay with PopLDdecay; geographic kriging of PCA axis; spatial distributions of private SNPs. Correlated genetic structure with climate (WorldClim), geography, and ethnolinguistic data (Ethnologue; passport data) using Pearson correlations, Mantel tests, and ANCOVA.
- Environmental and sociocultural associations: GWAS for mean temperature of wettest quarter and mean precipitation of wettest quarter using EMMA, MLM (GAPIT), and LFMM2 with FDR control; defined QTL windows ±50 kb; GO enrichment. Logistic regression associations for ethnic/linguistic groups using PLINK with PCs as covariates.
- Demography: smc++ inference of effective population size histories for D. exilis groups (K=6) and D. longiflora (generation time 1 year; μ=6.5×10⁻⁹).
- Selection scans: Identified signatures of selection using three approaches: composite likelihood ratio (SweeD), nucleotide diversity ratio (π D. longiflora/D. exilis) in 50 kb sliding windows (10 kb step), and FST in 50 kb windows. Cross-referenced top 1% regions with orthologs of 34 known domestication genes.
- Candidate gene analyses and phenotyping: Investigated grain-size gene GS5 orthologs (DeGS5-3A/3B) and shattering gene Sh1 orthologs (DeSh1-9A/9B). Validated a ~60 kb deletion at DeSh1-9A by read-depth, PCR, and Sanger sequencing. Quantified seed shattering by mechanical shaking of panicles; compared accessions with and without the deletion via two-way ANOVA.
Key Findings
- High-quality reference genome: 716.47 Mb total, with ~91.5% (655.72 Mb) in 18 pseudomolecules; 60.75 Mb unanchored; BUSCO complete genes 96.1%. Annotated 59,844 protein-coding genes; 74.3% expressed in surveyed tissues.
- Allotetraploid architecture: Two homoeologous sets of nine chromosomes; subgenomes A and B differentiated using subgenome-specific fl-LTR-RT families. TE bursts ~1.1–1.5 MYA; subgenome divergence ~3 MYA, indicating recent allotetraploidy. No subgenome dominance in gene retention or expression.
- Resequencing and variation: 11,046,501 high-quality SNPs retained. Mean nucleotide diversity π: 6.19×10⁻⁴ in D. exilis vs 3.68×10⁻³ in D. longiflora. LD decay slower in D. exilis (r²≈0.20 at 70 kb) than D. longiflora (r²≈0.16 at 70 kb). SNPs broadly distributed; ~30% gene-proximal; ~6.2% exonic (≈51.6% nonsynonymous, including 6727 predicted LOF).
- Population structure and drivers: Clear separation between cultivated D. exilis and wild D. longiflora; D. longiflora split into three groups. Within D. exilis, distinct clusters, notably southern Togo and a Guinea group; admixture increases in central West Africa. Genetic structure significantly correlates with climate (temperature, precipitation), geography (latitude, longitude, altitude), and ethnolinguistic factors (Mantel tests, ANCOVA), with effects of ethnicity persisting after controlling for environment.
- Environmental and sociocultural associations: Climate GWAS identified 38 loci (temperature) and 179 loci (precipitation), enriched for hormone metabolism, carbohydrate metabolism, development, and growth processes. Ethnic-group associations identified 55 SNPs (Bambara) and 227 SNPs (Fula); notable peaks on 2A, 2B, 6A near a WSD1 homolog implicated in drought-associated wax biosynthesis.
- Demography: D. exilis effective population size declined starting >10,000 years ago, bottlenecking ~2000–1000 years ago, followed by ~100-fold expansion, consistent with domestication and later cultivation spread. No analogous bottleneck signal in D. longiflora.
- Selection and domestication genes: Three complementary scans detected multiple candidate regions (CLR: 78; π ratio: 311; FST: 208). Most canonical cereal domestication genes show no strong selection in fonio, indicating underexploited targets. Two notable loci:
- Grain size GS5: Strong selective sweep at DeGS5-3A on chromosome 3A with near-complete loss of diversity; DeGS5-3A expressed in grain, DeGS5-3B not detected. Consistent with selection for wider/heavier grains in D. exilis vs D. longiflora.
- Shattering gene Sh1: ~60 kb deletion removing DeSh1-9A on chromosome 9A found in ~37% of D. exilis accessions; DeSh1-9B intact. Accessions with the deletion show a modest but significant 7% reduction in seed shattering (mean 45% vs 52%; two-way ANOVA p=0.008; n=3 panicles per accession across 43 deletion carriers and 39 non-carriers). The deletion is geographically widespread, suggesting an ancient origin or broad diffusion.
Discussion
The study delivers foundational genomic resources that clarify fonio’s recent allotetraploid origin, genome organization, and patterns of diversity, providing a platform for accelerated improvement of this climate-resilient orphan crop. Population analyses demonstrate that climatic, geographic, and ethnolinguistic factors have jointly shaped fonio diversity and structure, implying both environmental adaptation and human cultural practices influenced selection and seed exchange. Demographic reconstruction reveals a domestication-associated bottleneck followed by rapid expansion in the last millennium, milder than in some other crops, preserving substantial standing variation. Selection scans implicate DeGS5-3A in grain-size increases during domestication and reveal a partial sweep at DeSh1-9A conferring quantitative reduction in shattering; most classic domestication loci remain weakly selected in fonio, presenting actionable targets. Together, these findings directly address the goal of identifying genetic levers to enhance agronomic traits (reduced shattering, increased grain size) while maintaining adaptation to dry, nutrient-poor soils. The resources and targets identified can inform marker-assisted selection and genome editing strategies to convert fonio’s semi-domesticated phenotype into improved cultivars suited to hot, dry environments.
Conclusion
This work establishes a chromosome-scale reference genome and comprehensive population genomic dataset for fonio, elucidating its allotetraploid structure, domestication history, and the environmental and sociocultural factors shaping its diversity. Key domestication-related targets were identified, including a strong sweep at DeGS5-3A for grain size and a widespread deletion at DeSh1-9A that modestly reduces shattering. The absence of strong selection at many canonical domestication genes suggests clear opportunities for rapid trait improvement through breeding and genome editing, such as targeting DeSh1-9B (especially in accessions already lacking DeSh1-9A) to further reduce seed shattering, and exploring genes controlling plant architecture and lodging. Future research should expand phenotyping across environments, functionally validate additional candidate loci from climate and selection scans, and integrate genomic selection to accelerate breeding of high-yielding, climate-resilient fonio.
Limitations
- Lack of a genetic linkage map for D. exilis required alternative validation (Hi-C, optical mapping, FISH) and may limit fine-scale recombination-based inferences.
- Diploid progenitors of the subgenomes are not identified, complicating subgenome disentanglement; TE-based assignment was used as a proxy.
- The origin of the DeSh1-9A deletion (standing variation vs post-domestication mutation) remains unresolved; it was not detected in the resequenced D. longiflora panel.
- Selection scans rely on window-based statistics and reference wild samples; incomplete sampling of wild diversity and demographic confounding may affect detection power and false positives/negatives.
- Environmental and ethnolinguistic association tests, while controlled for structure, can be influenced by residual confounding and heterogeneous sampling densities across regions.
- Phenotyping of shattering used a standardized shaking assay that captures relative differences but may not fully represent field shattering dynamics.
Related Publications
Explore these studies to deepen your understanding of the subject.

