
Agriculture
Genome-wide association study uncovers genomic regions associated with grain iron, zinc and protein content in pearl millet
M. Pujar, S. Gangaprasad, et al.
This groundbreaking genome-wide association study by Mahesh Pujar and colleagues reveals essential marker-trait associations for iron, zinc, and protein content in pearl millet, highlighting significant genetic variation and promising avenues for biofortification in crops.
~3 min • Beginner • English
Introduction
Pearl millet is a climate-resilient cereal cultivated on over 31 million hectares, predominantly in arid and semi-arid regions of Asia and Africa. It is naturally nutritious, being relatively rich in grain iron (Fe), zinc (Zn), and protein, which are critical for human health. Micronutrient and protein deficiencies contribute to widespread malnutrition (“hidden hunger”), particularly affecting women and children. Biofortification—genetically enhancing grain Fe, Zn, and protein—offers a sustainable strategy to address these deficiencies. However, these nutritional traits are complex, influenced by multiple genes and genotype-by-environment interactions, making conventional breeding time-consuming and costly. With advances in high-throughput genotyping and the availability of the pearl millet reference genome, genome-wide association studies (GWAS) can identify SNP markers and candidate genes associated with Fe, Zn, and protein content. The present study aims to assess phenotypic variability and perform GWAS in a diverse panel of 281 inbred lines to identify genomic regions and candidate genes controlling grain Fe, Zn, and protein content for use in genomics-assisted biofortification.
Literature Review
Previous mapping efforts in pearl millet using QTL mapping (linkage analysis) identified genomic regions for Fe and Zn but had limited resolution due to few recombination events and reliance on markers such as SSR and DArT. GWAS has successfully identified loci controlling grain Fe and Zn in maize, rice, and wheat, and for protein content in wheat and maize. In pearl millet, earlier studies reported genomic regions on linkage groups corresponding to current chromosomes (e.g., Pg105, Pg107 for Fe; Pg101, Pg104, Pg105, Pg106, Pg107 for Zn). The rapid LD decay reported in pearl millet suggests GWAS can achieve high mapping resolution. Despite progress, prior studies rarely reached the gene level, and GWAS for grain protein content in pearl millet had not been reported before this work.
Methodology
Plant materials: A GWAS panel of 281 inbred lines developed at ICRISAT (112 restorers, 110 seed parents, 32 advanced progenies from breeding populations/composites, and 27 germplasm derivatives) differing in Fe, Zn, and agronomic traits. Field trials: Conducted at ICRISAT, Hyderabad, India, in two contrasting environments (rainy 2017 and summer 2018) using an alpha lattice design with three replications. Each plot: two 2 m rows, spacing 75 cm (rainy) and 60 cm (summer); standard agronomic practices; fertilization with DAP and urea as specified; irrigation to avoid moisture stress. Sampling and phenotyping: Five representative plants per plot; main panicles harvested at physiological maturity, dried, hand-threshed with precautions against Fe contamination. Grain Fe and Zn quantified at Flinders University by ICP-OES following acid digestion (HNO3/H2O2) and standardized protocols. Grain protein content measured by Near-Infrared Spectroscopy (NIRS) at ICRISAT. Soil Fe and Zn (top 30 cm) estimated by AAS (DTPA-extractable). DNA extraction and genotyping: Genomic DNA from 30-day-old seedlings; DArTseq genotyping at Diversity Arrays Technology (Illumina HiSeq2000), generating 87,748 SNPs. Primary and secondary DArT pipelines used for quality control and SNP calling. SNP filtering: Using TASSEL v5.3.1, SNPs with >30% missing data and MAF <10% were removed, yielding 58,719 high-quality SNPs with known physical positions. Statistical analyses: ANOVA across seasons using random-effects models in SAS; estimates of means, CV, heritability H2; Pearson correlations among traits in R. Population structure inferred with ADMIXTURE (K = 1–10), optimal K selected by minimum cross-validation error; kinship matrix (K) computed in TASSEL. Linkage disequilibrium (LD) assessed as pairwise R2 for adjacent SNPs in TASSEL; LD decay plotted as R2 vs. physical distance, with R2 threshold 0.2 to estimate decay distance. GWAS: Association analyses using GLM (Q) and MLM (Q+K with P3D and optimum compression) implemented in TASSEL. Due to genomic inflation in GLM, MLM results were prioritized. Significance threshold set at −log10(p) ≥ 3 (p ≤ 0.001). Q-Q and Manhattan plots generated with R package CMplot. Candidate gene identification: Physical positions of significant SNPs compared against pearl millet reference genome annotations to infer gene functions and potential roles in Fe, Zn, and protein metabolism.
Key Findings
- Phenotypic variation and heritability (281 lines, pooled across two seasons): Fe: mean 74 mg kg−1 (range 32–120), h2 ≈ 93%; Zn: mean 46 mg kg−1 (range 19–87), h2 ≈ 90%; Protein content (PC): mean 11% (range 8–16%), h2 ≈ 96%. CV: Fe 8.24%, Zn 9.45%, PC 5.65%. Significant G×E for all traits. - Trait correlations: Fe–Zn r = 0.77 (P < 0.01); PC with Fe r = 0.38; PC with Zn r = 0.44 (both P < 0.01). - Genotyping and population structure: 58,719 high-quality SNPs after filtering; six genetic subgroups (K = 6) identified by ADMIXTURE (lowest CV error ~0.659). - Linkage disequilibrium: Average genome-wide R2 = 0.116; LD decay rapid with average decay distance ~2.9 kb at R2 = 0.2 (shortest ~0.2 kb on chromosome 1; longest ~9 kb on chromosome 6). - GWAS results: Total 78 significant MTAs (−log10 p ≥ 3): 18 for Fe, 43 for Zn, 17 for PC. Chromosome distribution: 16 on chr 5; 14 each on chrs 4 and 7; 13 on chr 1; 10 on chr 2; 3 on chr 3. - Top MTAs and PVE: Fe: Pgl05_135500493 (−log10 p 4.75, p = 1.79×10−5) explaining 8.23% PVE; other Fe-associated loci on Pgl01, Pgl02, Pgl04, Pgl05, Pgl06, Pgl07. Zn: Pgl07_101483782 (−log10 p 4.65, p = 2.24×10−5) explaining 8.00% PVE, with additional signals on chrs 1–7. PC: Pgl06_71295563 (−log10 p 3.46, p = 3.46×10−4) explaining ~5.86% PVE; 17 MTAs across chrs 1, 2, 4, 5, 6, 7. - Co-localized SNPs for Fe and Zn: Four SNPs were common to Fe and Zn: Pgl04_64673688, Pgl05_135500493, Pgl05_144482656, Pgl07_101483782 (on chromosomes 4, 5, and 7). - Candidate genes: Significant SNPs linked to genes implicated in nutrient metabolism and transport, including late embryogenesis abundant protein (LEA), MYB transcription factors (SANT/Myb domain), pentatricopeptide repeat (PPR), zinc finger, ankyrin repeat, leucine-rich repeat, oligopeptide transporter, bZIP, protein kinases, glycosyl transferases, HSP70, and iron ion binding proteins. - Concordance with prior studies: Several MTAs co-localize with previously reported QTLs/associations for Fe and Zn in pearl millet (e.g., on Pg105, Pg107, Pg101, Pg104), supporting robustness of identified regions. - Breeding implications: Positive Fe–Zn correlation and shared MTAs suggest potential for simultaneous improvement; identified SNPs provide diagnostic markers for marker-assisted selection.
Discussion
The study demonstrates substantial genetic variability and high heritability for grain Fe, Zn, and protein content in a diverse pearl millet panel, validating the feasibility of biofortification. Rapid LD decay and dense genome-wide SNP coverage permitted high-resolution GWAS, identifying 78 MTAs across three traits. MLM modeling effectively controlled for structure and kinship, reducing genomic inflation relative to GLM. The strong positive correlation between Fe and Zn, alongside four co-localized MTAs across chromosomes 4, 5, and 7, implies shared physiological pathways for uptake, transport, and grain deposition. These results align with earlier QTL and association reports, strengthening evidence for key genomic regions (notably on Pg105 and Pg107) influencing micronutrient accumulation. While PC showed moderate positive correlations with Fe and Zn, no shared MTAs were detected, suggesting partially independent genetic control, yet breeding for high Fe/Zn may concurrently elevate PC. Candidate gene annotations (LEA, MYB, PPR, transporters) are consistent with known roles in iron/zinc homeostasis and stress responses, offering targets for functional validation. Overall, the findings provide actionable markers for genomics-assisted breeding, and the co-localized MTAs enable simultaneous selection for improved Fe and Zn.
Conclusion
This GWAS identified robust genomic regions and candidate genes associated with grain iron, zinc, and protein content in pearl millet. Key MTAs include Pgl05_135500493 and Pgl05_144482656 for Fe; Pgl07_101483782, Pgl07_101483780, and Pgl07_147179490 for Zn; and Pgl06_71295563 for protein content. Four SNPs co-segregated for Fe and Zn, supporting shared genetic control and enabling simultaneous improvement. The results corroborate earlier QTL/association findings and advance to near-gene resolution, providing diagnostic markers for biofortification breeding. Eleven elite inbred lines combining ≥80 mg kg−1 Fe, >60 mg kg−1 Zn, and >13% protein were identified as donor sources. Future work should include validation of MTAs across diverse genetic backgrounds and environments, fine mapping, functional characterization of candidate genes, and deployment of markers in marker-assisted backcrossing, marker-assisted recurrent selection, and genomic selection to accelerate development of biofortified hybrids and varieties.
Limitations
- Statistical stringency: None of the MTAs met Bonferroni-corrected thresholds due to the large number of markers; a −log10(p) ≥ 3 threshold was used, which may include some false positives. - Model trade-offs: While MLM reduced false positives compared to GLM, it may overcorrect for structure/kinship, potentially increasing false negatives. - Effect sizes: Individual SNPs explained modest PVE (≈5–8%), indicating polygenic control and the need for genomic selection or marker combinations. - Environmental influences: Significant G×E interactions and known soil micronutrient effects on grain Fe/Zn warrant multi-environment validation. - Operational challenges: Potential epigenetic effects and sample handling contamination risks can influence micronutrient measurements; careful standardization is required. - Generalizability: Findings need validation in independent populations and breeding germplasm to ensure broad applicability.
Related Publications
Explore these studies to deepen your understanding of the subject.