logo
ResearchBunny Logo
State of ex situ conservation of landrace groups of 25 major crops

Agriculture

State of ex situ conservation of landrace groups of 25 major crops

J. Ramirez-villegas, C. K. Khoury, et al.

Explore the valuable world of crop landraces with this compelling research by Julian Ramirez-Villegas and team. This study uncovers the potential distributions of 25 major crop landrace groups, revealing significant gaps in conservation efforts across various crops, while emphasizing the urgency of targeted collection efforts in key regions. Join the conversation on sustainable agriculture and conservation!... show more
Introduction

The paper addresses how comprehensively landrace diversity of major crops is conserved ex situ in genebanks. Landraces—traditional, locally adapted cultivated populations managed by Indigenous and farming communities—provide agroecological services and key genetic resources for breeding and for understanding domestication. Although approximately three million landrace samples have been assembled in international, regional, national and subnational genebanks after decades of collection, it has not been clear whether landrace diversity is comprehensively represented. International targets such as the Convention on Biological Diversity Aichi Target 13 and the Sustainable Development Goals Target 2.5 emphasize filling this conservation gap. To respond, the authors model eco-geographically distinguishable landrace group distributions for 25 staple crops within their regions of diversity, then compare the geographic and ecological coverage of genebank holdings to the predicted distributions to quantify current representation and identify spatial gaps requiring further collecting.

Literature Review

The study builds on extensive prior work recognizing centers of origin and regions of diversity of crops and the importance of landraces as repositories of genetic variation. It references decades of efforts to collect and conserve landraces in genebanks and the urgency due to socio-economic and environmental change influencing on-farm diversity. It situates the work within international policy frameworks (CBD Aichi targets and SDGs) and contrasts progress on landrace conservation with that of crop wild relatives, noting previous global gap analyses for CWR. The authors also draw on literature to delineate regions of diversity for each crop and to identify and test infraspecific groupings (races, genepools, genetic clusters, etc.) for modelling.

Methodology

Scope and study areas: The analysis includes 25 cereal, pulse, and starchy root/tuber/fruit crops whose genetic resources are managed by CGIAR centers or CePaCT. Landrace distributions and conservation gaps were assessed within primary (and, for some crops, secondary) regions of diversity, identified via literature review and expert confirmation.

Occurrence data: A dataset of 93,269 landrace occurrences was compiled (61.9% pre-assigned to groups, remainder inferred). Ex situ records were sourced from Genesys and FAO WIEWS, and directly from international genebanks (AfricaRice, Bioversity/MGIS/ITC, CePaCT, CIAT, CIMMYT, CIP, ICARDA, ICRISAT, IITA, IRRI), USDA GRIN-Global, CONABIO, and GBIF (with 'living specimen' as ex situ; others as reference sightings). Data were cleaned to remove duplicates, correct or omit erroneous coordinates, and clipped to study areas.

Predictor variables: Fifty spatial predictors at 2.5-arc-minute resolution (WGS84) were compiled, including 39 climate variables (WorldClim v2 and ENVIREM), elevation (SRTM), two evolutionary history proxies (distance to pre-1500 human settlements and to wild progenitors), and eight socioeconomic variables (population density, distance to navigable rivers, irrigation extent, travel-time accessibility, ethnic/cultural group distributions, and crop harvested area, production, and yield from SPAM). Predictor datasets are provided in Supplementary Dataset 3.

Landrace group classification: For each crop, infraspecific landrace groups (races, genepools, genetic clusters, geographic/environmental groupings) were compiled from literature. Machine learning classifiers—random forest, support vector machine, k-nearest neighbours, and artificial neural networks—were trained on occurrences and predictors; models were ensembled using the mode. Fifteen-fold cross-validation with 80/20 split was used; a classification was accepted if average accuracy per group was at least 80%. For records missing group assignment, the trained classifiers predicted group membership.

Distribution modelling: For each landrace group, MaxEnt (maxnet R package) was used to model probability of occurrence. Predictors were selected per group via PCA (retaining variables contributing ≥15% to the first component) and VIF (discarding VIF > 10) to reduce collinearity. Pseudo-absences (background points) were drawn within the same ecological land units, in potentially suitable areas per an SVM classifier, and >5 km from any occurrence, with a ratio of 10 pseudo-absences per unique occurrence. Five-fold cross-validation (80/20) evaluated AUC, sensitivity, specificity, and Cohen’s kappa; the median across folds formed the prediction. Pixels above the threshold maximizing sensitivity + specificity were considered presence.

Ex situ conservation gap analysis: Three pixel-level gap scores were computed across each group’s modelled distribution:

  • Connectivity gap (S_con): Using Delaunay triangulation of the three nearest genebank accession locations to each pixel, normalized by distances to triangle centroid and vertices; high values indicate sparse sampling/connectivity.
  • Accessibility gap (S_acc): Travel time from each pixel to the nearest genebank accession occurrence (via a friction surface), normalized by maximum travel time; high values indicate poor accessibility to sampled sites.
  • Environmental gap (S_env): Mahalanobis distance in predictor space from each pixel to the closest genebank accession environment using Ward hierarchical clustering; high values indicate novel environmental conditions not yet sampled.

Thresholding and confidence: For each score, thresholds were derived via cross-validation using five synthetic gaps (removing existing occurrences in random 100-km-radius areas) to maximize prediction (AUC, sensitivity, specificity). Pixels above threshold in any score were flagged as low-confidence gaps (value 1), in any two scores as medium-confidence (2), and in all three as high-confidence gaps (3). Complements of areas flagged as gaps yielded minimum (any-score gaps) and maximum (only high-confidence gaps) current representation estimates.

Aggregation: Group-level models and gap maps were combined to crop-level summaries (summing group pixels) to avoid bias toward crops with more groups. For maize and yam, geographic differentiations were combined. Results include spatial hotspots and representation metrics at crop and group levels.

Key Findings
  • Overall representation: On average, 63% ± 12.6% of landrace group distributions are currently represented ex situ in genebanks, indicating moderately comprehensive conservation with substantial variation among crops.
  • Most comprehensively represented (mean of min/max estimates): Breadfruit 81.6%, bananas and plantains 81.5%, lentils 78.3%, common beans 77.4%, chickpeas 75.8%, barley 75.5%, bread wheat 71.3%.
  • Largest conservation gaps (lower representation): Pearl millet 32.7%, yams 43.0%, finger millet 45.4%, groundnut 46.5%, potatoes 50.3%, peas 52.4%.
  • Potential coverage bounds: Some crops show high maximum potential representation (>90%): breadfruit, lentil, banana and plantain, grasspea, chickpea. Minimum coverage warns of extensive gaps for pearl millet (15.2%), groundnut (22.6%), finger millet (25.3%), peas (28.1%), yams (29.0%).
  • Crop-type comparisons: Average representation did not differ significantly among cereals (59.9%), pulses (64.6%), and starchy roots/tubers/fruits (64.9%) (P = 0.69). Mean minimum estimates were 45.0%, 45.6%, and 50.4%; mean maximum estimates 74.8%, 83.6%, and 79.3%, respectively. Conservation scores were not correlated with crop importance to global food supply, production, and trade (r = 0.064).
  • Within-crop variation: Notable differences among landrace groups within crops; e.g., barley with covered grains ~89.1% conserved vs. naked (hull-less) barley ~31.3%. Similar variability was observed in Asian rice, finger millet, potato, sorghum, and yam, while groups within cassava, chickpea, common bean, cowpea, groundnut, lentil, maize, pea, pearl millet, African rice, sweetpotato, and bread wheat were more similar.
  • Example gaps: High-confidence gaps for all five major sorghum races occur in sub-Saharan Africa, especially Central, West, and Southern Africa, including Madagascar.
  • Hotspot regions for further collecting: South Asia; the Mediterranean and West Asia; Mesoamerica; West, East, and Southern Africa; the Andean mountains; and Central to East Asia. Up to nine uncollected crops’ landrace groups may co-occur in single 2.5-arc-minute cells in India and Morocco, and up to eight in Algeria, Greece, Iran, Mexico, Pakistan, Sierra Leone, and Turkey.
  • Distribution richness: Landrace group diversity across crops is highest in East and Southern Africa, South and Central Asia, the Mediterranean and West Asia, West Africa, the Andes, and Mesoamerica; up to 12 of the 25 crops may co-occur in single cells in Bangladesh, Ethiopia, India, Nepal, and Pakistan.
Discussion

The findings indicate that ex situ representation of landrace groups for 25 major crops is generally substantial, reflecting decades of national and international collecting and conservation, and appears more advanced than protection for crop wild relatives. However, significant geographic and environmental gaps persist for many crops and landrace groups. The spatially explicit gap maps enable prioritization of collecting by crop, group, and region to improve representation, with attention to threats from economic, agricultural, demographic, environmental, climatic, and political changes. Given recent progress on access and benefit-sharing frameworks and international collaboration, filling identified gaps appears feasible, supporting achievement of CBD and SDG targets for ex situ conservation. The authors argue that periodic, holistic gap analyses—beyond simple accession counts—can serve as improved indicators of conservation status, and demonstrate the utility of linking infraspecific classifications and diverse predictors to model landrace distributions and guide action.

Conclusion

This study provides a comprehensive, quantitative, and spatial assessment of the current state of ex situ conservation for landrace groups across 25 major crops, showing moderate overall representation with clear crop- and region-specific gaps. The methodology and results offer actionable guidance to prioritize further collecting efforts to close gaps, making high coverage of landrace diversity in genebanks an attainable goal aligned with international biodiversity and sustainable development targets. Future work should expand the approach to additional crop categories (fruits, vegetables, nuts), extend analyses beyond historical regions of diversity to uncover novel variation, and incorporate assessments of in situ (on-farm) conservation. Integrating these results with parallel analyses of crop wild relatives and other culturally and economically important plants can deliver a more complete understanding of global crop diversity distributions and conservation needs.

Limitations

Key limitations include: (1) Data and classification constraints—occurrence and infraspecific grouping information may be incomplete, uneven, or of variable quality; many national/subnational collections are underrepresented in major databases; locality and characterization data are often incomplete, potentially underestimating true ex situ representation. (2) Predictor and modelling limitations—models rely on available environmental and socioeconomic predictors at 2.5-arc-minute resolution; fine-scale abiotic/biotic factors (e.g., local irrigation, soil traits, pests/pathogens, pollinators), farm practices, seed systems, and recent losses may be insufficiently captured. (3) Proxy for genetic diversity—geographic and environmental surrogates may not fully reflect genetic variation; species differences in reproductive biology and propagation, and the heterogeneous nature of landraces, can affect how well samples represent diversity; adequate sampling size may be large to capture rare alleles. (4) Assumptions about sampling—presence of any accession at a site is assumed to indicate adequate sampling of the targeted landrace group, which may miss finer distinctions; ongoing evolution and emergence of new landrace variation mean that even previously collected areas may warrant resampling. These issues underscore the need for field reconnaissance, partnerships with Indigenous and farming communities, adherence to access and benefit-sharing policies, and ensuring genebank capacity and logistics for effective long-term conservation.

Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny