logo
ResearchBunny Logo
Domestication of the Amazonian fruit tree cupuaçu may have stretched over the past 8000 years

Agriculture

Domestication of the Amazonian fruit tree cupuaçu may have stretched over the past 8000 years

M. Colli-silva, J. E. Richardson, et al.

Explore the intriguing domestication journey of cupuaçu, an Amazonian treasure, with groundbreaking genomic analysis revealing its roots linked to the wild cupuí. This study by Matheus Colli-Silva, James E. Richardson, Eduardo G. Neves, Jennifer Watling, Antonio Figueira, and José Rubens Pirani unravels a rich history of agricultural evolution over 5000–8000 years, shaped by both ancient and modern influences.

00:00
00:00
~3 min • Beginner • English
Introduction
The study investigates the domestication history of cupuaçu (Theobroma grandiflorum), an economically and culturally important Amazonian fruit tree closely related to cacao. Amazonia harbors exceptional biodiversity and long-term indigenous management of plants, with multiple species domesticated over the past ~12,000 years. Despite its importance, cupuaçu has been relatively understudied. Cupuaçu’s closest wild relative is cupuí (T. subincanum), with overlapping distributions and known natural and artificial hybrids. Prior work characterized cupuaçu as incipiently domesticated and reported limited genetic diversity, suggesting possible domestication effects. The research question is whether cupuaçu represents a domesticated form derived from cupuí, where and when domestication occurred, and how genetic diversity patterns reflect pre-Columbian and modern processes. To address this, the authors use population genomic data (RAD-seq) from four Brazilian Amazon sites to infer genetic relationships, diversity, selection, biogeography, and demographic history, placing results in the broader context of Amazonian domestication and human occupation.
Literature Review
The paper situates cupuaçu within Amazonia’s broader record of plant use and domestication, citing species with varying domestication intensity such as açaí (Euterpe oleracea), guaraná (Paullinia cupana), cacao (Theobroma cacao), pineapple (Ananas comosus), cassava (Manihot esculenta), and peach palm (Bactris gasipaes). Recent genomic studies have clarified domestication histories, for example in T. cacao, which underwent strong domestication ~3600 yBP with evidence of use ~5200 yBP. Cupuaçu and cupuí are sister taxa in phylogenies and share similar distributions, with cupuí extending further west. Reported hybrids reflect their genetic proximity. The concept of "incipient domestication" and the "cost-of-domestication" hypothesis provide theoretical frameworks, along with expectations from domestication demography (e.g., bottlenecks, reduced diversity, fewer regions under selection with larger effect). The authors reference archaeological evidence for early Holocene plant cultivation and the development of Amazonian dark earths, framing expectations for a protracted domestication process influenced by indigenous management and later modern dispersal.
Methodology
Sampling and data generation: Population genomic RAD-seq data were generated from cupuaçu (Theobroma grandiflorum) and its wild relative cupuí (T. subincanum) across four Amazonian locations in Brazil: Xapuri-Acre (ACRE), Balbina-Amazonas (BALB), Tapajós-Pará (PARA), and São Gabriel da Cachoeira-Amazonas (SGCA). Sequencing reads underwent quality control and SNP calling to produce variant call files (VCFs). Analytical framework: (1) Population structure and diversity: The authors computed genetic distances, built an UPGMA tree, a haplotype network, and conducted principal component analysis (PCA) to evaluate clustering by taxon and location. STRUCTURE analyses inferred ancestry components with independent runs (10,000 repetitions, 1000 burn-in), visualized and evaluated with pophelper to identify optimal K (K=2 expected for species partition; K=3 also examined). Expected heterozygosity (HE) was calculated for ancestry groups. (2) Phylogeny and biogeography: Bayesian phylogenetic inference was performed using whole-genomic SNPs (MrBayes). Ancestral area reconstruction used BioGeoBEARS with the four study locations to infer historical geographic ranges and potential origin. (3) Selection and cost-of-domestication: Coding regions were annotated and dN/dS ratios computed across chromosomes. Selective sweep scans based on linkage disequilibrium used iHS and XP-EHH (rehh) to detect partial/incomplete sweeps. Candidate genomic regions under selection were identified after remapping genes to a reference genome, and counts of candidate loci increasing in frequency were compared between taxa. (4) Demographic inference: A folded site-frequency spectrum (SFS) for four populations (one per location) was estimated using easySFS/dadi. Stairway Plot 2 inferred changes in effective population size (Ne) through time using a mutation rate µ=3.1×10^-9 per bp per generation and a generation time of 3 years for T. grandiflorum, simulating 100 independent samples and four breaking points. Data availability: Raw FASTQ reads (NCBI SRA PRJNA940113) and variant call files (EVA PRJEB61195) are publicly available.
Key Findings
- Cupuaçu (T. grandiflorum) is genetically supported as a domesticated derivative of cupuí (T. subincanum), with clustering of cupuaçu samples distinct from cupuí across UPGMA, haplotype network, PCA, and STRUCTURE analyses. - STRUCTURE indicated K=2 as expected for species-level partitioning and K=3 as the optimal clustering in this dataset; cupuaçu shows low admixture from cupuí and reduced genetic diversity. - The expected heterozygosity (HE) for the cupuaçu ancestry group was low (HE ≈ 0.06), consistent with domestication bottlenecks and inbreeding. - Biogeographic reconstruction places the origin of domesticated cupuaçu in Northwestern Amazonia, likely the Middle–Upper Rio Negro region (São Gabriel da Cachoeira/Balbina area), with subsequent human-mediated dispersal across the basin. - Demographic inference (Stairway Plot 2) identified two major bottlenecks: (1) an initial mid-Holocene bottleneck dating to approximately 5347–7943 years before present, marking initial domestication; (2) a second, modern-era bottleneck around ~169 years before present, coinciding with intensified introduction and range expansion. - Signals consistent with the "cost-of-domestication" were observed: cupuaçu exhibits an excess of nonsynonymous changes relative to synonymous ones among candidate loci under selection compared to cupuí. - Fewer genomic regions appear under selection in cupuaçu relative to cupuí (candidate loci increasing in frequency: 6471 in cupuaçu vs 10,445 in cupuí), aligning with expectations that domestication targets fewer loci of larger effect, with many hitchhiking variants. - Across locations, cupuaçu samples are not strongly genetically differentiated, supporting a single initial domestication followed by dispersal rather than multiple independent domestications.
Discussion
The findings directly address the study’s hypotheses, showing that cultivated cupuaçu emerged from cupuí and underwent at least two domestication phases. Genomic structure and reduced diversity in cupuaçu, together with low heterozygosity and limited introgression from cupuí, indicate a domestication bottleneck and subsequent human management. Biogeographic reconstruction and the earliest diverging cupuaçu lineages point to an origin in the Middle–Upper Rio Negro of Northwestern Amazonia. Demographic trajectories suggest a protracted domestication: an initial mid-Holocene domestication consistent with early indigenous plant management, followed by a modern phase with intensified dispersal and cultivation, likely reflecting market-driven expansion over the last two centuries. Evidence of increased nonsynonymous mutation load in cupuaçu supports the cost-of-domestication hypothesis, while the smaller number of candidate selected regions in cupuaçu versus cupuí matches theoretical expectations of selection on fewer loci with larger effects and hitchhiking. The results align with archaeobotanical records documenting early Holocene human occupation, plant use, and creation of Amazonian dark earths, integrating genomic signatures with cultural and environmental histories. While single-origin signatures are strongest, the authors acknowledge that protracted domestication from multiple localities can mimic monophyly, so broader geographic sampling remains important for testing alternative scenarios.
Conclusion
This study demonstrates that cupuaçu is a domesticated form derived from cupuí, with an origin likely in the Middle–Upper Rio Negro region and a domestication history spanning the mid-Holocene to the modern era. Two demographic bottlenecks delineate phases of domestication: an initial event ~5.3–7.9 kya and a second ~169 years ago associated with expansion and intensified cultivation. Genomic evidence of reduced diversity, limited introgression, and increased nonsynonymous loads in cupuaçu support domestication-related genetic consequences, while fewer candidate selected loci are consistent with selection on traits of larger effect. These results contribute to the growing recognition of Amazonia as a center of early plant domestication and highlight the combined influence of pre-Columbian and modern processes on current genetic diversity and crop distributions. Future work should expand geographic sampling across the Amazon Basin and adjacent regions to refine the domestication center and test for potential multi-local protracted domestication, integrate high-resolution genomic approaches to pinpoint causal variants under selection, and link genomic signatures to phenotypic traits and archaeobotanical records.
Limitations
- Geographic sampling limitations prevent pinpointing the precise locality of initial domestication; inference is strongest for the Middle–Upper Rio Negro but cannot exclude nearby areas or multi-local protracted domestication. - Demographic inferences depend on assumed parameters (mutation rate and generation time), which, if inaccurate, could shift timing estimates. - RAD-seq and candidate sweep approaches may miss some selected regions or causal variants; many identified loci likely hitchhike rather than being directly selected. - Limited differentiation among sampled cultivated populations constrains the ability to detect fine-scale structure or multiple domestication events.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny