logo
ResearchBunny Logo
Ancient genomes from northern China suggest links between subsistence changes and human migration

Humanities

Ancient genomes from northern China suggest links between subsistence changes and human migration

C. Ning, T. Li, et al.

Discover the genetic histories of ancient communities in Northern China, where millet farming laid the foundation for civilization. With research conducted by a team from Jilin University and the Max Planck Institute, delve into how these societies evolved and interacted over time, revealing fascinating links between agricultural practices and human migration.... show more
Introduction

China is among the earliest independent centers for cereal domestication, with rice agriculture in the Yangtze basin and millet agriculture in northern China. Northern China includes multiple river systems, notably the Yellow River (YR) Central Plain and the West Liao River (WLR) region. Both WLR and lower YR hosted early cultivation of foxtail and broomcorn millets from at least 6000 BCE, and complex Middle Neolithic societies (Hongshan in WLR; Yangshao in YR) relying substantially on millet farming emerged by ~4000 BCE. Archaeological evidence points to population growth, cultural innovation, and potential dispersals of Sino-Tibetan (from YR) and Transeurasian (from WLR) languages. Compared to the YR, reliance on crops in the WLR fluctuated with climate and cultural shifts, with increasing millet in Xinglongwa→Hongshan→Lower Xiajiadian and partial replacement by pastoralism in Upper Xiajiadian. The extent to which human migrations mediated these subsistence changes, and how interactions among AR, WLR, and YR populations shaped the dispersal of millet farming, remained unclear due to limited ancient genomic data. This study addresses these questions by analyzing 55 ancient genomes spanning the AR, WLR, YR, and intermediate regions across ~6000 years, to reconstruct population structure, migration, and admixture, and relate them to subsistence transitions.

Literature Review

Prior archaeobotanical and isotopic studies established early millet domestication and use in northern China and identified WLR and lower YR as key centers for early cultivation. Archaeological research documents Middle Neolithic complex societies (Hongshan and Yangshao) with substantial millet reliance and notable ceremonial constructions (e.g., Niuheliang). Paleobotanical and isotopic data indicate temporal shifts in WLR subsistence, with increasing millet reliance up to Lower Xiajiadian and later incorporation of pastoralism in Upper Xiajiadian, potentially tied to climate change. Linguistic studies suggest a northern origin for Sino-Tibetan (possibly linked to Yangshao) and associate WLR with the Transeurasian language family; contact and borrowing between these linguistic groups intensified from the Bronze Age. However, before this study, ancient genomes from these regions were sparse, leaving prehistoric migrations, contacts, and their impacts on present populations poorly understood.

Methodology

Samples: 107 ancient individuals from 19 archaeological sites across the Amur River (AR; 3 sites, 5525 BCE–250 CE), West Liao River (WLR; 4 sites, 3694–350 BCE), Yellow River (YR; 10 sites, 3550–50 BCE), and intermediate regions in Shaanxi and Inner Mongolia (2 sites) were initially screened. Fifty-five individuals with sufficient DNA preservation were sequenced to 0.03×–7.53× autosomal coverage. Laboratory procedures: Ancient DNA was extracted from teeth and petrous bones in clean facilities using established protocols. Libraries were double-stranded, dual-indexed, and sequenced on Illumina HiSeq X10 (150 bp paired-end). DNA damage patterns were assessed; terminal bases were soft-masked for C→T/G→A artifacts in genotyping. Data processing: Reads were adapter-trimmed (AdapterRemoval v2.2.0), aligned to hs37d5 (BWA v0.7.12), PCR duplicates removed (DeDup v0.12.2). Pseudohaploid genotypes were called by random allele sampling on two SNP panels: HumanOrigins (593,124 SNPs) and 1240k-Illumina (249,162 SNPs), using trimmed BAMs for transition SNPs. Authentication and QC: Postmortem damage profiles (mapDamage v2.0.6) and contamination estimates were generated (mtDNA: Schmutzi v1.5; nuclear in males: ANGSD v0.910). All samples showed characteristic aDNA damage; mtDNA contamination <4% for all; nuclear X-chromosome contamination <5% for males except one low-coverage outlier. Molecular sexing was based on X/Y to autosome coverage ratios. mtDNA and Y haplogroups were assigned (HaploGrep2; ISOGG markers). Reference datasets: Present-day genomes from HumanOrigins and 1240k-Illumina panels, augmented with Simons Genome Diversity Panel and published ancient genomes, were used for comparative analyses. A subset with many Tibetans/Sherpa enabled Sino-Tibetan analyses. Population genetic analyses: PCA (smartpca v16000; lsqproject/shrinkmode on) projected ancient individuals onto components computed from present-day Eurasian or East Asian panels. Unsupervised ADMIXTURE (v1.3.0) followed LD pruning (PLINK --indep-pairwise 200 25 0.2; MAF>1%). Genetic relatedness was assessed using pairwise mismatch rate and lcMLkin; first-degree relatives were removed from group analyses. Outgroup f-statistics (qp3Pop/qpDstat in ADMIXTOOLS) quantified affinities. Admixture modeling used qpWave/qpAdm with diverse outgroups (Mbuti, Natufian, Onge, Iran_N, Villabruna, Mixe, Ami, Nganasan, Itelmen), with alternative outgroup sets excluding Nganasan/Itelmen where appropriate. A genetic continuity test assessed whether ancient AR groups could be direct ancestors of present-day AR populations by likelihood ratio testing on read/allele count data. Grouping: Individuals were grouped by date, geography, archaeological context, and genetic profile (e.g., AR_EN, WLR_MN, YR_LN).

Key Findings
  • Dataset: 55 ancient genomes from northern China dated 7500–1700 BP across AR, WLR, YR, and intermediate regions; autosomal coverage 0.03×–7.53×; contamination low.
  • Population structure: Ancient individuals fall within East Asians on PCA but form three clusters aligning with geography: AR (top, Tungusic-related), YR (bottom), and WLR intermediate. ADMIXTURE reveals three ancestral components shared across groups with river-basin-specific profiles.
  • AR genetic stability: Early Neolithic AR hunter-gatherers (AR_EN), Iron Age AR_Xianbei_IA, and an AR-like WLR_BA_o individual cluster tightly with present-day AR (Tungusic-speaking) populations, showing similar ADMIXTURE profiles. Outgroup-f4 and qpWave analyses support close, largely cladal relationships among ancient and present-day AR groups, though strict continuity is rejected, implying internal stratification and gene flow within AR-related gene pools.
  • YR temporal change: Central Plain YR groups share a distinct profile but exhibit a monotonic increase in affinity to present-day southern Chinese and Southeast Asians (SC-SEA) from Middle Neolithic Yangshao (YR_MN) to Late Neolithic Longshan (YR_LN), evidenced by positive f4(YR_LN, YR_MN; SC-SEA, Mbuti) (e.g., Z=+3.7 with Ami). No significant change from YR_LN to later YR_LBIA. Present-day Han show additional SC-SEA affinity versus ancient YR (max |Z|=10.3), and Naxi show smaller but significant differences (max Z≤4.0), indicating continued exogenous southern-related input after the Neolithic, potentially linked to rice-farming expansions.
  • Geographic spread of YR-related ancestry: Middle Neolithic Inner Mongolia (Miaozigou_MN) and Late Neolithic Shanxi (Shimao_LN) are genetically similar to ancient YR groups. Upper Yellow River Qijia (Upper_YR_LN) also aligns closely. qpAdm models these as mixtures of YR farmers and AR hunter-gatherers with major YR ancestry (~80%); Upper_YR_IA shows even higher YR contribution (compatible with ~95%–100% YR ancestry).
  • Sino-Tibetan connections: Tibetans are modeled as mixtures of Sherpa and Upper_YR_LN (other sources also plausible), supporting a local northern source for previously reported admixture. Naxi and Yi are compatible with YR_MN-like ancestry; Lahu, Tujia, and Han show greater SC-SEA-related influence.
  • WLR correlated gene-subsistence changes: • Middle Neolithic: WLR populations are admixed between AR and YR. WLR_MN has ~39.8±5.7% AR-related ancestry; a nearby HMMH_MN individual has ~75.1±8.9% AR-related ancestry. A sharp spatial transition from predominantly YR-like to AR-like ancestry occurs within ~600 km when including Miaozigou_MN. • Late Neolithic (Lower Xiajiadian): WLR_LN shifts toward YR, overlapping the ancient YR cluster and showing reduced Siberian affinity; qpAdm estimates major YR contribution (≈74%–88% depending on secondary source), consistent with intensified farming and northward YR-related influx. • Bronze Age (Upper Xiajiadian): WLR_BA changes directionally toward AR; one individual (WLR_BA_o) is indistinguishable from ancient AR and shows extra affinity to later AR (Xianbei_IA) and present-day Tungusic speakers; the other WLR_BA individuals are modeled as WLR_LN mixed with WLR_BA_o (21±7% AR-like contribution). This aligns with archaeological evidence for a shift toward pastoralism under drier climate conditions and suggests an influx of AR-related pastoralists.
  • Overall: AR shows long-term genetic stability with limited food production; YR and WLR show substantial temporal changes linked to subsistence shifts (millet intensification and later pastoralism) and migrations.
Discussion

The findings directly address the study’s goal of linking subsistence changes to human migration in northern China. In the WLR, genetic shifts track subsistence transitions: increased YR affinity in the Late Neolithic corresponds to intensified millet farming, while increased AR affinity in the Bronze Age coincides with the adoption of a pastoral economy, consistent with climatic aridification. This indicates that demic processes accompanied cultural-economic changes. In the YR, a rising SC-SEA affinity from Middle to Late Neolithic parallels the intensification and northward spread of rice farming, implying migration-driven diffusion. The broad geographic spread of YR-like ancestry into adjacent regions (Inner Mongolia, Shanxi, upper YR) and admixture models for Sino-Tibetan speakers (e.g., Tibetans as mixtures with Upper_YR_LN) align with hypotheses of a northern origin of Sino-Tibetan languages and the role of the northeastern Tibetan Plateau in plateau occupation. AR populations demonstrate relative genetic continuity despite some internal stratification, reflecting stability in subsistence strategies (hunting, fishing, limited cultivation and animal husbandry). Together, the data illuminate complex, region-specific demographic dynamics where environmental shifts and economic strategies correlate with gene flow and population structure, and inform debates on archaeolinguistic dispersals (Sino-Tibetan and Transeurasian).

Conclusion

This study presents 55 ancient genomes spanning six millennia from the AR, WLR, and YR basins, providing the first genomic time series for key early complex societies (Yangshao and Hongshan). It reveals contrasting demographic trajectories: relative stability in AR populations versus substantial temporal changes in YR and WLR populations that correlate with subsistence shifts. WLR experienced increased YR-related ancestry with intensified millet farming and increased AR-related ancestry with the rise of pastoralism in the Bronze Age. YR populations show increasing southern (SC-SEA) affinity from Middle to Late Neolithic, likely linked to rice-farming expansions. These results support a tight coupling between subsistence strategies and migration, with implications for archaeolinguistic scenarios involving Sino-Tibetan and Transeurasian families. Future research should target unsampled candidate source populations, especially Neolithic groups from Shandong and the Lower Yangtze River regions (potential sources of rice-farming ancestry), and expand temporal and geographic coverage to refine fine-scale demographic histories and test gene–culture–language co-evolution across China.

Limitations
  • Source sampling gaps: No ancient genomes from key candidate source regions for rice-farming ancestry into the Central Plain (e.g., Shandong, Lower Yangtze), limiting resolution of YR southern-related gene flow origins.
  • Temporal and spatial coverage: Limited Bronze/Iron Age WLR sampling prevents robust testing of persistent spatial heterogeneity and fine-scale demographic patterns.
  • Data resolution: Some alternative qpAdm models remain marginally plausible due to genetic resolution limits; one low-coverage AR_IA individual shows elevated contamination, affecting precise placement.
  • Continuity: Strict genetic continuity between ancient and present-day AR groups is rejected, indicating unmodeled internal structure and subsequent gene flow.
  • Coverage variation: Low coverage for some individuals reduces power for detailed haplotype-based analyses and fine-grained inference.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny