The Arts

Global musical diversity is largely independent of linguistic and genetic histories

S. Passmore, A. L. C. Wood, et al.

This study conducted by Sam Passmore, Anna L. C. Wood, Chiara Barbieri, Dor Shilton, Hideo Daikoku, Quentin D. Atkinson, and Patrick E. Savage reveals intriguing insights into the independence of musical traditions from linguistic and genetic histories across the globe. With the analysis of an extensive set of data, the findings prompt a reconsideration of what shapes our musical identities.... show more

Introduction

The study investigates whether cultural domains—specifically music—track human demographic and linguistic histories. Inspired by Darwin’s proposal of parallels between biological and cultural evolution, prior research has compared genetic, linguistic, and archaeological evidence, finding partial correspondences but also frequent mismatches. Music, a universal cultural trait, has been proposed to preserve historical patterns possibly more conservatively than language. The authors outline three competing predictions: that music correlates with genes due to shared migration/evolutionary processes; that music correlates with language due to shared vocal and interactional transmission; or that music is largely independent due to differing evolutionary tempos, fabrics, and modes. Previous tests were largely regional with mixed results, and global comparisons were previously limited by data availability. With new global datasets for music (Cantometrics/Global Jukebox), genetics (GeLaTo), and language (global phylogeny), the study aims to quantify global musical diversity, assess its geographical/historical structure, and test its correspondence to genetic and linguistic histories.

Literature Review

Regional studies reported significant correlations between musical and genetic diversity in Taiwan, sub-Saharan Africa, and Eurasia, but not in Northeast Asia. Global analyses were impeded by data limitations until recent releases of large-scale repositories. The Global Jukebox Cantometrics dataset offers extensive coded musical features enabling cross-cultural comparisons and has shown alignment with human perceptual similarity judgments. Linguistic and genetic comparisons have shown tighter genetic relationships within language families but also notable mismatches. Critiques emphasize that basic vocabulary phylogenies capture limited dimensions of cultural history, and that cultural domains may move independently across groups, suggesting complex relationships among cultural traits. This study builds on these insights by conducting the first direct global comparison across matched music, language, and genetics.

Methodology

Data: Music from the Global Jukebox Cantometrics dataset with 5776 coded songs; primary analyses restricted to societies with at least two songs (5242 songs from 719 societies). Robustness samples: societies with ≥10 songs (3063 songs, 222 societies) and a Standard Cross-Cultural Sample (SCCS) subset (742 songs, 110 societies). Genetic data from GeLaTo (Human Origins SNP chip) and linguistic relationships from a global Bayesian language phylogeny. Matched dataset: 121 societies paired across domains (abstract reports 981 songs and 1296 genetic profiles; languages n=121). Pairing used glottocodes and proxy matches when necessary. Feature construction: Selected 24 of 37 Cantometric variables (removing redundancies). Built five latent musical dimensions modeled on Lomax’s factors: Articulation, Tension, Ornamentation, Rhythm, and Dynamics, allowing inter-factor correlations and select residual covariances. Goodness-of-fit assessed via RMSEA, SRMR, CFI. Sensitivity analyses excluded variables with low inter-rater reliability (Cohen’s kappa <0.4). An aggregate musical similarity metric was also constructed over all variables. Between/within-group structure: AMOVA on 5131 songs from 636 societies with linked language families to partition variance into within-society and between-society (within macro-groups) components, and to compare musical fixation with genetic structure. Spatial structure: Geographic autocorrelation assessed by comparing pairwise similarity within 500-km distance classes up to 20,000 km using Haversine distances. Autocorrelation computed for musical latent variables and aggregate measure, and for genetic (FST) and linguistic (patristic) distances. Tree-likeness: Delta scores (distance from four-point condition) computed for random samples of 50 societies each in Africa, Europe, and Oceania due to computational limits, using Euclidean distances on musical scores. Cross-domain relationships: Partial redundancy analysis (RDA) and partial Mantel tests assessed correlations between musical distances (PhiST matrices for each dimension and aggregate) and genetic (FST) or linguistic (patristic) distances while controlling for geography or the other domain. Distance matrices reduced via PCoA retaining axes explaining >10% variance; adjusted R² reported. Robustness checks repeated across ≥10-song and SCCS samples and within regions (Africa, Europe, Southeast Asia).

Key Findings

Five latent musical dimensions fit well across datasets: two-or-more-song sample (RMSEA=0.06 [90% CI: 0.056–0.068], SRMR=0.05, CFI=0.93); ≥10-song sample (RMSEA=0.06 [0.059–0.061], SRMR=0.06, CFI=0.94); SCCS (RMSEA=0.06 [0.059–0.067], SRMR=0.05, CFI=0.94). Sensitivity analyses showed high stability (r≥0.97) except for Tension, which warrants caution.
Music varies within and between societies: AMOVA indicates within-society diversity explains 54–72% of variance; between-society/within-macrogroup explains 29–43%. Musical fixation differences between populations are 10–40% higher than genetic differences, though more variable.
Spatial autocorrelation: Musical similarity shows significant geographic structure up to ~4000 km on average (r≈0.17), less than language (~5000 km, r≈0.24) and genes (~5500 km, r≈0.63).
Tree-like structure: Musical delta scores range ~0.25–0.40 across regions/variables (most ~0.31–0.34), between values reported for Indo-European lexicon (0.21) and Polynesian languages (0.41), and more tree-like on average than Austronesian lexical (0.38) and structural (0.44) data.
Global cross-domain correlations are weak and inconsistent: In the two-or-more-song sample, partial Mantel and RDA show low associations after controlling for geography or the other domain (e.g., Music–Genes controlling geography: Mantel r=0.15, p<0.001; RDA adj. R²=0.09; Music–Language controlling geography: r=0.18, p<0.001; RDA adj. R²=0.10). Most adjusted R² values are <0.10; SCCS shows even weaker patterns (83% of tests with adj. R²<0.05).
Regional exceptions: Within Africa, music correlates more with linguistic distance (e.g., Articulation adj. R² up to ~0.33). In Europe and especially Southeast Asia, music correlates more with geographic distance (up to ~0.23 and ~0.50, respectively), though small samples warrant caution.

Discussion

Findings indicate that while musical styles exhibit geographic and historical structure and retain a relatively tree-like signal consistent with vertical transmission, global musical similarities are largely independent of linguistic and genetic histories. This independence persists despite methodological compatibility and is corroborated by stronger local patterns in regions where prior studies suggested links (e.g., sub-Saharan Africa, Southeast Asia). The authors propose that historical borrowing and diffusion—such as along the Eurasian Silk Routes—allow musical traditions to cross linguistic and genetic boundaries, producing distinct topographies. Differences in evolutionary tempo and mechanisms (neutral drift versus functional coevolution) may further decouple music from language and genes. The observed weaker-than-geographic cross-domain correlations suggest that coevolutionary phylogenetic processes alone cannot explain the patterns; instead, music appears to follow partially independent historical pathways. Future work should model mechanisms of horizontal/vertical transmission in music and directly compare music with social structure, language, genes, and geography to clarify causal processes.

Conclusion

This study presents the first global, matched comparison of musical, genetic, and linguistic diversity, identifying five robust axes of musical style and demonstrating that global musical histories are only weakly related to linguistic and genetic histories, with some regional exceptions. The work expands the evidentiary base for reconstructing human cultural history by showing that music encodes largely independent historical information. Publicly available data and code provide a foundation for further investigations. Future research should integrate broader linguistic features, direct acoustic comparisons of song and speech, and explicit models of musical transmission to disentangle neutral versus functional mechanisms and to integrate musical evidence into holistic narratives of human cultural evolution.

Limitations

Matched sample size is limited (121 societies), non-random, and underrepresents some Indigenous populations due to historical/ethical constraints.
Cantometrics reduces complex musical diversity to 37 variables with variable inter-rater reliability and universality; some features lack validation against culture-bearers’ perspectives.
Linguistic phylogenies based on basic vocabulary capture limited aspects of language evolution; genetic platforms may introduce ascertainment bias.
Potential sampling mismatches in time and population between datasets; SCCS design reduces autocorrelation, limiting detectable cross-domain signals.
Smaller regional samples (Africa, Europe, Southeast Asia) introduce uncertainty in regional RDA estimates and may be affected by heterogeneous cultural/language-family compositions.

Related Publications

Explore these studies to deepen your understanding of the subject.

Linguistics and Languages

Expansion by migration and diffusion by contact is a source to the global diversity of linguistic nominal categorization systems

M. Allassonnière-tang, O. Lundgren, et al.

Linguistics and Languages

Global predictors of language endangerment and the future of linguistic diversity

L. Bromham, R. Dinnage, et al.

Medicine and Health

Humanities, criticality and transparency: global health histories and the foundations of inter-sectoral partnerships for the democratisation of knowledge

S. Bhattacharya, A. Medcalf, et al.

Environmental Studies and Forestry

Genetic diversity and population structure in *Nothofagus pumilio*, a foundation species of Patagonian forests: defining priority conservation areas and management

M. G. Mattera, M. J. Pastorino, et al.

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 12+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny