logo
ResearchBunny Logo
Cross-national analyses require additional controls to account for the non-independence of nations

Psychology

Cross-national analyses require additional controls to account for the non-independence of nations

S. Claessens, T. Kyritsis, et al.

This paper highlights critical issues in cross-national research within economics and psychology, revealing how assumptions of non-independence between nations can inflate findings. Conducted by Scott Claessens, Thanos Kyritsis, and Quentin D. Atkinson, the study urges social scientists to address this oversight to enhance the accuracy of their conclusions.

00:00
00:00
~3 min • Beginner • English
Introduction
The authors address the problem that nations are not statistically independent units due to spatial proximity and shared cultural ancestry, which violates regression assumptions and can inflate false positive rates in cross-national studies. They highlight the widespread use of bivariate correlations and regressions on national-level data to study economic development, cultural norms and values, and human psychology, and note established concepts of spatial and cultural phylogenetic non-independence. Although fields like ecology and anthropology have long accounted for such dependencies, economics and psychology often do not. The study aims to: (1) quantify spatial and cultural signals in widely used national-level variables (economic development measures and cultural values); (2) review highly cited cross-national papers to assess how often non-independence is controlled; (3) use simulations to evaluate the efficacy of common control methods; and (4) reanalyse prior influential cross-national associations while explicitly modeling spatial and cultural dependencies.
Literature Review
The authors searched Web of Science (to 2018; exported September 27, 2021) for articles combining economic development or values with cross-national/cross-cultural/cross-country terms, then retained the 100 most highly cited per-year papers (50 economic development, 50 cultural values) that reported at least one national-level analysis. They exhaustively coded 4,308 cross-national analyses for whether and how they attempted to control for non-independence (e.g., regional fixed effects, geographic distance controls like latitude, shared cultural history proxies, other methods). Results: only 42% of economic development articles and 8% of cultural values articles contained at least one analysis accounting for non-independence (95% bootstrap CIs reported). At the analysis level, only an estimated 5% (95% CrI [0.02, 0.13]) of economic development analyses and 1% (95% CrI [0.00, 0.02]) of cultural values analyses controlled for non-independence. Controls used most commonly included regional fixed effects, latitude, and proxies for shared cultural history. Logistic regressions suggested that higher-impact journals were, if anything, less likely to include controls (negative coefficients with wide credible intervals) and there was no clear time trend in adoption of controls over 1993–2018.
Methodology
The study comprised four components. (1) Quantifying geographic and cultural phylogenetic signal: The authors assembled longitudinal national datasets for Human Development Index (1990–2019; n=189 nations), GDP per capita (1960–2021; n=209), annual GDP per capita growth (1961–2021; n=208), and Gini inequality (1967–2021; n=167). Cultural values included World Values Survey traditional vs. secular and survival vs. self-expression values (1981–2019; n=116 nations), and cross-sectional cultural tightness (n=57) and individualism (n=97). They constructed proximity matrices for 269 nations: geographic proximity (1 minus scaled log geodesic distances between capital cities) and linguistic proximity (weighted average of inverse phylogenetic distances between all language pairs spoken across nation pairs, using Glottolog trees and Ethnologue speaker shares). Bayesian multilevel models (brms) allowed nation random intercepts to covary according to both proximity matrices simultaneously; signal estimates were the proportion of national-level variance explained by each matrix. (2) Literature review: Two Web of Science searches identified candidate papers (economic development and cultural values). Inclusion required original empirical work with at least one national-level analysis. They coded every main-text cross-national analysis for controls addressing non-independence and model characteristics, and computed article- and analysis-level proportions with bootstrap intervals. Bayesian logistic models tested associations with journal impact factor (log) and publication year (splines). (3) Simulation study: For 236 synthetic nations, they simulated predictor and outcome variables with specified spatial or cultural autocorrelation (λ for outcome, ρ for predictor set to 0.2, 0.5, 0.8) using covariance proportional to the geographic or linguistic proximity matrices. The true causal relation r was set to 0 (to estimate false positive rates) and later to 0.1, 0.3, 0.5 (to estimate power). For each parameter combination they generated 100 datasets and fitted 11 models: naive regression; controls for latitude, longitude; continent fixed effects; language-family fixed effects; control for mean predictor within 2000-km radius; Conley standard errors based on geographic or genetic distances; Bayesian Gaussian process (GP) over lat-long; Bayesian random effects with linguistic covariance; and a combined spatial GP plus linguistic covariance model. False positive rates (FPRs) were the proportion of 95% CI/CrI excluding zero when r=0; power was the proportion significant when r>0. (4) Reanalyses: From the review, they selected 12 previously significant cross-national associations with available data (six economic development and six cultural values), pre-registered the set, and fit four models for each: naive; spatial GP; linguistic covariance; and both. Most were bivariate; two included additional covariates or multilevel structure. All Bayesian models used brms; convergence diagnostics were satisfactory (R<1.1); Gaussian process approximations were used where needed.
Key Findings
- National variables exhibit substantial non-independence: Geographic and/or cultural phylogenetic proximity frequently explained over half of national-level variance for HDI, GDP per capita, Gini, and several cultural values. Geographic signal was strong for economic variables and traditional values; cultural phylogenetic signal was strong for most variables except GDP per capita growth (equivocal evidence). - Literature review: Only 42% of economic development articles and 8% of cultural values articles contained at least one analysis addressing non-independence. At the analysis level, just 5% (economic) and 1% (cultural) of analyses used such controls. Higher journal impact factor was not associated with more controls; no clear temporal trend was observed. - Simulations—spatial autocorrelation: With strong spatial autocorrelation (0.8) in both predictor and outcome, naive regressions yielded FPRs up to 77%. Common controls like latitude/longitude or language family fixed effects left FPRs above 50%; Conley SEs remained above ~40% under strong spatial autocorrelation. Continent fixed effects reduced FPR to about 35%, and controlling for the mean of the predictor within 2000-km radii reduced FPR to ~6% but at the cost of reduced power (below 80% for moderate true effects under strong autocorrelation). Bayesian spatial GP reduced FPRs to ~15% (moderate autocorrelation) and ~23% (strong), while maintaining ≥80% power for moderate and large effects; adding linguistic covariance performed equally well. - Simulations—cultural phylogenetic autocorrelation: Fixed-effects methods (including language family) did not reduce FPRs sufficiently (e.g., language family FE ~32% FPR under strong cultural autocorrelation). Models with random effects covarying by linguistic proximity eliminated false positives across all autocorrelation levels and were the only methods with ≥80% power to detect large effects (r=0.5). Spatial GP alone continued to yield false positives in this setting; combining spatial and linguistic covariance also eliminated false positives. - Reanalyses of 12 published associations: Effect sizes typically attenuated after controlling for non-independence, sometimes by half. Overall, 6 of 12 associations had 95% credible intervals including zero after controls. Among economic development analyses, 4/6 lost significance when controlling for spatial non-independence; among cultural values analyses, 2/6 lost significance when controlling for cultural phylogenetic non-independence. Patterns of attenuation corresponded with estimated strengths of spatial and cultural signals in the outcomes.
Discussion
The study demonstrates that many national-level variables in economics and psychology are strongly structured by spatial proximity and cultural ancestry, making naive cross-national regressions prone to spurious findings. A review shows that controls for non-independence are rarely applied, and when they are, they often rely on insufficient methods (e.g., latitude controls, regional or language family fixed effects, Conley standard errors). Simulations clarify why: such controls do not model covariance induced by proximity and therefore leave substantial residual structure, yielding inflated false positive rates. Random-effects approaches that explicitly model covariance matrices—spatial Gaussian processes over coordinates and linguistic covariance from language phylogenies—substantially reduce false positives while retaining power, and linguistic covariance is crucial when cultural phylogenetic non-independence is present. Reanalyses of prominent findings illustrate practical consequences: half of examined associations become compatible with no effect once non-independence is modeled. The authors discuss potential reasons for the underuse of appropriate methods, including incentives favoring significant results, limited awareness outside fields accustomed to spatial/phylogenetic data, and perceived complexity. They recommend clearer causal modeling, broader adoption of accessible tools (e.g., brms/Stan), and attention to multiple dependency sources. The findings argue for caution in interpreting published cross-national associations and for routine modeling of non-independence to ensure valid inference.
Conclusion
Cross-national analyses frequently violate independence assumptions due to spatial and cultural linkages between nations, which can inflate false positive rates. This work quantifies strong geographic and cultural signals in key national-level variables, documents the rarity and insufficiency of common controls in the literature, demonstrates via simulation that fixed-effects and post hoc SE corrections do not adequately curb false positives, and shows through reanalyses that many reported associations do not persist after proper controls. The authors advocate for widespread adoption of models that explicitly encode covariance induced by geographic and cultural proximity (e.g., spatial Gaussian processes and linguistic covariance random effects). Future research should: expand reanalyses across more domains; explore alternative dependency structures (e.g., conditional autoregressive models, GAMs), additional cultural distance metrics (cultural FST, genetic or religious phylogenies), and modern interconnection networks (migration, flights, social media); and refine guidance for selecting appropriate controls based on explicit causal models.
Limitations
- The literature review was not a formal systematic review (PRISMA not followed) and focused on highly cited papers, which may bias estimates. - Reanalyses primarily targeted initial bivariate specifications; comprehensive multivariate reanalyses were not conducted, so claims in original papers are not definitively refuted. - The simulation focused on spatial and linguistic (cultural phylogenetic) non-independence and did not model other dependencies (e.g., trade, migration, information networks, colonial histories). - Linguistic proximity was used as the primary operationalization of cultural ancestry; other measures might yield different results. - Some models assumed approximate normality of residuals without formal testing. - The number of reanalysed cases was small (n=12) and constrained by data availability; attenuation-effect relationships with signal strength were suggestive but uncertain. - Methods like 2000-km mean control reduced false positives but at the cost of statistical power, highlighting trade-offs not fully explored across all scenarios.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny