logo
ResearchBunny Logo
A global perspective on social stratification in science

Sociology

A global perspective on social stratification in science

A. Akbaritabar, A. F. C. Torres, et al.

This research conducted by Aliakbar Akbaritabar, Andrés Felipe Castro Torres, and Vincent Larivière delves into the intriguing social stratification among scientists worldwide. Analyzing the careers of 8.2 million scientists, the study unveils a stark stratified structure within academic communities, shedding light on productivity, impact, and mobility that could reshape our understanding of collaboration in academia.... show more
Introduction

The study investigates how inequality and social stratification manifest across the global scientific workforce. It addresses the research question of whether, and how, multiple facets of academic performance—productivity, collaboration, mobility, and visibility—co-vary to generate stratified structures among scientists across fields and ages. The context is a highly competitive academic system often justified as meritocratic and assessed using bibliometrics, typically in isolation. The purpose is to provide a multidimensional, structural assessment of stratification across disciplines, highlighting interrelations among indicators and age effects, and to supply country-level measures for future research. The importance lies in moving beyond single-indicator gaps (e.g., publications, citations) to understand cumulative advantage, feedbacks, and potential mechanisms sustaining inequalities in science.

Literature Review

Prior work documents growth in coauthorship, geographic mobility, and publications, alongside increased concentration of success indicators among a minority. Team science has risen, but individual productivity rates may not have increased. Citation distributions are highly skewed and increasingly concentrated. Large-scale Scopus evidence indicates: 33% of scholars publish only one paper; median authors per paper is two; ~27.2% of publications are single-authored; >75% are single-country; 87.5% of authors remain affiliated with a single country and 73.5% with a single subnational region across careers; 36.8% publish in only one year. These patterns suggest that observed increases in collaboration, mobility, productivity, and impact may be driven by a small subset of scholars, underscoring the need for a multidimensional analysis of interrelated indicators and potential Matthew effects linking collaboration, mobility, productivity, and citations.

Methodology

Data: 28.5 million articles and reviews from Scopus (1996–2021) involving 8.2 million disambiguated authors (Scopus Author IDs; precision 98.3%, recall 90.6%). Author-affiliation entities disambiguated via ROR API and geocoded to subnational units. Excluded 41,278 authors (0.5%) for missing metadata. Fields: six OECD macro fields (Agricultural Sciences; Natural Sciences; Humanities; Medical and Health Sciences; Engineering and Technology; Social Sciences), assigning each author to the field with highest publication share. Indicators: 12 author-level, career-long bibliometric measures across four domains—collaboration/internationalization (number of coauthored papers; average coauthors per paper; number of international coauthored publications; number of national coauthored publications), mobility (number of international affiliation moves; number of national moves; number of affiliated organizations), visibility/impact (total citations; average citations per paper), and productivity (fractional publications; total publications; first-author publications). Most indicators standardized by academic age (years since first publication); averages (coauthors per paper and citations per paper) normalized by number of papers. Categorization: All highly skewed indicators binned into as many categories as possible ensuring ≥2% frequency in each, capturing non-linearities and retaining outliers by grouping extremes; resulting categories range from 3 (e.g., international moves in Agricultural Sciences) to 10 (e.g., total citations in Natural and Medical/Health Sciences). Multivariate analysis: For each field, Multiple Correspondence Analysis (MCA) on the 12 categorized indicators; focus on first three factorial axes (largest eigenvalues). Notably, the first axis correlates strongly with academic age despite age standardization. Clustering: To account for age effects, conduct hierarchical clustering (Ward) with K-means consolidation separately within six academic-age groups (1; 2–5; 6–9; 10–14; 15–20; 21–25 years since first publication) across six fields, selecting six clusters per analysis (bottom, low, mid-low, mid-high, high, top) based on variance ratio. Network and communities: Build a global bipartite coauthorship network across 8.2 million authors; identify the giant component; detect collaboration communities using Constant Potts Model (CPM) adapted to bipartite networks across 18 resolution parameters. Robustness: compare with NetworKit algorithms (default, parallel Louvain, parallel Label Propagation) and with a one-mode projection analyzed via Leiden; results qualitatively robust. Assess within-community distributions of bibliometric classes and age groups and compute standardized entropy to evaluate heterogeneity.

Key Findings
  • MCA structure: Two dominant dimensions organize scholars: (1) “Academic age, number of organizations, and individual productivity” (first axis), showing a clear positive age gradient even after age-standardization and emphasizing cumulative advantage; (2) “Total productivity, visibility, and collaborations” (second axis), where publications, citations, and coauthorship dominate. Mobility indicators contribute little to the first three axes, consistent with low mobility shares (~8% international moves; ~12% national moves).
  • Stratification across fields and ages: A consistent stratified hierarchy emerges in all six macro fields and across age groups. Top class is a minority ranging from 6% (Humanities) to 19% (Natural Sciences); bottom class ranges from 22% (Natural Sciences) to 32% (Engineering & Technology). Middle and bottom classes cluster toward the lower-left in MCA space, indicating uniformly lower performance across indicators.
  • Age dynamics: Stratification is most pronounced among 21–25 year academic-age authors, suggesting cumulative effects over careers. Younger groups (1 year; 2–5 years) exhibit a more pyramidal shape with very small top shares.
  • Output contributions and the 20/80 rule: Among 15–20 year academic-age scholars, the top classes contribute the most to international publications, national publications, coauthored papers, and total citations but fall short of producing 80% of outputs—contradicting the 20/80 rule across the 10 indicators analyzed. Bottom classes (≈ one quarter of authors) contribute under 5% in 7 of 10 indicators; exceptions are mobility measures (number of organizations, national moves, international moves), where bottom and top classes contribute similarly (and in Humanities, the bottom contributes more), indicating mobility relates to both success and precariousness.
  • Collaboration communities: Analysis of 19,970 collaboration communities with at least 20 authors (covering ~99% of authors and ~42.7% of communities at one CPM resolution) shows high-entropy compositions by both bibliometric class and age. Communities mix authors from all classes and age groups, implying extensive inter-class and inter-age collaboration. These patterns are robust across multiple community detection algorithms and resolution settings.
  • Participation and exits: At least 25% of authors in communities publish only one paper, highlighting high churn and the steep stratification of sustained participation.
  • Gender robustness: Disaggregations by gender do not alter the reported stratification patterns.
Discussion

Findings reveal a robust, multidimensional stratification in global science mirroring stratification by academic age. Despite age-standardization, the first axis’s age gradient underscores cumulative advantage: with continued engagement in publication systems, scholars’ positions in the bibliometric hierarchy become clearer and more differentiated. Top classes achieve their status through high total productivity, visibility, and collaborative reach, benefiting disproportionately from collaborations across classes and ages. However, mobility behaves differently: similar contributions by top and bottom classes suggest it can signal both opportunity and precariousness, potentially destabilizing networks while also expanding them. The invalidity of the 20/80 rule under a multivariate framework highlights that dominance in one metric does not translate uniformly across others. The inter-class and inter-age mixing within collaboration communities indicates that stratification is not due to segregated networks but persists despite cross-class collaboration, implying structural mechanisms (e.g., resource concentration, performance-based incentives) may reinforce disparities. The results advocate for assessment practices that account for academic age and multiple performance dimensions rather than single indicators.

Conclusion

The paper contributes a global, multivariate, and career-long perspective on stratification in science, showing a consistent hierarchical structure across fields and ages and challenging the 20/80 narrative across bibliometric indicators. It identifies two principal axes structuring performance and demonstrates that top classes are small, with bottom classes contributing marginally to most outputs while mobility remains an exception. Collaboration communities are heterogeneous by class and age, indicating that stratification persists despite extensive inter-class collaboration. The authors release aggregated, country-level data to enable further research. Future directions include causal analyses of mechanisms (resource access, funding systems, labor support), incorporation of institutional prestige and policy environments, contract types and positions, and intersectional inequalities across contexts and cohorts, building on the multivariate, age-aware framework.

Limitations

The study is descriptive and does not establish causality; thus, it cannot determine whether stratification reflects unequal access to resources (e.g., assistants, junior collaborators) or other mechanisms. It lacks data on researchers’ contracts, positions, and institutional prestige, as well as national policies, all of which may shape opportunities and outputs. Mobility and other indicators rely on Scopus author IDs and ROR affiliation disambiguation with known (though high) precision/recall, and on publications indexed 1996–2021, which may omit outputs outside Scopus. Age effects persist despite standardization, and the analysis infers structure from bibliometric profiles rather than direct measures of inequality or resources.

Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny