
Interdisciplinary Studies
Growth rates of modern science: a latent piecewise growth curve approach to model publication numbers from established and new literature databases
L. Bornmann, R. Haunschild, et al.
This study by Lutz Bornmann, Robin Haunschild, and Rüdiger Mutz delves into scientific growth processes from the inception of the modern science system to the present day, revealing a growth rate of 4.10% and a doubling time of 17.3 years.
~3 min • Beginner • English
Introduction
The study investigates how modern science grows over time and whether this growth is best represented by a single exponential trend or by multiple historical segments with different growth rates. Motivated by classic and contemporary science of science literature (e.g., Price’s exponential growth, Kuhn’s paradigm shifts), the authors aim to produce precise estimates of growth rates over several centuries using multiple large bibliographic databases (Dimensions, Microsoft Academic, Web of Science, Scopus). They also assess whether growth patterns differ between broad fields (Life Sciences; Physical and Technical Sciences) and compare scientific growth with economic growth in the UK. The purpose is to provide a robust characterization of scientific growth dynamics, test alternative growth functions (unrestricted vs. logistic), and contextualize growth segments with historical economic and political periods.
Literature Review
Prior work commonly reports exponential growth of science with doubling periods around 15 years (e.g., de Solla Price; Fortunato et al.). Earlier analyses often relied on single databases or proxies such as cited references, which can better reflect early periods but omit uncited works (Bornmann & Mutz, 2015). Bibliometric databases differ in coverage and selectivity (Web of Science, Scopus, Dimensions, Microsoft Academic), potentially biasing early-period estimates. Theoretical links between scientometrics and econometrics suggest similar exponential laws of growth (Price, 1986), and prior research has shown associations between economic development and scientific output. However, precise, long-horizon, multi-database estimates of scientific growth rates and segmented growth epochs have been limited.
Methodology
Data sources: Four multidisciplinary bibliographic databases were used: Web of Science (coverage back to 1900; multiple indices; no document type restriction; query py=1900–2018; retrieved 30 Aug 2019), Scopus (coverage to 1861; query PUBYEAR AFT 1800; 30 Aug 2019), Microsoft Academic (snapshot 11 Jan 2019; coverage to 1800; five document types plus many untyped items; patents excluded; processed locally in PostgreSQL), and Dimensions (Publications sub-database; snapshot 26 Sep 2019; broad publication types; strong book coverage; processed locally in PostgreSQL). Broad field classifications were mapped using each database’s subject schemes: Web of Science subject categories, Scopus subject areas, Microsoft Academic top-level fields, and Dimensions ANZSRC Fields of Research codes, to define two broad fields (Physical and Technical Sciences; Life Sciences including Health Sciences). Economic data: UK nominal GDP series (NGDPMPUKA) 1770–2016 from FRED; UK publication counts from Dimensions (1788–2016). Data preparation: Annual publication counts per year were transformed to cumulative counts over time; analyses were conducted on log-transformed cumulative counts. The first five years of each time series were discarded to remove artifacts: start years became Dimensions 1670, Microsoft Academic 1805, Web of Science 1905, Scopus 1866; all to 2018. Statistical modeling: - Growth functions: Unrestricted exponential growth (log-linear form) vs. logistic (restricted) growth. - Segmented (piecewise) regression: Time segmented models with unknown breakpoints estimated; different growth rates per segment. - Multi-database integration via latent piecewise growth curve modeling with multiple imputation for incomplete early-year coverage. Missing data handling: Multiple imputation (five imputations) under Missing at Random (MAR) using MCMC, leveraging information from all time series to impute missing years (notably pre-1900). Parameters from each imputed dataset’s segmented model were pooled (Rubin’s rules) combining within- and between-imputation variances. Model comparison: Models varied by growth function (exponential vs. logistic), mixed vs. fixed effects, covariance components (intercept–slope), and number of segments. Schwarz’s BIC was used for model selection; mean square error (MSE) and BIC from MSE were also examined for UK analyses. Autocorrelation and heteroskedasticity were deemed negligible given R² > 0.99. Software: SAS (PROC NLMIXED, NLIN, MI, MIANALYZE). Convergence was aided by scaling random effects when needed.
Key Findings
- Overall unrestricted exponential growth across databases yields an annual growth rate of 4.10% with a doubling time of 17.3 years. Logistic (restricted) growth was rejected in favor of exponential models. - Segmented models fit substantially better than single-trend models. For all publications and Life Sciences, a four-segment model fit best; for Physical and Technical Sciences, a five-segment model fit best (BIC-based). - Four historical segments for all publications: 1) 1675–1809: 2.87% growth; doubling 24.5 years. 2) 1815–1882 (Industrial Revolution): 5.62%; doubling 12.6 years. 3) 1881–1952 (economic crises, World Wars): 3.78%; doubling 18.7 years. 4) 1952–2018 (post-war): 5.08%; doubling 14.0 years. - Broad fields: Life Sciences: overall growth 5.07%; doubling 14.0 years. Physical and Technical Sciences: overall growth 5.51%; doubling 12.9 years. Post-1945 segment: P&T 5.99% (doubling 11.9 years) vs. Life Sciences 4.79% (doubling 14.8 years). - Model fit: Mixed-effects segmented models with intercept–slope covariance in early segment improved BIC; residual structures could be treated as identity (R² > 0.99). - UK comparison (Dimensions data): Science growth (1780 onward) averaged 4.97% annually; doubling 14.3 years; eight growth segments identified. GDP (nominal) averaged 3.05%; doubling 23.1 years; seven segments identified. Scientific growth exceeded GDP growth on average. Temporal coupling observed in specific epochs (e.g., pre-industrialization late 18th–early 19th century, early industrialization 1840s, and strong post-WWII phases), alongside WWII slowdowns in science (e.g., 2.62% in 1940–1948). - Sensitivity to document types in Microsoft Academic: Including items without document type (excluding patents) produced small differences; an additional WWII segment emerged for all documents (1940–1945).
Discussion
The findings confirm that while an overall exponential growth pattern characterizes modern science, growth rates are not constant; instead, distinct historical segments align with major economic and political epochs such as industrialization and the World Wars. The multi-database approach strengthens validity by showing highly consistent patterns across different coverage profiles. Logistic (capacity-limited) growth was not supported within the observed horizons. Field-level analyses indicate broadly similar growth dynamics in Life Sciences and Physical and Technical Sciences, with slightly higher post-war growth in P&T. The UK case study shows scientific growth outpacing nominal GDP growth and exhibiting more granular segmentations, with periods of coupling between economic and scientific expansion. The results imply that systemic factors (e.g., resource allocation during wars, post-war investment, industrialization) modulate growth rates around a long-run exponential trajectory. They also suggest that increases in publication output are likely driven by an expanding research workforce rather than higher per-capita productivity, consistent with prior evidence.
Conclusion
This study provides precise, long-horizon estimates of scientific growth by integrating four major bibliographic databases within a latent piecewise growth curve framework with multiple imputation. Key contributions include: robust confirmation of long-run exponential growth (overall 4.10% per year; doubling ~17.3 years), identification of historically interpretable growth segments (four to five segments across datasets and fields), and comparative insights showing UK scientific growth exceeding nominal GDP growth and the global average. Differences between Life Sciences and Physical and Technical Sciences are modest, with slightly higher post-1945 growth in P&T. Future research should: - Investigate causal drivers of segment-specific growth rate changes (e.g., policy shifts, funding, technological shocks). - Examine growth using monodisciplinary databases (e.g., Chemical Abstracts, MEDLINE) to validate and refine field-specific patterns. - Incorporate real (inflation-adjusted) economic measures and additional national contexts. - Explore researcher population dynamics and collaboration trends to disentangle workforce size versus individual productivity effects.
Limitations
- Publication counts as a proxy for scientific growth face known issues: least publishable unit practices, disciplinary and journal quality variances, and unequal document types, potentially biasing comparisons. - Growth interpreted as increases in publication numbers may not map directly to increases in actionable or durable knowledge. - Early-period database coverage is incomplete and heterogeneous; missing values required multiple imputation under a MAR assumption that cannot be empirically verified. - Microsoft Academic may be biased toward items with a digital footprint; database selection and coverage strategies differ across sources. - UK GDP series used nominal values (not inflation-adjusted), limiting direct comparability with scientific growth. - The study does not empirically identify mechanisms behind segment-specific growth rate shifts; historical interpretations remain associative.
Related Publications
Explore these studies to deepen your understanding of the subject.