Introduction
The growth of science is a central theme in science of science studies. Early research indicated exponential growth in scientific literature, with a doubling time around 15 years. This rapid growth suggests that the bulk of knowledge remains at the cutting edge. However, accurate estimation of growth rates requires reliable, comprehensive data spanning a long period. Previous studies, often relying on databases like Web of Science, had limitations in their coverage of early modern science. This study addresses these limitations by incorporating data from newer databases, Dimensions and Microsoft Academic, which offer broader historical coverage, along with data from Web of Science and Scopus. The researchers aim to provide a more precise and historically informed assessment of scientific growth, considering both overall publications and those within specific fields (Physical and Technical Sciences and Life Sciences). Furthermore, they analyze the relationship between scientific growth and economic growth in the UK, leveraging historical economic data. The assumption is that while a strong science system is crucial for national wealth, economic growth is a necessary input for a thriving science system. The UK is chosen for this comparison due to the availability of long-term economic and scientific publication data.
Literature Review
Existing literature supports exponential growth in scientific publications. Derek John de Solla Price's work is foundational, suggesting that growth is proportional to the existing body of knowledge. While this exponential growth has been empirically observed, precise estimation of growth rates based on comprehensive and reliable data was lacking. Prior studies using Web of Science data, while valuable, faced limitations in covering early periods of modern science. The introduction of newer databases like Dimensions and Microsoft Academic provides an opportunity for improved accuracy and extended historical analysis. Cited references data has also been used, though it is acknowledged to be a less-than-ideal proxy as it does not capture non-cited publications. The study builds upon previous work, aiming to overcome these limitations and offer a more robust estimation of scientific growth. The literature review also highlights the theoretical link between scientific and economic growth, based on previous scientometric research.
Methodology
The study utilizes data from four major bibliographic databases: Web of Science, Scopus, Dimensions, and Microsoft Academic. Data were collected for total publications as well as for two broad scientific fields: Physical and Technical Sciences and Life Sciences. These specific fields were chosen because data on publications serve as reliable proxies for research activity within them. For the comparison of scientific and economic growth, the researchers used data from the UK, obtaining economic data (nominal GDP) from the Federal Reserve Bank of St. Louis (FRED). The data spans from 1770 to 2016 for GDP, and publication counts cover varying historical ranges across databases. Data cleaning involved removing the first five years from each time series to avoid artifacts. The core analytical approach uses a latent piecewise growth curve model, which addresses the challenges posed by the varying time intervals and volumes of data across databases. The model treats incomplete time series as missing data problems, employing a multiple imputation procedure (specifically, a Markov Chain Monte Carlo procedure). Five imputed datasets were created for each missing value to account for the uncertainty inherent in imputation. Segmented regression is applied within each imputed dataset, considering unrestricted exponential growth and, in some cases, logistic growth. The final results synthesize the findings across the five imputed datasets. The models are compared using Schwarz's Bayesian Information Criterion (BIC), selecting the models with the lowest BIC as the best fit to the data. Statistical analyses were performed using SAS software.
Key Findings
The unrestricted growth model showed an overall annual growth rate of 4.10% for all publications, with a doubling time of 17.3 years. However, model comparison using BIC revealed that segmented regression models (with four or five segments) provided a significantly better fit to the data. These segments align with historical periods: 1) Emergence of modern physics and pre-industrialization (1675–1809) with a moderate growth rate; 2) Industrial Revolution (1815–1882) with strong growth; 3) Economic crises and World Wars (1881–1952) with slower growth; and 4) Post-war period (1952-2018) with strong growth. Analyzing Life Sciences and Physical and Technical Sciences separately, similar patterns were observed. In the post-World War II period, Physical and Technical Sciences demonstrated slightly higher growth than Life Sciences. A comparison of scientific and economic growth in the UK, using data from Dimensions and FRED, showed that the annual growth rate of publications (4.97%) exceeded the GDP growth rate (3.05%), although both were characterized by distinct periods of growth and decline. The economic data used is nominal, not inflation-adjusted. The UK data showed even more distinct segments than the global data, with eight segments fitting better than a continuous growth model. Analyzing all publications in Microsoft Academic, including those with and without known document types, revealed only small differences in results, suggesting robustness of findings. The analysis also found a negative covariance between the intercept and slope for the first segment across all databases, meaning a higher initial publication volume is associated with a lower growth rate in the initial period, and vice versa.
Discussion
The findings suggest that scientific growth is not a uniform, continuous process. The segmented growth models highlight the influence of historical events, economic development, and socio-political shifts on the rate of scientific publication. The study successfully employs a sophisticated statistical model to integrate data from multiple sources, addressing challenges associated with incomplete and varying datasets. The consistently high R-squared values across models indicate a strong fit to the data, lending confidence to the results. While publication counts are a useful proxy for scientific output, the study acknowledges limitations around quality, disciplinary variations, and the interpretation of “growth” in terms of actionable knowledge.
Conclusion
This study provides a robust, historically informed analysis of scientific growth using data from multiple databases. The results demonstrate that segmented growth models outperform simple exponential models, revealing the influence of historical contexts on scientific development. Future research could explore the reasons behind the variations in growth rates across different segments and fields. Expanding the analysis to include mono-disciplinary databases would provide further insights. Further analyses using inflation-adjusted GDP could improve the comparability of scientific and economic growth rates.
Limitations
The study acknowledges limitations inherent in using publication counts as the sole measure of scientific growth. Variations in publication quality and disciplinary differences are not fully addressed. Furthermore, the interpretation of “growth” in purely quantitative terms may not fully capture the complexity of scientific progress. The economic data used was not inflation-adjusted, potentially affecting the comparison of scientific and economic growth rates.
Related Publications
Explore these studies to deepen your understanding of the subject.