
Humanities
A mathematical model for the process of accumulation of scientific knowledge in the early modern period
M. Zamani, H. El-hajj, et al.
This groundbreaking research by Maryam Zamani, Hassan El-Hajj, Malte Vogl, Holger Kantz, and Matteo Valleriani unveils a mathematical model that deciphers the intricate dynamics of knowledge accumulation during the early modern science era. Using the *Sphaera* corpus of over 350 astronomy textbooks, the study highlights how economic constraints influenced the dissemination of scientific knowledge. Dive in to discover the fascinating interplay of context and knowledge transmission!
~3 min • Beginner • English
Introduction
The study addresses how scientific knowledge circulated and accumulated in early modern Europe, focusing on university astronomy textbooks (15th–17th centuries). Drawing on knowledge economy and evolutionary epistemology, the authors argue that accumulation depends on circulation and consensus around core knowledge. Building on Stephen Cole’s framework distinguishing core versus research-frontier knowledge, they operationalize core knowledge via widely taught textbook content and link rising homogenization to increased circulation. The research question is to quantify and model the dynamics of knowledge circulation (hence accumulation) within the Sphaera corpus and to identify mechanisms and phases driving these processes. The paper employs epidemic-inspired diffusion models (SI and Bass) to capture the non-linear dynamics of adoption of text-parts across editions, hypothesizing that external factors (captured by the Bass model’s parameter) and institutional-economic contexts significantly shape diffusion.
Literature Review
The paper situates itself within literature on the knowledge economy and modernization (Mokyr; Castells; Marginson), evolutionary epistemology (Popper), and sociology of science hierarchies and consensus (Cole; Kuhn). It builds on prior historical network analyses of the Sphaera corpus showing homogenization and epistemic community formation (Valleriani et al. 2019, 2022; Zamani et al. 2020). For modeling, it leverages analogies between information diffusion and epidemics (Goffman; Bettencourt et al.; Pastor-Satorras et al.) and innovation adoption (Bass), as well as works on multiplex networks, layer interactions, and entanglement (Renoust; Škrlj & Renoust). It also draws on book history and early modern print markets (Gingerich; Gehl; Maclean) and Jesuit educational networks (Grendler).
Methodology
Data: The Sphaera corpus comprises 359 editions of astronomy textbooks (1472–1650), representing roughly 350,000 copies (avg. print run 1,000). One copy per edition was analyzed (~76,000 pages). The corpus centers on works related to Sacrobosco’s Tractatus de sphaera; to avoid bias, the Sacrobosco core text-part and links based solely on it were removed after network construction. Network structure: A directed multiplex network with four semantic layers links editions via text-part relations, with direction determined by publication year. Layers: SOP (Same Original Part): same original text-part; SAP (Same Adaptation Part): same adaptation/commentary/translation; TSOP (Translated Same Original Part): different translations of the same original part; ASAP (Annotated Same Adaptation Part): different commentaries on the same adaptation. Nodes and links per layer: SOP 300 nodes/4658 links; SAP 202/2005; TSOP 22/18; ASAP 40/90. Layer relations: A Layer Interaction Network (LIN) reduces each layer to a node with weighted links representing overlapping edges. The aggregated graph has E = 4819 unique links. Overlap findings: about 97% of aggregated links are in SOP; TSOP accounts for ~0.37%. Entanglement (from leading eigenvector of overlapping frequency matrix): SOP γ = 0.91; SAP γ = 0.39; TSOP γ = 0; ASAP γ = 0.02, indicating SOP’s dominant coupling. Diffusion models: Editions are individuals; text-parts are transmissible units. At time t, editions published and containing the text-parts are infectious; later editions are susceptible; no recovery. SI model: di/dt = β i (1−i); solution is logistic sigmoid with rate β. Bass model: di/dt = (α + β i)(1−i), with α external influence (advertising/exogenous) and β internal (word-of-mouth/endogenous). Special cases: α=0 reduces to SI; β=0 yields exponential saturation; α<0 indicates resistance to external influence. Fitting procedure: For each layer, i(t) is the fraction of editions published by time t. Curves were fit using (1) a generic sigmoid and (2) analytical SI and Bass solutions. Goodness-of-fit via R². Component analysis: Layers SOP and SAP were aggregated (focusing on original texts and commentaries) to form a dense graph with 35 components; the three largest (pink/giant, green, blue) were analyzed chronologically and fit with the Bass model. Network reduction to infer historically plausible links: Recognizing maximal-connection assumptions over-densify links, the authors constructed additional layers based on geography, format, economic conditions, and social/economic awareness, then compared network-driven diffusion curves to the empirical s-curve of cumulative editions by year. Economic layers: EC1 (same format; printers/publishers alive contemporaneously), EC2 (same place; different printers/publishers alive contemporaneously), EC3 (same semantic knowledge; same edition type—OT, COT, COMP, COT+COMP, ADAPT; different printers/publishers alive contemporaneously). Awareness layers: AW1 (shared semantic knowledge by different authors alive contemporaneously), AW2 (shared paratexts; different printers/publishers alive contemporaneously), AW3 (similar printer/publisher fingerprint across different printers/publishers alive contemporaneously). Convergence to the empirical s-curve assessed via R² over time intervals.
Key Findings
Diffusion patterns by layer: SOP and SAP exhibit classic s-curves; TSOP starts late with sparse growth; ASAP begins later with a rapid rise then abrupt slowdown. Model fit: Bass model fits better than SI (SOP example R²: SI = 0.98; Bass = 0.99). SOP sigmoid fit suggested β ≈ 0.047; Bass fit parameters for layers (95% CI): SOP β = 0.049 ± 0.0007; α = 0.0004 ± 0.00002; i0 = 0.0033. SAP β = 0.037 ± 0.0005; α = 0.0023 ± 0.0001; i0 = 0.0049. TSOP β = 0.031 ± 0.0008; α = 0.0051 ± 0.0002; i0 = 0.045. ASAP β = 0.069 ± 0.004; α = 0.0042 ± 0.0007; i0 = 0.025. SOP dominates multiplex coupling (γ = 0.91) and contains ~97% of aggregated links; TSOP shows no coupling (γ = 0) and only ~0.37% of links. Component-level diffusion (SOP+SAP aggregated): Three major components represent distinct phases: green (1478–1538; second largest) shows near-linear diffusion with higher β; pink/giant (1531–1629; largest) accelerates then slows; blue (1570–1618; third largest) shows slow start, fast middle, slow end. Bass fit parameters (95% CI): Pink β = 0.044 ± 0.0019; α = 0.0122 ± 0.0005; i0 = 0.010; R² = 0.99. Green β = 0.071 ± 0.0084; α = 0.0072 ± 0.0014; i0 = 0.030; R² = 0.99. Blue β = 0.194 ± 0.014; α = −0.0091 ± 0.0011; i0 = 0.05; R² = 0.98, indicating strong endogenous diffusion with resistance to external influence. Historical interpretation: Higher α in TSOP aligns with vernacular translations reaching beyond universities; SOP’s lower α and relatively higher β suggest diffusion driven by endogenous academic mechanisms. ASAP’s high β indicates large integrative review-like works significantly contributed to circulation. Network reduction and plausibility: Comparing diffusion in constructed networks to the empirical s-curve: Geographic same-city and same-format alone offer modest improvement. Economic layers show EC2 (same place; contemporaneous, different producers, shared semantics) best aligns (R² ≈ 0.93), suggesting local markets mattered when producers were alive. Awareness layers show time-structured convergence: AW1 and AW2 match better before 1530, whereas AW3 (fingerprint similarity across contemporaneous producers) aligns best after 1530 (R² ≈ 0.98), indicating imitation in content and layout drove diffusion under economic/material constraints.
Discussion
Modeling knowledge diffusion via SI/Bass on a semantic multiplex network reveals that the accumulation of early modern astronomical core knowledge was propelled by circulation dynamics exhibiting non-linear s-curves, with the Bass model’s external influence term critical to capture early uptake patterns. The dominance of SOP and strong endogenous diffusion (higher β, lower α) indicate internal academic propagation, while TSOP’s higher α reflects exogenous channels through vernacular translations. Component analysis shows two main historical phases: a pre-1530 phase and a post-1530 phase, with the latter encompassing rapid growth and widespread homogenization. The negative α in the blue component suggests strong within-community reinforcement and resistance to outside influences, consistent with Jesuit institutional contexts. Crucially, convergence analyses demonstrate that economic and institutional production contexts—especially post-1530 imitation of successful editions’ content and layout (AW3)—best explain observed diffusion, with geographic proximity playing a secondary, conditional role (EC2) when producers were active contemporaneously. These findings connect higher circulation rates to homogenization and consensus-building in core knowledge, aligning with theories that link consensus to accelerated accumulation. The approach also offers a way to prune overly dense semantic graphs by favoring links consistent with economic-awareness patterns.
Conclusion
The paper introduces a quantitative framework that models the accumulation of scientific knowledge by tracing the diffusion of semantic units across early modern astronomy textbooks. Using a multiplex semantic network and fitting SI and Bass models, the authors identify two major phases of circulation, demonstrate the superior fit of the Bass model, and reveal that institutional-economic factors—particularly imitation in content and layout after the 1530s—were primary drivers of diffusion and homogenization of core knowledge. The methodology enables reduction of ambiguous, maximally connected semantic networks by privileging historically plausible links informed by economic and social awareness data. Future research will integrate semantic and awareness layers into a coupled multiplex framework, explore predictive links between communication channels and scientific developments, and test whether similar dynamics and phase transitions characterize other disciplines and epochs.
Limitations
The corpus focuses on astronomy textbooks tied to Sacrobosco’s tradition, which may limit generalizability; although links based solely on the core Sacrobosco text-part were removed, residual bias may persist. The assumption of one representative copy per edition abstracts from intra-edition variation. The network’s initial maximal-connection assumption likely introduces artificial links, necessitating post hoc pruning guided by auxiliary economic and awareness layers. External influence estimates are inferred from diffusion fits and not directly observed. Geographical and economic metadata may be incomplete or uneven across time and regions. The approach captures diffusion of identifiable text-parts but may miss subtler forms of knowledge transfer not codified as explicit semantic relations.
Related Publications
Explore these studies to deepen your understanding of the subject.