Introduction
International inequalities in scientific knowledge production hinder the generation and flow of knowledge. While journal prestige and research quality influence visibility, national scientific infrastructures and reputations also play a significant role, impacting funding, research scale, and even library subscription practices. Wealthier nations, possessing superior resources and established reputations, tend to receive more citations than comparable research from less-resourced countries, a phenomenon observed particularly in the Global South. This paper addresses the difficulty of systematically studying these inequalities by introducing a novel framework called 'citational lensing'. This framework leverages the relationship between citations and textual similarity to identify countries receiving disproportionately high or low citations compared to their research content.
Literature Review
The study builds upon existing research using citation networks and text analysis to understand the flow of ideas in science. Prior studies have noted a misalignment between citation networks and textual similarity between scientific fields, with some suggesting that models of scientific diffusion should incorporate both. However, this misalignment has not been systematically studied in the context of international scientific inequalities. This work addresses this gap by extending the analysis to the international level, arguing that these misalignments are not merely a methodological concern but carry practical significance, reflecting various factors including overall research quality and national reputations.
Methodology
The authors represent international science as a multiplex network with three layers: (1) *L*<sub>citation</sub>, representing the citation network between countries; (2) *L*<sub>text</sub>, representing the textual similarity of research output between countries, measured using Kullback-Leibler divergence (KLD) and a labeled latent Dirichlet allocation (LDA) model to capture national research signatures; and (3) *L*<sub>distortion</sub> (the 'citational well'), representing the difference between citation flow and textual similarity. The study utilizes data from the Microsoft Academic Graph (MAG), encompassing nearly 20 million papers across 150 fields from 1980 to 2012. The data include citations and abstract texts. The authors use a semi-partialing quadratic assignment procedure (QAP) to model the relationship between citation networks and textual similarity. To analyze the textual similarity, a supervised topic model called labeled LDA was used. This model takes into account the nationalities of the authors to capture how ideas and concepts are associated with specific countries. The KLD is used to measure the information loss between different national research signatures. The citational well is constructed by subtracting the textual similarity network from the citation network. The analysis is further refined by classifying countries into 'core' and 'periphery' groups based on their scientific prominence. Finally, the authors control for various factors such as citation inflation and journal selection to refine their analysis. The results are presented across several plots, maps, and tables analyzing changes over time and across different research areas.
Key Findings
The analysis reveals a growing gap between core and periphery countries in citational distortion. Core countries (primarily Western Europe, East Asia, and the United States) consistently receive more citations than would be expected given their research content, while peripheral countries are significantly undercited. This gap widens over time, particularly in the physical and mathematical sciences. While the number of countries participating in global science increases, the distribution of citations remains highly stratified. The United States consistently shows the highest positive distortion (overcitation), followed by other leading scientific nations. China shows a notable rise in citation distortion over time, transitioning from undercitation to overcitation. Most countries, excluding major players, remain consistently undercited or overcited over time, indicating a relatively stable international scientific hierarchy. The authors further investigate citational distortion across different scientific fields and transnational regions, highlighting variations in the magnitude of the distortion. There is a general consistency in the rankings and patterns of countries identified as either overcited or undercited across different fields. The analysis is robust to different methods for accounting for citation inflation, journal selection, and language of abstracts used in the analysis. These findings are consistent across different approaches to analyzing the data and are robust to different methodological choices.
Discussion
The findings confirm and expand upon previous research on citation inequality in science. The rise of China's citation distortion aligns with other bibliometric data showing its growing scientific influence. However, the results do not fully align with prior work on citation inequality among elite researchers regarding European countries. The most significant limitation lies in the measurement of textual similarity, which is inherently noisy. Although the findings provide insight into prominence and recognition of research, methodological refinements would yield greater precision. Despite this limitation, the framework's adaptability offers several strengths. The method can be extended to analyze other aspects of international scientific communication or applied to other actors such as journals or universities.
Conclusion
Citational lensing offers a valuable tool for understanding international inequalities in scientific knowledge production. The framework successfully identifies countries experiencing successful scientific enterprises, as measured by citation counts, and highlights the growing gap between core and periphery nations. This underscores the need for policies to address citation distortion and promote inclusivity in global science. Future research should focus on further refining the methodology and investigating the specific factors contributing to the observed citation distortions.
Limitations
The main limitation is the inherent noise in measuring textual similarity between countries. While the study controls for several factors, other variables like research quality, funding levels, and national reputations might influence the results and require further investigation. The use of English-only abstracts might also limit the representation of non-English speaking countries. The relatively stable hierarchy over time warrants further investigation to understand whether such stability is a reflection of a truly inert system or reflects the slow nature of systemic change within the scientific system.
Related Publications
Explore these studies to deepen your understanding of the subject.