logo
ResearchBunny Logo
In the race for knowledge, is human capital the most essential element?

Interdisciplinary Studies

In the race for knowledge, is human capital the most essential element?

L. Sinay, R. W. Carter, et al.

This research challenges the belief that human capital is the primary driver of scientific progress. Conducted by Laura Sinay, Rodney William Carter, and Maria Cristina Fogliatti de Sinay, it reveals biases in language, gender, and funding that skew researcher visibility, urging a rethink of algorithmic fairness in science.... show more
Introduction

Every year Clarivate Analytics, the company that manages Web of Science, lists highly cited researchers from analysis of their database of published peer-reviewed articles. Early in the 2019 report, they asked who would contest that in the race for knowledge, human capital is most essential. They amplified this by stating that talent—including intelligence, creativity, ambition, and social competence—outpaces other capacities such as access to funding and facilities. This understanding contradicts previous findings that suggest other elements used in algorithms by automated search engines such as gender, language and funding might significantly restrict the development of scientific knowledge. Using a randomly selected group of scholars listed on the Clarivate Analytics database for 2018, the study explored whether other factors might be as, or more important than human capital in the race for knowledge development. The authors hypothesized that if human capacity and talent are the most essential criteria for knowledge development, then the profile of prominent scholars that emerge from databases should be equally distributed among gender, level of country development, access to funding and languages spoken in the country where scholars are affiliated.

Literature Review

The paper reviews several elements beyond talent that may shape scientific development:

  • Gender: Numerous studies report that female scholars’ influence is negatively affected by stereotypes, family commitments, implicit favoritism in academic decisions, and lower research productivity relative to men; globally, under 30% of scholars are women (Ceci and Williams, 2011; Moss-Racusin et al., 2012; Nielsen, 2016; Cooper et al., 2019; UNESCO, 2019; Mairesse and Pezzoni, 2015; Mayer and Rathmann, 2018).
  • Language: English dominance in science disadvantages non-fluent scholars; reviewers often focus on English quality over scientific content (American Society for Cell Biology, 2012; Drubin and Kellogg, 2017; Amano et al., 2016).
  • Funding: Access to and scale of funding influence research productivity, with effects varying by funding amount and structure (Jacob and Lefgren, 2011; Vlăsceanu and Hâncean, 2015; Cattaneo et al., 2016; Hottenrott and Lawson, 2017; Kem, 2010; Rosenbloom et al., 2015).
  • Search engine algorithms and bibliometrics: Originating from Garfield’s work, citation-based algorithms underpin major scholarly databases and influence journal purchasing, publication strategies, evaluation, and funding decisions. Assumptions such as Bradford’s law bias discovery toward a core of high-impact, predominantly English-language journals, excluding quality work in other languages and in moderate/low-impact venues, especially in arts and social sciences (Garfield, 1956, 1965, 1970, 2007; Clarivate Analytics, 2018b, 2019b, 2020a,b; Elsevier, 2019; Google Scholar, 2019; Hicks et al., 2015; Bol et al., 2018; Sinay et al., 2019a).
  • Productivity metrics: Emphasis on publication counts in high-impact journals disadvantages scholars from less developed countries (greater teaching loads) and women (lower average publication rates), and ignores implausibly high output by some top scholars; large coauthorship fosters cross-citations (Boyer, 1990; American Society for Cell Biology, 2012; Mairesse and Pezzoni, 2015; Nielsen, 2016; Mayer and Rathmann, 2018; Noorden and Chawla, 2019; Sinay et al., 2019a).
  • Citations as success: Citation counts ignore negative citations and may reinforce dominant paradigms, especially with massive coauthorship enabling cross-citation (Garfield, 1970; Clarivate Analytics, 2020b; Sinay et al., 2019a; Agnieszka et al., 2019).
  • Merton’s norms and bias: Although Merton’s norms claim personal attributes are irrelevant, scholars’ perspectives influence research problems, methods and values; algorithms lack buffering against personal characteristics, perpetuating bias (Merton, 1942; Merton and Garfield, 1979; Brightman, 1939; Ihde, 2002; Angermuller, 2017; Clarivate Analytics, 2020b; Sinay et al., 2019a). The literature thus suggests gender, language, funding, and algorithmic design significantly affect who advances in the race for knowledge, contrasting with the view that talent alone is decisive.
Methodology

Design and data source: A random sample was drawn from Clarivate Analytics’ Highly Cited Researchers (HCR) 2018 database, which includes 3539 authors across 22 fields. A random number generator selected three letters (A, K, O). Within each field, the first three scholars whose last names began with these letters were chosen. If first names were unavailable (preventing gender identification), the next listed scholar was selected. Where fewer than three scholars per letter existed, subsequent scholars were used to complete the sample. This yielded 198 scholars, representing a random selection rather than the absolute top of the HCR list. Variables examined: gender; access to funding (sponsorship); journal of publication; number of co-authors; language (English commonality in country of primary affiliation); and country human development level. Data collection: Country of primary affiliation from HCR (self-specified) was matched to the UNDP Human Development Report 2019 to classify development level (low, medium, high, very high). English commonality was defined as countries where at least 30% of the population speaks English, based on CIA World Factbook and, as needed, Wikipedia. Gender was identified using Google Photos. UNESCO statistics provided country-level gender distribution of researchers. For each selected scholar, the most recent highly cited publication (as highlighted in Web of Science as top 1% by field and year) up to December 2018 was retrieved. From these publications, the number of citations (collected August 2019), number of co-authors, sponsors (as disclosed), and journal were recorded. If funding was not disclosed, the research was considered unsponsored. Analysis: Descriptive statistics (average, min, max, standard deviation, median, mode) summarized distributions for the variables. Outliers were identified as values greater than three standard deviations above the mean, and some analyses reported results with and without such outliers.

Key Findings

Sample composition and affiliations (N=198):

  • 49.5% of scholars are affiliated with institutions in the USA; the remaining 100 scholars span 29 countries, leaving 83% of UN-recognized countries unrepresented.
  • 98% of scholars are primarily affiliated with very-high Human Development countries; none are primarily affiliated with low-HDI countries; one is affiliated with a medium-HDI country (also co-affiliated with a high-HDI country). Gender:
  • 83% of the sample are male. For the subset of 21 countries with UNESCO gender data, 80% of sampled top scholars are male versus a country average of 67% male researchers (SD=8; min 51; max 85), indicating additional barriers for women to reach highly cited status.
  • Among women in the sample (N=33), all are affiliated with very-high HDI countries where English is commonly spoken; none are affiliated with Latin American or African countries; only two are affiliated with Asian countries (both South Korea). Language:
  • 93% of scholars are affiliated with countries where at least 30% of the population speaks English. Countries with lower English prevalence (e.g., Spain, Japan, China, Turkey, Brazil) have fewer affiliated top scholars in the sample than English-dominant countries like the UK, underscoring English fluency advantages. Authorship patterns:
  • Authors per highly cited paper range from 1 to 2834. Mean = 56.3 (SD=229; median=11; mode=4). Excluding outliers (>3 SD; N=196) reduces mean to 36.5 (SD=84). High coauthorship indicates collaborative, often large-team science, facilitating publication volume and cross-citation. Funding:
  • 90% of authors identified sponsors. Among 23 authors who did not report sponsorship, only one is not affiliated with a very-high HDI country. Interpreting non-disclosure as no sponsorship yields mean sponsors per paper = 7.26 (SD=22.2; median=1; mode=1); excluding outliers (N=196) mean = 5.08 (SD=5.3). Journals and language of publication:
  • The 198 papers appeared in 130 journals; all papers and journals were in English. Average publications per journal = 1.5 (SD=1.55; median=1; mode=1). Excluding outliers (N=152), mean = 1.23 (SD=0.57).
  • Concentration: Nature and its family accounted for 17.6% (35 papers), Science 5.1% (10 papers), and other prominent families (e.g., IEEE, Lancet, NEJM, JAMA) contributed substantial shares. Nearly half (45%) of papers were published within just 10 journal families, despite an estimated 40,000 active scholarly journals, indicating strong venue concentration and English-language dominance. Synthesis of seven salient observations:
  1. No women in the sample from Latin America, Africa, Asia (except two from South Korea), or Oceania.
  2. No affiliations in low-HDI countries; only one in a medium-HDI country.
  3. No papers or journals in languages other than English.
  4. At least 90% of outputs had sponsorship.
  5. 94% of research developed in institutions in very-high HDI countries.
  6. 83% of scholars are males in very-high HDI, English-speaking countries.
  7. Almost half of the most visible researchers are affiliated with US institutions. Overall, findings challenge the claim that talent alone drives scientific advancement; structural factors—language, gender, funding, and facilities—strongly influence who is recognized as highly cited.
Discussion

The study’s findings directly challenge Clarivate Analytics’ assertion that talent is the most essential element in the race for knowledge. The pronounced skew toward male scholars affiliated with very-high HDI, English-speaking countries, the English-only publication pattern, heavy reliance on sponsorship, and concentration in a narrow set of elite journals suggest that systemic factors and algorithmic practices privilege particular scholar profiles. In the absence of evidence linking talent to gender, national development level, or language, the observed disparities point to structural biases rather than differences in innate ability. These biases matter because they can narrow the scope of scientific inquiry, potentially excluding pressing local perspectives (e.g., African agricultural challenges or classroom-centered understandings of ADHD) and undermining public trust in science by limiting diversity of viewpoints, sponsors, and contexts. The paper argues that search engines and bibliometric algorithms (e.g., Web of Science) can and should be updated to incorporate broader sources, mitigate English and venue biases, and buffer against gender and development-level inequities. Doing so would diversify the pool of recognized contributors, enhance the robustness and validity of scientific knowledge, and potentially improve public confidence in science, which is critical in contexts such as climate change and pandemics.

Conclusion

This paper demonstrates that among highly cited researchers, recognition is not primarily determined by human capital alone; instead, language, gender, funding, and institutional resources linked to national development exert substantial influence. The work challenges widely used bibliometric assumptions and the narrative that talent outpaces access to funding and facilities, showing that current algorithmic and systemic practices create a narrow scholar profile that may compromise the validity and inclusivity of science. Main contributions:

  • Empirical evidence from a random HCR 2018 sample quantifying biases related to gender, language (English dominance), country development level, sponsorship, and journal concentration.
  • Identification of structural mechanisms—bibliometric algorithms and selection biases—that amplify these disparities.
  • Policy-relevant recommendation that search engines and databases revise algorithms and expand indexed sources to elevate diverse scholarship beyond elite English-language venues and very-high HDI institutions. Future research directions:
  • Extend analyses across multiple HCR cohorts and other databases (e.g., Scopus, Google Scholar) to test generalizability over time and platforms.
  • Incorporate additional factors such as ethnicity, race, and religion where data permit, and examine subfield variations.
  • Evaluate specific algorithmic interventions (e.g., weighting for language diversity, inclusion of regional journals) and their impacts on visibility and citation patterns.
  • Investigate causality between sponsorship patterns and recognition, including industry and government influences, and explore interventions to support underrepresented scholars and venues.
Limitations
  • Variable coverage: Some potentially relevant factors (e.g., religion, ethnicity) were excluded due to data unavailability.
  • Sample frame: The study uses the HCR 2018 list; while randomly sampled within that frame, it reflects scholars already filtered by citation-based criteria and may not generalize to the broader scientific community.
  • Gender identification: Determined via Google Photos, which may introduce classification error.
  • Funding disclosure: Non-disclosure was treated as no sponsorship, which may underestimate actual funding.
  • Language proxy: English commonality was approximated by countries with ≥30% English speakers, which may not reflect individual scholars’ language proficiency.
  • Publication selection: Analysis focused on each scholar’s most recent highly cited paper up to Dec 2018, which might not capture their full publication profile.
  • Outlier handling: Some analyses reported with and without outliers (>3 SD), which affects summary statistics and interpretations.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny