logo
ResearchBunny Logo
Second-order Citations in Altmetrics: A Case Study Analyzing the Audiences of COVID-19 Research in the News and on Social Media

Interdisciplinary Studies

Second-order Citations in Altmetrics: A Case Study Analyzing the Audiences of COVID-19 Research in the News and on Social Media

J. P. Alperin, A. Fleerackers, et al.

This study conducted by Juan Pablo Alperin, Alice Fleerackers, Michelle Riedlinger, and Stefanie Haustein shows that social media discussions surrounding research are significantly more engaging compared to direct interactions with the research itself. The findings highlight the crucial role of news media in sharing research with the public, particularly in the context of COVID-19.

00:00
00:00
~3 min • Beginner • English
Introduction
The study addresses whether altmetrics can capture societal impact beyond academic audiences and how second-order citations (social media posts linking to news stories that cite research) compare to first-order citations (posts linking directly to research). Prior work shows low correlation between social media mentions and citations and that social media sharing of research is largely driven by academics. Given this, the authors propose examining venues where research circulates among non-academic audiences—particularly the news media. They argue that tracking news coverage of research and its subsequent spread on social platforms may better reveal public engagement with science. The research question: How does social media engagement with news stories about research compare to engagement with the research articles themselves? The COVID-19 pandemic provides a salient case to explore this approach.
Literature Review
The paper reviews altmetrics literature indicating: (1) low correlations between social media mentions and traditional citations; (2) the predominance of academic users sharing research on platforms like Twitter and Facebook; and (3) the limitations of traditional altmetrics for capturing public audiences. It highlights the role of news media in shaping public discourse and as a key source of science information. While tools like Altmetric and PlumX track news mentions of research, few studies have leveraged these data to understand audiences, focusing instead on which outlets cover research and journalistic practices. The concept of second-order citations (Priem & Costello, 2010) is introduced as a means to observe indirect public engagement with research via news sharing, with prior indications that such audiences may be broader and more representative than those directly sharing research.
Methodology
Design: Case study of COVID-19-related research to measure and compare first-order versus second-order citations on Twitter and Facebook. - Corpus of research: Identified all COVID-19-related articles (Jan 1, 2020–Dec 31, 2020) using National Library of Medicine search terms (Chen et al., 2020). Limited to two preprint servers (bioRxiv, medRxiv) and two journals (Journal of Virology, British Medical Journal), yielding 3,934 research articles. - News stories: Queried Altmetric Explorer for news stories mentioning any of the 3,934 articles. Altmetric detects mentions via links, DOIs, or textual cues. To improve accuracy, analysis was restricted to five widely circulating outlets: BBC, MSN, The New York Times, The Guardian, The Washington Post. Result: 344 research articles (8.7%) were mentioned 1,406 times across 1,221 unique news stories. Each article was mentioned on average 4.1 times (SD=6.5). - Social media collection windows and tools: Between Mar 9–Apr 9, 2021, collected historical posts from Jan 1, 2020–Jan 31, 2021. • Twitter: Python scripts using Twint to collect tweets containing links to (a) the 344 research articles (first-order) and (b) the 1,221 news stories (second-order). Resolved DOIs via dx.doi.org and doi.org patterns and final landing URLs; expanded shortened URLs from Altmetric for news. • Facebook: Used CrowdTangle to extract publicly accessible posts from pages, groups, and profiles (public spaces) linking to research articles or news stories. Note: public spaces represent a subset of total Facebook activity for ethical/data access reasons. - Yields: 50,299 tweets linking to 325 (94.5%) research articles and 97,235 tweets linking to 486 (39.8%) news stories; 6,420 Facebook posts linking to 246 (71.5%) research articles and 14,081 posts linking to 516 (42.3%) news stories. - Final dataset: (1) 344 research articles; (2) 1,221 news stories; (3) 50,299 tweets + 6,420 Facebook posts with first-order citations; (4) 97,235 tweets + 14,081 Facebook posts with second-order citations. Some posts cited multiple research articles or news stories. - Researcher identification: Compared Twitter user accounts in the dataset to a reference dataset of 423,920 researcher accounts (Mongeon, Bowman & Costas 2022, 2023) to estimate the share of researchers among accounts posting first- vs second-order citations. - Statistical analysis: Using Python/Pandas, computed counts of posts, accounts, and engagement (retweets/shares, likes/reactions, replies/comments). For each research article, summed first-order posts and corresponding second-order posts. Calculated Spearman correlations: within-platform and cross-platform comparisons of first- vs second-order citations. Compared overlaps of user IDs (Twitter) and Facebook public space IDs between first- and second-order shares. All scripts and data available (Alperin 2023a,b).
Key Findings
- Magnitude of sharing and engagement: Second-order citations substantially exceeded first-order citations on both platforms (approximately 2x). • Twitter posts: 50,299 (first-order) vs 97,235 (second-order) • Facebook posts: 6,420 (first-order) vs 14,081 (second-order) • Twitter accounts: 27,771 (first-order) vs 62,290 (second-order) • Facebook spaces: 3,976 (first-order) vs 8,191 (second-order) • Twitter engagement: 227,041 retweets, 512,308 likes, 39,788 replies (first-order) vs 412,509 retweets, 1,111,458 likes, 89,509 replies (second-order) • Facebook engagement: 89,422 shares, 176,890 reactions, 36,203 comments (first-order) vs 412,104 shares, 1,476,174 reactions, 304,614 comments (second-order) - Correlation patterns: • High Spearman correlations across platforms for the same citation type: news tweets vs news Facebook posts ρ=0.95; research tweets vs research Facebook posts ρ=0.84. • Very low within-platform correlations between first- and second-order citations: ρ=0.13 (tweets); ρ=0.02 (Facebook posts). • Near-zero or null cross-platform cross-type correlations: ρ=−0.01 (first-order Facebook vs second-order tweets); ρ=0.00 (first-order tweets vs second-order Facebook). - Audience overlap: • Twitter overlap small: 14.0% of 27,771 accounts that shared research also shared news stories; 6.4% of 60,296 accounts that shared news stories also shared research. • Facebook overlap small: 22.6% of 3,976 public spaces sharing research also shared news stories; 11.0% of 8,191 public spaces sharing news also shared research. - Researcher presence among Twitter accounts: • Using Mongeon et al. researcher dataset, 14.0% (n=3,899) of accounts posting first-order citations were identified as researchers vs 6.4% (n=3,830) for second-order citations. Only 718 researcher accounts appeared in both groups. - Highly shared items: News about preprints dominated second-order citations; of the top 13 research articles by second-order citations, 11 were preprints at time of news coverage (10 later published in high-profile venues). Only one highly shared by second-order was also highly shared by first-order. Topics with immediate practical value (e.g., interventions, treatments, risk factors) and controversial subjects (e.g., hydroxychloroquine, social restrictions) received disproportionate second-order attention.
Discussion
The findings confirm that direct sharing of research on social media predominantly reflects academic communities, while indirect sharing via news stories—second-order citations—captures larger and largely distinct audiences that are likely more representative of the broader public. The methodology provides a replicable framework to quantify and compare these audiences across platforms. The low correlations and minimal overlaps between first- and second-order citations suggest different audience compositions and dissemination mechanisms: journalistic selection and framing for relevance, usefulness, and emotional resonance likely drive broader engagement than academic notions of novelty or significance. Influencer and elite accounts (e.g., journalists, political figures, highly visible scientists) can substantially amplify second-order citations, as evidenced by outsized engagement metrics. Considering second-order citations alongside account classifications and network analyses could yield richer insights into public impacts of research than traditional first-order altmetrics alone.
Conclusion
Second-order citations offer a meaningful complement to traditional altmetrics, revealing substantial public engagement with research mediated through news coverage. The study provides strong empirical evidence that second-order citations can help identify non-academic audiences and measure societal reach beyond what first-order citations capture. It also pioneers a practical methodology for compiling and analyzing such data across Twitter and Facebook. Despite evolving platform access constraints, the overall approach remains valid and worthy of further development. Future research should expand beyond five outlets, multiple disciplines, and additional platforms; integrate account-type classifications and network analyses; and refine data collection methods to better capture indirect research engagement.
Limitations
- Case specificity: The study focuses on early COVID-19 research (2020) and two preprint servers plus two journals; results may not generalize across time, topics, or venues. - News outlet restriction: Analysis is limited to five major outlets (BBC, MSN, The New York Times, The Guardian, The Washington Post); broader media ecosystems may yield larger or different second-order patterns. - Platform/data constraints: Facebook data includes only publicly accessible spaces via CrowdTangle, likely underestimating total engagement. Twitter data was collected via Twint; subsequent API and access changes may limit replication. URL resolution and Altmetric’s news-detection coverage/precision introduce potential measurement error. - Researcher identification: Matching to Mongeon et al.’s dataset prioritizes precision over recall; the proportion of researchers among accounts likely underestimates true values. - Overlap measurements: Some accounts/spaces may have been missed due to data access limitations, private content, or URL variance, potentially impacting overlap estimates.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny