logo
ResearchBunny Logo
Political polarization of news media and influencers on Twitter in the 2016 and 2020 US presidential elections

Political Science

Political polarization of news media and influencers on Twitter in the 2016 and 2020 US presidential elections

J. Flamino, A. Galeazzi, et al.

Discover how Twitter's news media landscape evolved during the 2016 and 2020 US presidential elections with insights from nearly a billion tweets. This research, conducted by experts including James Flamino and Alessandro Galeazzi, highlights trends in politically biased content and the dynamics of user influence in echo chambers.

00:00
00:00
~3 min • Beginner • English
Introduction
The study investigates how the political news media landscape and patterns of information diffusion on Twitter changed from the 2016 to the 2020 US presidential elections. Motivated by evidence of rising political polarization in the United States—including issue polarization among elites and affective polarization among voters—the authors examine how polarized political information diffuses between influential actors and the broader population. Traditional methods lack relational measures needed to trace diffusion, whereas social media enables large-scale network analyses. Using nearly one billion tweets, the study measures volumes of politically biased content, identifies key influencers capable of spreading news widely, and assesses ideological polarization and echo chamber behavior among users and influencers. The central questions are whether the prevalence of fake or extremely biased content changed between elections, how the composition and affiliations of influencers shifted, and whether ideological polarization increased.
Literature Review
The paper situates its contribution within research documenting increasing political polarization in the US and the role of social media in shaping political communication and misinformation. Prior work shows rising partisan division among elites and affective polarization among voters, the spread of true and false news online, and the influence of social media platforms (Twitter, Facebook, YouTube) on election outcomes and disinformation. Studies have explored echo chambers and homophily, exposure to ideologically diverse news, and the effects of encountering opposing views. Earlier analyses assessed fake news influence in the 2016 US election on Twitter and identified influencers within polarized networks. This literature underpins the focus on news diffusion, influencer impact, and echo chamber formation across election cycles.
Methodology
Data collection: Two Twitter datasets were compiled using the Twitter search API with only the presidential candidates’ names as keywords, covering 1 June through election day (8 Nov 2016; 2 Nov 2020). The 2016 dataset includes 171 million tweets from 11 million users; the 2020 dataset includes 702 million tweets from 20 million users. Tweets containing URLs were parsed to extract domains linked to news media outlets. In 2016, 30.7 million tweets with news links (2.3 million users) were identified; in 2020, 72.7 million such tweets (3.7 million users). Caveats: Twitter’s API sampling is non-random (firehose not truly 100%, 10%/1% not random), limiting standard sampling inference. News outlet classification: Outlets were categorized using AllSides (AS) and, when absent from AS, Media Bias/Fact Check (MBFC) (accessed 7 Jan 2021 for 2020 classification). Categories: left, left-leaning, centre, right-leaning, right, extreme bias left, extreme bias right, and fake news. Fake news includes sources flagged for fabricated news or conspiracies; extremely biased includes sources distorting facts or relying on propaganda/misrepresented opinion. MBFC factual reporting scores guided 2020 disinformation categories: 'low' or 'very low' as fake; 'mixed' as extremely biased. Uniformly insignificant outlets (<1% of the cumulative tweets of popular outlets in a category) were removed for consistency with 2016 procedures. The authors note such classifications are opinions, not statements of fact. Bot/activity filtering: As a proxy for automated/professional activity, tweets from unofficial clients (e.g., Hootsuite, SocialFlow, custom bots) were identified. Fraction of tweets from unofficial clients declined from 8% (2016) to 1% (2020). For polarization analyses, retweets from unofficial clients were removed; for other analyses they were retained. Retweet networks: For each news category, a directed retweet network was constructed where an edge from v to u exists if u retweeted v at least once for tweets linking to that category (one edge regardless of count). In-degree equals number of distinct accounts a user retweeted; out-degree equals number of distinct users who retweeted them. These per-category networks capture diffusion structures. Influencer identification: Influencers (best spreaders) were identified via the Collective Influence (CI) heuristic, consistent with prior 2016 analyses, yielding a Clout measure per node. Top influencers were extracted separately within each news category to avoid dominance by larger left/centre categories. The extreme bias left category was excluded from subsequent influencer analyses due to sparsity, fragmentation, and low community standing. Manual labeling classified the top 25 influencers per category (2016 and 2020) as media-affiliated, political-affiliated, independent, or other (unidentified), based on account metadata and off-platform information; majority vote among three annotators was used. Similarity networks and community detection: For polarization among influencers’ audiences, two similarity networks (2016 and 2020) were built among influencers. For each influencer i, a vector of retweet counts by users was formed; pairwise cosine similarities yielded weighted, fully connected networks (zero-weight edges removed). Random equal-sized subsets (200 nodes) were repeatedly sampled (100 times) to compare years. Communities were detected using the Louvain algorithm; separation was quantified by modularity and normalized cut. Additional visualizations subsampled to top 25 influencers per category; quote-based similarity networks were also analyzed for robustness. Latent ideology estimation: A user–influencer bipartite retweet matrix (A) was constructed using only tweets from official clients and users who retweeted at least three different influencers. In 2016, A had 751,311 users × 593 influencers with 39,385,772 retweets; in 2020, 2,034,970 users × 591 influencers with 153,463,788 retweets. Correspondence analysis (via SVD of standardized residuals) projected users onto a one-dimensional latent ideology scale (standardized to mean 0, SD 1). Influencer positions were defined as the median ideology of their retweeters. Robustness checks included removing unit-weight ties, log-weighting retweets, and downsampling 2020 to 2016 scale; results were highly stable (correlations >0.995). User latent ideology correlated strongly (>0.90) with average leaning computed from posted news categories. Anonymization: Personal account usernames were anonymized unless verified major news organizations; aliases reflect affiliation and year(s) of top-100 influence.
Key Findings
- Volume and share changes: Despite overall growth in political tweeting (users +80%; tweets +411%), the fraction of tweets linking to disinformation outlets declined. Fake news share of tweets fell from 10% (2016) to 6% (2020); extreme bias right from 13% to 6%. The fraction of users sharing extreme bias right content fell from 6% to 3%; users sharing fake news remained ~3%. - Category shifts: Left-leaning share of tweets increased from 24% to 45%; right-leaning from 3% to 6%; centre declined from 21% to 10%. Extreme bias left dropped from 2% to 0.05%. A large portion of the centre-to-left-leaning shift reflects CNN’s AS reclassification from centre (2016) to left-leaning (2020). Users active in both years (14% of 2020 users) showed movement consistent with outlet reclassifications: many from centre/left to left-leaning; many from fake/extreme bias right to right. - Automation/proxy signals: Tweets from unofficial clients declined from 8% (2016) to 1% (2020), with reduced average activity among such accounts. - Influencer composition and persistence: Among top influencers, there was a 29% retention rate (top 100 per category) and 36% retention among top 25 between 2016 and 2020, versus 14% persistence among average users. Deleted accounts among top 25 fake news influencers rose from 2 (2016) to 8 (2020). Extreme bias right influencers were primarily verified in 2020, increasing from 15 to 23 verified among the top 25. - Affiliations: Media-affiliated influencers decreased in most categories from 2016 to 2020, while media- and politically affiliated influencers increased in extreme bias right and fake news; politically affiliated influencers increased in right-biased categories. This indicates a shift of right-biased influence toward extreme bias and fake news and greater political affiliation among top right-side influencers. - Ranking dynamics: Many highly influential users in 2020 were previously unranked or low-ranked (58% of unique top 25 across categories came from outside 2016 top 100; 28% of these new influencers were independent). Right/right-leaning and extreme bias right/fake news categories showed greater volatility, with more top 10 churn and demotions below top 50; centre remained relatively stable. Notable shifts include @CNN and @politico moving from centre to left-leaning; emergence of @newsmax and @OANN; shifts toward extreme bias right for @DailyMail and @JudicialWatch. - Echo chambers and polarization (network measures): Influencer similarity networks consistently split into two communities (left/centre vs right/fake). Separation strengthened from 2016 to 2020: modularity increased from 0.401 (95% CI 0.392–0.409) to 0.465 (0.454–0.475); normalized cut decreased from 0.285 (0.232–0.339) to 0.052 (0.046–0.058), indicating denser intracommunity ties and sparser intercommunity ties. - Latent ideology and multimodality: Distributions of user and influencer latent ideologies became more bimodal. Hartigans’ dip test (HDT) increased for users from D=0.11074 (95% CI 0.11038–0.1112) in 2016 to D=0.14751 (0.1471–0.1477) in 2020; for influencers from D=0.18328 (0.1672–0.195) to D=0.23251 (0.206–0.238), rejecting unimodality (P<2.2×10^-16). Robustness analyses limiting to persistent users and/or influencers also showed increased polarization; the largest user increase occurred when holding influencers constant across years, suggesting 2020’s new influencers were more ideologically extreme than persistent ones. - Validation: Users’ latent ideology strongly correlated (>0.90) with their average leaning based on posted news categories, independently confirming observed outlet and influencer shifts.
Discussion
The analyses demonstrate that while the proportional volume of fake and extremely biased news declined on Twitter from 2016 to 2020, ideological polarization among users and influencers intensified. Retweet-based similarity networks show increasingly segregated communities, and latent ideology estimates reveal growing bimodality, indicating stronger echo chambers and reduced cross-ideological diffusion. Influencer ecosystems shifted: media-affiliated influence waned in most categories (except on the rightmost end), while politically affiliated influence grew within right-biased and extreme categories. Rankings and persistence patterns show substantial turnover, with many new, often more polarized influencers rising by 2020. These findings address the core questions about diffusion and polarization dynamics, highlighting that reductions in disinformation shares do not necessarily translate into reduced polarization. The strengthening of within-side ties and weakening of cross-side interactions suggests that users increasingly amplify ideologically aligned influencers and avoid opposing content, with implications for public discourse and exposure to diverse viewpoints.
Conclusion
This longitudinal analysis of nearly one billion tweets across the 2016 and 2020 US presidential elections maps major shifts in Twitter’s political news landscape. The study contributes: (1) category-specific influencer identification via retweet networks; (2) measurement of changing volumes of biased content; (3) characterization of influencer affiliations and turnover; and (4) multi-method polarization assessment using similarity networks and latent ideology. Key conclusions are that fake and extremely biased content shares declined, yet ideological polarization and echo chamber behavior increased among both users and influencers, with 2020’s new influencers more polarized than persistent ones. Future research directions include: applying natural language processing to distinguish sentiment and targets of quote tweets and to identify topics; refining user and organization classification to better capture affiliations; developing more granular polarization measures; and extending influencer and diffusion analyses to other social media platforms.
Limitations
- Data sampling: Twitter API data are non-random and incomplete (firehose not truly 100%, 10%/1% streams not random), limiting generalizability and formal sampling inference. - Outlet classification: Bias and factuality labels from AS and MBFC are opinion-based and may change over time; reclassifications (e.g., CNN from centre to left-leaning) affect category comparisons. - Bot detection proxy: Use of unofficial clients as a proxy for automation/professional accounts is crude; sophisticated bots using official clients may remain, and such accounts were only removed in polarization analyses. - Category coverage: The extreme bias left category was excluded from influencer analyses due to sparsity and low standing, potentially underrepresenting that segment. - Influence identification: The CI heuristic approximates optimal influencer sets and covers less than 80% of potential cascades; results may depend somewhat on heuristic choice, though CI and PageRank correlate strongly. - Interaction modality: Analyses emphasize retweets as endorsements; quotes/replies were less frequent but may carry nuanced opposition/support not fully captured, despite robustness checks. - Anonymization: Personal accounts were anonymized unless verified major news organizations, which may limit external validation of specific influencer identities. - Platform scope: Findings are specific to Twitter and the chosen time windows; cross-platform dynamics and off-platform effects are not captured.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny