Political Science
The role of bot squads in the political propaganda on Twitter
G. Caldarelli, R. D. Nicola, et al.
The study investigates how automated accounts (social bots) contribute to political propaganda on Twitter, focusing on discourse around migration from Northern Africa to Italy. Motivated by evidence that bots shape online discussions and can amplify low-credibility content, the authors aim to move beyond bot detection to quantify bots’ effective role in disseminating messages. They frame the problem as extracting meaningful interaction patterns from noisy social data by using entropy-based null models to discount users’ activity levels and tweet virality. The research questions are: (i) which accounts are most effective at spreading messages once random activity is filtered out, and (ii) how do bots interact with these hubs, including whether bots act in coordinated formations (“bot squads”) to amplify specific political tendencies. The work is significant because it offers a rigorous, statistically grounded methodology to identify non-random communication backbones and assess bots’ role in political information flows.
Prior work documents widespread bot activity in political contexts (e.g., US 2016 elections, Brexit) and their efficiency in spreading low-credibility content and targeting influential users. Estimates suggest 9–15% of active Twitter users can be bots, and evolving bots can evade detection. Numerous detection approaches exist based on profile, network, and posting behavior features; Cresci et al. proposed cost-effective lightweight classifiers with high accuracy. However, most studies do not explicitly filter random noise in interaction data. Entropy-based null models (maximum-entropy approaches) have been applied across domains (trade, finance, social networks) to extract statistically significant structures and early warning signals. Within political Twitter analysis, related work inferred political standings and cross-group communication using similar null-model-based projections. This study merges bot detection with entropy-based network validation to quantify bots’ effective contribution in message diffusion and to uncover coordinated bot formations supporting aligned hubs.
Data collection: Tweets were collected via the Twitter public Filter API over one month (23 January–22 February 2019) using a keyword list relevant to migration from North Africa to Italy (e.g., “Immigrati”, “Migranti”, “Ong”, “Seawatch”, “Guardia costiera libica”, etc.). The corpus comprises 1,082,029 tweets from 127,275 unique accounts. Data were stored in Elasticsearch for retrieval. The period coincided with intense public debate concerning the ‘Diciotti’ case and related political decisions.
Bot detection: Accounts were classified using a reconstruction of Cresci et al.’s lightweight supervised classifier, implemented with Weka’s J48 (C4.5) on the original public training set, achieving comparable performance. Features include profile- and count-based signals (friends, followers, tweets, account age, following rate; presence of name, image, address, bio, URL; list membership; simple thresholds like 2×followers>friends, friends≥100, followers≥50).
Political polarization inference: The analysis centers on retweets (as a clearer endorsement signal than replies, mentions, or quotes). A bipartite undirected network was built with verified and unverified users; a link exists if at least one retweet occurred between them. To infer communities among verified users, the Bipartite Configuration Model (BiCM) was used to compute a validated monopartite projection on the verified layer by assessing statistical significance of common neighbors (V-motifs) with Poisson-binomial distributions and multiple hypothesis control via FDR. The Louvain algorithm was applied repeatedly (reshuffling node order) and the partition with maximum modularity was selected. Unverified users’ polarization was then assigned using a polarization index p_i = max_c k_ic/k_i and extended via an iterative label propagation (“contagion of polarization”) that propagates labels through unverified–unverified retweet links until no further assignments are possible.
Backbone of content exchange: To extract significant content flows while discounting user activity and tweet virality, a directed bipartite network between users and tweets was constructed: user→tweet (authoring), tweet→user (retweet). The Bipartite Directed Configuration Model (BiDCM) constrained each user’s number of original tweets, each tweet’s retweet count, and each account’s retweet activity. A validated directed projection onto users was obtained by testing, for each ordered pair (u,u′), whether the number of tweets written by u and retweeted by u′ exceeds BiDCM expectations (FDR threshold ≈ 3.0×10^−7 for α=0.01; Bonferroni threshold ≈ 8.8×10^−12). The resulting validated directed network represents significant author→retweeter relations. Self-loops (significant self-retweets) were identified (~1.2% of links; <3% of validated nodes) and discarded for subsequent analyses.
Hubs and bot-squad analysis: Hubs (effective message sources) were quantified using the Hubs-Authorities (HITS) algorithm on the validated directed network; focus was on hub scores. The fraction of bots among each hub’s significant retweeters was computed and compared to the network-wide bot incidence. Overlaps among bots following top hubs were measured to detect shared automated followers (bot squads). For bot squads, activity types (tweets, retweets, quotes, replies), targets of retweets and mentions, and URLs in original tweets were analyzed to assess content and coordination patterns.
- Community structure and polarization: The verified-users projection (via BiCM) reveals three main communities: (1) pro-government/right-wing and M5S (blue), (2) Democratic Party/center-left (red), and (3) NGOs, media, left-wing politicians (purple), with smaller groups (e.g., Maltese PM). Unverified users exhibit strong polarization; the polarization index distribution peaks near 1. Label propagation increases the share of polarized unverified users by 27% overall, and by ~58% within the validated-unverified subset.
- Validated content-exchange backbone: The BiDCM-validated directed network comprises 14,883 users and 34,302 links, with connectance p ≈ 3×10^−5. Bots constitute ~2.5% of validated nodes versus ~7% in the full network. Significant self-retweet loops account for ~1.2% of links; <3% of nodes self-retweet significantly (loops removed thereafter).
- Hubs: The most effective hubs (by HITS hub score) are predominantly from the right-wing/pro-government community. Matteo Salvini ranks first; Giorgia Meloni ranks fourth. Two journalists from a CasaPound-supported site rank second and third. The first non-blue hub (TgLa7 newscast) appears at rank 176. An Italian NGO assisting migrants shows high out-degree (k_out=1104; 5th) but very low hub score (~4×10^−4; rank 452), indicating limited effective amplification despite activity.
- Bots’ contribution and concentration: For key right-wing leaders (Salvini, Meloni), the fraction of bots among their significant retweeters exceeds the network’s average bot incidence. Other hubs with similar hub scores tend to have lower bot fractions.
- Bot squads (shared automated followers): Coordinated formations of bots follow and significantly retweet multiple strong hubs. The largest bot-squad structure includes 22 genuine accounts (9 among the top-10 hubs) sharing 22 bots; this subgraph has 172 nodes and is almost entirely in the blue community. A second, smaller squad (58 nodes) centers on TgLa7 (purple) with much lower hub scores and a skewed hub-score distribution.
- Bot activity patterns and sources: In the largest bot squad, retweeting is the dominant activity. Bots mostly retweet the strong hubs they follow. Mentions span across political sides, including right-wing leaders and the official Democratic Party account, as well as institutional accounts (Quirinale, Roberto_Fico). Of bots’ original tweets, 89% contain a URL, and 97% of those URLs point to voxnews.info, a site blacklisted by fact-checkers (butac.it, bufale.net).
- Polarization in validated vs full network: Validated users are markedly more polarized than the full user set; in the full network, >40% of accounts (and >50% of automated accounts) remain unpolarized, compared to ~10% (and ~5% for bots) in the validated network.
Filtering out random activity and tweet virality via entropy-based null models exposes the backbone of non-random, effective content diffusion. Within this backbone, right-wing/pro-government accounts dominate as hubs, and their visibility is amplified by bots at rates exceeding average bot prevalence. The discovery of bot squads—sets of automated accounts shared among multiple aligned hubs—indicates coordinated amplification strategies that differ from previously reported star-like bot-following patterns. These squads primarily engage in retweeting, directly boosting the reach of selected hubs, and their original content frequently links to a single low-credibility source, suggesting targeted agenda amplification. Although bots constitute a smaller share in the validated backbone than in the overall dataset, those that remain have disproportionate influence due to their statistically significant retweet behavior. The approach demonstrates that entropy-based validation paired with bot detection can robustly reveal non-trivial, politically salient structures in social media propaganda, helping to differentiate genuine influence from orchestrated amplification.
This work integrates a lightweight, high-accuracy bot detector with entropy-based null-model projections to isolate significant Twitter content flows and assess bots’ roles in Italian migration-related political discourse. Main contributions include: (i) identification of the effective diffusion backbone and its dominant hubs (largely right-wing/pro-government), (ii) evidence that key hubs attract above-average shares of bots among significant retweeters, and (iii) first reported observation of coordinated bot formations (“bot squads”) shared across multiple aligned hubs, which focus on retweet amplification and often cite a single low-credibility source. The findings underscore the importance of statistically validating interaction patterns to separate meaningful signals from noise. Future research should generalize the analysis across topics, time periods, and countries; further characterize bot-squad coordination and evolution; integrate richer content semantics; and develop proactive, adversarially robust bot detection that anticipates evolving behaviors.
- Topical and temporal scope: The dataset covers one month and a specific Italian political topic; findings may not generalize across issues or time.
- Interaction modality: Only retweets were analyzed for polarization and backbone extraction; replies, mentions, and quotes (which can convey support or dissent) were excluded from validation.
- Bot detection constraints: The classifier relies on lightweight profile/activity features and known training data; evolving bots or sophisticated human-managed accounts may evade detection or be misclassified.
- Incomplete interaction coverage: Nearly half of unverified users did not interact with verified accounts during the window, requiring label propagation; some accounts remained unpolarized.
- Observational design: The analysis reveals coordination patterns and amplification but cannot establish intent, control, or causality behind bot squads.
- Platform/API limitations: Data availability and policy constraints (e.g., rate limits, content removals) may affect completeness and reproducibility.
Related Publications
Explore these studies to deepen your understanding of the subject.

