logo
Loading...
Exploring the construction and infiltration strategies of social bots in sina microblog

Computer Science

Exploring the construction and infiltration strategies of social bots in sina microblog

W. Wang, X. Chen, et al.

This intriguing study reveals the construction and infiltration strategies of social bots on Sina Microblog, successfully deploying 96 bots with a 100% survival rate and gaining over 5,500 followers in just 42 days. Conducted by Wenxian Wang, Xingshu Chen, Shuyu Jiang, Haizhou Wang, Mingyong Yin, and Peiming Wang, it uncovers critical vulnerabilities in social media defense mechanisms and offers insights for improved bot detection.... show more
Introduction

The study addresses whether and how benign social bots can be automatically constructed to infiltrate Sina Microblog (Weibo) effectively and safely, gaining influence while evading platform defenses. With the rapid growth of social media usage, platforms like Weibo are key venues for information sharing but also for malicious content such as rumors, hate speech, and fake news. Existing research largely focuses on bot detection, which is reactive and often delayed. The authors propose a proactive approach: construct social bots and evaluate infiltration strategies to identify vulnerabilities and inform better defenses, while ensuring bots generate positive or neutral content and avoid sensitive topics. The research questions include: Can social bots evade Weibo’s defenses? Which profile attributes and behavioral strategies (gender, profile photo type, activity level, following strategy, posting strategy and interaction types) maximize infiltration and influence? What network mechanisms (e.g., homophily) enhance infiltration?

Literature Review

Related work on social bots in OSNs largely emphasizes Twitter and Facebook. On Twitter, early efforts (e.g., Realboy) enabled automated posting and following; later studies constructed large cohorts of bots to evaluate factors such as gender, activity level, tweet generation, and target selection, showing significant infiltration success and highlighting platform vulnerabilities (Freitas et al.; Moghaddam et al.). Studies also examined homophily, finding shared characteristics increase follow-back probability, and showed Twitter’s anti-spam often targets publishers but not forwarders. Other works demonstrated influence gains via simple strategies, phishing risks through social bots, and benefits of automated chatting. Facebook-focused research explored infiltration of organizations and privacy breaches, showing bots can discover informal organizational links and gather large volumes of personal data (Huber et al.; Elyashar et al.; Boshmaf et al.). Factors affecting user susceptibility include demographics, network size, behavior, and security awareness. In contrast, work on Sina Microblog is limited, with prior studies leaning toward intelligent information agents or malicious botnets rather than large-scale construction and infiltration strategy analysis. This study fills that gap by systematically constructing Weibo bots and evaluating infiltration strategies specific to the platform’s user base, structure, and policies.

Methodology

Framework: The system comprises (1) data collection, (2) corpus preparation, and (3) social bot construction and OSN infiltration via a Botmaster controller. Data Collection: Because official Weibo APIs are rate-limited, the authors reverse-engineered password encryption and login flows to implement simulated logins and combined them with visitor cookies in a fusion strategy, coordinated by a concurrent adaptive mechanism. Crawlers collect four data types: social relationships, personal information, microblogs, and comments, using a breadth-first expansion from seed users. Fusion crawling was empirically faster than official APIs and enabled access to social relationship data not available via APIs. Corpus Preparation: To ensure benign behavior, an LSTM-based sentiment classifier filters positive/neutral comments; a Char-RNN text generator (two LSTM layers with Adam optimizer and categorical cross-entropy) produces positive comments, with temperature-controlled softmax sampling to balance diversity and coherence. Profile Initialization: Each bot’s profile includes basic info (nickname, real name via Faker, gender, birthdate 1980–2000, location tied to server locale, hobbies aligned with assigned topic: technology, news, or games; additional attributes like sexuality and blood type), contact info (random QQ; email derived from name with random domain), career (media/tech companies via Faker), and education (local university with enrollment year at age 17–20). Nicknames were sourced from Zhihu and NetEase Cloud Music, deduplicated and validated via a Weibo nickname check endpoint after simulated login. Each bot received 4–8 initial follower bots, then posted at least 10 microblogs over 5 days pre-infiltration. Activity and Command System: Actions are divided into Social-Interaction (SI: post, forward, comment, like text/comment, reply, send message) and Social-Structure (SS: follow, unfollow). All actions are executed via HTTP in simulated login state. Commands (atomic and combined) are issued as key-value JSON with randomized short sleeps to mimic human delays; each bot maintains a fixed User-Agent. Activity scheduling avoids 00:00–08:00 and rate-spikes per IP (enforcing 30–120 s delays if actions on same IP are too close). Infiltration Factors (balanced 50/50 across bots unless specified): - Gender: 48 male, 48 female. - Activity level: High (20–150 min between actions) vs Low (60–300 min). - Profile photo: Real human vs unreal (landscapes/cartoons/animals). When uploading, cropping parameters ax=0, ay=0, aw=min(height,width,900); images are base64-encoded. - Following strategy: (a) follow users with the same topic; (b) follow random users. Targets are filtered to avoid low-value/zombie accounts: activity within past month, >20 followers, follower/following ratio ≤0.01, and complete profiles. - Posting strategy: (a) subjective content: repost/forward authenticated users’ (yellow/blue V) topic-related microblogs with synonym replacement (jieba + HIT-CIR Tongyici Cilin) and positive comments; (b) objective content: authoritative news on relevant topics. Experimental Setup: 96 social bots deployed on 9 cloud servers (10–12 bots/server; 2 GB RAM; ~1 Mbps bandwidth) over 6 weeks. Phase 1 (4 weeks): all bots operated under assigned strategies to measure infiltration and influence. Phase 2 (2 weeks): controlled comparisons of interaction types and following tactics. Phase 2a: Selected 12 bots (matched by topic and attributes) each performed only one action type 30 times/day: follow, comment, forward, or like; targets were active within 3 days, drawn from interactions with V users’ topic microblogs. Phase 2b: Selected 9 bots (3 per topic) performed 30 follows/day using one of three strategies: random targets, topic-related targets, or followers-of-followers. Influence Quantification: Follower quality was assessed using an influence score based on (a) microblog influence (average comments + likes + forwards per microblog), (b) direct proliferation influence (number of microblogs × follower count), and (c) activity level (mean microblogs/day). Indicators were normalized; weights were λ1=0.34, λ2=0.53, λ3=0.13. Average follower influence per strategy was computed by summing follower influence per bot and averaging across bots in each strategy. Network Visualization: Gephi was used to illustrate the emerging bot–follower network.

Key Findings
  • Evasion and Survival: All 96 bots operated for 6 weeks with a 100% survival rate, indicating successful evasion of Weibo’s spam detection under the designed activity patterns and fixed UA. - Scale of Infiltration: Over 42 days, bots gained a total of 5,546 followers. Per-bot follower counts ranged roughly from 20 to 110; all bots exceeded 20 followers within 42 days; 50% exceeded 50 followers (comparable to average human follower counts). - Follower Composition and Reach: 14.53% of followers were authenticated (V) users; 89 authenticated followers had over 10,000 followers each, implying a potential reach approaching 890,000. During infiltration, bots received 951 interactions: 60.46% likes, 38.60% comments, 0.74% forwards. - Data Collection Efficacy: The fusion crawling strategy outperformed official APIs in speed and coverage, including collection of social relationships lacking API endpoints. - Strategy Effects (Phase 1): • Gender: Female bots slightly outperformed male bots, but gender had no significant effect overall. • Activity Level: High-activity bots acquired substantially more followers; activity proved the dominant factor for influence gains while maintaining stealth with careful scheduling. • Profile Photo: Real human photos yielded a small advantage overall (followers’ proportions 51.08% vs 48.92%); female-real-photo bots had 2–6% more followers than others. • Following Strategy: Following topic-specific users produced more followers than random following; however, follower quality (influence) was lower for topic-specific following, likely due to zombie/bot accounts in those clusters. • Posting Strategy: No meaningful difference between subjective opinions and objective news, likely because both were topic-aligned. - Influence Quantification (average follower influence per bot): • Gender: Female 10.0729; Male 10.2937 (similar). • Activity: High 12.2426; Low 8.1240 (high markedly better). • Profile photo: Real 10.3464; Unreal 10.0202 (similar). • Following: Specified users 9.7199; Random 10.6467 (random higher quality). • Posting: Facts 10.1794; Opinions 10.1873 (no difference). - Interaction Type Comparison (Phase 2a): Following produced the largest daily follower increases, followed by commenting, then forwarding; liking had negligible effect. - Following Strategy Comparison (Phase 2b): Following followers of followers yielded nearly double the daily follower gains compared to random or topic-based following, highlighting strong homophily effects.
Discussion

The study demonstrates that carefully engineered social bots can infiltrate Weibo at scale while evading platform defenses. By reverse-engineering login flows and simulating human-like behavior (sleep cycles, inter-action delays, fixed UA), bots remained undetected for six weeks and accrued substantial followers. Among evaluated factors, activity level most strongly influenced infiltration success and follower influence, indicating that visibility through frequent actions is critical when balanced against detection risk. Gender and profile photo type had only marginal effects; content subjectivity (opinions vs facts) also made little difference when topic alignment was maintained. Following strategies and interaction types materially shaped growth: direct following, particularly targeting followers-of-followers, most effectively expanded reach, evidencing homophily in Weibo similar to findings on Twitter. However, targeting topic-specific clusters can attract more followers but lower-quality ones (e.g., zombies), reducing overall influence quality. These insights directly answer the research questions by identifying tactics that maximize infiltration and reveal weaknesses in Weibo’s defenses, informing both mitigation strategies for platforms and operational guidelines for benign, guidance-oriented bots.

Conclusion

The authors developed a scalable, automated pipeline to construct and operate 96 social bots on Weibo, leveraging a fusion data-collection strategy and a Botmaster-controlled action system. Over 42 days, bots achieved 5,546 followers with 100% survival, evidencing vulnerabilities in Weibo’s defense mechanisms. Empirical evaluation identified that high activity levels, following actions (especially followers-of-followers), and leveraging homophily are the most effective levers for rapid influence gains; gender, profile photo type, and posting subjectivity have minimal impact. The work offers concrete guidance for platform defenders to harden against such infiltration and for the design of benign bots to promote positive discourse. Future research should expand bot scale and temporal coverage, evaluate additional factors such as content polarity and linguistic features, and refine target selection to avoid low-value accounts, thereby generalizing and strengthening findings.

Limitations
  • Ethical and misuse risks: Although bots were constrained to positive/neutral topics (technology, news, games) and avoided sensitive domains, the techniques could be repurposed for manipulation. - Factor coverage: Only five intuitive factors were studied; other variables (e.g., content polarity, linguistic style, multimedia usage, temporal patterns beyond the defined windows) may also affect infiltration. - Scale and duration: Experiments involved 96 bots over six weeks; larger-scale and longer-term studies could reveal additional dynamics and detection responses. - Target selection noise: Topic-specific following attracted zombie/bot accounts, affecting follower quality estimates. - Generalizability: Findings are specific to Weibo’s ecosystem, defenses, and user behaviors during the study period and may not fully transfer to other platforms or times.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 22+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny