logo
ResearchBunny Logo
Sinophobia was popular in Chinese language communities on Twitter during the early COVID-19 pandemic

Sociology

Sinophobia was popular in Chinese language communities on Twitter during the early COVID-19 pandemic

Y. Zhang, H. Lin, et al.

Dive into the intriguing findings of research conducted by Yongjun Zhang, Hao Lin, Yi Wang, and Xinguang Fan, which uncovers the dynamics of Sinophobia in Chinese-language Twitter communities during the early COVID-19 pandemic. With an analysis of over 25 million tweets, discover how sentiments shifted and issues like US-China relations took center stage. Don't miss these critical insights!

00:00
00:00
~3 min • Beginner • English
Introduction
The study investigates how Chinese-language users on a Western social media platform (Twitter) discussed COVID-19 and China during the early pandemic. Motivated by observed global increases in xenophobia and Sinophobia and prior work focusing primarily on English-language users or Chinese users on censored domestic platforms, the authors assess sentiment, targets of sentiment, content, and interaction dynamics among Chinese-language users on Twitter. The research questions are: (RQ1) Who were the Chinese Twitter users mentioning China-related issues during the early pandemic? (RQ2) What was the overall pattern of public sentiments? (RQ3) Who were the main targets of positive and negative sentiments? (RQ4) Were there conversations between pro-China and anti-China users? (RQ5) What was the content of these Chinese tweets? (RQ6) Were sentiment patterns driven by specific topics? (RQ7) Did topics vary by user type (pro- vs anti-China)? The study is important because Twitter provides a less censored public sphere compared to Chinese domestic platforms, offering insight into transnational Chinese-language discourse, the spread of misinformation, and the dynamics of digital nationalism and anti-state activism during a global crisis.
Literature Review
Two strands of literature frame the study. First, work on public sentiment toward China during COVID-19, including analyses of English-language Twitter and global media, documents sharper anti-China attitudes and increased racial slurs, often despite WHO guidance against stigmatizing language. Second, research on Chinese citizens’ responses on domestic platforms (e.g., Weibo) notes more supportive sentiment toward the Chinese government, attributed to perceived effective pandemic responses. Additional literature addresses censorship circumvention in China during crises, computational propaganda by anti-Chinese state groups on Twitter, the role of Chinese state media and cyber-nationalists (e.g., “Little Pink,” Diba Expedition), and the use of social media to mobilize around contentious issues (Hong Kong protests, Xinjiang, Taiwan). This background motivates examining Chinese-language communities on a Western platform to fill a gap between studies of English audiences and domestic Chinese users.
Methodology
Data: The authors compiled CNTweets, a dataset of 25.30 million tweets by 1.32 million users (2019–2021) retrieved via Twitter’s academic API using Chinese keywords (simplified and traditional) related to China, Chinese, CCP, and Asians. Descriptives include 16.64M tweets mentioning China (中国), 7.46M mentioning CCP (共产党), and 0.28M mentioning Asians/Chinese descent (亚裔/华裔). Training data and annotation: A 10,000-tweet labeled set was created. Tweets were sourced from known pro-/anti-China accounts and their networks, relevant hashtags/keywords, plus random samples to increase neutral cases. Stratified sampling yielded 7,000 potentially positive/negative and 3,000 random tweets. Each tweet was double-annotated for sentiment (positive/neutral/negative), targets (Chinese people, Chinese government, China in general), and topics (COVID-19, politics and subtypes, economy, culture, religion, US, US-China relations), with adjudication resolving disagreements. Models: Fine-tuned Chinese-RoBERTa-wwm-ext was used for sentiment and target classification, outperforming baselines (e.g., MacBERT, Multilingual BERT). Performance (F1/accuracy): Sentiment 0.81/0.81; Targets—China in general 0.70/0.86, Chinese people 0.61/0.94, Chinese government 0.84/0.89. Topic classification: Supervised RoBERTa classifiers identified broad topics and subtopics with reported performance (F1/accuracy): COVID-19 0.91/0.91; Culture 0.93/0.97; Democracy 0.16/0.98; Economy 0.63/0.90; Politics 0.23/0.98; US Politics 0.92/0.92; Taiwan Politics 0.70/0.96; HK Politics 0.68/0.99; Religion 0.70/0.98; US 0.27/0.99; US-China relation 0.86/0.96. To address lower F1s for some topics, structural topic modeling (STM) with document-level metadata (e.g., month) was also applied; models with K=30 (main), K=50, K=100 were explored. User location and account type: Users’ self-reported locations were parsed via regex for countries/regions and major cities; sensitivity and potential misreporting acknowledged. An additional classifier categorized tweet type (personal opinions, news content, government/institution announcements, ads/spam, others). Network analysis: Pro-/anti-China users were identified by user-level positive sentiment rate (>0.6 pro, <0.4 anti). A bipartite conversation network was built using conversation_id to capture reply-thread participation, supplemented by a retweet network (reported in SI). Segregation measured using E-I index (between vs within-group ties). Topic–sentiment linkage: Logistic regression with monthly fixed effects modeled the probability a tweet is positive toward China as a function of topics. Temporal/keyword analyses: Time-series trends for sentiments, targets, and keywords (China, CCP, Asians; Hong Kong, Taiwan, Xinjiang, Tibet, US, COVID-19) were examined to contextualize events (e.g., Wuhan cases, “Chinese Virus” tweet, US 2020 election, WHO origins report, Atlanta spa shootings).
Key Findings
- Content concentration: 1% of users generated 62% of Chinese-language tweets; 10% produced 90% of content. - Geolocation: Of 1.32M users, 0.58M (43.83%) reported a location; among those, 0.33M (58%) had identifiable countries/regions. Among identified: Mainland China 31.62%, US 18.09%, Taiwan 8.95%, Hong Kong 8.59%. - Tweet type: 68.4% personal opinions; 27.6% news content; 0.71% government/institution announcements; 2.16% ads/spam. - Sentiment distribution (N≈25.3M): Negative 62.2% (15.74M), Neutral 21.9% (5.54M), Positive 15.9% (4.02M). Negativity surged following key events (e.g., Trump’s “Chinese Virus” on Mar 16, 2020) and remained dominant throughout Dec 2019–Apr 2021. - Targets overall: Chinese government 60.0% (15.19M), Chinese people 11.0% (2.79M), China in general 25.0% (6.32M). Keyword mentions favored China/CCP over Asians; Asian mentions increased after the Mar 2021 Atlanta spa shootings. - Targets by sentiment: Negative tweets—80% targeted the Chinese government, 11% the Chinese people, 19% China in general. Positive tweets—20% government, 34% people, 46% broad China. - Engagement and segregation: Identified 459,821 anti-China and 496,504 pro-China users. Observed 19.82M unique conversations; 96.4% of tweets had no replies (anti: 17.32M; pro: 1.78M). Cross-camp conversations accounted for 0.83M threads (4.7%). E-I index: conversation network −0.33; retweet network −0.906, indicating strong within-group clustering and limited cross-boundary engagement. - Topics (RoBERTa): 73% politics; 31% democracy; 27% US; 22% US politics; 20% COVID-19; 14% US–China relations; 9% Hong Kong politics; 6% Taiwan politics; 6% culture; 5% economy; 2% religion. STM (K=30) highlighted themes such as freedom/democracy (~8%), US election (~6.9%), COVID-19 (~4.9%), Hong Kong–NSL (~4.8%), Wuhan outbreak (~4.8%), human rights/Xinjiang (~3.7%), pro-CCP/“50-cent party” (~5.4%). - Topic–sentiment linkage (logit coefficients): Tweets on COVID-19 (−0.558), politics (−0.877), religion (−0.906), and US–China relations (−0.093) were less likely to be positive; culture (0.225) and economy (0.112) more likely to be positive (all p<0.001). - Pro vs anti topic focus: Both camps engaged heavily in politics, US, COVID-19; anti-China users focused more on democracy/freedom and Hong Kong politics and were more active overall; pro-China users devoted relatively more share to economy, culture, COVID-19, and US issues. Average per-user tweets illustrate higher activity among anti-China users (e.g., politics: anti 35.07 vs pro 2.88). Overall, Sinophobia and anti-China sentiment were prevalent, largely targeting government/CCP rather than Chinese people.
Discussion
Findings address the research questions by showing that a small, highly active subset of users, mainly reporting locations in Mainland China, the US, Hong Kong, and Taiwan, drove Chinese-language discourse about China on Twitter. Sentiment analysis indicates sustained negativity toward China across the early pandemic, with negative content primarily aimed at the Chinese government/CCP rather than at Chinese people, clarifying the targets of online Sinophobia. Network analyses reveal polarized, segregated communities with minimal cross-group engagement, suggesting echo chambers where pro- and anti-China users primarily interact within camp. Topic analyses show politics (including democracy/freedom, Hong Kong, Taiwan) and COVID-19 dominated attention; topic–sentiment modeling demonstrates that political and pandemic-related discussions drove negativity, while cultural and economic content correlated with more positive sentiment. These results nuance understandings of anti-China discourse by distinguishing targets and showing how agenda differences between pro- and anti-China communities map onto topic emphases, with implications for platform governance and the protection of minority communities exposed to propaganda and hate during crises.
Conclusion
The study contributes the first large-scale analysis of Chinese-language Twitter discourse on China during the early COVID-19 period, combining supervised deep learning and unsupervised topic modeling across 25M tweets. It documents that discourse was dominated by a small share of highly active accounts, was predominantly negative toward China, targeted government/CCP more than people, and was organized into segregated pro- and anti-China communities with limited cross-engagement. Content centered on politics, COVID-19, and US-related issues; cultural and economic topics were associated with more positive sentiments. The work highlights Twitter as a contested arena for competing propaganda efforts in Chinese-language spaces and underscores the need for platform interventions attentive to minority-language communities. Future research directions include improving classifier performance (e.g., via semi-supervised learning and expanded labeled data), validating user location beyond self-reports, expanding coverage beyond keyword-based sampling and beyond the early pandemic period, and examining cross-lingual dynamics and off-platform linkages to better understand the broader information ecosystem.
Limitations
- Classifier performance: Some topic classifiers had relatively low F1 scores (notably for certain categories such as culture, religion, and economy as discussed by the authors), potentially affecting topic prevalence estimates. - Location data: User locations are self-reported and may be missing, outdated, or intentionally misreported; true geolocations were not available. - Sampling frame: Data consist of keyword-matched Chinese-language tweets about China/CCP/Asians during Dec 2019–Apr 2021; results do not generalize to the entire Twitterverse or to other time periods or topics. - Representativeness: Chinese-language Twitter users are a selective population (e.g., users circumventing censorship, activists, nationalists) and are not representative of broader Chinese or diaspora communities. - Engagement measures: Conversation network focuses on replies within identified users and may underestimate other forms of interaction; high proportion of tweets without replies limits inference about deliberative engagement.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny