Introduction
The COVID-19 pandemic triggered a global surge in Sinophobia and anti-Chinese sentiment. Existing research has explored this phenomenon through analyses of English-language social media and global news outlets, revealing a significant increase in anti-China attitudes and racial slurs. However, limited research exists on how Chinese-language users on Western social media platforms, such as Twitter, responded to the pandemic and expressed their sentiments toward China. This study addresses this gap by examining the sentiments and topics discussed by Chinese-language Twitter users during the early stages of the pandemic, focusing on the interplay between pro- and anti-China sentiments and the targets of these sentiments. The researchers aim to understand the sources of these tweets, the prevalent sentiments, the target entities of these sentiments, and the dynamics of online conversations between pro- and anti-China users. The study also investigates the content of tweets, the relationship between topics and sentiments, and potential variations in topics among different user groups. The importance of this study lies in its potential to shed light on the complexities of online discourse surrounding China during a global crisis, contributing valuable insights to the fields of China studies and social media analysis.
Literature Review
Prior studies have documented the rise in anti-Chinese sentiment during the COVID-19 pandemic, particularly in English-language social media and news. Research analyzing English-language tweets revealed a sharp increase in anti-China attitudes in the US. Similarly, analyses of global web news demonstrated a significant rise in racial slurs targeting China. In contrast, studies focusing on domestic Chinese social media platforms, like Sina Weibo, showed predominantly supportive sentiments toward the Chinese government's COVID-19 response. This contrast highlights the need to explore how Chinese-language users on Western social media platforms, like Twitter, navigated these complex sentiments during the pandemic, particularly given the unique context of Twitter's accessibility in China.
Methodology
The researchers compiled a unique database, CNTweets, comprising over 25 million Chinese tweets mentioning keywords related to China, the CCP, Chinese people, and Asians (December 2019 – April 2021). Geographic location data was extracted from user profiles, though limitations due to self-reporting were acknowledged. To analyze sentiments and topics, they employed the Robustly Optimized Bidirectional Encoder Representations from Transformers (ROBERTa) and structural topic modeling. A training dataset of 10,000 manually annotated tweets was used to fine-tune the ROBERTa models for sentiment and target entity classification (Chinese people, Chinese government, China in general). The models were evaluated using F1 scores and accuracy. Social network analysis (SNA), using Twitter's conversation_id, was implemented to study the interaction between pro- and anti-China users. Structural topic modeling (STM) was employed as a robustness check for topic classification, using document-level metadata to understand topic prevalence. Logistic regression models with monthly fixed effects were used to examine the relationship between topics and overall sentiment.
Key Findings
The analysis revealed that a small percentage of Twitter users (1%) generated a large proportion (61.8%) of the tweets in the dataset, with 10% of users contributing to 90% of the tweets. The majority of Chinese-language Twitter users identified themselves as being located in Mainland China, the US, Taiwan, or Hong Kong. Sentiment analysis indicated that 62.2% of tweets exhibited negative sentiment toward China. Importantly, these negative sentiments were predominantly directed at the Chinese government and CCP, rather than the Chinese people. The most frequent topics were politics (including Hong Kong protests and Taiwan issues), COVID-19, and US-China relations. Pro-China users focused more on cultural and economic topics, while anti-China users emphasized political issues such as democracy and freedom. SNA revealed significant segregation between pro- and anti-China users, with limited in-depth cross-group engagement. The E-I index for the conversation network was -0.33, and for the retweet network, -0.906, indicating strong homophily. Keyword analysis demonstrated a consistent pattern of mentioning China and the CCP more frequently than Asians or Chinese people. Time series analysis showed predominantly negative sentiment throughout the study period, with spikes correlated to significant events like Trump's "Chinese Virus" tweet and the WHO report on COVID-19 origins. Structural topic modeling identified key themes such as democracy-freedom, US elections, global issues, and COVID-19.
Discussion
The findings highlight the prevalence of Sinophobia within Chinese-language Twitter communities during the early pandemic, emphasizing that it was primarily directed at the Chinese government and the CCP. This suggests that the online discourse was significantly shaped by political considerations and geopolitical tensions rather than generalized anti-Chinese sentiment. The high level of segregation between pro- and anti-China users suggests the existence of echo chambers and limited opportunities for constructive dialogue. The observed difference in topic focus between pro- and anti-China users reflects differing agendas and perspectives on China's role in global affairs. The study's findings contribute to the understanding of how online platforms can amplify existing political divisions and fuel nationalistic sentiments during times of crisis. The results also underscore the limitations of self-reported geographic data and the potential influence of computational propaganda on social media.
Conclusion
This study provides novel insights into the dynamics of Sinophobia within Chinese-language Twitter communities during the early COVID-19 pandemic. It reveals the prevalence of negative sentiment, primarily directed toward the Chinese government, and the high level of segregation between pro- and anti-China user groups. Future research could explore the impact of these online sentiments on real-world outcomes and the role of algorithmic amplification in shaping online discourse. Further investigation into the use of semi-supervised machine learning to improve the accuracy of sentiment and topic classification is also warranted.
Limitations
Several limitations should be considered. First, the reliance on self-reported location data may lead to underestimation or misrepresentation of user locations. Second, some classifiers, particularly those for culture, religion, and economy, had relatively low F1 scores. Third, the study only analyzes tweets mentioning specific keywords, potentially missing relevant conversations. Finally, the data covers only the early phase of the pandemic, limiting the scope of analysis. Future research should address these limitations using alternative methods and more comprehensive data.
Related Publications
Explore these studies to deepen your understanding of the subject.