logo
ResearchBunny Logo
Online public opinion during the first epidemic wave of COVID-19 in China based on Weibo data

Social Work

Online public opinion during the first epidemic wave of COVID-19 in China based on Weibo data

W. Shi, F. Zeng, et al.

This insightful paper, authored by Wen-zhong Shi, Fanxin Zeng, Anshu Zhang, Chengzhuo Tong, Xiaoqi Shen, Zhewei Liu, and Zhicheng Shi, explores the emotional responses of the Chinese public during the COVID-19 outbreak by analyzing over 45 million Weibo posts. Discover how emotions varied with significant events and the differences between urban and rural sentiments.

00:00
00:00
~3 min • Beginner • English
Introduction
Since December 2019, the COVID-19 pandemic prompted social distancing and lockdowns, shifting interpersonal communication and information sharing heavily onto social media platforms such as Weibo. Social media thus became both a conduit for following epidemic developments and a venue for expressing opinions and seeking help. Public opinion—understood as the collective sentiment or emotion on public issues—can be observed via such online traces. Prior to this work, most analyses emphasized topic trends and coarse sentiment (positive/negative/neutral), with limited attention to fine-grained emotion categories. This study asks how online public opinion in China evolved during the first COVID-19 wave when viewed through fine-grained emotions, and how emotions varied temporally and spatially, particularly around key events (e.g., announcements, lockdowns, festivals). The purpose is to provide a detailed understanding of emotional dynamics to support emergency response and public opinion management. The importance lies in leveraging large-scale social media data to detect population sentiment shifts that can guide timely, targeted interventions during major public health crises.
Literature Review
Public opinion formation has historically been linked to mass media agenda setting and, more recently, to participatory social media environments. Researchers have utilized social media to infer public opinion, identify topic trends, and analyze sentiment. In the COVID-19 context, studies have examined topic evolution, spatiotemporal distributions of discussions, misinformation diffusion, and psychological impacts using platforms like Weibo and Twitter. Methods include LDA topic modeling, random forests, semantic analyses, and deep learning (e.g., BERT, BiGRU). However, prior work largely focused on topic discovery and coarse sentiment polarity, often missing fine-grained emotion dimensions (such as specific emotions like happiness, fear, sadness, disgust). This study addresses that gap by employing a lexicon-based emotional ontology tailored to Chinese social media to extract seven distinct emotions and link them to topics, enabling nuanced temporal-spatial analyses during the epidemic.
Methodology
Data: The study analyzes more than 45 million Weibo posts from December 1, 2019 to April 30, 2020, including national and city-level subsets (with special focus on Wuhan and major central cities such as Guangzhou, Shenzhen, Shanghai, Beijing, and Chengdu, along with their neighboring cities). A subset of posts explicitly mentioning 'lockdown' (封城) around the Wuhan city closure (January 23, 2020) and a user-level subset (1149 users with posts before and after lockdown within January 20–26) were examined for emotional shifts. Emotion extraction: A lexicon-based approach using the emotional ontology from the Information Retrieval Laboratory of Dalian University of Technology was adopted. The lexicon defines seven categories: like (喜好), happiness (高兴), sadness (悲伤), anger (愤怒), fear (恐惧), disgust (厌恶), surprise (惊讶), with subcategories and example terms (e.g., happiness: joyful, hilarity; fear: timid, afraid). For each cleaned Weibo text, words are matched to the lexicon and counts per emotion category are computed, forming an emotion count vector. Example: if two words map to happiness, then happiness count is 2. Vectors are normalized to give proportional emotion vectors per post (each post weighted equally). City-day averages yield daily emotion distributions. To mitigate volume effects when comparing central and neighboring cities, neighboring cities’ posts were merged before computing emotion distributions. Topic modeling: LDA (bag-of-words assumption) was used to discover latent topics, treating text generation and topic extraction as inverse processes. Model selection considered Perplexity (lower is better) and topic Semantic Consistency/Coherence (C_UMass; higher indicates better word co-occurrence semantics). Strategy: minimize perplexity while selecting near peaks of coherence to choose topic numbers. On Wuhan posts for January 23, 2020, coherence peaked at certain topic counts (6, 9, 13, 15); six-topic results were presented with topic names assigned from top keywords. Evaluation metrics: - Perplexity on test corpus: perplexity(D_test) = exp{- Σ_d Σ_i log p(w_di) / Σ_d N_d}. - UMass coherence: C_UMass = 2/[N(N−1)] Σ_i Σ_j>i log P(w_i, w_j), where P(w_i, w_j) is joint occurrence probability, reflecting contextual relatedness. Analytical procedures: - Temporal emotion analysis at national and Wuhan levels, highlighting key dates: Christmas (Dec 22–25), New Year’s Day (Jan 1), announcement of human-to-human transmission (Jan 20), Wuhan lockdown (Jan 23), Chinese New Year (Jan 25), Lantern Festival (Feb 8), and Qingming Festival/National mourning (Apr 4). - Spatial comparison of the emotion fear between central cities and aggregated neighboring cities, tracking peak timing and magnitude, and post-peak persistence. - Lockdown-focused analyses: (i) hourly emotion distribution within two hours post-announcement on Jan 23; (ii) user-level dominant emotion transitions before vs. after lockdown (Jan 20–26) using the average emotion distribution per user across posts and comparing dominant categories. - Topic-emotion integration: For Jan 23 Wuhan posts, LDA topic classification with six categories and keyword-based naming; subsequent aggregation of emotion distributions per topic category to interpret sentiment associated with each topic.
Key Findings
Temporal emotion dynamics: - National and Wuhan emotion distributions were broadly stable day-to-day but showed sharp fluctuations around key dates. Positive emotions (like, happiness) rose during holidays (Dec 22–25 for Christmas; Jan 1 New Year’s Day; Jan 25 Chinese New Year; Feb 8 Lantern Festival). On Apr 4 (Qingming Festival with national mourning), sadness surged nationwide and in Wuhan. - Following Jan 20, 2020 (announcement of human-to-human transmission), fear spiked rapidly in Wuhan, peaking on Jan 21 and then gradually declining; nationally, the fear rise was smaller and lagged by about two days. Post-peak, positive emotions, especially happiness, increased, with a notable happiness peak on Jan 25 (Chinese New Year). Spatial patterns of fear: - In all examined cities, fear surged between Jan 20–27, with peak levels reaching 4–5 times typical baselines; Wuhan exceeded 6 times its usual level. Peaks aligned with Jan 20 (announcement) and Jan 24 (post-lockdown), with neighboring cities peaking later than central cities. Thereafter, fear declined but remained 1–2 times above pre-epidemic levels even after Wuhan’s reopening on Apr 8. Central cities consistently showed higher fear proportions than neighboring cities during Jan 20–Mar 1, indicating stronger reactions in urban hubs. Lockdown-related sentiment: - Within two hours after Wuhan’s lockdown announcement (Jan 23, 3:00 a.m.), public sentiment toward the policy was relatively negative, with fear and sadness increasing notably. - User-level analysis (n=1149 users with posts before and after lockdown, Jan 20–26) showed many users' dominant emotion category shifted; overall, the number of users with positive dominant emotions (like, happiness) increased, while those with negative emotions decreased. The number of users with fear as dominant emotion declined. This improvement may reflect confidence in policy and the Chinese New Year festival atmosphere. Topic modeling and topic-emotion linkage (Jan 23, Wuhan): - LDA coherence indicated suitable topic counts at 6, 9, 13, 15; six topics were presented: (0) Traffic measures, (1) Chinese New Year, (2) Epidemic, (3) Life, (4) Materials, (5) Social relationship. Keyword examples included lockdown, subway, bus (Traffic); hope, go home (Chinese New Year); COVID, confirmed, hospital (Epidemic); life, peace, hold on (Life); masks, supermarket, supplies, cannot buy (Materials); city, friends, government, care (Social relationship). - Emotion distributions by topic showed: categories 0, 3, 5 had higher proportions of like/happiness due to supportive/encouraging language (e.g., "God bless", "cheer up"). Category 2 (Epidemic) had the highest fear. Categories 1 (Chinese New Year) and 4 (Materials) had slightly elevated fear; Materials also showed higher disgust, reflecting anxiety and dissatisfaction about supply shortages. Social relationship (5) showed elevated disgust reflecting mixed views on government actions (both supportive and critical posts).
Discussion
Monitoring fine-grained emotions on social media during a public health emergency enables timely insights into public concerns and needs. The observed spikes in fear following critical announcements and lockdowns, coupled with holiday-related positivity and mourning-related sadness, illustrate how collective emotions respond to events. Spatially, central cities exhibited stronger and earlier fear reactions than neighboring cities, suggesting that public opinion management should prioritize urban hubs within city clusters. Lockdown-induced negativity appeared short-lived; over ensuing days users shifted toward more positive dominant emotions, hinting at adaptation, policy confidence, and the effect of festive periods. Topic-emotion integration indicates that ensuring material supplies and communicating traffic/containment measures clearly can help mitigate fear and disgust, while enhancing policy publicity and efficient information sharing may bolster trust and positive sentiment. These findings address the study's research questions by showing when, where, and around which topics specific emotions intensified, offering actionable guidance for authorities to tailor responses and communication strategies during crises.
Conclusion
This study introduces a fine-grained emotion extraction framework, based on a Chinese emotional ontology, applied to over 45 million Weibo posts from the first COVID-19 wave. Combined with LDA topic modeling, it reveals temporal spikes of fear around key announcements and lockdowns, elevated happiness during festivals, a sadness surge during national mourning, and persistent but diminishing fear post-peak. Central cities reacted more strongly than neighboring cities, and lockdown-related emotions shifted from short-term negativity to more positive dominance over subsequent days. Topic-level analyses highlighted concerns with movement restrictions, epidemic developments, material supplies, daily life, and social relationships/governance. The approach provides theoretical and technical support for public opinion monitoring and can be extended to other regions and events. Future research could enhance emotion detection with richer, context-aware lexicons and more granular short-text topic methods to further improve precision and interpretability.
Limitations
The lexicon-based emotion extraction emphasizes adjective-based cues and may miss emotions expressed through other linguistic forms, idioms, sarcasm, or context-dependent expressions common in social media. A more comprehensive, domain- and platform-specific vocabulary or hybrid models (lexicon plus machine learning) could improve coverage and accuracy. Topic modeling granularity for short texts remains limited; more fine-grained, context-aware topic methods could better capture nuanced discussions. Additionally, while normalization mitigates volume effects, residual biases due to user demographics, bot activity, or regional posting behaviors may affect generalizability.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny